同和创作矩阵 › ›博文归档› 技术博文归档 › 查看内容

22131_Trinucleotide

2022-5-16 18:21| 发布者: Hocassian| 查看: 23| 评论: 0|原作者: 肇庆学院ACM合集

摘要:

C:\Users\Administrator\Downloads\2019-10-12-10-14-5-89506134356900-Problem List-采集的数据-后羿采集器.html

Pro.ID

22131

Title

Trinucleotide

Title链接

http://10.20.2.8/oj/exercise/problem?problem_id=22131

AC

Submit

Ratio

0.00%

时间&空间限制

Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/65536 K (Java/Others)

描述

Bioinformatics, as a new subject, get its great development in science. Researches in DNA (Deoxyribonucleic acid) are very popular today. We know that DNA is made up with nucleotides, which are 'A', 'T', 'C', 'G'.

Studying from some thesis, we know "Genomic Signature"(基因特征组), another hotspot in bioinformatics, is preserved in short DNA fragments. So in this problem, we focus on "Trinucleotide"(三核苷酸) — DNA fragments with length 3.

There are 64 species of trinucleotide, which are "AAA", "AAG", ... ,"GGG". In a DNA sequence with length L, there are (L-2) trinucleotides. We use statistical method to do some analyses. See below:

For these (L-2) trinucleotides, we give each of them a label, from 1 to L-2.

We select every pairs of trinucleotides, there are (L-2)*(L-3)/2 pairs totally. If a pair of two trinucleotides is same, we note down the distance of these two trinucleotides. The distance is defined as the differences between the labels.

According these "Sample Data"(样本数据) we noted down, we need to calculate the "Variance"(方差) of the sample data. Do you still remember how to calculate the variance? S₂=[(x₁-X)²+{x₂-X)² .. .+ (x_n-X)²]/n, X=(x₁+x₂+.. .+x_n)/n. If the sample data size n=0, we assume that S₂=X=0.

For example, for the DNA sequence ATATATA:

We label the trinucleotides. L₁: ATA, L₂: TAT, L₃: ATA, L₄: TAT, L₅: AtA

(L₁, L₃)=2, (L₁, L₅)=4, (L₃, L₅)=2, (L₂, L₄)=2. So the sample data is 2, 4, 2, 2.

The average X=(2+4+2+2)/4=2.5 .

The variance S₂= [(2-2.5)²+(4-2.5)²+(2-2.5)²+(2-2.5)²]/4 = 0.75

Now I give you a DNA sequence, please calculate the variance mentioned aboves.

输入

The first line of input there is one integer T ( T ≤ 100), giving the number of test cases in the input. For each test case, there is a string consists with 'A', 'T', 'C', 'G', which is the DNA sequence. The length of the string will be no less than 3 and no more than 100000.

输出

Description

There are 64 species of trinucleotide, which are "AAA", "AAG", ... ,"GGG". In a DNA sequence with length L, there are (L-2) trinucleotides. We use statistical method to do some analyses. See below:

For these (L-2) trinucleotides, we give each of them a label, from 1 to L-2.

For example, for the DNA sequence ATATATA:

We label the trinucleotides. L₁: ATA, L₂: TAT, L₃: ATA, L₄: TAT, L₅: AtA

(L₁, L₃)=2, (L₁, L₅)=4, (L₃, L₅)=2, (L₂, L₄)=2. So the sample data is 2, 4, 2, 2.

The average X=(2+4+2+2)/4=2.5 .

The variance S₂= [(2-2.5)²+(4-2.5)²+(2-2.5)²+(2-2.5)²]/4 = 0.75

Now I give you a DNA sequence, please calculate the variance mentioned aboves.

Input

Output

For each test case, output one line with the answer S₂, rounded to 1e-6. If the "Relative error" between your answer and standard output is less than 1e-8, we consider you are right.

Sample Input

1
ATATATA

Sample Output

0.750000

Source

ZS 2007 Ex1