Bioinformatics/Subsequence: Difference between revisions

From Rosetta Code
Content added Content deleted
No edit summary
No edit summary
Line 11: Line 11:
long = 20
long = 20
see "DNA sequence:" + nl
see "DNA sequence:" + nl
see " 12345678901234567890" + nl
see " " + long + ": "
see " " + long + ": "


Line 30: Line 29:
add(dnaList,baseStr)
add(dnaList,baseStr)
next
next
see nl+ " 12345678901234567890" + nl


strBase = ""
strBase = ""
Line 38: Line 36:
next
next


see "sequence to search: " + strBase + nl
see "subsequence to search: " + strBase + nl


seqok = 0
seqok = 0
Line 57: Line 55:


if seqok = 0
if seqok = 0
see "sequence not found" + nl
see "subsequence not found" + nl
ok
ok
</lang>
</lang>
Line 63: Line 61:
<pre>
<pre>
DNA sequence:
DNA sequence:
12345678901234567890
20: GAGTATAAAAAGCGACATAG
20: GAGTATAAAAAGCGACATAG
40: AAGCAGGGGGGGAACAGACA
40: AAGCAGGGGGGGAACAGACA
Line 74: Line 71:
180: TAGCTAAATGGATAAAAGCG
180: TAGCTAAATGGATAAAAGCG
200: GGTGAAGTCGGTCGCAAACG
200: GGTGAAGTCGGTCGCAAACG
subsequence to search: ATGA
12345678901234567890
sequence to search: ATGA
start position of subsequence = 79
start position of subsequence = 79
start position of subsequence = 103
start position of subsequence = 103

Revision as of 15:08, 20 March 2021

Bioinformatics/Subsequence is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.
Task

Genarate randomly a string (200 elements) of characters A, C, G, and T representing a DNA sequence write a routine to find the position of subsequence.
Let length of subsequence equal to 4

Ring

<lang ring> row = 0 dnaList = [] base = ["A","C","G","T"] long = 20 see "DNA sequence:" + nl see " " + long + ": "

for nr = 1 to 200

   row = row + 1
   rnd = random(3)+1
   baseStr = base[rnd]
   see baseStr # + " "
   if (row%20) = 0 and long < 200
       long = long + 20
       see nl
       if long < 100
          see " " + long + ": "
       else
          see "" + long + ": "
       ok
   ok
   add(dnaList,baseStr)

next

strBase = "" for n = 1 to 4

   rnd = random(3)+1
   strBase = strBase + base[rnd]

next

see "subsequence to search: " + strBase + nl

seqok = 0

for n = 1 to 196

   flag = 1
   for m = 0 to 3
       if dnaList[n+m] != strBase[m+1]
          flag = 0
          exit
       ok
   next
   if flag = 1
      seqok = 1
      see "start position of sequence = " + n + nl
   ok

next

if seqok = 0

  see "subsequence not found" + nl

ok </lang>

Output:
DNA sequence:
 20: GAGTATAAAAAGCGACATAG
 40: AAGCAGGGGGGGAACAGACA
 60: ACAATTGTGAAAACTAATCA
 80: ATACGGAAAAGGATAAACAT
100: GAGGGACTGCGGTTGGTAGG
120: CGATGAAACCTAAGAATGAA
140: AACGAGGAAGGTGTAAAGTG
160: ATGGGGTCATGGGACAGACA
180: TAGCTAAATGGATAAAAGCG
200: GGTGAAGTCGGTCGCAAACG
subsequence to search: ATGA
start position of subsequence = 79
start position of subsequence = 103
start position of subsequence = 116