FASTA format: Difference between revisions

← Older edit

FASTA format (view source)

Revision as of 08:53, 4 March 2024

63,835 bytes added , 2 months ago

→‎version 2: rewritten

Walterpachl

2,295

edits

Revision as of 11:48, 13 September 2016 (view source) rosettacode>Juan70 (Added OCaml) ← Older edit		Latest revision as of 08:53, 4 March 2024 (view source) Walterpachl (talk \| contribs) (→‎version 2: rewritten)
(119 intermediate revisions by 56 users not shown)
Line 1: {{~~draft~~ task}} In [[wp:bioinformatics\|bioinformatics]], long character strings are often encoded in a format called [[wp:FASTA format\|FASTA]]. ~~A FASTA file can contain several strings, each identified by a name marked by a “<tt>></tt>” character at the beginning of the line.~~ A FASTA file can contain several strings, each identified by a name marked by a <big><big><code>></code></big></big> (greater than) character at the beginning of the line. ~~Write a program that reads a FASTA file such as:~~ ~~<pre>>Rosetta_Example_1~~ ;Task: Write a program that reads a FASTA file such as: <pre> >Rosetta_Example_1 THERECANBENOSPACE >Rosetta_Example_2 THERECANBESEVERAL LINESBUTTHEYALLMUST BECONCATENATED~~</pre>~~ </pre> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> Note that a high-quality implementation will not hold the entire file in memory at once; real FASTA files can be multiple gigabytes in size. <br><br> =={{header\|11l}}== {{trans\|Python}} <syntaxhighlight lang="11l">V FASTA = \|‘>Rosetta_Example_1 THERECANBENOSPACE >Rosetta_Example_2 THERECANBESEVERAL LINESBUTTHEYALLMUST BECONCATENATED’ F fasta_parse(infile_str) V key = ‘’ V val = ‘’ [(String, String)] r L(line) infile_str.split("\n") I line.starts_with(‘>’) I key != ‘’ r [+]= (key, val) key = line[1..].split_py()[0] val = ‘’ E I key != ‘’ val ‘’= line I key != ‘’ r [+]= (key, val) R r print(fasta_parse(FASTA).map((key, val) -> ‘#.: #.’.format(key, val)).join("\n"))</syntaxhighlight> {{out}} <pre> ~~<pre>Rosetta_Example_1: THERECANBENOSPACE~~ Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Action!}}== In the following solution the input file [https://gitlab.com/amarok8bit/action-rosetta-code/-/blob/master/source/fasta.txt fasta.txt] is loaded from H6 drive. Altirra emulator automatically converts CR/LF character from ASCII into 155 character in ATASCII charset used by Atari 8-bit computer when one from H6-H10 hard drive under DOS 2.5 is used. <syntaxhighlight lang="action!">PROC ReadFastaFile(CHAR ARRAY fname) CHAR ARRAY line(256) CHAR ARRAY tmp(256) BYTE newLine,dev=[1] newLine=0 Close(dev) Open(dev,fname,4) WHILE Eof(dev)=0 DO InputSD(dev,line) IF line(0)>0 AND line(1)='> THEN IF newLine THEN PutE() FI newLine=1 SCopyS(tmp,line,2,line(0)) Print(tmp) Print(": ") ELSE Print(line) FI OD Close(dev) RETURN PROC Main() CHAR ARRAY fname="H6:FASTA.TXT" ReadFastaFile(fname) RETURN</syntaxhighlight> {{out}} [https://gitlab.com/amarok8bit/action-rosetta-code/-/raw/master/images/FASTA_format.png Screenshot from Atari 8-bit computer] <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATE </pre> =={{header\|Ada}}== The simple solution just reads the file (from standard input) line by line and directly writes it to the standard output. <syntaxhighlight lang="ada">with Ada.Text_IO; use Ada.Text_IO; procedure Simple_FASTA is Current: Character; begin Get(Current); if Current /= '>' then raise Constraint_Error with "'>' expected"; end if; while not End_Of_File loop -- read name and string Put(Get_Line & ": "); -- read name and write directly to output Read_String: loop exit Read_String when End_Of_File; -- end of input Get(Current); if Current = '>' then -- next name New_Line; exit Read_String; else Put(Current & Get_Line); -- read part of string and write directly to output end if; end loop Read_String; end loop; end Simple_FASTA;</syntaxhighlight> {{out}} <pre>./simple_fasta < test.txt Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</pre> ~~Note that a high-quality implementation will not hold the entire file in memory at once; real FASTA files can be multiple gigabytes in size.~~ This is a boringly simple text transformation. The following more complex solution reads the entire file into a map and then prints the data stored in the map. The output is exactly the same. as for the simple text transformation. ''"Note that a high-quality implementation will not hold the entire file in memory at once; real FASTA files can be multiple gigabytes in size."'' When processing FASTA files, one may use the input step by step to uptdate an internal data structure and, at the end, to output the answer to a given question. For the task at hand the required output is about the same as the input, thus we store the entire input. For another task, we would not store the entire file. If the task where, e.g., to count the number of characters for each string, we would store (name, number) pairs in our data structure. <syntaxhighlight lang="ada">with Ada.Text_IO, Ada.Containers.Indefinite_Ordered_Maps; use Ada.Text_IO; procedure FASTA is package Maps is new Ada.Containers.Indefinite_Ordered_Maps (Element_Type => String, Key_Type => String); Map: Maps.Map; -- Map holds the full file (as pairs of name and value) function Get_Value(Previous: String := "") return String is Current: Character; begin if End_Of_File then return Previous; -- file ends else Get(Current); -- read first character if Current = '>' then -- ah, a new name begins return Previous; -- the string read so far is the value else -- the entire line is part of the value return Get_Value(Previous & Current & Get_Line); end if; end if; end Get_Value; procedure Print_Pair(Position: Maps.Cursor) is begin Put_Line(Maps.Key(Position) & ": " & Maps.Element(Position)); -- Maps.Key(X) is the name and Maps.Element(X) is the value at X end Print_Pair; Skip_This: String := Get_Value; -- consumes the entire file, until the first line starting with '>'. -- the string Skip_This should be empty, but we don't verify this begin while not End_Of_File loop -- read the file into Map declare Name: String := Get_Line; -- reads all characters in the line, except for the first ">" Value: String := Get_Value; begin Map.Insert(Key => Name, New_Item => Value); -- adds the pair (Name, Value) to Map end; end loop; Map.Iterate(Process => Print_Pair'Access); -- print Map end FASTA;</syntaxhighlight> =={{header\|Aime}}== <syntaxhighlight lang="aime">file f; text n, s; f.affix(argv(1)); while (f.line(s) ^ -1) { if (s[0] == '>') { o_(n, s, ": "); n = "\n"; } else { o_(s); } } o_(n);</syntaxhighlight> {{Out}} <pre>>Rosetta_Example_1: THERECANBENOSPACE >Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</pre> =={{header\|ALGOL 68}}== {{Trans\|ALGOL W}} <syntaxhighlight lang="algol68"> BEGIN # read FASTA format data from standard input and write the results to # # standard output - only the ">" line start is handled # BOOL at eof := FALSE; on logical file end( stand in, ( REF FILE f )BOOL: at eof := TRUE ); WHILE STRING line; read( ( line, newline ) ); NOT at eof DO IF line /= "" THEN # non-empty line # INT start := LWB line; BOOL is heading = line[ start ] = ">"; # check for heading line # IF is heading THEN print( ( newline ) ); start +:= 1 FI; print( ( line[ start : ] ) ); IF is heading THEN print( ( ": " ) ) FI FI OD END </syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|ALGOL W}}== <syntaxhighlight lang="algolw">begin % reads FASTA format data from standard input and write the results to standard output % % only handles the ">" line start % string(256) line; % allow the program to continue after reaching end-of-file % ENDFILE := EXCEPTION( false, 1, 0, false, "EOF" ); % handle the input % readcard( line ); while not XCPNOTED(ENDFILE) do begin % strings are fixed length in Algol W - we need to find the line lengh with trailing spaces removed % integer len; len := 255; while len > 0 and line( len // 1 ) = " " do len := len - 1; if len > 0 then begin % non-empty line % integer pos; pos := 0; if line( 0 // 1 ) = ">" then begin % header line % write(); pos := 1; end if_header_line ; for cPos := pos until len do writeon( line( cPos // 1 ) ); if line( 0 // 1 ) = ">" then writeon( ": " ) end if_non_empty_line ; readcard( line ); end while_not_eof end.</syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Arturo}}== <syntaxhighlight lang="rebol">parseFasta: function [data][ result: #[] current: ø loop split.lines data 'line [ if? `>` = first line [ current: slice line 1 (size line)-1 set result current "" ] else -> set result current (get result current)++line ] return result ] text: { >Rosetta_Example_1 THERECANBENOSPACE >Rosetta_Example_2 THERECANBESEVERAL LINESBUTTHEYALLMUST BECONCATENATED } inspect.muted parseFasta text</syntaxhighlight> {{out}} <pre>[ :dictionary Rosetta_Example_1 : THERECANBENOSPACE :string Rosetta_Example_2 : THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED :string ]</pre> =={{header\|AutoHotkey}}== <~~lang~~syntaxhighlight ~~AutoHotkey~~lang="autohotkey">Data = ( >Rosetta_Example_1 Line 32 ⟶ 323: Gui, add, Edit, w700, % Data Gui, show return</~~lang~~syntaxhighlight> {{out}} <pre>>Rosetta_Example_1: THERECANBENOSPACE >Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</pre> =={{header\|AWK}}== <syntaxhighlight lang="awk"> ~~<lang AWK>~~ # syntax: GAWK -f FASTA_FORMAT.AWK filename # stop processing each file when an error is encountered Line 89 ⟶ 381: return } </syntaxhighlight> ~~</lang>~~ {{out}} <pre> Line 95 ⟶ 387: Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|BASIC}}== ==={{header\|QBasic}}=== {{works with\|QBasic\|1.1}} {{works with\|QuickBasic\|4.5}} <syntaxhighlight lang="qbasic">FUNCTION checkNoSpaces (s$) FOR i = 1 TO LEN(s$) - 1 IF MID$(s$, i, 1) = CHR$(32) OR MID$(s$, i, 1) = CHR$(9) THEN checkNoSpaces = 0 NEXT i checkNoSpaces = 1 END FUNCTION OPEN "input.fasta" FOR INPUT AS #1 first = 1 DO WHILE NOT EOF(1) LINE INPUT #1, ln$ IF LEFT$(ln$, 1) = ">" THEN IF NOT first THEN PRINT PRINT MID$(ln$, 2); ": "; IF first THEN first = 0 ELSEIF first THEN PRINT : PRINT "Error : File does not begin with '>'" EXIT DO ELSE IF checkNoSpaces(ln$) THEN PRINT ln$; ELSE PRINT : PRINT "Error : Sequence contains space(s)" EXIT DO END IF END IF LOOP CLOSE #1</syntaxhighlight> ==={{header\|True BASIC}}=== {{trans\|QBasic}} <syntaxhighlight lang="qbasic">DEF EOF(f) IF END #f THEN LET EOF = -1 ELSE LET EOF = 0 END DEF FUNCTION checknospaces(s$) FOR i = 1 TO LEN(s$)-1 IF (s$)[i:1] = CHR$(32) OR (s$)[i:1] = CHR$(9) THEN LET checkNoSpaces = 0 NEXT i LET checknospaces = 1 END FUNCTION OPEN #1: NAME "m:\input.fasta", org text, ACCESS INPUT, create old LET first = 1 DO WHILE (NOT EOF(1)<>0) LINE INPUT #1: ln$ IF (ln$)[1:1] = ">" THEN IF (NOT first<>0) THEN PRINT PRINT (ln$)[2:maxnum]; ": "; IF first<>0 THEN LET first = 0 ELSEIF first<>0 THEN PRINT "Error : File does not begin with '>'" EXIT DO ELSE IF checknospaces(ln$)<>0 THEN PRINT ln$; ELSE PRINT "Error : Sequence contains space(s)" EXIT DO END IF END IF LOOP CLOSE #1 END</syntaxhighlight> =={{header\|BASIC256}}== <syntaxhighlight lang="basic256">open 1, "input.fasta" first = True while not eof(1) ln = readline(1) if left(ln, 1) = ">" then if not first then print print mid(ln, 2, length(ln)-2) & ": "; if first then first = False else if first then print "Error : File does not begin with '>'" exit while else if checkNoSpaces(ln) then print left(ln, length(ln)-2); else print "Error : Sequence contains space(s)" exit while end if end if end if end while close 1 end function checkNoSpaces(s) for i = 1 to length(s) - 1 if chr(mid(s,i,1)) = 32 or chr(mid(s,i,1)) = 9 then return False next i return True end function</syntaxhighlight> =={{header\|C}}== <syntaxhighlight lang="c">#include <stdio.h> #include <stdlib.h> #include <string.h> void main() { FILE * fp; char * line = NULL; size_t len = 0; ssize_t read; fp = fopen("fasta.txt", "r"); if (fp == NULL) exit(EXIT_FAILURE); int state = 0; while ((read = getline(&line, &len, fp)) != -1) { /* Delete trailing newline / if (line[read - 1] == '\n') line[read - 1] = 0; / Handle comment lines/ if (line[0] == '>') { if (state == 1) printf("\n"); printf("%s: ", line+1); state = 1; } else { / Print everything else / printf("%s", line); } } printf("\n"); fclose(fp); if (line) free(line); exit(EXIT_SUCCESS); }</syntaxhighlight> {{out}} <pre>Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</pre> =={{header\|C sharp\|C#}}== <syntaxhighlight lang="csharp">using System; using System.Collections.Generic; using System.IO; using System.Text; class Program { public class FastaEntry { public string Name { get; set; } public StringBuilder Sequence { get; set; } } static IEnumerable<FastaEntry> ParseFasta(StreamReader fastaFile) { FastaEntry f = null; string line; while ((line = fastaFile.ReadLine()) != null) { // ignore comment lines if (line.StartsWith(";")) continue; if (line.StartsWith(">")) { if (f != null) yield return f; f = new FastaEntry { Name = line.Substring(1), Sequence = new StringBuilder() }; } else if (f != null) f.Sequence.Append(line); } yield return f; } static void Main(string[] args) { try { using (var fastaFile = new StreamReader("fasta.txt")) { foreach (FastaEntry f in ParseFasta(fastaFile)) Console.WriteLine("{0}: {1}", f.Name, f.Sequence); } } catch (FileNotFoundException e) { Console.WriteLine(e); } Console.ReadLine(); } }</syntaxhighlight> =={{header\|C++}}== <~~lang~~syntaxhighlight lang="cpp">#include <iostream> #include <fstream> Line 137 ⟶ 634: return 0; }</~~lang~~syntaxhighlight> {{out}} Line 144 ⟶ 641: </pre> =={{header\|DClojure}}== <syntaxhighlight lang="clojure">(defn fasta [pathname] ~~<lang d>import std.stdio, std.string;~~ (with-open [r (clojure.java.io/reader pathname)] (doseq [line (line-seq r)] (if (= (first line) \>) (print (format "%n%s: " (subs line 1))) (print line)))))</syntaxhighlight> =={{header\|Common Lisp}}== ~~void main() {~~ <syntaxhighlight lang="lisp">;; The input file as a parameter ~~immutable fileName = "fasta_format_data.fasta";~~ (defparameter input #p"fasta.txt" "The input file name.") ;; * Reading the data ~~bool first = true;~~ (with-open-file (data input) (loop :for line = (read-line data nil nil) :while line ;; Check if we have a comment using a simple test instead of a RegEx :if (char= #\> (char line 0)) :do (format t "~&~a: " (subseq line 1)) :else :do (format t "~a" line)))</syntaxhighlight> {{out}} <pre>Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</pre> =={{header\|Crystal}}== If you want to run below code online, then paste below code to [https://play.crystal-lang.org/#/cr <b>playground</b>] <syntaxhighlight lang="ruby"> # create tmp fasta file in /tmp/ tmpfile = "/tmp/tmp"+Random.rand.to_s+".fasta" File.write(tmpfile, ">Rosetta_Example_1 THERECANBENOSPACE >Rosetta_Example_2 THERECANBESEVERAL LINESBUTTHEYALLMUST BECONCATENATED") # read tmp fasta file and store to hash ~~foreach (const line; fileName.File.byLine) {~~ ref = tmpfile ~~if (line[0] == '>') {~~ id = seq = "" ~~if (first) {~~ fasta = {} of String => String ~~first = false;~~ File.each_line(ref) do \|line\| ~~} else {~~ if line.starts_with?(">") ~~writeln;~~ fasta[id] = seq.sub(/\s/, "") if id != }"" id = line.split(/\s/)[0].lstrip(">") seq = "" else seq += line end end fasta[id] = seq.sub(/\s/, "") # show fasta component ~~write(line[1 .. $].strip, ": ");~~ fasta.each { \|k,v\| puts "#{k}: #{v}"} ~~} else {~~ </syntaxhighlight> ~~line.strip.write;~~ {{out}} } <pre> } Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Delphi}}== See [https://rosettacode.org/wiki/FASTA_format#Pascal Pascal]. =={{header\|EasyLang}}== ~~writeln;~~ <syntaxhighlight> ~~}</lang>~~ repeat s$ = input until s$ = "" if substr s$ 1 1 = ">" if stat = 1 print "" . stat = 1 print s$ else write s$ . . input_data >Rosetta_Example_1 THERECANBENOSPACE >Rosetta_Example_2 THERECANBESEVERAL LINESBUTTHEYALLMUST BECONCATENATED </syntaxhighlight> =={{header\|F_Sharp\|F#}}== <syntaxhighlight lang="fsharp"> //FASTA format. Nigel Galloway: March 23rd., 2023. let fN(g:string)=match g[0] with '>'->printfn "\n%s:" g[1..] \|_->printf "%s" g let lines=seq{use n=System.IO.File.OpenText("testFASTA.txt") in while not n.EndOfStream do yield n.ReadLine()} printfn "%s:" ((Seq.head lines)[1..]); Seq.tail lines\|>Seq.iter fN; printfn "" </syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Factor}}== <syntaxhighlight lang="factor">USING: formatting io kernel sequences ; IN: rosetta-code.fasta : process-fasta-line ( str -- ) dup ">" head? [ rest "\n%s: " printf ] [ write ] if ; : main ( -- ) readln rest "%s: " printf [ process-fasta-line ] each-line ; MAIN: main</syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Forth}}== Developed with gforth 0.7.9 <syntaxhighlight lang="forth">1024 constant max-Line char > constant marker : read-lines begin pad max-line >r over r> swap read-line throw while pad dup c@ marker = if cr 1+ swap type ." : " else swap type then repeat drop ; : Test s" ./FASTA.txt" r/o open-file throw read-lines close-file throw cr ; Test </syntaxhighlight> {{out}} <pre> Rosetta_Example_1 : THERECANBENOSPACE Rosetta_Example_2 : THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|FreeBASIC}}== This program sticks to the task as described in the heading and doesn't allow for any of the (apparently) obsolete practices described in the Wikipedia article : <syntaxhighlight lang="freebasic">' FB 1.05.0 Win64 Function checkNoSpaces(s As String) As Boolean For i As UInteger = 0 To Len(s) - 1 If s[i] = 32 OrElse s[i] = 9 Then Return False '' check for spaces or tabs Next Return True End Function Open "input.fasta" For Input As # 1 Dim As String ln, seq Dim first As Boolean = True While Not Eof(1) Line Input #1, ln If Left(ln, 1) = ">" Then If Not first Then Print Print Mid(ln, 2); ": "; If first Then first = False ElseIf first Then Print: Print "Error : File does not begin with '>'"; Exit While Else If checkNoSpaces(ln) Then Print ln; Else Print : Print "Error : Sequence contains space(s)"; Exit While End If End If Wend Close #1 Print : Print Print "Press any key to quit" Sleep</syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Gambas}}== <syntaxhighlight lang="gambas">Public Sub Main() Dim sList As String = File.Load("../FASTA") Dim sTemp, sOutput As String For Each sTemp In Split(sList, gb.NewLine) If sTemp Begins ">" Then If sOutput Then Print sOutput sOutput = Right(sTemp, -1) & ": " Else sOutput &= sTemp Endif Next Print sOutput End</syntaxhighlight> Output: <pre>Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED~~</pre>~~ </pre> =={{header\|Go}}== <~~lang~~syntaxhighlight lang="go">package main import ( Line 215 ⟶ 900: fmt.Println(err) } }</~~lang~~syntaxhighlight> {{out}} <pre> Line 221 ⟶ 906: Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Haskell}}== We pass the file path as an argument to the parseFasta function, which only does the file loading and result printing. '''The first way''' We parse FASTA by hand (generally not a recommended approach). We use the fact that groupBy walks the list from the head and groups the items by a predicate; here we first concatenate all the fasta strings and then pair those with each respective name. <syntaxhighlight lang="haskell">import Data.List ( groupBy ) parseFasta :: FilePath -> IO () parseFasta fileName = do file <- readFile fileName let pairedFasta = readFasta $ lines file mapM_ (\(name, code) -> putStrLn $ name ++ ": " ++ code) pairedFasta readFasta :: [String] -> [(String, String)] readFasta = pair . map concat . groupBy (\x y -> notName x && notName y) where notName :: String -> Bool notName = (/=) '>' . head pair :: [String] -> [(String, String)] pair [] = [] pair (x : y : xs) = (drop 1 x, y) : pair xs</syntaxhighlight> {{out}} <pre>Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</pre> '''The second way''' We parse FASTA using parser combinators. Normally you'd use something like Trifecta or Parsec, but here we use ReadP, because it is simple and also included in ghc by default. With other parsing libraries the code would be almost the same. <syntaxhighlight lang="haskell">import Text.ParserCombinators.ReadP import Control.Applicative ( (<\|>) ) import Data.Char ( isAlpha, isAlphaNum ) parseFasta :: FilePath -> IO () parseFasta fileName = do file <- readFile fileName let pairs = fst . last . readP_to_S readFasta $ file mapM_ (\(name, code) -> putStrLn $ name ++ ": " ++ code) pairs readFasta :: ReadP [(String, String)] readFasta = many pair <* eof where pair = (,) <$> name <> code name = char '>' > many (satisfy isAlphaNum <\|> char '_') <* newline code = concat <$> many (many (satisfy isAlpha) <* newline) newline = char '\n'</syntaxhighlight> {{out}} <pre>Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</pre> =={{header\|J}}== Needs chunking to handle huge files. <~~lang~~syntaxhighlight lang="j">require 'strings' NB. not needed for J versions greater than 6. parseFasta=: ((': ' ,~ LF&taketo) , (LF -.~ LF&takeafter));._1</~~lang~~syntaxhighlight> '''Example Usage''' <~~lang~~syntaxhighlight lang="j"> Fafile=: noun define >Rosetta_Example_1 THERECANBENOSPACE Line 237 ⟶ 978: parseFasta Fafile Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</~~lang~~syntaxhighlight> Nowadays, most machines have gigabytes of memory. However, if it's necessary to process FASTA content on a system with inadequate memory we can use files to hold intermediate results. For example: <syntaxhighlight lang="j">bs=: 2 chunkFasta=: {{ r=. EMPTY bad=. a.-.a.{~;48 65 97(+i.)each 10 26 26 dir=. x,'/' off=. 0 siz=. fsize y block=. dest=. '' while. off < siz do. block=. block,fread y;off([, [ -~ siz<.+)bs off=. off+bs while. LF e. block do. line=. LF taketo block select. {.line case. ';' do. case. '>' do. start=. }.line-.CR r=.r,(head=. name,'.head');<name=. dir,start -. bad start fwrite head '' fwrite name case. do. (line-.bad) fappend name end. block=. LF takeafter block end. end. r }}</syntaxhighlight> Here, we're using a block size of 2 bytes, to illustrate correctness. If speed matters, we should use something significantly larger. The left argument to <code>chunkFasta</code> names the directory used to hold content extracted from the FASTA file. The right argument names that FASTA file. The result identifies the extracted headers and contents Thus, if '~/fasta.txt' contains the example file for this task and we want to store intermediate results in the '~temp' directory, we could use: <syntaxhighlight lang="j"> fasta=: '~temp' chunkFasta '~/fasta.txt'</syntaxhighlight> And, to complete the task: <syntaxhighlight lang="j"> ;(,': ',,&LF)each/"1 fread each fasta Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</syntaxhighlight> =={{header\|Java}}== This implementation presumes the data-file is well-formed <syntaxhighlight lang="java"> import java.io.BufferedReader; import java.io.FileReader; import java.io.IOException; import java.util.ArrayList; import java.util.List; </syntaxhighlight> <syntaxhighlight lang="java"> public static void main(String[] args) throws IOException { List<FASTA> fastas = readFile("fastas.txt"); for (FASTA fasta : fastas) System.out.println(fasta); } static List<FASTA> readFile(String path) throws IOException { try (BufferedReader reader = new BufferedReader(new FileReader(path))) { List<FASTA> list = new ArrayList<>(); StringBuilder lines = null; String newline = System.lineSeparator(); String line; while ((line = reader.readLine()) != null) { if (line.startsWith(">")) { if (lines != null) list.add(parseFASTA(lines.toString())); lines = new StringBuilder(); lines.append(line).append(newline); } else { lines.append(line); } } list.add(parseFASTA(lines.toString())); return list; } } static FASTA parseFASTA(String string) { String description; char[] sequence; int indexOf = string.indexOf(System.lineSeparator()); description = string.substring(1, indexOf); /* using 'stripLeading' will remove any additional line-separators / sequence = string.substring(indexOf + 1).stripLeading().toCharArray(); return new FASTA(description, sequence); } / using a 'char' array seems more logical / record FASTA(String description, char[] sequence) { @Override public String toString() { return "%s: %s".formatted(description, new String(sequence)); } } </syntaxhighlight> <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> <br /> An alternate demonstration {{trans\|D}} {{works with\|Java\|7}} <~~lang~~syntaxhighlight lang="java">import java.io.; import java.util.Scanner; Line 267 ⟶ 1,114: System.out.println(); } }</~~lang~~syntaxhighlight> <pre>Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED Rosetta_Example_3: THISISFASTA</pre> =={{header\|JavaScript}}== The code below uses Nodejs to read the file. <syntaxhighlight lang="javascript"> const fs = require("fs"); const readline = require("readline"); const args = process.argv.slice(2); if (!args.length) { console.error("must supply file name"); process.exit(1); } const fname = args[0]; const readInterface = readline.createInterface({ input: fs.createReadStream(fname), console: false, }); let sep = ""; readInterface.on("line", (line) => { if (line.startsWith(">")) { process.stdout.write(sep); sep = "\n"; process.stdout.write(line.substring(1) + ": "); } else { process.stdout.write(line); } }); readInterface.on("close", () => process.stdout.write("\n")); </syntaxhighlight> <pre>Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</pre> =={{header\|jq}}== Line 279 ⟶ 1,162: in each cycle, only as many lines are read as are required to compose an output line. <br> Notice that an additional ">" must be provided to "foreach" to ensure the final block of lines of the input file are properly assembled. <syntaxhighlight lang="jq"> ~~<lang jq>~~ def fasta: foreach (inputs, ">") as $line Line 290 ⟶ 1,173: ; fasta</~~lang~~syntaxhighlight> {{out}} <~~lang~~syntaxhighlight lang="sh">$ jq -n -R -r -f FASTA_format.jq < FASTA_format.fasta Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</~~lang~~syntaxhighlight> =={{header\|Julia}}== {{works with\|Julia\|0.6}} <syntaxhighlight lang="julia">for line in eachline("data/fasta.txt") if startswith(line, '>') print(STDOUT, "\n$(line[2:end]): ") else print(STDOUT, "$line") end end</syntaxhighlight> =={{header\|Kotlin}}== {{trans\|FreeBASIC}} <syntaxhighlight lang="scala">// version 1.1.2 import java.util.Scanner import java.io.File fun checkNoSpaces(s: String) = ' ' !in s && '\t' !in s fun main(args: Array<String>) { var first = true val sc = Scanner(File("input.fasta")) while (sc.hasNextLine()) { val line = sc.nextLine() if (line[0] == '>') { if (!first) println() print("${line.substring(1)}: ") if (first) first = false } else if (first) { println("Error : File does not begin with '>'") break } else if (checkNoSpaces(line)) print(line) else { println("\nError : Sequence contains space(s)") break } } sc.close() }</syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Lua}}== <syntaxhighlight lang="lua">local file = io.open("input.txt","r") local data = file:read("a") file:close() local output = {} local key = nil -- iterate through lines for line in data:gmatch("(.-)\r?\n") do if line:match("%s") then error("line contained space") elseif line:sub(1,1) == ">" then key = line:sub(2) -- if key already exists, append to the previous input output[key] = output[key] or "" elseif key ~= nil then output[key] = output[key] .. line end end -- print result for k,v in pairs(output) do print(k..": "..v) end</syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|M2000 Interpreter}}== Spaghetti code, using Goto, but works using partially reading of an input stream, with no known size of each reading (supposed data transmitted). We make an object as a FASTA_MACHINE, and run it. Object produce events, so we have some functions for services. These functions called as subs, but we have to use New if we want to shadow any same named variable. (subs always include the New (a Read New) so we didn't use there). If there no modules variables with same names as for arguments for these functions then we can exclude New. All these functions have same scope as the module where they belong. We can use ";" for comments, ">" for title. We can input one char, or many, in each input packet. Linefeed by default is CRLF. Whitespaces are spaces, nbsp, and tabs. <syntaxhighlight lang="m2000 interpreter"> Module CheckIt { Class FASTA_MACHINE { Events "GetBuffer", "header", "DataLine", "Quit" Public: Module Run { Const lineFeed$=chr$(13)+chr$(10) Const WhiteSpace$=" "+chr$(9)+chrcode$(160) Def long state=1, idstate=1 Def boolean Quit=False Def Buf$, waste$, Packet$ GetNextPacket: Call Event "Quit", &Quit If Quit then exit Call Event "GetBuffer", &Packet$ Buf$+=Packet$ If len(Buf$)=0 Then exit On State Goto GetStartIdentifier, GetIdentifier, GetStartData, GetData, GetStartIdentifier2 exit GetStartIdentifier: waste$=rightpart$(Buf$, ">") GetStartIdentifier2: If len(waste$)=0 Then waste$=rightpart$(Buf$, ";") : idstate=2 If len(waste$)=0 Then idstate=1 : Goto GetNextPacket ' we have to read more buf$=waste$ state=2 GetIdentifier: If Len(Buf$)=len(lineFeed$) then { if buf$<>lineFeed$ then Goto GetNextPacket waste$="" } Else { if instr(buf$, lineFeed$)=0 then Goto GetNextPacket waste$=rightpart$(Buf$, lineFeed$) } If idstate=2 Then { idstate=1 \\ it's a comment, drop it state=1 Goto GetNextPacket } Else Call Event "header", filter$(leftpart$(Buf$,lineFeed$), WhiteSpace$) Buf$=waste$ State=3 GetStartData: while left$(buf$, 2)=lineFeed$ {buf$=Mid$(buf$,3)} waste$=Leftpart$(Buf$, lineFeed$) If len(waste$)=0 Then Goto GetNextPacket ' we have to read more waste$=Filter$(waste$,WhiteSpace$) Call Event "DataLine", leftpart$(Buf$,lineFeed$) Buf$=Rightpart$(Buf$,lineFeed$) state=4 GetData: while left$(buf$, 2)=lineFeed$ {buf$=Mid$(buf$,3)} waste$=Leftpart$(Buf$, lineFeed$) If len(waste$)=0 Then Goto GetNextPacket ' we have to read more If Left$(waste$,1)=";" Then wast$="": state=5 : Goto GetStartIdentifier2 If Left$(waste$,1)=">" Then state=1 : Goto GetStartIdentifier waste$=Filter$(waste$,WhiteSpace$) Call Event "DataLine", waste$ Buf$=Rightpart$(Buf$,lineFeed$) Goto GetNextPacket } } Group WithEvents K=FASTA_MACHINE() Document Final$, Inp$ \\ In documents, "="" used for append data. Final$="append this" Const NewLine$=chr$(13)+chr$(10) Const Center=2 \\ Event's Functions Function K_GetBuffer (New &a$) { Input "IN:", a$ inp$=a$+NewLine$ while right$(a$, 1)="\" { Input "IN:", b$ inp$=b$+NewLine$ if b$="" then b$="n" a$+=b$ } a$= replace$("\N","\n", a$) a$= replace$("\n",NewLine$, a$) } Function K_header (New a$) { iF Doc.Len(Final$)=0 then { Final$=a$+": " } Else Final$=Newline$+a$+": " } Function K_DataLine (New a$) { Final$=a$ } Function K_Quit (New &q) { q=keypress(1) } Cls , 0 Report Center, "FASTA Format" Report "Simulate input channel in packets (\n for new line). Use empty input to exit after new line, or press left mouse button and Enter to quit. Use ; to write comments. Use > to open a title" Cls, row ' scroll from current row K.Run Cls Report Center, "Input File" Report Inp$ Report Center, "Output File" Report Final$ } checkit </syntaxhighlight> =={{header\|Mathematica}}/{{header\|Wolfram Language}}== Mathematica has built-in support for FASTA files and strings <syntaxhighlight lang="mathematica">ImportString[">Rosetta_Example_1 THERECANBENOSPACE >Rosetta_Example_2 THERECANBESEVERAL LINESBUTTHEYALLMUST BECONCATENATED ", "FASTA"]</syntaxhighlight> {{out}} <pre>{"THERECANBENOSPACE", "THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED"}</pre> =={{header\|Nim}}== <syntaxhighlight lang="nim"> import strutils let input = """>Rosetta_Example_1 THERECANBENOSPACE >Rosetta_Example_2 THERECANBESEVERAL LINESBUTTHEYALLMUST BECONCATENATED""".unindent proc fasta(input: string) = var row = "" for line in input.splitLines: if line.startsWith(">"): if row != "": echo row row = line[1..^1] & ": " else: row &= line.strip echo row fasta(input) </syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Oberon}}== Works with A2 Oberon. <syntaxhighlight lang="Oberon"> MODULE Fasta; IMPORT Files, Streams, Strings, Commands; PROCEDURE PrintOn(filename: ARRAY OF CHAR; wr: Streams.Writer); VAR rd: Files.Reader; f: Files.File; line: ARRAY 1024 OF CHAR; res: BOOLEAN; BEGIN f := Files.Old(filename); ASSERT(f # NIL); NEW(rd,f,0); res := rd.GetString(line); WHILE rd.res # Streams.EOF DO IF line[0] = '>' THEN wr.Ln; wr.String(Strings.Substring2(1,line)^); wr.String(": ") ELSE wr.String(line) END; res := rd.GetString(line) END END PrintOn; PROCEDURE Do; VAR ctx: Commands.Context; filename: ARRAY 256 OF CHAR; res: BOOLEAN BEGIN ctx := Commands.GetContext(); res := ctx.arg.GetString(filename); PrintOn(filename,ctx.out) END Do; END Fasta. </syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Objeck}}== <syntaxhighlight lang="objeck">class Fasta { function : Main(args : String[]) ~ Nil { if(args->Size() = 1) { is_line := false; tokens := System.Utility.Parser->Tokenize(System.IO.File.FileReader->ReadFile(args[0]))<String>; each(i : tokens) { token := tokens->Get(i); if(token->Get(0) = '>') { is_line := true; if(i <> 0) { "\n"->Print(); }; } else if(is_line) { "{$token}: "->Print(); is_line := false; } else { token->Print(); }; }; }; }; } } </syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|OCaml}}== Line 301 ⟶ 1,502: The program reads and processes the input one line at a time, and directly prints out the chunk of data available. The long strings are not concatenated in memory but just examined and processed as necessary: either printed out as is in the case of part of a sequence, or formatted in the case of the name (what I call the label), and managing the new lines where needed. {{works with\|OCaml\|4.03+}} <~~lang~~syntaxhighlight lang="ocaml"> (* This program reads from the standard input and writes to standard output. * Examples of use: Line 321 ⟶ 1,522: let print_fasta chan = let rec doloop currlabel line = ~~match~~if is_label line ~~with~~then begin if currlabel <> "" then print_newline (); ~~\| s when is_label s ->~~ let newlabel = get_label line in print_string (newlabel ^ ": "); ~~and _ = if currlabel <> "" then print_newline () else () in~~ ~~let _ = print_string (newlabel ^ ": ") in~~ doloop newlabel (read_in chan) ~~\| _ ->~~end else begin ~~let _ = print_string line in~~ print_string line; doloop currlabel (read_in chan) end in try Line 341 ⟶ 1,543: let () = print_fasta stdin </syntaxhighlight> ~~</lang>~~ {{out}} Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED =={{header\|~~Perl 6~~Pascal}}== <syntaxhighlight lang="pascal"> ~~<lang perl6>grammar FASTA {~~ program FASTA_Format; // FPC 3.0.2 var InF, OutF: Text; ch: char; First: Boolean=True; InDef: Boolean=False; begin ~~rule TOP { <entry>+ }~~ Assign(InF,''); ~~rule entry { \> <title> <sequence> }~~ Reset(InF); ~~token title { <.alnum>+ }~~ Assign(OutF,''); ~~token sequence { ( <.alnum>+ )+ % \n { make $0.join } }~~ Rewrite(OutF); While Not Eof(InF) do begin Read(InF,ch); Case Ch of '>': begin if Not(First) then Write(OutF,#13#10) else First:=False; InDef:=true; end; #13: Begin if InDef then begin InDef:=false; Write(OutF,': '); end; Ch:=#0; end; #10: ch:=#0; else Write(OutF,Ch); end; end; Close(OutF); Close(InF); end. </syntaxhighlight> FASTA_Format < test.fst <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Perl}}== <syntaxhighlight lang="perl">my $fasta_example = <<'END_FASTA_EXAMPLE'; >Rosetta_Example_1 THERECANBENOSPACE >Rosetta_Example_2 THERECANBESEVERAL LINESBUTTHEYALLMUST BECONCATENATED END_FASTA_EXAMPLE my $num_newlines = 0; while ( < $fasta_example > ) { if (/\A\>(.)/) { print "\n" x $num_newlines, $1, ': '; } else { $num_newlines = 1; print; } }</syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Phix}}== <!--<syntaxhighlight lang="phix">(notonline)--> <span style="color: #004080;">bool</span> <span style="color: #000000;">first</span> <span style="color: #0000FF;">=</span> <span style="color: #004600;">true</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">fn</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">open</span><span style="color: #0000FF;">(</span><span style="color: #008000;">"fasta.txt"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"r"</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">if</span> <span style="color: #000000;">fn</span><span style="color: #0000FF;">=-</span><span style="color: #000000;">1</span> <span style="color: #008080;">then</span> <span style="color: #0000FF;">?</span><span style="color: #000000;">9</span><span style="color: #0000FF;">/</span><span style="color: #000000;">0</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #008080;">while</span> <span style="color: #004600;">true</span> <span style="color: #008080;">do</span> <span style="color: #004080;">object</span> <span style="color: #000000;">line</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">trim</span><span style="color: #0000FF;">(</span><span style="color: #7060A8;">gets</span><span style="color: #0000FF;">(</span><span style="color: #000000;">fn</span><span style="color: #0000FF;">))</span> <span style="color: #008080;">if</span> <span style="color: #004080;">atom</span><span style="color: #0000FF;">(</span><span style="color: #000000;">line</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">then</span> <span style="color: #7060A8;">puts</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"\n"</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">exit</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #008080;">if</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">line</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">then</span> <span style="color: #008080;">if</span> <span style="color: #000000;">line</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]==</span><span style="color: #008000;">'>'</span> <span style="color: #008080;">then</span> <span style="color: #008080;">if</span> <span style="color: #008080;">not</span> <span style="color: #000000;">first</span> <span style="color: #008080;">then</span> <span style="color: #7060A8;">puts</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"\n"</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"%s: "</span><span style="color: #0000FF;">,{</span><span style="color: #000000;">line</span><span style="color: #0000FF;">[</span><span style="color: #000000;">2</span><span style="color: #0000FF;">..$]})</span> <span style="color: #000000;">first</span> <span style="color: #0000FF;">=</span> <span style="color: #004600;">false</span> <span style="color: #008080;">elsif</span> <span style="color: #000000;">first</span> <span style="color: #008080;">then</span> <span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"Error : File does not begin with '>'\n"</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">exit</span> <span style="color: #008080;">elsif</span> <span style="color: #008080;">not</span> <span style="color: #7060A8;">find_any</span><span style="color: #0000FF;">(</span><span style="color: #008000;">" \t"</span><span style="color: #0000FF;">,</span><span style="color: #000000;">line</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">then</span> <span style="color: #7060A8;">puts</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #000000;">line</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">else</span> <span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"\nError : Sequence contains space(s)\n"</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">exit</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #008080;">end</span> <span style="color: #008080;">while</span> <span style="color: #7060A8;">close</span><span style="color: #0000FF;">(</span><span style="color: #000000;">fn</span><span style="color: #0000FF;">)</span> <!--</syntaxhighlight>--> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|PicoLisp}}== <syntaxhighlight lang="picolisp">(de fasta (F) (in F (while (from ">") (prin (line T) ": ") (until (or (= ">" (peek)) (eof)) (prin (line T)) ) (prinl) ) ) ) (fasta "fasta.dat")</syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|PL/M}}== {{works with\|8080 PL/M Compiler}} ... under CP/M (or an emulator) Reads the data from the file named on the command line, e.g., if the program is stored in D:FASTA.COM and the data in D:FSTAIN.TXT, the following could be used: <code>D:FASTA D:FASTAIN.TXT</code>.<br> Restarts CP/M when the program finishes. <syntaxhighlight lang="plm"> 100H: / DISPLAY THE CONTENTS OF A FASTA FORMT FILE / DECLARE FALSE LITERALLY '0', TRUE LITERALLY '0FFH'; DECLARE NL$CHAR LITERALLY '0AH'; / NEWLINE: CHAR 10 / DECLARE CR$CHAR LITERALLY '0DH'; / CARRIAGE RETURN, CHAR 13 / DECLARE EOF$CHAR LITERALLY '26'; / EOF: CTRL-Z / / CP/M BDOS SYSTEM CALL, RETURNS A VALUE / BDOS: PROCEDURE( FN, ARG )BYTE; DECLARE FN BYTE, ARG ADDRESS; GOTO 5; END; / CP/M BDOS SYSTEM CALL, NO RETURN VALUE / BDOS$P: PROCEDURE( FN, ARG ); DECLARE FN BYTE, ARG ADDRESS; GOTO 5; END; EXIT: PROCEDURE; CALL BDOS$P( 0, 0 ); END; / CP/M SYSTEM RESET / PR$CHAR: PROCEDURE( C ); DECLARE C BYTE; CALL BDOS$P( 2, C ); END; PR$STRING: PROCEDURE( S ); DECLARE S ADDRESS; CALL BDOS$P( 9, S ); END; PR$NL: PROCEDURE; CALL PR$STRING( .( 0DH, NL$CHAR, '$' ) ); END; FL$EXISTS: PROCEDURE( FCB )BYTE; / RETURNS TRUE IF THE FILE NAMED IN THE / DECLARE FCB ADDRESS; / FCB EXISTS / RETURN ( BDOS( 17, FCB ) < 4 ); END FL$EXISTS ; FL$OPEN: PROCEDURE( FCB )BYTE; / OPEN THE FILE WITH THE SPECIFIED FCB / DECLARE FCB ADDRESS; RETURN ( BDOS( 15, FCB ) < 4 ); END FL$OPEN; FL$READ: PROCEDURE( FCB )BYTE; / READ THE NEXT RECORD FROM FCB / DECLARE FCB ADDRESS; RETURN ( BDOS( 20, FCB ) = 0 ); END FL$READ; FL$CLOSE: PROCEDURE( FCB )BYTE; / CLOSE THE FILE WITH THE SPECIFIED FCB / DECLARE FCB ADDRESS; RETURN ( BDOS( 16, FCB ) < 4 ); END FL$CLOSE; / I/O USES FILE CONTROL BLOCKS CONTAINING THE FILE-NAME, POSITION, ETC. / / WHEN THE PROGRAM IS RUN, THE CCP WILL FIRST PARSE THE COMMAND LINE AND / / PUT THE FIRST PARAMETER IN FCB1, THE SECOND PARAMETER IN FCB2 / / BUT FCB2 OVERLAYS THE END OF FCB1 AND THE DMA BUFFER OVERLAYS THE END / / OF FCB2 / DECLARE FCB$SIZE LITERALLY '36'; / SIZE OF A FCB / DECLARE FCB1 LITERALLY '5CH'; / ADDRESS OF FIRST FCB / DECLARE FCB2 LITERALLY '6CH'; / ADDRESS OF SECOND FCB / DECLARE DMA$BUFFER LITERALLY '80H'; / DEFAULT DMA BUFFER ADDRESS / DECLARE DMA$SIZE LITERALLY '128'; / SIZE OF THE DMA BUFFER / DECLARE F$PTR ADDRESS, F$CHAR BASED F$PTR BYTE; / CLEAR THE PARTS OF FCB1 OVERLAYED BY FCB2 / DO F$PTR = FCB1 + 12 TO FCB1 + ( FCB$SIZE - 1 ); F$CHAR = 0; END; / SHOW THE FASTA DATA, IF THE FILE EXISTS / IF NOT FL$EXISTS( FCB1 ) THEN DO; / THE FILE DOES NOT EXIST / CALL PR$STRING( .'FILE NOT FOUND$' );CALL PR$NL; END; ELSE IF NOT FL$OPEN( FCB1 ) THEN DO; / UNABLE TO OPEN THE FILE / CALL PR$STRING( .'UNABLE TO OPEN THE FILE$' );CALL PR$NL; END; ELSE DO; / FILE EXISTS AND OPENED OK - ATTEMPT TO SHOW THE DATA / DECLARE ( BOL, GOT$RCD, IS$HEADING ) BYTE, DMA$END ADDRESS; DMA$END = DMA$BUFFER + ( DMA$SIZE - 1 ); GOT$RCD = FL$READ( FCB1 ); / GET THE FIRST RECORD / F$PTR = DMA$BUFFER; BOL = TRUE; IS$HEADING = FALSE; DO WHILE GOT$RCD; IF F$PTR > DMA$END THEN DO; / END OF BUFFER / GOT$RCD = FL$READ( FCB1 ); / GET THE NEXT RECORDD / F$PTR = DMA$BUFFER; END; ELSE IF F$CHAR = NL$CHAR THEN DO; / END OF LINE / IF IS$HEADING THEN DO; CALL PR$STRING( .': $' ); IS$HEADING = FALSE; END; BOL = TRUE; END; ELSE IF F$CHAR = CR$CHAR THEN DO; END; / IGNORE CARRIAGE RETURN / ELSE IF F$CHAR = EOF$CHAR THEN GOT$RCD = FALSE; / END OF FILE / ELSE DO; / HAVE ANOTHER CHARACTER / IF NOT BOL THEN CALL PR$CHAR( F$CHAR ); / NOT FIRST CHARACTER / ELSE DO; / FIRST CHARACTER - CHECK FOR A HEADING LINE / BOL = FALSE; IF IS$HEADING := F$CHAR = '>' THEN CALL PR$NL; ELSE CALL PR$CHAR( F$CHAR ); END; END; F$PTR = F$PTR + 1; END; / CLOSE THE FILE / IF NOT FL$CLOSE( FCB1 ) THEN DO; CALL PR$STRING( .'UNABLE TO CLOSE THE FILE$' ); CALL PR$NL; END; END; CALL EXIT; EOF </syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|PowerShell}}== When working with a real file, the content of the <code>$file</code> variable would be: <code>Get-Content -Path .\FASTA_file.txt -ReadCount 1000</code>. The <code>-ReadCount</code> parameter value for large files is unknown, yet sure to be a value between 1,000 and 10,000 depending upon the length of file and length of the records in the file. Experimentation is the only way to know the optimum value. {{works with\|PowerShell\|4.0+}} <syntaxhighlight lang="powershell"> $file = @' >Rosetta_Example_1 THERECANBENOSPACE >Rosetta_Example_2 THERECANBESEVERAL LINESBUTTHEYALLMUST BECONCATENATED '@ $lines = $file.Replace("`n","~").Split(">").ForEach({$_.TrimEnd("~").Split("`n",2,[StringSplitOptions]::RemoveEmptyEntries)}) $output = New-Object -TypeName PSObject foreach ($line in $lines) { $name, $value = $line.Split("~",2).ForEach({$_.Replace("~","")}) $output \| Add-Member -MemberType NoteProperty -Name $name -Value $value } $output \| Format-List ~~FASTA.parse: q:to /§/;~~ </syntaxhighlight> {{Out}} <pre> Rosetta_Example_1 : THERECANBENOSPACE Rosetta_Example_2 : THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> ===Version 3.0 Or Less=== <syntaxhighlight lang="powershell"> $file = @' >Rosetta_Example_1 THERECANBENOSPACE Line 363 ⟶ 1,820: LINESBUTTHEYALLMUST BECONCATENATED '@ § $lines = $file.Replace("`n","~").Split(">") \| ForEach-Object {$_.TrimEnd("~").Split("`n",2,[StringSplitOptions]::RemoveEmptyEntries)} ~~for $/<entry>[] {~~ ~~say ~.<title>, " : ", .<sequence>.made;~~ $output = New-Object -TypeName PSObject ~~}</lang>~~ foreach ($line in $lines) { $name, $value = $line.Split("~",2) \| ForEach-Object {$_.Replace("~","")} $output \| Add-Member -MemberType NoteProperty -Name $name -Value $value } $output \| Format-List </syntaxhighlight> {{Out}} <pre> Rosetta_Example_1 : THERECANBENOSPACE Rosetta_Example_2 : THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|PureBasic}}== <syntaxhighlight lang="purebasic">EnableExplicit Define Hdl_File.i, Frm_File.i, c.c, header.b Hdl_File=ReadFile(#PB_Any,"c:\code_pb\rosettacode\data\FASTA_TEST.txt") If Not IsFile(Hdl_File) : End -1 : EndIf Frm_File=ReadStringFormat(Hdl_File) If OpenConsole("FASTA format") While Not Eof(Hdl_File) c=ReadCharacter(Hdl_File,Frm_File) Select c Case '>' header=#True PrintN("") Case #LF, #CR If header Print(": ") header=#False EndIf Default Print(Chr(c)) EndSelect Wend CloseFile(Hdl_File) Input() EndIf</syntaxhighlight> {{out}} <pre>Rosetta_Example_1 : THERECANBENOSPACE Rosetta_Example_2 : THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</pre> =={{header\|Python}}== Line 377 ⟶ 1,880: and I use a generator expression yielding key, value pairs as soon as they are read, keeping the minimum in memory. <~~lang~~syntaxhighlight lang="python">import io FASTA='''\ Line 401 ⟶ 1,904: yield key, val print('\n'.join('%s: %s' % keyval for keyval in fasta_parse(infile)))</~~lang~~syntaxhighlight> {{out}} <pre>Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</pre> =={{header\|R}}== <syntaxhighlight lang="rsplus"> library("seqinr") data <- c(">Rosetta_Example_1","THERECANBENOSPACE",">Rosetta_Example_2","THERECANBESEVERAL","LINESBUTTHEYALLMUST","BECONCATENATED") fname <- "rosettacode.fasta" f <- file(fname,"w+") writeLines(data,f) close(f) fasta <- read.fasta(file = fname, as.string = TRUE, seqtype = "AA") for (aline in fasta) { cat(attr(aline, 'Annot'), ":", aline, "\n") } </syntaxhighlight> {{out}} <pre> >Rosetta_Example_1 : THERECANBENOSPACE >Rosetta_Example_2 : THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Racket}}== <~~lang~~syntaxhighlight lang="racket"> #lang racket (let loop ([m #t]) Line 418 ⟶ 1,942: (current-output-port))))) (newline) </syntaxhighlight> ~~</lang>~~ =={{header\|Raku}}== (formerly Perl 6) <syntaxhighlight lang="raku" line>grammar FASTA { rule TOP { <entry>+ } rule entry { \> <title> <sequence> } token title { <.alnum>+ } token sequence { ( <.alnum>+ )+ % \n { make $0.join } } } FASTA.parse: q:to /§/; >Rosetta_Example_1 THERECANBENOSPACE >Rosetta_Example_2 THERECANBESEVERAL LINESBUTTHEYALLMUST BECONCATENATED § for $/<entry>[] { say ~.<title>, " : ", .<sequence>.made; }</syntaxhighlight> {{out}} <pre>Rosetta_Example_1 : THERECANBENOSPACE Rosetta_Example_2 : THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</pre> =={{header\|REXX}}== Neither REXX version reads the entire file into memory at one time;   lines are processed as they are read (one line at a time). ===version 1=== This REXX version correctly processes the examples shown. <~~lang~~syntaxhighlight lang="rexx">/REXX ~~pgm~~program reads a (~~bioinformational~~bio-informational) FASTA file and displays the contents. / ~~parse~~Parse ~~arg~~Arg ~~iFID _~~ifid . /~~iFID~~ iFID: = the input file to be read. / If ifid=='' Then ~~if iFID=='' then iFID='FASTA.IN' /Not specified? Use the default/~~ ~~$=;~~ ~~name~~ifid= 'FASTA.IN' /~~default~~ ~~values~~Not specified? ~~(so~~ ~~far).~~Then use the default / name='' / the name of an output file (so far) / d='' / the value of the output file's / Do While lines(ifid)\==0 / process the FASTA file contents / x=strip(linein(ifid),'T') / read a line (a record) from the input / / and strip trailing blanks / If left(x,1)=='>' Then Do / a new file id / Call out / show output name and data / name=substr(x,2) / and get the new (or first) output name / d='' / start with empty contents / End Else / a line with data / d=d\|\|x / append it to output / End Call out / show output of last file used. / Exit out: ~~do while lines(iFID)\==0 /process the FASTA file contents/~~ If d\=='' Then / if there ara data / ~~x=strip(linein(iFID), 'T') /read a line (record) from file,/~~ Say name':' d / show output name and data /* ~~and~~ ~~strip~~ ~~trailing~~ ~~blanks.~~/ Return</syntaxhighlight> ~~if left(x,1)=='>' then do~~ {{out\|output\|text=  when using the default input filename:}} ~~if $\=='' then say name':' $~~ ~~name=substr(x,2)~~ $= ~~end~~ ~~else $=$\|\|x~~ ~~end /j/~~ ~~if $\=='' then say name':' $~~ ~~/stick a fork in it, we're done./</lang>~~ ~~{{out}} when using the default input file~~ <pre> Rosetta_Example_1: THERECANBENOSPACE Line 448 ⟶ 2,006: ===version 2=== This REXX version handles   (see the ''talk'' page): ::   blank lines ::*   sequences that end in an asterisk   [''''''] ::   sequences that contain blanks, tabs, and other whitespace ::*   sequence names that are identified with a semicolon   [''';'''] <~~lang~~syntaxhighlight lang="rexx">/REXX ~~pgm~~program reads a (~~bioinformational~~bio-informational) FASTA file and displays the contents. / ~~parse~~Parse ~~arg~~Arg iFID _ . /iFID: = the input file to be read. / ifIf iFID=='' ~~then~~Then iFID='~~FASTA~~FASTA2.IN' /Not specified? ~~Use~~Then use the default./ $name=;'' ~~name=~~ /~~default~~the ~~values~~name of an output file (so far). / data='' do ~~while~~ ~~lines(iFID)\==0~~ /~~process~~the value of the ~~FASTA~~output file's ~~contents~~stuff./ Do While ~~x=strip(linein~~lines(iFID),\==0 ~~'T')~~ /~~read~~process athe ~~line~~ ~~(record)~~FASTA ~~from~~ file, contents. / x=strip(linein(iFID),'T') /read a line (a record) ~~and strip~~from ~~trailing~~the ~~blanks.~~file,/ if ~~x==''~~ ~~then~~ ~~iterate~~ ~~/ignore~~ ~~blank~~ ~~lines.~~ /--------- and strip trailing blanks. / Select ~~if left(x,1)==';' then do~~ When x=='' Then if ~~name==''~~ ~~then~~/ ~~name=substr(x~~If the line is all blank,2) / Nop ~~say~~ x / ignore it. / When left(x,1)==';' Then Do ~~iterate~~ If name=='' Then ~~end~~name=substr(x,2) if ~~left(x,1)=='>'~~ ~~then~~Say dox End ~~if $\=='' then say name':' $~~ When left(x,1)=='>' Then Do ~~name=substr(x,2)~~ If data\=='' Then $= Say name':' ~~end~~data name=substr(x,2) ~~else $=space($\|\|translate(x,,''),0)~~ ~~end~~ ~~/j/~~data='' End Otherwise ~~if $\=='' then say name':' $~~ data=space(data\|\|translate(x, ,''),0) ~~/stick a fork in it, we're done./</lang>~~ End ~~'''input'''   The   '''FASTA2.IN'''   file is shown below:~~ End If data\=='' Then Say name':' data / [?] show output of last file used. / </syntaxhighlight> <pre> '''input:'''   The   '''FASTA2.IN'''   file is shown below: <pre> ;LCBO - Prolactin precursor - Bovine Line 498 ⟶ 2,062: IENY </pre> {{out}}\|output\|text=  when using the ~~<tt> FASTA2.IN </tt>~~default input ~~file is used~~filename:}} <pre> ~~<pre style="overflow:scroll">~~ ;LCBO - Prolactin precursor - Bovine ; a sample sequence in FASTA format Line 505 ⟶ 2,069: MCHU - Calmodulin - Human, rabbit, bovine, rat, and chicken: ADQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMINEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISAAELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQMMTAK gi\|5524211\|gb\|AAD44166.1\| cytochrome b [Elephas maximus maximus]: LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLVEWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLGLLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALFLSIVILGLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQPVEYPYTIIGQMASILYFSIILAFLPIAGXIENY </pre> =={{header\|Ring}}== <syntaxhighlight lang="ring"> # Project : FAST format a = ">Rosetta_Example_1 THERECANBENOSPACE >Rosetta_Example_2 THERECANBESEVERAL LINESBUTTHEYALLMUST BECONCATENATED" i = 1 while i <= len(a) if substr(a,i,17) = ">Rosetta_Example_" see nl see substr(a,i,18) + ": " + nl i = i + 17 else if ascii(substr(a,i,1)) > 20 see a[i] ok ok i = i + 1 end </syntaxhighlight> Output: <pre> >Rosetta_Example_1: THERECANBENOSPACE >Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Ruby}}== <~~lang~~syntaxhighlight lang="ruby">def fasta_format(strings) out, text = [], "" strings.split("\n").each do \|line\| Line 530 ⟶ 2,125: EOS puts fasta_format(data)</~~lang~~syntaxhighlight> {{out}} Line 539 ⟶ 2,134: =={{header\|Run BASIC}}== <~~lang~~syntaxhighlight lang="runbasic">a$ = ">Rosetta_Example_1 THERECANBENOSPACE >Rosetta_Example_2 Line 556 ⟶ 2,151: end if i = i + 1 wend</~~lang~~syntaxhighlight> {{out}} <pre>>Rosetta_Example_1: THERECANBENOSPACE >Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</pre> =={{header\|Rust}}== This example is implemented using an [https://doc.rust-lang.org/book/iterators.html iterator] to reduce memory requirements and encourage code reuse. <syntaxhighlight lang="rust"> use std::env; use std::io::{BufReader, Lines}; use std::io::prelude::; use std::fs::File; fn main() { let args: Vec<String> = env::args().collect(); let f = File::open(&args[1]).unwrap(); for line in FastaIter::new(f) { println!("{}", line); } } struct FastaIter<T> { buffer_lines: Lines<BufReader<T>>, current_name: Option<String>, current_sequence: String } impl<T: Read> FastaIter<T> { fn new(file: T) -> FastaIter<T> { FastaIter { buffer_lines: BufReader::new(file).lines(), current_name: None, current_sequence: String::new() } } } impl<T: Read> Iterator for FastaIter<T> { type Item = String; fn next(&mut self) -> Option<String> { while let Some(l) = self.buffer_lines.next() { let line = l.unwrap(); if line.starts_with(">") { if self.current_name.is_some() { let mut res = String::new(); res.push_str(self.current_name.as_ref().unwrap()); res.push_str(": "); res.push_str(&self.current_sequence); self.current_name = Some(String::from(&line[1..])); self.current_sequence.clear(); return Some(res); } else { self.current_name = Some(String::from(&line[1..])); self.current_sequence.clear(); } continue; } self.current_sequence.push_str(line.trim()); } if self.current_name.is_some() { let mut res = String::new(); res.push_str(self.current_name.as_ref().unwrap()); res.push_str(": "); res.push_str(&self.current_sequence); self.current_name = None; self.current_sequence.clear(); self.current_sequence.shrink_to_fit(); return Some(res); } None } } </syntaxhighlight> {{out}} <pre>Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</pre> =={{header\|Scala}}== <syntaxhighlight lang="scala">import java.io.File import java.util.Scanner object ReadFastaFile extends App { val sc = new Scanner(new File("test.fasta")) var first = true while (sc.hasNextLine) { val line = sc.nextLine.trim if (line.charAt(0) == '>') { if (first) first = false else println() printf("%s: ", line.substring(1)) } else print(line) } println("~~~+~~~") }</syntaxhighlight> =={{header\|Scheme}}== <syntaxhighlight lang="scheme">(import (scheme base) (scheme file) (scheme write)) (with-input-from-file ; reads text from named file, one line at a time "fasta.txt" (lambda () (do ((first-line? #t #f) (line (read-line) (read-line))) ((eof-object? line) (newline)) (cond ((char=? #\> (string-ref line 0)) ; found a name (unless first-line? ; no newline on first name (newline)) (display (string-copy line 1)) (display ": ")) (else ; display the string directly (display line))))))</syntaxhighlight> {{out}} <pre>Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</pre> =={{header\|Seed7}}== <syntaxhighlight lang="seed7">$ include "seed7_05.s7i"; const proc: main is func local var file: fastaFile is STD_NULL; var string: line is ""; var boolean: first is TRUE; begin fastaFile := open("fasta_format.in", "r"); if fastaFile <> STD_NULL then while hasNext(fastaFile) do line := getln(fastaFile); if startsWith(line, ">") then if first then first := FALSE; else writeln; end if; write(line[2 ..] <& ": "); else write(line); end if; end while; close(fastaFile); end if; writeln; end func;</syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Sidef}}== {{trans\|Ruby}} <syntaxhighlight lang="ruby">func fasta_format(strings) { var out = [] var text = '' for line in (strings.lines) { if (line.begins_with('>')) { text.len && (out << text) text = line.substr(1)+': ' } else { text += line } } text.len && (out << text) return out } fasta_format(DATA.slurp).each { .say } __DATA__ >Rosetta_Example_1 THERECANBENOSPACE >Rosetta_Example_2 THERECANBESEVERAL LINESBUTTHEYALLMUST BECONCATENATED</syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Smalltalk}}== Works with Pharo Smalltalk <syntaxhighlight lang="smalltalk"> FileLocator home / aFilename readStreamDo: [ :stream \| [ stream atEnd ] whileFalse: [ \| line \| ((line := stream nextLine) beginsWith: '>') ifTrue: [ Transcript cr; show: (line copyFrom: 2 to: line size); show: ': ' ] ifFalse: [ Transcript show: line ] ] ] </syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Tcl}}== <~~lang~~syntaxhighlight lang="tcl">proc fastaReader {filename} { set f [open $filename] set sep "" Line 577 ⟶ 2,375: } fastaReader ./rosettacode.fas</~~lang~~syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|TMG}}== Unix TMG: <!-- C port of TMG processes 1.04 GB FASTA file in 38 seconds on a generic laptop --> <syntaxhighlight lang="unixtmg">prog: ignore(spaces) loop: parse(line)\loop parse(( = {} )); line: ( name \| = {} \| seqns ); name: <>> ignore(none) smark string(nonl) scopy * ( [f>0?] = {} \| = {} ) [f=0] = { 1 2 <: > }; seqns: smark string(nonl) scopy [f=0]; none: <<>>; nonl: !<< >>; spaces: << >>; f: 1;</syntaxhighlight> =={{header\|uBasic/4tH}}== <syntaxhighlight lang="text">If Cmd (0) < 2 Then Print "Usage: fasta <fasta file>" : End If Set(a, Open (Cmd(2), "r")) < 0 Then Print "Cannot open \q";Cmd(2);"\q" : End Do While Read (a) ' while there are lines to process t = Tok (0) ' get a lime If Peek(t, 0) = Ord(">") Then ' if it's a marker Print Show (Chop(t, 1)); ": "; Show (FUNC(_Payload(a))) Continue ' get the payload and print it EndIf Print "Out of sequence" : Break ' this should never happen Loop Close a ' close the file End ' and end the program _Payload ' get the payload Param (1) Local (4) b@ := "" ' start with an empty string Do c@ = Mark(a@) ' mark its position While Read (a@) ' now read a line d@ = Tok (0) ' get the line If Peek (d@, 0) = Ord(">") Then e@ = Head(a@, c@) : Break b@ = Join (b@, d@) ' marker? reset position and exit Loop ' if not add the line to current string Return (b@) ' return the string</syntaxhighlight> {{out}} <pre>Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED 0 OK, 0:431 </pre> =={{header\|V (Vlang)}}== <syntaxhighlight lang="Vlang"> const data = ( ">Rosetta_Example_1 THERECANBENOSPACE >Rosetta_Example_2 THERECANBESEVERAL LINESBUTTHEYALLMUST BECONCATENATED" ) fn main() { mut i := 0 for i <= data.len { if data.substr_ni(i, i + 17) == ">Rosetta_Example_" { print("\n" + data.substr_ni(i, i + 18) + ": ") i = i + 17 } else { if data.substr_ni(i, i + 1) > "\x20" {print(data[i].ascii_str())} } i++ } } </syntaxhighlight> {{out}} <pre> >Rosetta_Example_1: THERECANBENOSPACE >Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|Wren}}== {{trans\|Kotlin}} More or less. <syntaxhighlight lang="wren">import "io" for File var checkNoSpaces = Fn.new { \|s\| !s.contains(" ") && !s.contains("\t") } var first = true var process = Fn.new { \|line\| if (line[0] == ">") { if (!first) System.print() System.write("%(line[1..-1]): ") if (first) first = false } else if (first) { Fiber.abort("File does not begin with '>'.") } else if (checkNoSpaces.call(line)) { System.write(line) } else { Fiber.abort("Sequence contains space(s).") } } var fileName = "input.fasta" File.open(fileName) { \|file\| var offset = 0 var line = "" while(true) { var b = file.readBytes(1, offset) offset = offset + 1 if (b == "\n") { process.call(line) line = "" // reset line variable } else if (b == "\r") { // Windows // wait for following "\n" } else if (b == "") { // end of stream System.print() return } else { line = line + b } } }</syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED </pre> =={{header\|XPL0}}== <syntaxhighlight lang="xpl0">proc Echo; \Echo line of characters from file to screen int Ch; def LF=$0A, EOF=$1A; [loop [Ch:= ChIn(3); case Ch of EOF: exit; LF: quit other ChOut(0, Ch); ]; ]; int Ch; [FSet(FOpen("fasta.txt", 0), ^i); loop [Ch:= ChIn(3); if Ch = ^> then [CrLf(0); Echo; Text(0, ": "); ] else ChOut(0, Ch); Echo; ]; ]</syntaxhighlight> {{out}} <pre> Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED Line 585 ⟶ 2,553: =={{header\|zkl}}== <~~lang~~syntaxhighlight lang="zkl">fcn fasta(data){ // a lazy cruise through a FASTA file fcn(w){ // one string at a time, -->False garbage at front of file line:=w.next().strip(); Line 592 ⟶ 2,560: }) }.fp(data.walker()) : Utils.Helpers.wap(_); }</~~lang~~syntaxhighlight> This assumes that white space at front or end of string is extraneous (excepting ">" lines). Lazy, works for objects that support iterating over lines (ie most). *The fasta function returns an iterator that wraps a function taking an iterator. Uh, yeah. An initial iterator (Walker) is used to get lines, hold state and do push back when read the start of the next string. The function sucks up one string (using the iterator). The wrapping iterator (wap) traps the exception when the function waltzes off the end of the data and provides API for foreach (etc). FASTA file: <~~lang~~syntaxhighlight lang="zkl">foreach l in (fasta(File("fasta.txt"))) { println(l) }</~~lang~~syntaxhighlight> FASTA data blob: <~~lang~~syntaxhighlight lang="zkl">data:=Data(0,String, ">Rosetta_Example_1\nTHERECANBENOSPACE\n" ">Rosetta_Example_2\nTHERECANBESEVERAL\nLINESBUTTHEYALLMUST\n" "BECONCATENATED"); foreach l in (fasta(data)) { println(l) }</~~lang~~syntaxhighlight> {{out}} <pre>