Read entire file: Difference between revisions

Content added Content deleted

Inline

Revision as of 17:21, 5 July 2010

Load the entire contents of some text file as a single string variable.

If applicable, discuss: encoding selection, the possibility of memory-mapping.

Of course, one should avoid reading an entire file at once if the file is large and the task can be accomplished incrementally instead (in which case check File IO); this is for those cases where having the entire file is actually what is wanted.

ALGOL 68

In official ALGOL 68 a file is composed of pages, lines and characters, however for ALGOL 68 Genie and ELLA ALGOL 68RS this concept is not supported as they adopt the Unix concept of files being "flat", and hence contain only characters.

In official/standard ALGOL 68 only: <lang algol68>MODE BOOK = FLEX[0]FLEX[0]FLEX[0]CHAR; ¢ pages of lines of characters ¢ BOOK book;

FILE book file; INT errno = open(book file, "book.txt", stand in channel);

get(book file, book)</lang>

Once a "book" has been read into a book array it can still be associated with a virtual file and again be accessed with standard file routines (such as readf, printf, putf, getf, new line etc). This means data can be directly manipulated from a array cached in "core" using transput (stdio) routines.

In official/standard ALGOL 68 only: <lang algol68>FILE cached book file; associate(cached book file, book)</lang>

E

<lang e><file:foo.txt>.getText()</lang>

The file is assumed to be in the default encoding.

Forth

Works with: GNU Forth

<lang forth>s" foo.txt" slurp-file ( str len )</lang>

J

To memory map the file:

<lang j> require'jmf'

  JCHAR map_jmf_ 'var';'foo.txt'</lang>

Caution: updating the value of the memory mapped variable will update the file, and this character remains when the variable's value is passed, unmodified, to a verb which modifies its own local variables.

Java

There is no single method to do this in Java (probably because reading an entire file at once could fill up your memory quickly), so to do this you could simply append the contents as you read them line-by-line as in File IO. <lang java>import java.io.BufferedReader; import java.io.FileReader; import java.io.IOException;

public class ReadFile {

   public static void main(String[] args) throws IOException{
       String fileContents = readEntireFile("./foo.txt");
   }

   private static String readEntireFile(String filename) throws IOException {
       BufferedReader in = new BufferedReader(new FileReader(filename));
       StringBuilder contents = new StringBuilder();
       while(in.ready()){
           contents.append(in.readLine() + "\n");
       }
       return contents.toString();
   }

}</lang>

Perl

<lang perl>open my $fh, $filename; my $text = do { local( $/ ); <$fh> };</lang> or <lang perl>use File::Slurp; my $text = read_file($filename);</lang>

PicoLisp

Using 'till' is the shortest way: <lang PicoLisp>(in "file" (till NIL T))</lang> To read the file into a list of characters: <lang PicoLisp>(in "file" (till NIL))</lang> or, more explicit: <lang PicoLisp>(in "file" (make (while (char) (link @))))</lang> Encoding is always assumed to be UTF-8.

PowerShell

<lang powershell>Get-Content foo.txt</lang> This will only detect Unicode correctly with a BOM in place (even for UTF-8). With explicit selection of encoding: <lang powershell>Get-Content foo.txt -Encoding UTF8</lang> However, both return an array of strings which is fine for pipeline use but if a single string is desired the array needs to be joined: <lang powershell>(Get-Content foo.txt) -join "`n"</lang>

PHP

<lang php>file_get_contents($filename)</lang>

Python

<lang python>open(filename).read()</lang>

This returns a byte string and does not assume any particular encoding.

Ruby

<lang ruby>IO.read(filename)</lang>

Tcl

This reads the data in as text, applying the default encoding translations. <lang tcl>set f [open $filename] set data [read $f] close $f</lang> To read the data in as uninterpreted bytes, either use fconfigure to put the handle into binary mode before reading, or (from Tcl 8.5 onwards) do this: <lang tcl>set f [open $filename "rb"] set data [read $f] close $f</lang>