Talk:Read entire file

From Rosetta Code
Revision as of 22:43, 7 March 2013 by rosettacode>Dkf (→‎Encoding selection: bytes vs chars)

Encoding selection

The task description mentions "encoding selection". What has encoding got to do with reading entire file? If you read a file into memory, you just read the file. Encoding is something to be considered when you are manipulating the data. I assume the "encoding" refers to text encoding. However, the file may be a picture file, or binary data or whatever. --PauliKL 16:09, 6 March 2013 (UTC)

The primary objective is to read a file into a string, not necessarily into memory. Some languages are encoding-aware and behave accordingly. Ruby, for instance:
<lang Ruby>['ASCII', 'UTF-8'].map { |e| File.open('foo', encoding: e).read.size }
  1. => [3, 1]</lang>Isopsephile 16:43, 6 March 2013 (UTC)
It depends on whether you're reading the file as a sequence of bytes or as a sequence of characters. The encoding is only required when converting from bytes to characters (or vice versa, of course). Failure to properly distinguish between the two concepts has been the cause of a huge amount of pain over the past decade or two, pain which we're only gradually emerging from as an industry. Thank goodness for Unicode and UTF-8. –Donal Fellows 22:43, 7 March 2013 (UTC)