Talk:Read entire file: Difference between revisions

From Rosetta Code
Content added Content deleted
(→‎Encoding selection: bytes vs chars)
(→‎Fortran: new section)
Line 10: Line 10:


: It depends on whether you're reading the file as a sequence of bytes or as a sequence of characters. The encoding is only required when converting from bytes to characters (or ''vice versa'', of course). Failure to properly distinguish between the two concepts has been the cause of a huge amount of pain over the past decade or two, pain which we're only gradually emerging from as an industry. Thank goodness for Unicode and UTF-8. –[[User:Dkf|Donal Fellows]] 22:43, 7 March 2013 (UTC)
: It depends on whether you're reading the file as a sequence of bytes or as a sequence of characters. The encoding is only required when converting from bytes to characters (or ''vice versa'', of course). Failure to properly distinguish between the two concepts has been the cause of a huge amount of pain over the past decade or two, pain which we're only gradually emerging from as an industry. Thank goodness for Unicode and UTF-8. –[[User:Dkf|Donal Fellows]] 22:43, 7 March 2013 (UTC)

== Fortran ==

There was some stuff pretending that it's impossible to read the file into memory, and that the only way is some convoluted loop, with goto (!). Sorry, that's wrong as of the current Fortran standard. I replaced this garbage with an example allocating a character string exactly the right size, and reading the file in stream access.

I also showed another example using Intel Fortran, that makes use of the Windows API to create a memory map of the file.

[[User:Arbautjc|Arbautjc]] ([[User talk:Arbautjc|talk]]) 20:00, 11 November 2016 (UTC)

Revision as of 20:00, 11 November 2016

Encoding selection

The task description mentions "encoding selection". What has encoding got to do with reading entire file? If you read a file into memory, you just read the file. Encoding is something to be considered when you are manipulating the data. I assume the "encoding" refers to text encoding. However, the file may be a picture file, or binary data or whatever. --PauliKL 16:09, 6 March 2013 (UTC)

The primary objective is to read a file into a string, not necessarily into memory. Some languages are encoding-aware and behave accordingly. Ruby, for instance:
<lang Ruby>['ASCII', 'UTF-8'].map { |e| File.open('foo', encoding: e).read.size }
  1. => [3, 1]</lang>Isopsephile 16:43, 6 March 2013 (UTC)
It depends on whether you're reading the file as a sequence of bytes or as a sequence of characters. The encoding is only required when converting from bytes to characters (or vice versa, of course). Failure to properly distinguish between the two concepts has been the cause of a huge amount of pain over the past decade or two, pain which we're only gradually emerging from as an industry. Thank goodness for Unicode and UTF-8. –Donal Fellows 22:43, 7 March 2013 (UTC)

Fortran

There was some stuff pretending that it's impossible to read the file into memory, and that the only way is some convoluted loop, with goto (!). Sorry, that's wrong as of the current Fortran standard. I replaced this garbage with an example allocating a character string exactly the right size, and reading the file in stream access.

I also showed another example using Intel Fortran, that makes use of the Windows API to create a memory map of the file.

Arbautjc (talk) 20:00, 11 November 2016 (UTC)