Talk:Read entire file: Difference between revisions

Content added Content deleted
(→‎Encoding selection: bytes vs chars)
Line 8: Line 8:
:<lang Ruby>['ASCII', 'UTF-8'].map { |e| File.open('foo', encoding: e).read.size }
:<lang Ruby>['ASCII', 'UTF-8'].map { |e| File.open('foo', encoding: e).read.size }
# => [3, 1]</lang>[[User:Isopsephile|Isopsephile]] 16:43, 6 March 2013 (UTC)
# => [3, 1]</lang>[[User:Isopsephile|Isopsephile]] 16:43, 6 March 2013 (UTC)

: It depends on whether you're reading the file as a sequence of bytes or as a sequence of characters. The encoding is only required when converting from bytes to characters (or ''vice versa'', of course). Failure to properly distinguish between the two concepts has been the cause of a huge amount of pain over the past decade or two, pain which we're only gradually emerging from as an industry. Thank goodness for Unicode and UTF-8. –[[User:Dkf|Donal Fellows]] 22:43, 7 March 2013 (UTC)