UTF-8: Difference between revisions

87 bytes added ,  14 years ago
m
Can scan UTF-8 sequence backwards
m (Added to the encyclopedia.)
m (Can scan UTF-8 sequence backwards)
 
(2 intermediate revisions by one other user not shown)
Line 1:
{{encyclopedic}}[[Category:Encyclopedia]]'''Unicode Transformation Format, 8-bit representation''' or UTF-8 is a particular encoding of [[Unicode]] code-points into eight-bit octets. It was originally developed for [[Bell Labs]]' [[Plan 9]] operating system by Ken Thompson (inventor of [[Unix]]) and Rob Pike in 1992. It is widely used on Unix-like systems and for XML documents.
 
Some advantages of UTF-8:
Line 5:
* subsumes 7-bit ASCII
* one can detect the start of characters
* one can scan characters in both directions forward and backward
* can encode code-points at least 32-bits long