String: Difference between revisions
no edit summary
Puppydrum64 (talk | contribs) |
No edit summary |
||
(8 intermediate revisions by 2 users not shown) | |||
Line 1:
In computer programming, a '''[[w:String (computer science)|string]]''' is a finite sequence of characters or symbols
==How
Strings are encoded using one or more encoding methods, such as [[ASCII]] or [[Unicode]], a scheme that maps a set of glyphs to specific numeric values. There are standardized encoding methods to support portability and interoperability.
For example, this is the encoding of <code>"Hello World"</code> in ASCII:
<pre>0x48,0x65,0x6c,0x6c,0x6f,0x20,0x57,0x6f,0x72,0x6c,0x64</pre>
Compilers for languages such as [[C]] add an extra "null" byte <code>(0x00)</code> at the end of the string when the string is stored in computer memory. This byte is called the '''null terminator''' and is added to make it easy for the computer to determine the end of the string.
Why
==Control
The first 32 characters of ASCII and Unicode are reserved for control codes. Most of these are relics of the old teletype days and many are no longer of use to most computers today, but a few of them are still used widely (like 0 for the null terminator, 8 for backspace, etc.) Most of them are nowhere to be found on your keyboard; they are used internally by the computer as a signal to perform certain tasks, or to mark the beginning or end of various data.
==Escape
An escape character is a distinguished character used to signal that the following character in a string is to be interpreted in a special way. Without escape characters, we'd run into a problem if we wanted quotation marks to appear in our string when printed to the screen, since quotation marks often mark the beginning and end of the string literal. The same is true if we wanted to have a string that happened to include the comment character in it. In many languages, the backslash <code>\</code> is the escape character, but this varies depending on the language.
Escape characters are also used to encode special instructions that otherwise the user would have an extremely difficult time supplying to the computer. In C and many other languages, the two character sequence <
==Substitution
This is a similar concept to escape characters, but instead allows you to print a variable value, such as a number or the result of some calculation. The syntax for doing so will vary depending on your programming language, but here's an example from C:
<syntaxhighlight lang="c">
printf("The sum of A plus B is %d", sum(a+b));
</syntaxhighlight>
Here, <code>%</code> is the substitution character, and <code>d</code> tells the computer to substitute with a decimal value. The expression after the comma is what will be replacing the <code>%d</code>. Nearly all languages with a built-in print function will handle conversion of numeric data to text characters for you.
|