Literals/String: Difference between revisions

From Rosetta Code
Content added Content deleted
(Rephrased task. Additional refinement is needed.)
Line 22: Line 22:
Strings are contained in double quotes.
Strings are contained in double quotes.


char str = "hello";
char str[] = "hello";


This means that 'z' and "z" are different. The former is a character while the latter is a string, an array of two characters: 'z' and the string-terminator null \0.
This means that 'z' and "z" are different. The former is a character while the latter is a string, an array of six characters: the letters 'h', 'e', 'l', 'l', 'o', and the string-terminator null '\0'.


C has no raw string feature and no way of literally quoting a string containing line breaks. C also has no built-in mechanism for expanding variables within strings.
C has no raw string feature and no way of literally quoting a string containing line breaks. C also has no built-in mechanism for expanding variables within strings.

Revision as of 07:43, 7 February 2008

Task
Literals/String
You are encouraged to solve this task according to the task description, using any language you may know.

Show literal specification of characters and strings. If supported, show how verbatim strings (quotes where escape sequences are quoted literally) and here-strings work. Also, discuss which quotes expand variables.

Ada

Single character literals require single quotes

  ch : character := 'a';

String literals use double quotes

  msg : string := "hello world";
  empty : string := "";  -- an empty string

The length of a string in Ada is equal to the number of characters in the string. Ada does not employ a terminating null character like C. A string can have zero length, but zero length strings are not often used. Ada's string type is a fixed length string. It cannot be extended after it is created. If you need to extend the length of a string you need to use either a bounded string, which has a pre-determined maximum length, similar to C strings, or an unbounded string which can expand or shrink to match the data it contains.

C

In C, single characters are contained in single quotes.

char ch = 'z';

Strings are contained in double quotes.

char str[] = "hello";

This means that 'z' and "z" are different. The former is a character while the latter is a string, an array of six characters: the letters 'h', 'e', 'l', 'l', 'o', and the string-terminator null '\0'.

C has no raw string feature and no way of literally quoting a string containing line breaks. C also has no built-in mechanism for expanding variables within strings.

C#

C# uses single quotes for characters and double quotes for strings just as C. C# also supports verbatim strings. These begin with @" and end with ". Verbatim quotes may contain line breaks and so verbatim strings and here-strings overlap.

C++

Quoting is essentially the same in C and C++. However, C++ adds the ability to prefix an L to an opening quote indicate that a character is a wide character or that a string is an array of wide characters.

D

Character literals:

  char c = 'a';

Regular strings support C-style escape sequences.

  auto str = "hello";   // UTF-8
  auto str = "hello"c;  // UTF-8
  auto str3 = "hello"w; // UTF-16
  auto str4 = "hello"d; // UTF-32

Literal string (escape sequences are not interpreted):

  auto str = `"Hello," he said.`;
  auto str2 = r"\n is slash-n";

Specified delimiter string:

  // Any character is allowed after the first quote;
  // the string ends with that same character followed 
  // by a quote.
  auto str = q"$"Hello?" he enquired.$";
  // If you include a newline, you get a heredoc string:
  auto otherStr = q"EOS
  This is part of the string.
      So is this.
  EOS";

Token string:

  // The contents of a token string must be valid code fragments.
  auto str = q{int i = 5;};
  // The contents here isn't a legal token in D, so it's an error:
  auto illegal = q{@?};

Hex string:

  // assigns value 'hello' to str
  auto str = x"68 65 6c 6c 6f";

Java

char a = 'a'; //prints as: a
String b = "abc"; //prints as: abc
char doubleQuote = '"'; //prints as: "
char singleQuote = '\''; //prints as: '
String singleQuotes = "''"; //prints as: ''
String doubleQuotes = "\"\""; //prints as: ""

LaTeX

Since LaTeX is a markup language rather than a programming language, quotes are displayed rather than interpreted. However, quotes do deserve special mention in LaTeX. Opening (left) quotes are denoted with backquotes and closing (right) quotes are denoted with quotes. Single quotes use a single symbol and double quotes use double symbols. For example, to typeset 'a' is for "apple" in LaTeX, one would type

`a' is for ``apple''

One common mistake is to use the same symbol for opening and closing quotes, which results in the one of the quotes being backward in the output. Another common mistake is to use a double quote symbol in the input file rather than two single quotes in order to produce a double quote in the output.

Python

Python makes no distinction between single characters and strings. One can use single or double quotes.

'c' == "c" # character
'text' == "text"
' " '
" ' "
'\x20' == ' '
u'unicode string'
u'\u05d0' # unicode literal

As shown in the last examples Unicode strings are single or double quoted with a "u" or "U" prepended thereto.

Verbatim (a.k.a. "raw") strings are contained within either single or double quotes, but have an "r" or "R" prepended to indicate that backslash characters should NOT be treated as "escape sequences." This is useful when defining regular expressions as it avoids the need to use sequences like \\\\ (a sequence of four backslashes) in order to get one literal backslash into a regular expression string.

r'\x20' == '\\x20'

The Unicode and raw string modifiers can be combined to prefix a raw Unicode string. This must be done as "ur" or "UR" (not with the letters reversed as it: "ru").

Here-strings are denoted with triple quotes.

''' single triple quote '''
""" double triple quote """

The "u" and "r" prefixes can also be used with triple quoted strings.

Triple quoted strings can contain any mixture of double and single quotes as well as embedded newlines, etc. They are terminated by unescaped triple quotes of the same type that initiated the expression. They are generally used for "doc strings" and other multi-line string expressions --- and are useful for "commenting out" blocks of code.

PowerShell

PowerShell makes no distinction between characters and strings. Single quoted strings do not interpolate variable contents but double quoted strings do. Also, escape sequences are quoted literally as separate characters within single quotes.

PowerShell here-strings begin with @' (or @") followed immediately by a line break and end with a line break followed by '@ (or "@). Escape sequences and variables are interpolated in @" quotes but not in @' quotes.

Seed7

In Seed7, single characters are contained in single quotes.

var char: ch is 'z';

Strings are contained in double quotes.

var string: stri is "hello";

This means that 'z' and "z" are different. The former is a character while the latter is a string. Seed7 strings are not null terminated (they do not end with \0). They can contain any sequence of UNICODE (UCS-32) characters (including a \0). In the source code the UNICODE characters are written with the UTF-8 coding. Empty strings are also allowed. The backslash is used to allow double quotes and control characters in strings. There is also a possibility to break a string into several lines.

var string: example is "this is a string\
                       \ which continues in the next line\n\
                       \and contains a line break";

There is no built-in mechanism for expanding variables within strings.