I'm working on modernizing Rosetta Code's infrastructure. Starting with communications. Please accept this time-limited open invite to RC's Slack.. --Michael Mol (talk) 20:59, 30 May 2020 (UTC)

Quoting constructs

From Rosetta Code
Quoting constructs is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Pretty much every programming language has some form of quoting construct to allow embedding of data in a program, be it literal strings, numeric data or some combination thereof.

Show examples of the quoting constructs in your language. Explain where they would likely be used, what their primary use is, what limitations they have and why one might be preferred over another. Is one style interpolating and another not? Are there restrictions on the size of the quoted data? The type? The format?

This is intended to be open-ended and free form. If you find yourself writing more than a few thousand words of explanation, summarize and provide links to relevant documentation; but do provide at least a fairly comprehensive summary here, on this page, NOT just a link to [See the language docs].

Note: This is primarily for quoting constructs for data to be "embedded" in some way into a program. If there is some special format for external data, it may be mentioned but that isn't the focus of this task.


Go[edit]

package main
 
import (
"fmt"
"os"
"regexp"
"strconv"
)
 
/* Quoting constructs in Go. */
 
// In Go a Unicode codepoint, expressed as a 32-bit integer, is referred to as a 'rune'
// but the more familiar term 'character' will be used instead here.
 
// Character literal (single quotes).
// Can contain any single character including an escaped character.
var (
rl1 = 'a'
rl2 = '\'' // single quote can only be included in escaped form
)
 
// Interpreted string literal (double quotes).
// A sequence of characters including escaped characters.
var (
is1 = "abc"
is2 = "\"ab\tc\"" // double quote can only be included in escaped form
)
 
// Raw string literal(single back quotes).
// Can contain any character including a 'physical' new line but excluding back quote.
// Escaped characters are interpreted literally i.e. `\n` is backslash followed by n.
// Raw strings are typically used for hard-coding pieces of text perhaps
// including single and/or double quotes without the need to escape them.
// They are particularly useful for regular expressions.
var (
rs1 = `
first"
second'
third"
`

rs2 = `This is one way of including a ` + "`" + ` in a raw string literal.`
rs3 = `\d+` // a sequence of one or more digits in a regular expression
)
 
func main() {
fmt.Println(rl1, rl2) // prints the code point value not the character itself
fmt.Println(is1, is2)
fmt.Println(rs1)
fmt.Println(rs2)
re := regexp.MustCompile(rs3)
fmt.Println(re.FindString("abcd1234efgh"))
 
/* None of the above quoting constructs can deal directly with interpolation.
This is done instead using library functions.
*/

 
// C-style using %d, %f, %s etc. within a 'printf' type function.
n := 3
fmt.Printf("\nThere are %d quoting constructs in Go.\n", n)
 
// Using a function such as fmt.Println which can take a variable
// number of arguments, of any type, and print then out separated by spaces.
s := "constructs"
fmt.Println("There are", n, "quoting", s, "in Go.")
 
// Using the function os.Expand which requires a mapper function to fill placeholders
// denoted by ${...} within a string.
mapper := func(placeholder string) string {
switch placeholder {
case "NUMBER":
return strconv.Itoa(n)
case "TYPES":
return s
}
return ""
}
fmt.Println(os.Expand("There are ${NUMBER} quoting ${TYPES} in Go.", mapper))
}
Output:
97 39
abc "ab	c"

first"
second'
third"

This is one way of including a ` in a raw string literal.
1234

There are 3 quoting constructs in Go.
There are 3 quoting constructs in Go.
There are 3 quoting constructs in Go.

Julia[edit]

Note: Almost all of the documentation below is quoted from various portions of the Julia documentation at docs.julialang.org.

  • As with all objects in Julia, the size of a quoted string is limited by the maximum allocatable memory object in the underlying OS (32 or 64 bit).
  • Quoted strings are considered to contain a series of Unicode characters, but invalid Unicode within strings does not itself throw any errors. Therefore, strings may potentially contain any values.
  • String literals are delimited by double quotes or triple double quotes:
 
julia> str = "Hello, world.\n"
"Hello, world.\n"
 
julia> """Contains "quote" characters and
a newline"""
"Contains \"quote\" characters and \na newline"
 
  • Both single and triple quoted strings are may contain interpolated values. Triple-quoted strings are also dedented to the level of the least-indented line. This is useful for defining strings within code that is indented. For example:
 
julia> str = """
Hello,
world.
"""
" Hello,\n world.\n"
 
  • Julia allows interpolation into string literals using $:
 
julia> "$greet, $whom.\n"
"Hello, world.\n"
 
  • The shortest complete expression after the $ is taken as the expression whose value is to be interpolated into the string. Thus, you can interpolate any expression into a string using parentheses:
 
julia> "1 + 2 = $(1 + 2)"
"1 + 2 = 3"
 
  • Julia reserves the single quote ' for character literals, not for strings:
 
julia> 'π'
'π': Unicode U+03C0 (category Ll: Letter, lowercase)
 
  • Julia requires commands sent to functions such as run() be surrounded by backticks. Such expressions create a Cmd object, which is used for running a child process from Julia:
 
julia> mycommand = `echo hello`
`echo hello`
 
julia> typeof(mycommand)
Cmd
 
julia> run(mycommand);
hello
 
  • Julia uses the colon : in metaprogramming for quoting symbols and other code:
 
julia> a = :+
:+
 
julia> typeof(a)
Symbol
 
julia> b = quote + end
quote
#= REPL[3]:1 =#
+
end
 
julia> typeof(b)
Expr
 
julia> eval(a) == eval(b)
true
 
julia> c = :(2 + 3)
:(2 + 3)
 
julia> eval(c)
5
 

Phix[edit]

Single quotes are used for single ascii characters, eg 'A'. Multibyte unicode characters are typically held as utf-8 strings.
Double quotes are used for single-line strings, with backslash interpretation, eg "one\ntwo\nthree\n".
The concatenation operator & along with a couple more quotes can certainly be used to mimic string continuation, however it is technically an implementation detail rather than part of the language specification as to whether that occurs at compile-time or run-time.
Phix does not support interpolation other than printf-style, eg printf(1,"Hello %s,\nYour account balance is %3.2f\n",{name,balance}).
Back-ticks and triple-quotes are used for multi-line strings, without backslash interpretation, eg

constant t123 = `
one
two
three
`

or (entirely equivalent, except the following can contain back-ticks which the above cannot, and vice versa for triple quotes)

constant t123 = """
one
two
three
"""

Both are also equivalent to the top double-quote one-liner. Note that a single leading '\n' is automatically stripped.
Several builtins such as substitute, split, and join are often used to convert such strings into the required internal form.
Regular expressions are usually enclosed in back-ticks, specifically to avoid backslash interpretation.
You can also declare hexadecimal strings, eg

x"1 2 34 5678_AbC" -- same as {0x01, 0x02, 0x34, 0x56, 0x78, 0xAB, 0x0C}
-- note however it displays as {1,2,52,86,120,171,12}
-- whereas x"414243" displays as "ABC" (as all chars)

Literal sequences are represented with curly braces, and can be nested to any depth, eg

{2, 3, 5, 7, 11, 13, 17, 19}
{1, 2, {3, 3, 3}, 4, {5, {6}}}
{{"John", "Smith"}, 52389, 97.25}
{} -- the 0-element sequence

Raku[edit]

The Perl philosophy, which Raku thoroughly embraces, is that "There Is More Than One Way To Do It" (often abbreviated to TIMTOWDI). Quoting constructs is an area where this is enthusiastically espoused.

Raku has a whole quoting specific sub-language built in called Q. Q changes the parsing rules inside the quoting structure and allows extremely fine control over how the enclosed data is parsed. Every quoting construct in Raku is some form of a Q syntactic structure, using adverbs to fine tune the desired behavior, though many of the most commonly used have some form of "shortcut" syntax for quick and easy use. Usually, when using an adverbial form, you may omit the Q: and just use the adverb.

In general, any and all quoting structures have theoretically unlimited length, in practice, are limited by memory size, practically, it is probably better to limit them to less than a gigabyte or so, though they can be read as a supply, not needing to hold the whole thing in memory at once. They can hold multiple lines of data. How the new-line characters are treated depends entirely on any white-space adverb applied. The Q forms use some bracketing character to delimit the quoted data. Usually some Unicode bracket ( [], {}, <>, ⟪⟫, whatever,) that has an "open" and "close" bracketing character, but they may use any non-indentifier character as both opener and closer. ||, //, ??, the list goes on. The universal escape character for constructs that allow escaping is backslash "\".

The following exposition barely scratches the surface. For much more detail see the Raku documentation for quoting constructs for a comprehensive list of adverbs and examples.

The most commonly used
  • Q[ ], common shortcut: 「 」
The most basic form of quoting. No interpolation, no escaping. What is inside is what you get. No exceptions.
「Ze backslash characters!\ \Zay do NUSSING!! \」 -> Ze backslash characters!\ \Zay do NUSSING!! \


  • "Single quote" quoting. - Q:q[ ], adverbial: q[ ], common shortcut: ' '
No interpolation, but allow escaping quoting characters.
'Don\'t panic!' -> Don't panic!


  • "Double quote" quoting. - Q:qq[ ], adverbial: qq[ ], common shortcut: " "
Interpolates: embedded variables, logical characters, character codes, continuations.
"Hello $name, today is {Date.today} \c[grinning face] \n🦋" -> Hello Dave, today is 2020-03-25 😀
🦋
Where $name is a variable containing a name (one would imagine), {Date.today} is a continuation - a code block to be executed and the result inserted, \c[grinning face] is the literal emoji character 😀 as a character code, \n is a new-line character and 🦋 is an emoji butterfly. Allows escape sequences, and indeed, requires them when embedding data that looks like it may be an interpolation target but isn't.


Every adverbial form has both a q and a qq variant to give the 'single quote' or "double quote" semantics. Only the most commonly used are listed here.


  • "Quote words" - Q:qw[ ], adverbial: qw[ ], common shortcut: < >
No interpolation, but allow escaping quote characters. (Inherited from the q[] escape semantics)
< a β 3 Б 🇩🇪 >
Parses whatever is inside as a white-space separated list of words. Returns a list with all white space removed. Any numeric values are returned as allomorphs.
That list may be operated on directly with any listy operator or it may be assigned to a variable.
say < a β 3 Б 🇩🇪 >[*-1] # What is the last item in the list? (🇩🇪)
say +< a β 3 Б 🇩🇪 > # How many items are in the list? (5)


  • "Quote words with white space protection" - Q:qww[ ], adverbial: qww[ ]
May preserve white space by enclosing it in single or double quote characters, otherwise identical to qw[ ].
say qww< a β '3 Б' 🇩🇪 >[2] # Item 2 in the list? (3 Б)


  • "Double quote words" quoting. - Q:qqw[ ], adverbial: qqw[ ], common shortcut: << >> or « »
Interpolates similar to standard double quote, but then interprets the interpolated string as a white space separated list.


  • "Double quoted words with white space protection" - Q:qqww[ ], adverbial: qqww[ ]
Same as qqw[ ] but retains quoted white space.


  • "System command" - Q:qx[ ], adverbial: qx[ ]
Execute the string inside the construct as a system command and return the result.


  • "Heredoc format" - Q:q:to/END/; END, adverbial: q:to/END/; END
Return structured text between two textual delimiters. Depending on the adverb, may or may not interpolate (same rules as other adverbial forms.) Will return the text with the same indent as the indent of the final delimiter. The text delimiter is user chosen (and is typically, though not necessarily, uppercase) as is the delimiter bracket character.

There are other adverbs to give precise control what interpolates or doesn't, that may be applied to any of the above constructs. See the doc page for details. There is another whole sub-genre dedicated to quoting regexes.

REXX[edit]

There are no "escape" characters used in the REXX language.

The different types (or styles) of incorporating quoted constructs are a largely matter of style.

/*REXX program demonstrates various ways to express a string of characters  or  numbers.*/
a= 'This is one method of including a '' (an apostrophe) within a string.'
b= "This is one method of including a ' (an apostrophe) within a string."
 
/*sometimes, an apostrophe is called */
/*a quote. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
c= "This is one method of including a "" (a double quote) within a string."
d= 'This is one method of including a " (a double quote) within a string.'
 
/*sometimes, a double quote is also */
/*called a quote, which can make for */
/*some confusion and bewilderment. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
f= 'This is one method of expressing a long literal by concatenations, the ' || ,
'trailing character of the above clause must contain a trailing ' || ,
'comma (,) === note the embedded trailing blank in the above 2 statements.'
/*──────────────────────────────────────────────────────────────────────────────────────*/
g= 'This is another method of expressing a long literal by ' ,
"abutments, the trailing character of the above clause must " ,
'contain a trailing comma (,)'
/*──────────────────────────────────────────────────────────────────────────────────────*/
h= 'This is another method of expressing a long literal by ' , /*still continued.*/
"abutments, the trailing character of the above clause must " ,
'contain a trailing comma (,) --- in this case, the comment /* ... */ is ' ,
'essentially is not considered to be "part of" the REXX clause.'
/*──────────────────────────────────────────────────────────────────────────────────────*/
i= 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109
 
/*This is one way to express a list of */
/*numbers that don't have a sign. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
j= 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109,
71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181
 
/*This is one way to express a long */
/*list of numbers that don't have a */
/*sign. */
/*Note that this form of continuation */
/*implies a blank is abutted to first */
/*part of the REXX statement. */
/*Also note that some REXXs have a */
/*maximum clause length. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
k= 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109,
71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181
 
/*The J and K values are identical,*/
/*superfluous and extraneous blanks are*/
/*ignored. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
l= '-2 3 +5 7 -11 13 17 19 -23 29 -31 37 -41 43 47 -53 59 -61 67 -71 73 79 -83 89 97 -101'
 
/*This is one way to express a list of */
/*numbers that have a sign. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
m= a b c d f g h i j k l /*this will create a list of all the */
/*listed strings used (so far) into */
/*the variable L (with an */
/*intervening blank between each */
/*variable's value. */


Ring[edit]

 
text = list(3)
 
text[1] = "This is 'first' example for quoting"
text[2] = "This is second 'example' for quoting"
text[3] = "This is third example 'for' quoting"
 
for n = 1 to len(text)
see "text for quoting: " + nl + text[n] + nl
str = substr(text[n],"'","")
see "quoted text:" + nl + str + nl + nl
next
 
Output:

text for quoting:
This is 'first' example for quoting
quoted text:
This is first example for quoting

text for quoting:
This is second 'example' for quoting
quoted text:
This is second example for quoting

text for quoting:
This is third example 'for' quoting
quoted text:
This is third example for quoting

Wren[edit]

Library: Wren-fmt

The only quoting construct Wren has is the string literal which is a sequence of characters (usually interpreted as UTF-8) enclosed in double-quotes. It can include various escape sequences as listed in the Literals/String#Wren task. Unlike many other languages, Wren doesn't currently support any form of 'raw' or 'verbatim' string whereby escape sequences etc. are interpreted literally.

However, it does support interpolation which enables any Wren expression, whatever its type or complexity, to be embedded in a string literal by placing it in parentheses immediately preceded by a % character. A literal % character is represented by the escape sequence \%.

If the expression is not a string, then it is automatically converted to one by applying its type's toString method. All classes have such a method which is usually written explicitly or can be just inherited from the Object class which sits at the top of the type hierarchy.

Interpolated expressions can also be nested though this is not usually a good idea as they can quickly become unreadable.

It can be argued that interpolated strings which contain anything other than simple expressions (for example formatting information) are hard to read anyway and, although not part of the standard language, the above module contains methods modelled after C's 'printf' function family to meet this objection.

Here are some examples of all this.

import "/fmt" for Fmt
 
// simple string literal
System.print("Hello world!")
 
// string literal including an escape sequence
System.print("Hello tabbed\tworld!")
 
// interpolated string literal
var w = "world"
System.print("Hello interpolated %(w)!")
 
// 'printf' style
Fmt.print("Hello 'printf' style $s!", w)
 
// more complicated interpolated string literal
var h = "Hello"
System.print("%(Fmt.s(-8, h)) more complicated interpolated %(w.map { |c| "%(c + "\%")" }.join())!")
 
// more complicated 'printf' style
Fmt.print("$-8s more complicated 'printf' style $s\%!", h, w.join("\%"))
Output:
Hello world!
Hello tabbed	world!
Hello interpolated world!
Hello 'printf' style world!
Hello    more complicated interpolated w%o%r%l%d%!
Hello    more complicated 'printf' style w%o%r%l%d%!

zkl[edit]

Quoting text: zkl has two types of text: parsed and raw. Strings are limited to one line, no continuations.

Parsed text is in double quotes ("text\n") and escape ("\n" is newline, UTF-8 ("\Ubd;" or "\u00bd"), etc).

"Raw" text is unparsed, useful for things like regular expressions and unit testing of source code. It uses the form 0'<sigil>text<sigil>. For example 0'<text\n> is the text "text\\n". There is no restriction on sigil (other than it is one character).

Text blocks are multiple lines of text that are gathered into one line and then evaluated (thus can be anything, such as string or code and are often mixed). #<<< (at the start of a line) begins and ends the block. A #<<<" beginning tag prepends a " to the block. For example:

#<<<
text:=
"
A
";
#<<<

is parsed as text:="\nA\n";

Other data types are pretty much as in other languages.