Quoting constructs: Difference between revisions

← Older edit

Quoting constructs (view source)

Revision as of 16:41, 7 March 2024

29,848 bytes added , 2 months ago

BLC data embedding

Tromp

56

edits

Revision as of 14:27, 24 August 2021 (view source) Puppydrum64 (talk \| contribs) m (→‎{{header\|6502 Assembly}}) ← Older edit		Latest revision as of 16:41, 7 March 2024 (view source) Tromp (talk \| contribs) (BLC data embedding)
(35 intermediate revisions by 14 users not shown)
Line 1: {{~~draft~~ task}} Pretty much every programming language has some form of quoting construct to allow embedding of data in a program, be it literal strings, numeric data or some combination thereof. Line 9: Note: This is primarily for quoting constructs for data to be "embedded" in some way into a program. If there is some special format for external data, it may be mentioned but that isn't the focus of this task. =={{header\|6502 Assembly}}== The following uses VASM syntax for quoting constructs. There is no built-in support for interpolation, escape characters, etc. What constitutes as an escape character depends on the code that is using embedded strings. A function that needs this embedded data can take as an argument a pointer to the data, which can easily be obtained by loading the low byte and high byte of the label into consecutive zero page RAM locations. ~~{{works with\|VASM}}~~ ~~<lang 6502asm>LookUpTable: db $00,$03,$06,$09,$12 ;a sequence of pre-defined bytes~~ <syntaxhighlight lang="6502asm">LookUpTable: db $00,$03,$06,$09,$12 ;a sequence of pre-defined bytes MyString: db "Hello World!",0 ;a null-terminated string GraphicsData: incbin "C:\game\gfx\tilemap.chr" ;a file containing the game's graphics</~~lang~~syntaxhighlight> Most 6502 assemblers use Motorola syntax, which uses the following conventions: * A value with no prefix is interpreted as a base 10 (decimal) number. $ represents hexadecimal and % represents binary. Single or double quotes represent ASCII. * Unlike immediate values in instructions, quoted data does NOT begin with a #. For example, <code>dw $1234</code> represents the literal value 0x1234, not "the value stored at memory address 0x1234." * Multiple values can be put on the same line, separated by commas. <code>DB</code> only needs to be before the first data value on that line. Or, you can put each value on its own line. Both are valid and have the same end result when the code is assembled. 6502 Assembly uses <code>db</code> or <code>byte</code> for 8-bit data and <code>dw</code> or <code>word</code> for 16-bit data. 16-bit values are written by the programmer in big-endian, but stored little-endian. For example, the following two data blocks are equivalent. You can write it either way, but the end result is the same. <syntaxhighlight lang="6502asm">dw $ABCD db $CD,$AB</syntaxhighlight> Most assemblers support "C-like" operators, and there are a few additional ones: * < or LOW() means "The low byte of." For example, <code><$3456</code> evaluates to 56. * > or HIGH() means "The high byte of." For example, <code><$78AB</code> evaluates to 78. These two operators are most frequently used with labeled memory addresses, like so: <syntaxhighlight lang="6502asm">lookup_table_lo: byte <Table00,<Table01,<Table02 lookup_table_hi: byte >Table00,>Table01,>Table02</syntaxhighlight> =={{header\|68000 Assembly}}== Formatting is largely dependent on the assembler and the syntax. Generally speaking, assemblers that use Motorola syntax follow these conventions: * A value with no prefix is interpreted as a base 10 (decimal) number. $ represents hexadecimal and % represents binary. Single or double quotes represent ASCII. * Unlike immediate values in instructions, quoted data does NOT begin with a #. For example, <code>DC.L $12345678</code> represents the literal value 0x12345678, not "the value stored at memory address 0x12345678." * The length of the data must be specified, and if the value given is smaller than that size, it will get padded to the left with zeroes. If the value is too big to fit in the specified size, you'll get a compiler error and the assembly is cancelled. * Multiple values can be put on the same line, separated by commas. <code>DC._</code> only needs to be before the first data value on that line. Or, you can put each value on its own line. Both are valid and have the same end result when the code is assembled. You should always follow byte data with EVEN which will add an extra byte of padding if the total number of bytes before it was odd. This is necessary for your code to comply with the CPU's alignment rules. <syntaxhighlight lang="68000devpac">ByteData: DC.B $01,$02,$03,$04,$05 even WordData: DC.W $01,$02 DC.W $03,$04 ;the above was the same as DC.W $0001,$0002,$0003,$0004 LongData: DC.L $00000001,$00000002,$00000004,$00000008 MyString: DC.B "Hello World!",0 ;a null terminator will not be automatically placed. even</syntaxhighlight> <code>DS._</code> represents a sequence of space. The number after it specifies how many bytes/words/longs' worth of zeroes to place. Some assemblers support values besides zero, others do not. <syntaxhighlight lang="68000devpac">DS.B 8 ;8 bytes, each equals 0 DS.W 16 ;16 words, each equals zero DS.L 20 ;20 longs, each equals zero</syntaxhighlight> In addition to constants, a label can also be specified. If a label is defined with an <code>EQU</code> statement, the label will be replaced with the assigned value during the assembly process. <syntaxhighlight lang="68000devpac">ScreenSize equ $1200 MOVE.W (MyData),D0 MyData DC.W ScreenSize</syntaxhighlight> Code labels, on the other hand, get replaced with the memory address they point to. This can be used to make a lookup table of various data areas, functions, etc. Since there is no "24-bit" data constant directive, you'll have to use <code>DC.L</code> for code labels. The top byte will always be zero in this case. <syntaxhighlight lang="68000devpac">Printstring: ;insert your code here FunctionTable: DC.L PrintString ;represents the address of the function "PrintString"</syntaxhighlight> Most constants can be derived from compile-time expressions, which are useful for explaining what the data actually means. The expressions are evaluated during the assembly process, and the resulting object code will have these calculations already completed, so your program doesn't have to waste time doing them. Most "C-like" operators are supported, but as always the exact syntax depends on your assembler. Parentheses will aid the assembler in getting these correct, but sometimes it still doesn't do what you expect. <syntaxhighlight lang="68000devpac">DC.B $0200>>3 ;evaluates to $0040. As long as the final result fits within the designated storage size, you're good. DC.W 4+5 ;evaluates to $0009 DC.W (4030)-1 ;evaluates to $1199 DC.L MyFunction+4 ;evaluates to the address of MyFunction, plus 4.</syntaxhighlight> We can use this technique to get the length of a region of data, which the assembler can calculate for us. <syntaxhighlight lang="68000devpac">TilemapCollision: DC.B $11,$11,$11,$11,$11,$11,$11,$11,$11,$11 DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01 DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01 DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01 DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01 DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01 DC.B $11,$00,$00,$00,$00,$00,$00,$00,$00,$01 DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01 DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01 DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01 DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01 DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01 DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01 DC.B $11,$11,$11,$11,$11,$11,$11,$11,$11,$11 TilemapCollisionEnd: MOVE.W #(TilemapCollisionEnd-TilemapCollision)-1,D0 ;gets the length of this region of memory, minus 1, into D0. ; Again, even though the "operands" of this expression are longs, ; their difference fits in 16 bits and that's all that matters.</syntaxhighlight> For quoting binary data in another file, you can use the <code>incbin</code> directive to embed it directly in your source code. This is handy for graphics data and music. =={{header\|Applesoft BASIC}}== Real precision numbers (also called "floating point" numbers) and quoted strings are constructs within expressions. Literal strings, real precision numbers, and quoted strings are contructs within DATA statements. Numbers start with a digit from 0 to 9 or a sign + or - and can include a two digit signed exponent. The real numbers must be in the range from -1.7E+38 to 1.7E+38. Reals with an absolute value less than 2.9388E-39 are converted to zero. Quoted strings start with the double quote. Quoted strings are terminated with the double quote or by the end of a statement. A quote cannot be embedded in a quoted string. Most control characters can be embedded in quoted strings, but this is usually discouraged. Literals can be used in DATA statments. These are strings that do not start with a double quote and can have a double quote included in the literal string. === Quoted constructs within expressions === <syntaxhighlight lang="text">? 0 : ? -326.12E-5 : ? HELLO : ? "HELLO" : ? "HELLO</syntaxhighlight> The literal HELLO is interpreted as a variable name, and it's value 0 is printed. {{out}} <pre> 0 -3.2612E-03 0 HELLO HELLO </pre> === Quoted constructs within DATA statements === <syntaxhighlight lang="text"> 10 DATA 0,-326.12E-5,HELLO,"HELLO","HELLO 20 READ A%: PRINT A%: READ A: PRINT A: READ A$: PRINT A$: READ A$: PRINT A$: READ A$: PRINT A$ 30 DATA AB"C 40 READ A$: PRINT A$</syntaxhighlight> {{out}} <pre> 0 -3.2612E-03 HELLO HELLO HELLO AB"C </pre> =={{header\|Binary Lambda Calculus}}== The ability to embed arbitrary binary data of any length with zero overhead is one of the defining features of BLC, in which a closed lambda term is parsed from the start of the programs, and then applied to the rest of the program, making the latter the quoted part. Even the simple hello world program, which in BLC is <code> Hello, world!</code> follows this pattern, with the initial space encoding the lambda term <code>\x.x</code> for the identity function. The restriction is that only one string of data can be so embedded. If we need to embed more pieces, then we can concatenate self-delimiting descriptions, which incur some logarithmic overhead, e.g. by use of the Levenshtein encoding. =={{header\|BQN}}== BQN programs manipulate data of seven types: Character <syntaxhighlight lang="bqn">'a' 'b' @</syntaxhighlight> <code>@</code> is a symbol that represents the null character. Characters can contain a newline(<code>@+10</code> is recommended, however). * Number <syntaxhighlight lang="bqn">123 1.23 123E5 ¯1234 ∞ π</syntaxhighlight> <code>∞</code>, <code>¯∞</code> and <code>π</code> are constants which represent infinity, negative infinity and pi. * Function: A block <code>{}</code> which takes 1 or two arguments: <code>𝕩</code> and/or <code>𝕨</code> * 1-Modifier: A block similar to a function, which can take 1 extra function argument <code>𝔽</code> on the left. * 2-Modifier: A modifier which can take two function arguments, <code>𝔽</code> and <code>𝔾</code>. * Namespace: A block where any data member is exported using <code>⇐</code> assignment. * Array: consists of any of the above. Regular array notation <syntaxhighlight lang="bqn">⟨1, 2, 3⟩</syntaxhighlight> You can nest arrays in arrays. Separators can be <code>,</code>, <code>⋄</code> and newline. Stranding <syntaxhighlight lang="bqn">1‿2‿3</syntaxhighlight> any expression which doesn't fit in a single atom must be put in parentheses. ** Strings <syntaxhighlight lang="bqn">"Hello World" "Quoted "" String"</syntaxhighlight> any sequence of characters including newlines can be put inside a string. Quotes are escaped by typing two quotes. =={{header\|C++}}== <syntaxhighlight lang="c++"> #include <iostream> #include <string> #include <vector> int main() { // C++ uses double quotes for strings and single quotes for characters. std::string simple_string = "This is a simple string"; char letter = 'A'; std::cout << simple_string << " " << letter << std::endl; // C++ can implement multiline strings. std::string multiline_string = R"( An example of multi-line string. Text formatting is preserved. This is a raw string literal, introduced in C++ 11.)"; std::cout << multiline_string << std::endl; // C++'s primitive data types: bool, char, double, float, int, long, short, // can be used to to store data, for example, const int block_length = 64; std::cout << "block length = " << block_length << std::endl; // Vectors of these data types are also possible, for example, std::vector<double> state = { 1.0, 2.0, 3.0 }; } </syntaxhighlight> {{ out }} <pre> This is a simple string A An example of multi-line string. Text formatting is preserved. This is a raw string literal, introduced in C++ 11. block length = 64 </pre> =={{header\|Ecstasy}}== <syntaxhighlight lang="java"> module test { @Inject Console console; void run() { // characters are single quoted Char ch = 'x'; console.print( $"ch={ch.quoted()}"); // strings are double quoted String greeting = "Hello"; console.print( $"greeting={greeting.quoted()}"); // multi-line strings use '\|' as a left column // the start of the first line escapes the '\|' to indicate the start of the multiline // a trailing escape indicates that the current line continues without a linefeed String lines = \\|first line \|second line\ \| continued ; console.print($\|lines= \|{lines} ); // the $"..." is a template string, containing {expressions} // the multi-line form of the template string uses $\| String name = "Bob"; String msg = $\|{greeting} {name}, \|Have a nice day! \|{ch}{ch}{ch} ; console.print($\|msg= \|{msg} ); } } </syntaxhighlight> {{out}} <pre> ch='x' greeting="Hello" lines= first line second line continued msg= Hello Bob, Have a nice day! xxx </pre> =={{header\|FreeBASIC}}== {{trans\|Ring}} <syntaxhighlight lang="freebasic">'In FB there is no substr function, then 'Function taken fron the https://www.freebasic.net/forum/index.php Function substr(Byref soriginal As String, Byref spattern As Const String, Byref sreplacement As Const String) As String ' in <soriginal> replace all occurrences of <spattern> by <sreplacement> Dim As Uinteger p, q If sreplacement <> spattern Then p = Instr(soriginal, spattern) If p Then q = Len(sreplacement) If q = 0 Then q = 1 Do soriginal = Left(soriginal, p - 1) + sreplacement + Mid(soriginal, p + Len(spattern)) p = Instr(p + q, soriginal, spattern) Loop Until p = 0 End If End If Return soriginal End Function Dim As String text(1 To 3) text(1) = "This is 'first' example for quoting" text(2) = "This is second 'example' for quoting" text(3) = "This is third example 'for' quoting" For n As Integer = 1 To Ubound(text) Print !"text for quoting:\n"; text(n) Print !"quoted text:\n"; substr(text(n),"'",""); !"\n" Next n Sleep</syntaxhighlight> {{out}} <pre>Same as Ring input.</pre> =={{header\|Go}}== <~~lang~~syntaxhighlight lang="go">package main import ( Line 94 ⟶ 388: } fmt.Println(os.Expand("There are ${NUMBER} quoting ${TYPES} in Go.", mapper)) }</~~lang~~syntaxhighlight> {{out}} Line 111 ⟶ 405: There are 3 quoting constructs in Go. There are 3 quoting constructs in Go. </pre> =={{header\|J}}== J provides four mechanisms for inline data: * A sequence of numbers, for example <tt>1 2 3</tt> * A sequence of characters, for example <tt>'1 2 3'</tt> * A newline terminated multiline script, for example: <syntaxhighlight lang="j">0 :0 1 2 3 4 5 6 )</syntaxhighlight> * (in recent J versions), an embeddable newline terminated multiline script, for example: <syntaxhighlight lang="j">{{)n 1 2 3 4 5 6 }}</syntaxhighlight> Note that a multiline <code>{{)n</code> construct discards the leading newline, but the construct can also be used for single line strings (more concise than <code>'</code> delimited strings when <code>'</code> appears multiple times in the string). For example <code><nowiki>{{)n1 2 3}}</nowiki></code> is the same value as <code>'1 2 3'</code>. Sequences of numbers or characters which contain exactly one element are treated specially -- they do not have a length of their own. J also has a [https://www.jsoftware.com/help/dictionary/dcons.htm constant language] for numbers, which gives special significance to embedded letters. For example <tt>12e3</tt> is the floating point value 12000 (but J extends this notation to support some numbers in bases other than 10, extended precision integers, rational values and complex values and approximations involving certain commonly used constants, such as pi). The multiline scripts are special cases of the mechanisms for defining verbs, adverbs and conjunctions (what might be called functions or macros or operators or procedures in other languages) which instead provide the raw characters of the definition. The old form (beginning with <tt>0 : 0</tt> and ending with a line containing a single right parenthesis and no other displayable characters) is different from the new form (beginning with <tt>{{)n</tt> and ending with a line which has <tt>}}</tt> and no other characters preceding it) in the way that any following part of a surrounding sentence is arranged. These values of <tt>A</tt> would be equivalent: <syntaxhighlight lang="j">NB. no trailing linefeed A=: '1 2 3' NB. removing linefeed A=: 0 : 0-.LF 1 2 3 ) NB. removing linefeed A=: {{)n 1 2 3 }}-.LF</syntaxhighlight> Also, the <nowiki>{{}}</nowiki> forms are nestable. So, for example, this would also define an equivalent value for <tt>A</tt>: <syntaxhighlight lang="j">{{ {{ A=: {{)n 1 2 3 }}-.LF }}'' }}''</syntaxhighlight> Here, we are defining verbs inline and immediately evaluating them (by providing an argument (which is ignored because it is not referenced)). The use of an unbalanced right parenthesis as an escape character was inherited from APL. The double curly brace mechanism was a compromise between J's existing use of curly braces and visual conventions used in a variety of other languages. =={{header\|Java}}== <syntaxhighlight lang="java"> import java.util.List; public final class QuotingConstructs { public static void main(String[] args) { // Java uses double quotes for strings and single quotes for characters. String simple = "This is a simple string"; char letter = 'A'; // A Text Block is denoted by triple quotes. String multiLineString = """ This is an example of multi-line string. Text formatting is preserved. Text blocks can be used for a multi-line string. """; System.out.println(multiLineString); // Java's primitive data types: boolean, byte, char, double, float, int, long, short, // can be used to to store data, for example, final int blockLength = 64; // Arrays or lists of these data types are possible, for example, double[] state = new double[] { 1.0, 2.0, 3.0 }; // Custom data types can be stored in a record or a class, for example, record Circle(int centreX, int centreY, double radius) {} // A list of custom data types: List<Circle> circles = List.of( new Circle(1, 2, 1.25), new Circle(-2, 3, 2.50) ); } } </syntaxhighlight> {{ out }} <pre> This is an example of multi-line string. Text formatting is preserved. Text blocks can be used for a multi-line string. </pre> Line 117 ⟶ 503: of such values. Such data can be included in a jq program wherever an expression is allowed, but in a jq program, consecutive JSON values must be specified using "," as a separator, as shown in this snippet: <~~lang~~syntaxhighlight lang="jq">def data: "A string", 1, {"a":0}, [1,2,[3]] ; </~~lang~~syntaxhighlight>Long JSON strings can be broken up into smaller JSON strings and concatenated using the infix "+" operator, e.g. <~~lang~~syntaxhighlight lang="jq"> "This is not such a " + "long string after all."</~~lang~~syntaxhighlight> "Raw data", such as character strings that are not expressed as JSON strings, cannot be included in jq programs textually but must be "imported" in some manner, e.g. from environment variables, text files, or using command-line ~~arguments~~ options. =={{header\|Julia}}== Line 136 ⟶ 522: * String literals are delimited by double quotes or triple double quotes: <~~lang~~syntaxhighlight lang="julia"> julia> str = "Hello, world.\n" "Hello, world.\n" Line 143 ⟶ 529: a newline""" "Contains \"quote\" characters and \na newline" </syntaxhighlight> ~~</lang>~~ * Both single and triple quoted strings are may contain interpolated values. Triple-quoted strings are also dedented to the level of the least-indented line. This is useful for defining strings within code that is indented. For example: <~~lang~~syntaxhighlight lang="julia"> julia> str = """ Hello, Line 152 ⟶ 538: """ " Hello,\n world.\n" </syntaxhighlight> ~~</lang>~~ * Julia allows interpolation into string literals using $: <~~lang~~syntaxhighlight lang="julia"> julia> "$greet, $whom.\n" "Hello, world.\n" </syntaxhighlight> ~~</lang>~~ * The shortest complete expression after the $ is taken as the expression whose value is to be interpolated into the string. Thus, you can interpolate any expression into a string using parentheses: <~~lang~~syntaxhighlight lang="julia"> julia> "1 + 2 = $(1 + 2)" "1 + 2 = 3" </syntaxhighlight> ~~</lang>~~ * Julia reserves the single quote ' for character literals, not for strings: <~~lang~~syntaxhighlight lang="julia"> julia> 'π' 'π': Unicode U+03C0 (category Ll: Letter, lowercase) </syntaxhighlight> ~~</lang>~~ * Julia requires commands sent to functions such as run() be surrounded by backticks. Such expressions create a Cmd object, which is used for running a child process from Julia: <~~lang~~syntaxhighlight lang="julia"> julia> mycommand = `echo hello` `echo hello` Line 178 ⟶ 564: julia> run(mycommand); hello </syntaxhighlight> ~~</lang>~~ * Julia uses the colon : in metaprogramming for quoting symbols and other code: <~~lang~~syntaxhighlight lang="julia"> julia> a = :+ :+ Line 204 ⟶ 590: julia> eval(c) 5 </syntaxhighlight> ~~</lang>~~ =={{header\|Lua}}== Lua has three string definition syntaxes: single- and double-quotes, which are equivalent; and long-bracket pairs [[ ]] which may span multiple lines. Long-bracket pairs may be specified to an arbitrary depth, which may be useful for quoting Lua source code itself (which might use long-brackets). Lua strings are variable-length arrays of bytes, not 0-terminated (as in C), so may contain aribitrary raw binary data. Commonly escaped characters and octal\hexadecimal notation are supported. <syntaxhighlight lang="lua">s1 = "This is a double-quoted 'string' with embedded single-quotes." s2 = 'This is a single-quoted "string" with embedded double-quotes.' s3 = "this is a double-quoted \"string\" with escaped double-quotes." s4 = 'this is a single-quoted \'string\' with escaped single-quotes.' s5 = [[This is a long-bracket "'string'" with embedded single- and double-quotes.]] s6 = [=[This is a level 1 long-bracket ]]string[[ with [[embedded]] long-brackets.]=] s7 = [==[This is a level 2 long-bracket ]=]string[=[ with [=[embedded]=] level 1 long-brackets, etc.]==] s8 = [[This is a long-bracket string with embedded line feeds]] s9 = "any \0 form \1 of \2 string \3 may \4 contain \5 raw \6 binary \7 data \xDB" print(s1) print(s2) print(s3) print(s4) print(s5) print(s6) print(s7) print(s8) print(s9) -- with audible "bell" from \7 if supported by os print("some raw binary:", #s9, s9:byte(5), s9:byte(12), s9:byte(17))</syntaxhighlight> {{out}} <pre>This is a double-quoted 'string' with embedded single-quotes. This is a single-quoted "string" with embedded double-quotes. this is a double-quoted "string" with escaped double-quotes. this is a single-quoted 'string' with escaped single-quotes. This is a long-bracket "'string'" with embedded single- and double-quotes. This is a level 1 long-bracket ]]string[[ with [[embedded]] long-brackets. This is a level 2 long-bracket ]=]string[=[ with [=[embedded]=] level 1 long-brackets, etc. This is a long-bracket string with embedded line feeds any form ☺ of ☻ string ♥ may ♦ contain ♣ raw ♠ binary data █ some raw binary: 64 0 1 2</pre> =={{header\|Nim}}== Line 215 ⟶ 640: Tuples literals are defined as a list of values between parentheses. Field names may be specified by preceding a value by the name followed by a colon.<br/> <syntaxhighlight lang="nim"> ~~<lang Nim>~~ echo "A simple string." echo "A simple string including tabulation special character \\t: \t." Line 253 ⟶ 678: # Tuples. echo ('a', 1, true) # Tuple without explicit field names. echo (x: 1, y: 2) # Tuple with two int fields "x" and "y".</~~lang~~syntaxhighlight> {{out}} Line 272 ⟶ 697: ('a', 1, true) (x: 1, y: 2)</pre> =={{header\|Perl}}== Please consult the [[https://www.rosettacode.org/wiki/Literals/String#Perl String#Perl]] page that covers almost everything on syntax. The following are just some random supplements that mainly focus on usages. <syntaxhighlight lang="perl" line># 20221202 Perl programming solution use strict; use warnings; print <<`EXEC` # superfluous alternative to qx/ / and ` ` sleep 2; ls /etc/resolv.conf EXEC ; # only with quoted begin tag then you can have spaces in between print <<END # so << 'END' or << "END" and semi-colon is always optional Make sure that the end tag must be exactly the same as the begin tag. END ; # the above wouldn't have worked had it been something like # END␣ ␣ ␣ (with redundant trailing spaces) print <<"HERE1", <<"HERE2" # it is also possible to stack heredocs Hello from HERE1 HERE1 Hello from HERE2 HERE2 ; my $haystack = 'Santa says HoHoHo'; # a quoted pattern expanded before my $needle = "\x48\x6F"; # the regex is interpreted print "1) Found.\n" if $haystack =~ /$needle{3}/; # Matches Hooo print "2) Found.\n" if $haystack =~ /($needle){3}/; # Do what you mean # due to autoconversion, things may still work the same { use Benchmark; # under (usually overlooked) scalar interpolation my ( $iterations, $x, $y ) = 1e7, rand, rand; timethese( $iterations, { 'normal' => ' $x + $y', 'useless' => '"$x" + "$y"' } ); } # however in the 2nd case the boxing and unboxing are unnecessary { # the following illustrate some behaviors under array interpolation my @Y_M_D = sub{$_[5]+1900,$_[4]+1,$_[3]}->(localtime(time)); local $\ = "\n"; print @Y_M_D; # YMD print "@Y_M_D"; # Y M D local $, = '-'; # output field separator print @Y_M_D; # Y-M-D print "@Y_M_D"; # Y M D local $" = '_'; # interpolated list separator print "@Y_M_D"; # Y_M_D }</syntaxhighlight> =={{header\|Phix}}== Line 281 ⟶ 755: Back-ticks and triple-quotes are used for multi-line strings, without backslash interpretation, eg <!--<~~lang~~syntaxhighlight ~~Phix~~lang="phix">--> <span style="color: #008080;">constant</span> <span style="color: #000000;">t123</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">` one Line 287 ⟶ 761: three `</span> <!--</~~lang~~syntaxhighlight>--> or (entirely equivalent, except the following can contain back-ticks which the above cannot, and vice versa for triple quotes) <!--<~~lang~~syntaxhighlight ~~Phix~~lang="phix">--> <span style="color: #008080;">constant</span> <span style="color: #000000;">t123</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">""" one Line 297 ⟶ 771: three """</span> <!--</~~lang~~syntaxhighlight>--> Both are also equivalent to the top double-quote one-liner. Note that a single leading '\n' is automatically stripped.<br> Line 304 ⟶ 778: You can also declare hexadecimal strings, eg <!--<~~lang~~syntaxhighlight ~~Phix~~lang="phix">--> <span style="color: #000000;">x</span><span style="color: #008000;">"1 2 34 5678_AbC"</span> <span style="color: #000080;font-style:italic;">-- same as {0x01, 0x02, 0x34, 0x56, 0x78, 0xAB, 0x0C} -- note however it displays as {1,2,52,86,120,171,12} -- whereas x"414243" displays as "ABC" (as all chars)</span> <!--</~~lang~~syntaxhighlight>--> Literal [http://phix.x10.mx/docs/html/sequences.htm sequences] are represented with curly braces, and can be nested to any depth, eg <~~lang~~syntaxhighlight ~~Phix~~lang="phix">{2, 3, 5, 7, 11, 13, 17, 19} {1, 2, {3, 3, 3}, 4, {5, {6}}} {{"John", "Smith"}, 52389, 97.25} {} -- the 0-element sequence</~~lang~~syntaxhighlight> =={{header\|Quackery}}== In Quackery, everything is code except when it is data. The objects it supports are operators ("primitives" or virtual op-codes), numbers (bigints) and nests (dynamic arrays that can contain operators, numbers, and nests, delimits by <code>[</code> and <code>]</code>). The behaviour of a number is to put its value on the data stack - in essence, numbers are self-quoting. However, in common with operators and nests, they can be explicitly quoted by preceding them with <code>'</code> (pronounced "quote"). The behaviour of <code>'</code> is to put the quackery object that follows it on the data stack rather than evaluate it, which is the default behaviour for all objects. Any object can be given a name during compilation, using the word <code>is</code>, which is a "builder"; a word that is executed during compilation. The behaviour of <code>is</code> is to add the named object to the compiler's dictionary of named objects. So, for example, the number <code>12</code> could be declared as a constant called <code>dozen</code> with the Quackscript <code>12 is dozen</code>. However is it preferable to restrict the use of <code>is</code> to naming nests, (i.e. <code>[ 12 ] is dozen</code>) so that the Quackery decompiler <code>unbuild</code> can differentiate between the constant <code>dozen</code> and the literal <code>12</code>. (Aside: The word <code>builds</code> is analogous to <code>is</code>, but creates new builders. The Quackery compiler is extensible, so new quoting constructs can be created as required.) Other relevant builders include <code>hex</code>, which parses the whitespace delimited string following it as a hexadecimal number, <code>char</code>, which will embed a non-whitespace character (represented by a number), and <code>$</code>, which will embed a string (represented by a nest of characters). The behaviour of <code>$</code> is to consider the first non-whitespace character following it as the string's delimiter, and treat everything following it as the string to be embedded until it encounters a second instance of the delimiter. Typically the delimiter for a string is <code>"</code>, but any non-whitespace character is acceptable, so the following are equivalent: <code>$ "Hello World!"</code>, <code>$ 'Hello World!'</code>, <code>$ ZHello World!Z</code>. The builder <code>constant</code> evaluates the object that precedes it at runtime and embeds it as a literal during compilation, so <code>[ 10 18 ** ] constant</code> compiles as <code>1000000000000000000</code>. Also relevant are look-up tables and ancillary stacks. The word <code>table</code> is used to create look-up tables. As an example, the behaviour of the nest <code>[ table 2 3 5 7 11 13 17 19 23 29 ]</code> is to take a number, n, in the range <code>0</code> to <code>9</code> (inclusive) from the data stack and put the corresponding small prime number on the data stack. <code>table</code> is not limited to returning numbers, it will handle any Quackery object, so can be considered as a quoting construct. The word <code>stack</code> is used to create ancillary stacks, which, like the data stack, are places to hold data. They are the Quackery analogue to variables, but can be pre-loaded with data much like tables. As an example, this stack is preloaded with two items; there is an empty nest <code>[ ]</code> on the top of the stack, and beneath that, the number <code>123</code>: <code>[ stack 123 [ ] ]</code>. =={{header\|Raku}}== The Perl philosophy, which Raku thoroughly embraces, is that "There Is More Than One Way To Do It" (often abbreviated to ~~TIMTOWDI~~TIMTOWTDI). Quoting constructs is an area where this is enthusiastically espoused. Raku has a whole quoting specific sub-language built in called Q. Q changes the parsing rules inside the quoting structure and allows extremely fine control over how the enclosed data is parsed. Every quoting construct in Raku is some form of a Q syntactic structure, using adverbs to fine tune the desired behavior, though many of the most commonly used have some form of "shortcut" syntax for quick and easy use. Usually, when using an adverbial form, you may omit the Q: and just use the adverb. Line 402 ⟶ 896: The different types (or styles) of incorporating quoted constructs are a largely matter of style. <~~lang~~syntaxhighlight lang="rexx">/REXX program demonstrates various ways to express a string of characters or numbers./ a= 'This is one method of including a '' (an apostrophe) within a string.' b= "This is one method of including a ' (an apostrophe) within a string." Line 462 ⟶ 956: /the variable L (with an / /intervening blank between each / /variable's value. /</~~lang~~syntaxhighlight><br><br> =={{header\|Ring}}== {{incomplete\|Ring\|<u>Explain</u> where they would likely be used, what their primary use is, what limitations they have and why one might be preferred over another. Is one style interpolating and another not? Are there restrictions on the size of the quoted data? The type? The format?}} ~~<lang Ring>~~ <syntaxhighlight lang="ring"> text = list(3) Line 477 ⟶ 972: see "quoted text:" + nl + str + nl + nl next </syntaxhighlight> ~~</lang>~~ {{out}} text for quoting: <br> Line 497 ⟶ 992: ====Characters==== Character literals are prefixed by a "$". Conceptionally, they are the elements of strings, although effectively only the codePoint is stored in strings. But when accessing a string element, instances of Character are used. Characters can be asked for being uppercase, lowercase, etc. <~~lang~~syntaxhighlight lang="smalltalk">$a $Å $日</~~lang~~syntaxhighlight> ====Strings==== String literals are enclosed in single quotes. Conceptionally, they holde instances of Character as element, but actually the underlying storage representation is chosen to be space effective. Typically, underneath are classes like SingleByteString, TwoByteString and FourByteString, but this is transparent to the programmer. Strings can hold any Unicode character; UTF8 is only used when strings are exchanged with the external world (which is good, as it makes operations like stringLength much easier). <~~lang~~syntaxhighlight lang="smalltalk">'hello' '日本語 </~~lang~~syntaxhighlight> Traditional Smalltalk-80 does not support any escapes inside strings, which is inconvenient, occasionally.<br>Smalltalk/X supports an extended syntax for C-like strings: {{works with\|Smalltalk/X}} <~~lang~~syntaxhighlight lang="smalltalk">c'hello\nthis\tis a C string\0x0D'</~~lang~~syntaxhighlight> and also embedded expressions: <~~lang~~syntaxhighlight lang="smalltalk">e'hello world; it is now {Time now}\n'</~~lang~~syntaxhighlight> ====Arrays==== Literal arrays are written as #(...), where the elements are space separated; each literal array element can be any type of literal again, optionally omitting the '#'-character: <~~lang~~syntaxhighlight lang="smalltalk">#( 1 1.234 (1/2) foo 'hello' #(9 8 7) (99 88 77) $λ '日本語' true [1 2 3] false)</~~lang~~syntaxhighlight> Here, the third element is a fraction, followed by the symbol #'foo', two arrays, a character, another string, the boolean true, a byteArray and the boolean false. ====ByteArrays==== A dense collection of byte valued integers is written as #[..]. Conceptionally, they are arrays of integer values in the range 0..255, but use only one byte per element of storage. They are typically used for bulk storage such as bitmap images, or when exchanging such with external functions. <~~lang~~syntaxhighlight lang="smalltalk">#[ 1 2 16rFF 2r0101010 ]</~~lang~~syntaxhighlight> ====Symbols==== These are like symbol atoms in Lisp/Scheme, written as #'...' (i.e. like a string with hash prefix). If the characters do not contain special characters or are of the form allowed for a message selector, the quotes can be omitted. Symbols are often used as key in dictionaries, especially for message selectors and global/namespace name bindings. They can be quickly compared using "==", which is. a pointer compare (identity) instead of "=" which is compares the contents (equality). <~~lang~~syntaxhighlight lang="smalltalk">#'foo' #'foo bar baz' #foo. " same as #'foo' " #'++' #++ " same as #'++' " #a:b:c: " same as #'a:b:c:' "</~~lang~~syntaxhighlight> ====Blocks==== Line 535 ⟶ 1,030: Blocks thus represent a piece of code which can be stored in an instance variable, passed as argument or returned from a method. <br>Block syntax is very compact: <~~lang~~syntaxhighlight lang="smalltalk">[ expression . expression ... expression ]</~~lang~~syntaxhighlight> or for a block with arguments: <~~lang~~syntaxhighlight lang="smalltalk">[:arg1 :arg2 :... :argN \| expression . expression ... expression ]</~~lang~~syntaxhighlight> Blocks are one of the fundamental building blocks of Smalltalk (no pun here), as the language (Compiler) does not specify any syntax for control structures. Control structures like if, while, etc. are all implemented as library functions, and defined eg. in the Boolean, Block or Collection classes. <br> If you have a block at hand, it can be evaluated by sending it a "value"message: <~~lang~~syntaxhighlight lang="smalltalk">aBlock value. "evaluate the block, passing no argument" anotherBlock value:1 value:2. "evaluate the block, passing two arguments"</~~lang~~syntaxhighlight> The most basic implementation of such a control structure is found in the Boolean subclasses True and False, which implement eg. "<tt>ifTrue:arg</tt>" and "<tt>ifFalse:</tt>". Here are those two as concrete example: <~~lang~~syntaxhighlight lang="smalltalk">in the True class: ifTrue: aBlock ^ aBlock value "I am true, so I evaluate the block" Line 550 ⟶ 1,045: in the False class: ifTrue: aBlock ^ nil "I am false, so I ignore the block"</~~lang~~syntaxhighlight> Thus, the expression <tt>"someBoolean ifTrue:[ 'hello print' ]"</tt> will either evaluate the lambda or not, depending on the someBoolean receiver. Obviously, you can teach other objects on how to respond to "value" messages and then use them as if they where blocks. Line 561 ⟶ 1,056: ====Inline Object==== {{works with\|Smalltalk/X}} <~~lang~~syntaxhighlight lang="smalltalk">#{ foo: <someConstant> bar: <someConstant> }</~~lang~~syntaxhighlight> Generates a literal constant instance of an anonymous class, with two instance vars: foo and bar. The object is dumb in that it only provides getter and setter functions. These are used eg. when returning structured multiple values from a method. Line 572 ⟶ 1,067: Similar to byteArrays, there are dense arrays of ints, floats, doubles or bits (i.e. they use much less memory compared to regular arrays, which hold pointers to their elements). They are also perfect when calling out to C-language functions. The syntax is analogous to the Scheme language's syntax: {{works with\|Smalltalk/X}} <syntaxhighlight lang="text">#u16( 1 2 3 ). " an array of unsigned int16s " #u32( 1 2 3 ). " an array of unsigned int32s " #u64( 1 2 3 ). " an array of unsigned int64s " Line 582 ⟶ 1,077: #f64( -1 2.0 3 ). " an array of float64s " #b( 1 0 1 1 0 0 ). " an array of bits " #B( true false true true ). " an array of booleans "</~~lang~~syntaxhighlight> =={{header\|Wren}}== Line 603 ⟶ 1,098: Here are some examples of all this. <~~lang~~syntaxhighlight ~~ecmascript~~lang="wren">import "./fmt" for Fmt // simple string literal Line 632 ⟶ 1,127: """ System.print(r) </syntaxhighlight> ~~</lang>~~ {{out}} Line 646 ⟶ 1,141: Single (") or dual ("") double-quotes can be included without problem. </pre> =={{header\|Z80 Assembly}}== {{trans\|6502 Assembly}} Quoting constructs is very straightforward. The use of quotation marks tells the assembler that the data inside those quotation marks is to be assembled as ASCII values. A null terminator must go outside the quotation marks. There is no built-in support for control codes, as the assembler does not assume that any "putS" routine exists. Whatever functions you create that use these strings will have to be given the capability of handling control codes. Unlike some languages, there is no limit on how long a string can be. A text string can span as many lines as you want it to, since the null terminator is the only way the CPU knows it will end (assuming that your "putS" routine uses a null terminator.) <syntaxhighlight lang="z80">MyString: byte "Hello World",0 ;a null-terminated string LookupTable: byte &03,&06,&09,&0C ;a pre-defined sequence of bytes (similar in concept to enum in C) TileGfx: incbin "Z:\game\gfx\tilemap.bmp" ;a file containing bitmap graphics data</syntaxhighlight> For most Z80 assemblers, the following are standard: * A value with no prefix is interpreted as a base 10 (decimal) number. $ or & represents hexadecimal and % represents binary. Single or double quotes represent ASCII. Multiple values can be put on the same line, separated by commas. DB only needs to be before the first data value on that line. Or, you can put each value on its own line. Both are valid and have the same end result when the code is assembled. Z80 Assembly uses db or byte for 8-bit data and dw or word for 16-bit data. 16-bit values are written by the programmer in big-endian, but stored little-endian. For example, the following two data blocks are equivalent. You can write it either way, but the end result is the same. <syntaxhighlight lang="z80">word $ABCD word $CD,$AB</syntaxhighlight> Most assemblers support "C-like" operators, and there are a few additional ones: * < or LOW() means "The low byte of." For example, <$3456 evaluates to 56. * > or HIGH() means "The high byte of." For example, <$78AB evaluates to 78. These two operators are most frequently used with labeled memory addresses, like so: <syntaxhighlight lang="z80">lookup_table_lo: byte <Table00,<Table01,<Table02 lookup_table_hi: byte >Table00,>Table01,>Table02</syntaxhighlight> The <code>incbin</code> directive can be used for embedding raw graphics data, text, or music. =={{header\|zkl}}== Line 665 ⟶ 1,194: A #<<<" beginning tag prepends a " to the block. For example: <~~lang~~syntaxhighlight lang="zkl">#<<< text:= " A "; #<<<</~~lang~~syntaxhighlight> is parsed as text:="\nA\n";