Quoting constructs: Difference between revisions
m
syntax highlighting fixup automation
Thundergnat (talk | contribs) m (syntax highlighting fixup automation) |
|||
Line 11:
The following uses VASM syntax for quoting constructs. There is no built-in support for interpolation, escape characters, etc. What constitutes as an escape character depends on the code that is using embedded strings. A function that needs this embedded data can take as an argument a pointer to the data, which can easily be obtained by loading the low byte and high byte of the label into consecutive zero page RAM locations.
<
MyString: db "Hello World!",0 ;a null-terminated string
GraphicsData: incbin "C:\game\gfx\tilemap.chr" ;a file containing the game's graphics</
Most 6502 assemblers use Motorola syntax, which uses the following conventions:
Line 23:
6502 Assembly uses <code>db</code> or <code>byte</code> for 8-bit data and <code>dw</code> or <code>word</code> for 16-bit data. 16-bit values are written by the programmer in big-endian, but stored little-endian. For example, the following two data blocks are equivalent. You can write it either way, but the end result is the same.
<
db $CD,$AB</
Most assemblers support "C-like" operators, and there are a few additional ones:
Line 31:
These two operators are most frequently used with labeled memory addresses, like so:
<
byte <Table00,<Table01,<Table02
lookup_table_hi:
byte >Table00,>Table01,>Table02</
Line 44:
* Multiple values can be put on the same line, separated by commas. <code>DC._</code> only needs to be before the first data value on that line. Or, you can put each value on its own line. Both are valid and have the same end result when the code is assembled. You should always follow byte data with EVEN which will add an extra byte of padding if the total number of bytes before it was odd. This is necessary for your code to comply with the CPU's alignment rules.
<
DC.B $01,$02,$03,$04,$05
even
Line 56:
MyString:
DC.B "Hello World!",0 ;a null terminator will not be automatically placed.
even</
<code>DS._</code> represents a sequence of space. The number after it specifies how many bytes/words/longs' worth of zeroes to place. Some assemblers support values besides zero, others do not.
<
DS.W 16 ;16 words, each equals zero
DS.L 20 ;20 longs, each equals zero</
In addition to constants, a label can also be specified. If a label is defined with an <code>EQU</code> statement, the label will be replaced with the assigned value during the assembly process.
<
MOVE.W (MyData),D0
MyData
DC.W ScreenSize</
Code labels, on the other hand, get replaced with the memory address they point to. This can be used to make a lookup table of various data areas, functions, etc. Since there is no "24-bit" data constant directive, you'll have to use <code>DC.L</code> for code labels. The top byte will always be zero in this case.
<
;insert your code here
FunctionTable:
DC.L PrintString ;represents the address of the function "PrintString"</
Most constants can be derived from compile-time expressions, which are useful for explaining what the data actually means. The expressions are evaluated during the assembly process, and the resulting object code will have these calculations already completed, so your program doesn't have to waste time doing them. Most "C-like" operators are supported, but as always the exact syntax depends on your assembler. Parentheses will aid the assembler in getting these correct, but sometimes it still doesn't do what you expect.
<
DC.W 4+5 ;evaluates to $0009
DC.W (40*30)-1 ;evaluates to $1199
DC.L MyFunction+4 ;evaluates to the address of MyFunction, plus 4.</
We can use this technique to get the length of a region of data, which the assembler can calculate for us.
<
DC.B $11,$11,$11,$11,$11,$11,$11,$11,$11,$11
DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01
Line 108:
;gets the length of this region of memory, minus 1, into D0.
; Again, even though the "operands" of this expression are longs,
; their difference fits in 16 bits and that's all that matters.</
Line 123:
=== Quoted constructs within expressions ===
<syntaxhighlight lang="text">? 0 : ? -326.12E-5 : ? HELLO : ? "HELLO" : ? "HELLO</
The literal HELLO is interpreted as a variable name, and it's value 0 is printed.
{{out}}
Line 134:
</pre>
=== Quoted constructs within DATA statements ===
<syntaxhighlight lang="text"> 10 DATA 0,-326.12E-5,HELLO,"HELLO","HELLO
20 READ A%: PRINT A%: READ A: PRINT A: READ A$: PRINT A$: READ A$: PRINT A$: READ A$: PRINT A$
30 DATA AB"C
40 READ A$: PRINT A$</
{{out}}
<pre>
Line 152:
* Character
<
'b'
@</
<code>@</code> is a symbol that represents the null character. Characters can contain a newline(<code>@+10</code> is recommended, however).
* Number
<
1.23
123E5
¯1234
∞
π</
<code>∞</code>, <code>¯∞</code> and <code>π</code> are constants which represent infinity, negative infinity and pi.
Line 175:
* Array: consists of any of the above.
** Regular array notation <syntaxhighlight lang
** Stranding <syntaxhighlight lang
** Strings <
"Quoted "" String"</
=={{header|FreeBASIC}}==
{{trans|Ring}}
<
'Function taken fron the https://www.freebasic.net/forum/index.php
Line 212:
Print !"quoted text:\n"; substr(text(n),"'",""); !"\n"
Next n
Sleep</
{{out}}
<pre>Same as Ring input.</pre>
=={{header|Go}}==
<
import (
Line 294:
}
fmt.Println(os.Expand("There are ${NUMBER} quoting ${TYPES} in Go.", mapper))
}</
{{out}}
Line 319:
* A sequence of numbers, for example <tt>1 2 3</tt>
* A sequence of characters, for example <tt>'1 2 3'</tt>
* A newline terminated multiline script, for example: <
1 2 3
4 5 6
)</
* (in recent J versions), an embeddable newline terminated multiline script, for example: <
1 2 3
4 5 6
}}</
Note that a multiline <code>{{)n</code> construct discards the leading newline, but the construct can also be used for single line strings (more concise than <code>'</code> delimited strings when <code>'</code> appears multiple times in the string). For example <code><nowiki>{{)n1 2 3}}</nowiki></code> is the same value as <code>'1 2 3'</code>.
Line 336:
The multiline scripts are special cases of the mechanisms for defining verbs, adverbs and conjunctions (what might be called functions or macros or operators or procedures in other languages) which instead provide the raw characters of the definition. The old form (beginning with <tt>0 : 0</tt> and ending with a line containing a single right parenthesis and no other displayable characters) is different from the new form (beginning with <tt>{{)n</tt> and ending with a line which has <tt>}}</tt> and no other characters preceding it) in the way that any following part of a surrounding sentence is arranged. These values of <tt>A</tt> would be equivalent:
<
A=: '1 2 3'
Line 347:
A=: {{)n
1 2 3
}}-.LF</
Also, the <nowiki>{{}}</nowiki> forms are nestable. So, for example, this would also define an equivalent value for <tt>A</tt>:
<syntaxhighlight lang="j">{{
{{
A=: {{)n
Line 357:
}}-.LF
}}''
}}''</
Here, we are defining verbs inline and immediately evaluating them (by providing an argument (which is ignored because it is not referenced)).
Line 367:
of such values. Such data can be included in a jq program wherever an expression is allowed, but
in a jq program, consecutive JSON values must be specified using "," as a separator, as shown in this snippet:
<
"A string", 1, {"a":0}, [1,2,[3]]
;
</
using the infix "+" operator, e.g. <
"This is not such a"
+ "long string after all."</
"Raw data", such as character strings that are not expressed as JSON strings,
cannot be included in jq programs textually but must be "imported" in some manner, e.g. from
Line 386:
* String literals are delimited by double quotes or triple double quotes:
<
julia> str = "Hello, world.\n"
"Hello, world.\n"
Line 393:
a newline"""
"Contains \"quote\" characters and \na newline"
</syntaxhighlight>
* Both single and triple quoted strings are may contain interpolated values. Triple-quoted strings are also dedented to the level of the least-indented line. This is useful for defining strings within code that is indented. For example:
<
julia> str = """
Hello,
Line 402:
"""
" Hello,\n world.\n"
</syntaxhighlight>
* Julia allows interpolation into string literals using $:
<
julia> "$greet, $whom.\n"
"Hello, world.\n"
</syntaxhighlight>
* The shortest complete expression after the $ is taken as the expression whose value is to be interpolated into the string. Thus, you can interpolate any expression into a string using parentheses:
<
julia> "1 + 2 = $(1 + 2)"
"1 + 2 = 3"
</syntaxhighlight>
* Julia reserves the single quote ' for character literals, not for strings:
<
julia> 'π'
'π': Unicode U+03C0 (category Ll: Letter, lowercase)
</syntaxhighlight>
* Julia requires commands sent to functions such as run() be surrounded by backticks. Such expressions create a Cmd object, which is used for running a child process from Julia:
<
julia> mycommand = `echo hello`
`echo hello`
Line 428:
julia> run(mycommand);
hello
</syntaxhighlight>
* Julia uses the colon : in metaprogramming for quoting symbols and other code:
<
julia> a = :+
:+
Line 454:
julia> eval(c)
5
</syntaxhighlight>
=={{header|Lua}}==
Lua has three string definition syntaxes: single- and double-quotes, which are equivalent; and long-bracket pairs [[ ]] which may span multiple lines. Long-bracket pairs may be specified to an arbitrary depth, which may be useful for quoting Lua source code itself (which might use long-brackets). Lua strings are variable-length arrays of bytes, not 0-terminated (as in C), so may contain aribitrary raw binary data. Commonly escaped characters and octal\hexadecimal notation are supported.
<
s2 = 'This is a single-quoted "string" with embedded double-quotes.'
s3 = "this is a double-quoted \"string\" with escaped double-quotes."
Line 479:
print(s8)
print(s9) -- with audible "bell" from \7 if supported by os
print("some raw binary:", #s9, s9:byte(5), s9:byte(12), s9:byte(17))</
{{out}}
<pre>This is a double-quoted 'string' with embedded single-quotes.
Line 504:
Tuples literals are defined as a list of values between parentheses. Field names may be specified by preceding a value by the name followed by a colon.<br/>
<syntaxhighlight lang="nim">
echo "A simple string."
echo "A simple string including tabulation special character \\t: \t."
Line 542:
# Tuples.
echo ('a', 1, true) # Tuple without explicit field names.
echo (x: 1, y: 2) # Tuple with two int fields "x" and "y".</
{{out}}
Line 570:
Back-ticks and triple-quotes are used for multi-line strings, without backslash interpretation, eg
<!--<
<span style="color: #008080;">constant</span> <span style="color: #000000;">t123</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">`
one
Line 576:
three
`</span>
<!--</
or (entirely equivalent, except the following can contain back-ticks which the above cannot, and vice versa for triple quotes)
<!--<
<span style="color: #008080;">constant</span> <span style="color: #000000;">t123</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">"""
one
Line 586:
three
"""</span>
<!--</
Both are also equivalent to the top double-quote one-liner. Note that a single leading '\n' is automatically stripped.<br>
Line 593:
You can also declare hexadecimal strings, eg
<!--<
<span style="color: #000000;">x</span><span style="color: #008000;">"1 2 34 5678_AbC"</span> <span style="color: #000080;font-style:italic;">-- same as {0x01, 0x02, 0x34, 0x56, 0x78, 0xAB, 0x0C}
-- note however it displays as {1,2,52,86,120,171,12}
-- whereas x"414243" displays as "ABC" (as all chars)</span>
<!--</
Literal [http://phix.x10.mx/docs/html/sequences.htm sequences] are represented with curly braces, and can be nested to any depth, eg
<
{1, 2, {3, 3, 3}, 4, {5, {6}}}
{{"John", "Smith"}, 52389, 97.25}
{} -- the 0-element sequence</
=={{header|Raku}}==
Line 691:
The different types (or styles) of incorporating quoted constructs are a largely matter of style.
<
a= 'This is one method of including a '' (an apostrophe) within a string.'
b= "This is one method of including a ' (an apostrophe) within a string."
Line 751:
/*the variable L (with an */
/*intervening blank between each */
/*variable's value. */</
=={{header|Ring}}==
{{incomplete|Ring|<u>Explain</u> where they would likely be used, what their primary use is, what limitations they have and why one might be preferred over another. Is one style interpolating and another not? Are there restrictions on the size of the quoted data? The type? The format?}}
<syntaxhighlight lang="ring">
text = list(3)
Line 767:
see "quoted text:" + nl + str + nl + nl
next
</syntaxhighlight>
{{out}}
text for quoting: <br>
Line 787:
====Characters====
Character literals are prefixed by a "$". Conceptionally, they are the elements of strings, although effectively only the codePoint is stored in strings. But when accessing a string element, instances of Character are used. Characters can be asked for being uppercase, lowercase, etc.
<
$Å
$日</
====Strings====
String literals are enclosed in single quotes. Conceptionally, they holde instances of Character as element, but actually the underlying storage representation is chosen to be space effective. Typically, underneath are classes like SingleByteString, TwoByteString and FourByteString, but this is transparent to the programmer. Strings can hold any Unicode character; UTF8 is only used when strings are exchanged with the external world (which is good, as it makes operations like stringLength much easier).
<
'日本語 </
Traditional Smalltalk-80 does not support any escapes inside strings, which is inconvenient, occasionally.<br>Smalltalk/X supports an extended syntax for C-like strings:
{{works with|Smalltalk/X}}
<
and also embedded expressions:
<
====Arrays====
Literal arrays are written as #(...), where the elements are space separated; each literal array element can be any type of literal again, optionally omitting the '#'-character:
<
Here, the third element is a fraction, followed by the symbol #'foo', two arrays, a character, another string, the boolean true, a byteArray and the boolean false.
====ByteArrays====
A dense collection of byte valued integers is written as #[..]. Conceptionally, they are arrays of integer values in the range 0..255, but use only one byte per element of storage. They are typically used for bulk storage such as bitmap images, or when exchanging such with external functions.
<
====Symbols====
These are like symbol atoms in Lisp/Scheme, written as #'...' (i.e. like a string with hash prefix).
If the characters do not contain special characters or are of the form allowed for a message selector, the quotes can be omitted. Symbols are often used as key in dictionaries, especially for message selectors and global/namespace name bindings. They can be quickly compared using "==", which is. a pointer compare (identity) instead of "=" which is compares the contents (equality).
<
#'foo bar baz'
#foo. " same as #'foo' "
#'++'
#++ " same as #'++' "
#a:b:c: " same as #'a:b:c:' "</
====Blocks====
Line 825:
Blocks thus represent a piece of code which can be stored in an instance variable, passed as argument or returned from a method.
<br>Block syntax is very compact:
<
or for a block with arguments:
<
Blocks are one of the fundamental building blocks of Smalltalk (no pun here), as the language (Compiler) does not specify any syntax for control structures. Control structures like if, while, etc. are all implemented as library functions, and defined eg. in the Boolean, Block or Collection classes.
<br>
If you have a block at hand, it can be evaluated by sending it a "value"message:
<
anotherBlock value:1 value:2. "evaluate the block, passing two arguments"</
The most basic implementation of such a control structure is found in the Boolean subclasses True and False, which implement eg. "<tt>ifTrue:arg</tt>" and "<tt>ifFalse:</tt>". Here are those two as concrete example:
<
ifTrue: aBlock
^ aBlock value "I am true, so I evaluate the block"
Line 840:
in the False class:
ifTrue: aBlock
^ nil "I am false, so I ignore the block"</
Thus, the expression <tt>"someBoolean ifTrue:[ 'hello print' ]"</tt> will either evaluate the lambda or not, depending on the someBoolean receiver.
Obviously, you can teach other objects on how to respond to "value" messages and then use them as if they where blocks.
Line 851:
====Inline Object====
{{works with|Smalltalk/X}}
<
foo: <someConstant>
bar: <someConstant>
}</
Generates a literal constant instance of an anonymous class, with two instance vars: foo and bar.
The object is dumb in that it only provides getter and setter functions. These are used eg. when returning structured multiple values from a method.
Line 862:
Similar to byteArrays, there are dense arrays of ints, floats, doubles or bits (i.e. they use much less memory compared to regular arrays, which hold pointers to their elements). They are also perfect when calling out to C-language functions. The syntax is analogous to the Scheme language's syntax:
{{works with|Smalltalk/X}}
<syntaxhighlight lang="text">#u16( 1 2 3 ). " an array of unsigned int16s "
#u32( 1 2 3 ). " an array of unsigned int32s "
#u64( 1 2 3 ). " an array of unsigned int64s "
Line 872:
#f64( -1 2.0 3 ). " an array of float64s "
#b( 1 0 1 1 0 0 ). " an array of bits "
#B( true false true true ). " an array of booleans "</
=={{header|Wren}}==
Line 893:
Here are some examples of all this.
<
// simple string literal
Line 922:
"""
System.print(r)
</syntaxhighlight>
{{out}}
Line 944:
the null terminator is the only way the CPU knows it will end (assuming that your "putS" routine uses a null terminator.)
<
byte "Hello World",0 ;a null-terminated string
LookupTable:
byte &03,&06,&09,&0C ;a pre-defined sequence of bytes (similar in concept to enum in C)
TileGfx:
incbin "Z:\game\gfx\tilemap.bmp" ;a file containing bitmap graphics data</
For most Z80 assemblers, the following are standard:
Line 956:
Z80 Assembly uses db or byte for 8-bit data and dw or word for 16-bit data. 16-bit values are written by the programmer in big-endian, but stored little-endian. For example, the following two data blocks are equivalent. You can write it either way, but the end result is the same.
<
word $CD,$AB</
Most assemblers support "C-like" operators, and there are a few additional ones:
Line 964:
* > or HIGH() means "The high byte of." For example, <$78AB evaluates to 78.
These two operators are most frequently used with labeled memory addresses, like so:
<
byte <Table00,<Table01,<Table02
lookup_table_hi:
byte >Table00,>Table01,>Table02</
The <code>incbin</code> directive can be used for embedding raw graphics data, text, or music.
Line 989:
A #<<<" beginning tag prepends a " to the block.
For example:
<
text:=
"
A
";
#<<<</
is parsed as text:="\nA\n";
|