Talk:S-expressions

Symbols and strings

To be more generally useful, it's probably better to distinguish between quoted and unquoted strings instead of giving numbers special treatment. 0x1, 1d0, 13#4bc, 1.3f, 1_000 may or may not be parsed as numbers depending on what the definition of literal numbers is, and can be deferred to a separate step -- as long as the parse remembers that they are not quoted. On the other hand, it's more likely than not that "data" and data mean completely different things, so the parser better remember that information instead of making it optional. --Ledrug 10:48, 16 October 2011 (UTC)

you are of course right, i just didn't want to make the task to hard. in languages that don't support symbols, an object would need to be created, if that can be done. otherwise, how can a symbol be represented?

That's the task implementor's problem, isn't it? What do you think the S stands for in S-expression? String expression? I think not. To call symbols unquoted strings is beyond laughable. A correct implementation of this task replaces A with pointers to the same object in (A A A). There is no S-expression without interning.24.85.131.247 04:51, 17 October 2011 (UTC)

that is not even true in lisp: in common lisp (A A A), the first A is a function and the other two A are a variable.

s-expressions are just data. the interpretation of the meaning of atoms in an s-expression is entirely up to the application. in this application they are strings.--eMBee 05:37, 17 October 2011 (UTC)

whether it is useful to distinguish between quoted and unquoted strings also depends on what is done with the input. unless you age writing an interpreter of sorts, the input is just data. and if the language can only handle strings as data, then what good is it to have a special representation for unquoted strings?

but if anyone wants to distinguish between quoted and unquoted strings and skip numbers instead, they are free to do so--eMBee 12:00, 16 October 2011 (UTC)

Well, without context, it's not like you can actually expect the code to do something useful to the symbols anyway. All the parser needs to do is distinguish between "123" and 123, "data" and data, just stick a is_quoted flag somewhere on the strings. If your usage later needs to tell symbols from strings, look that flag up; if not, it does no harm. For numbers, just assume you can check the unquoted strings and see if they match some patterns later. It's probably simpler this way and more language neutral (parsing numbers is likely language dependent). --Ledrug 13:37, 16 October 2011 (UTC)

that is not as easy as it sounds, not every language can associate flags with strings without creating a new class, in which case it usually isn't a string anymore.

take a look at the pike example, now that i introduced the Symbol class without handling numbers in the parser, all but quoted strings become Symbols and i find myself having to emulate not only strings but numbers (still incomplete) as well. if i would parse numbers upfront i could store them as such and the Symbol class would be simpler. i will still have to deal with strings in the Symbol class, but with numbers out of the way i could require explicit casting to use Symbols as strings. as long as Symbols can contain numbers the Symbol class has to tell if it is a string, int or float and behave accordingly, because if i have to check what the type of the symbol is before i can use it that would just be to cumbersome.--eMBee 14:03, 16 October 2011 (UTC)

i could of course stick all tokens into a token class and handle the conversion at a later step. but then in order to use the input that conversion step is mandatory and i just end up with more complicated code for something that was supposed to be simple.--eMBee 14:15, 16 October 2011 (UTC)