Talk:S-expressions

From Rosetta Code
Revision as of 20:13, 17 October 2011 by rosettacode>Paddy3118 (→‎So... what is the point here?: nested lists of string/float/int)

Symbols and strings

To be more generally useful, it's probably better to distinguish between quoted and unquoted strings instead of giving numbers special treatment. 0x1, 1d0, 13#4bc, 1.3f, 1_000 may or may not be parsed as numbers depending on what the definition of literal numbers is, and can be deferred to a separate step -- as long as the parse remembers that they are not quoted. On the other hand, it's more likely than not that "data" and data mean completely different things, so the parser better remember that information instead of making it optional. --Ledrug 10:48, 16 October 2011 (UTC)

you are of course right, i just didn't want to make the task to hard. in languages that don't support symbols, an object would need to be created, if that can be done. otherwise, how can a symbol be represented?
That's the task implementor's problem, isn't it? What do you think the S stands for in S-expression? String expression? I think not. To call symbols unquoted strings is beyond laughable. A correct implementation of this task replaces A with pointers to the same object in (A A A). There is no S-expression without interning.24.85.131.247 04:51, 17 October 2011 (UTC)
that is not even true in lisp: in common lisp (A A A), the first A is a function and the other two A are a variable.
s-expressions are just data. the interpretation of the meaning of atoms in an s-expression is entirely up to the application. in this application they are strings.--eMBee 05:37, 17 October 2011 (UTC)
whether it is useful to distinguish between quoted and unquoted strings also depends on what is done with the input. unless you age writing an interpreter of sorts, the input is just data. and if the language can only handle strings as data, then what good is it to have a special representation for unquoted strings?
but if anyone wants to distinguish between quoted and unquoted strings and skip numbers instead, they are free to do so--eMBee 12:00, 16 October 2011 (UTC)
Well, without context, it's not like you can actually expect the code to do something useful to the symbols anyway. All the parser needs to do is distinguish between "123" and 123, "data" and data, just stick a is_quoted flag somewhere on the strings. If your usage later needs to tell symbols from strings, look that flag up; if not, it does no harm. For numbers, just assume you can check the unquoted strings and see if they match some patterns later. It's probably simpler this way and more language neutral (parsing numbers is likely language dependent). --Ledrug 13:37, 16 October 2011 (UTC)
that is not as easy as it sounds, not every language can associate flags with strings without creating a new class, in which case it usually isn't a string anymore.
take a look at the pike example, now that i introduced the Symbol class without handling numbers in the parser, all but quoted strings become Symbols and i find myself having to emulate not only strings but numbers (still incomplete) as well. if i would parse numbers upfront i could store them as such and the Symbol class would be simpler. i will still have to deal with strings in the Symbol class, but with numbers out of the way i could require explicit casting to use Symbols as strings. as long as Symbols can contain numbers the Symbol class has to tell if it is a string, int or float and behave accordingly, because if i have to check what the type of the symbol is before i can use it that would just be to cumbersome.--eMBee 14:03, 16 October 2011 (UTC)
i could of course stick all tokens into a token class and handle the conversion at a later step. but then in order to use the input that conversion step is mandatory and i just end up with more complicated code for something that was supposed to be simple.--eMBee 14:15, 16 October 2011 (UTC)

So... what is the point here?

Is the point here to represent data in a way which is natural to the language (thus, for example, allowing the language to throw errors for unquoted character sequences which have been reserved), or is it to emulate another system? And are we producing a result which displays pleasantly, or do we want type annotations? And if we want type annotations, what types do we support and when do we use them? (In my experience, S expressions are simple only if you ignore most of the details of how they are implemented, and the task description seems ambivalent about where to draw the line. That said, any concept is simple once you understand it, but here I am focusing on the task description and not on the abstract concepts.) --Rdm 17:34, 17 October 2011 (UTC)

It seemed straight-forward for Python as I was given sample input and a specific example of what the intermediate Python datastructure has to be. (Nested listswith ints as ints, floats as floats, strings for the rest). Maybe it needs to be emphasisized for other languages too? --Paddy3118 19:16, 17 October 2011 (UTC)
So am I supposed to emulate python's data language (and, if so, where is the specification for that)? Or am I supposed to use native types (which do not precisely match the word syntax being asked for, but would allow support for things like symbols, rational numbers, complex numbers and representations of functions)? Anyways, for now, I am not implementing any data language, since none was specified. --Rdm 19:22, 17 October 2011 (UTC)
If your language has native support for nested lists of floats, ints and strings, then wouldn't that be enough to emulate the python? (Or nested lists of a variant type that could hold a string/float/int)? --Paddy3118 20:13, 17 October 2011 (UTC)