Talk:Compiler/Simple file inclusion pre processor

What is the task?

Currently, there's no specific feature set required of a preprocessor. So how would we distinguish between an adequate and inadequate implementation? --Rdm (talk) 15:43, 22 July 2022 (UTC)

Good point - I've split the task description (which I hope isn't too TLDR) into sections, including one on the minimum requirements. --Tigerofdarkness (talk) 19:51, 22 July 2022 (UTC).

wrong language

This task would probably be more useful and help compare different languages better (the point of this site) if it handled the same syntax used by Compiler/Sample_programs, rather than ALGOL 68 processes ALGOL 68, C processes C, PL/1 processes PL/1, Phix processes Phix, etc. --Pete Lomax (talk) 17:35, 6 June 2021 (UTC)

You have a point, but by processing the actual language, it compares the language's pre-processor facilities and also how hard pre-processing the source is.

I strongly suspect that the syntax of Algol 68, PL/1 and COBOL (for example) makes the implementation harder than a C pre-processor would be, which was part of my motivation.

To pre-process Algol 68, PL/1 and COBOL for this task requires some level of lexical analysis, whilst a C pre-processor need only find lines that start with "#include" ( possibly with spaces before and after the "#" ). --Tigerofdarkness (talk) 18:25, 6 June 2021 (UTC)

Thanks for considering and attempting the task, BTW. --Tigerofdarkness (talk) 18:36, 6 June 2021 (UTC)

As per the abortive Phix entry each submission c/should explain some of the issues that might arise were it applied to the specific language itself instead of the toy C.

(And at least in the Phix case extol the virtues of any existing builtin mechanisms that provide similar functionality.)

The task is about how the builtin mechanism can be provided - the Phix compiler (as I understand it) is written in Phix and so presumably the facilities are provided by code in Phix? --Tigerofdarkness (talk) 21:42, 6 June 2021 (UTC)

It would without question still require some form of lexical analysis anyway, eg

/*
#include tests
*/
#include hello_world.t
#include phoenix_number.t
print("#include tests completed\n")

with expected output of

Hello, World!
142857
#include tests completed

I would specifically state the task need not attempt to resolve any issues caused by using/declaring the same identifiers in multiple files.

I (still) think the task should require toy C to be targetted, with any host-language-specific code parked on sub-pages. --Pete Lomax (talk) 20:17, 6 June 2021 (UTC)

I'm not sure why you mention "issues caused by using/declaring the same identifiers in multiple files" - my mention of "Many programming languages allow file inclusion, so that for example, standard declarations or code-snippets can be stored in a separate source file and be included into the programs that require them" was meant to indicate why pre-proessors were used.

Phix includes are granted a new scope, so if the existing handling was replaced with a simple preprocessor it would break it completely. Basically then my entry is going to have to stand as it is. --Pete Lomax (talk) 23:48, 6 June 2021 (UTC)

Some lexical analysis is required to parse #include lines, but not as much as in e.g. PL/1 where you could see: main: procedure; %include "someCode.incl.pl1"; end main;.

I still argue that the task as written is a task that compares languages. In particular it compares the pre-processing facilities of languages.

The Self-hosting compiler task is a much harder, language specific task - it doesn't for example, say write a C compiler in your language. The Compiler/lexical_analyzer and related tasks are about writing a compiler for a simple C like language. The Algol 68 pre-processor in Algol 68 is similar in size to one of those tasks, so I think this task is not too large for an RC task.

Also, I doubt that a C pre-processor written in anything but C would get much use for anything other than bootstrapping... --Tigerofdarkness (talk) 21:42, 6 June 2021 (UTC)

I believe our sample language could use one line: #include "file" and #define call value. Usage could be:

~~ Header.h ~~
#define area(h, w) h*w

~~ Source.t ~~
#include "Header.h"
#define width 5
#define height() 6
area = #area(height, width)#;

yielding code output of:

area = 6*5;

Or:

/* Include Header.h */
/* Define area(h, w) as h*w */
/* End Header.h */
/* Define width() as 5 */
/* Define height() as 6 */
/* Use area, height, and width */
area = 6*5;

This should be in addition to the existing preprocessor definition as both are useful demonstrations. This new one providing identical language support to our sample language in multiple source languages.

Hi Jwells - thanks for considering this Task. When I created it I thought it would be a way to compare how different languages (could) handle file inclusion (not that all do). The difficulty of handling e.g. PL1's %Include compared to e.g. C's #include seemed an interesting way of comparing languages. I also thought having a set of file-inclusion pre-processors available on RC would make it easier for the solutions to other tasks to use file inclusion for common things. That seemed a good idea at the time...

Whilst the task isn't intended to be "Implement the C pre-processor in your language", for languages that don't have their own specific syntax, that is an option for this task. I see from your User page that you are doing stuff in Kotlin - which I imagine doesn't have in built file inclusion - if you wanted to implement a C-style #include in Kotlin, that would be nice.

I was hoping to keep the task reasonably simple so it didn't require implementation of #define or #if-#endif etc., however I agree they are useful things to have. I gather that the C pre-processor is used in in some current Fortran programming, for example. Maybe implementing a macro processor should be a task too ?

BTW, you should probably sign your comments - add --~~~~ (without the "nowiki" tags) to the end of the comment.

--Tigerofdarkness (talk) 12:23, 22 May 2022 (UTC)