GSoC Week 6: More full_type shenanigans

25 Jul 2022

This blog post is related to my GSoC 2022 project.

In the previous week, I worked on finishing up as much of the EDL parser as possible, after which I will be able to write passes to interpret the parsed code and then implement the game_buffer_save and game_buffer_load functions.

There was not a very high quantity of code checked in this time, mainly due to my college classes starting leading to me having less time to do things, and parsing declarations being a new thing leading to me taking my time with interpreting things. The week was all about full_type, with regards to plumbing it through the whole type parser I wrote the week before the previous one. Most of the functions take a pointer to a full_type, and the only one which does not is TryParseTypeID() which returns a full_type instead.

The way full_type works is that it is divided into three parts:

The first part to store the jdi::definition which represents the rule <type-specifier-seq>
The second part to store the jdi::ref_stack which represents the rule <declarator>
The third part to store any modifiers such as long, const and unsigned

Thus, a full_type wholly represents a variable declaration minus the initializer. After finishing the type specifier and declarator parsers, the only thing remaining was initializers, and they are pretty simple: there are three tokens which can start an initializer, equals (=), opening curly brace ({) and opening parenthesis (().

An equals sign can be followed by an expression having the assignment operator’s precedence.
An opening curly brace can be followed by an (optionally designated) initializer list, which consists of initializer clauses (either brace initializers or assignment-precedence expressions followed by commas) followed by a closing curly brace.
Much like brace initializers, an opening parenthesis can be followed by an initializer list, followed by a closing parenthesis.

Regarding last week’s issue of similar prefixes for abstract and non-abstract declarators, the solution was quite simple: merge the two rules together, which led to me getting rid of the separate code path for abstract declarators entirely and led to three different options for declarator parsing:

ABSTRACT: The declarator definitely does not contain a qualified-id (type arguments to templates)
NON_ABSTRACT: The declarator definitely contains a qualified-id (variable declarations)
MAYBE_ABSTRACT: The declarator may or may not contain a qualified-id (function parameters)

With <qualified-id> and <ptr-operator>, the fix was also pretty trivial: make a new function TryParseMaybeNestedPtrOperator, which parses a nested name specifier until it maybe hits a *, in which case it treats it as a pointer to member declaration.

After this, only function-parameter and template-argument parsers are left in declarators. I have put the framework in place to finish implementing them and the hold-up currently is the fact that jdi::ref_stack::parameter and jdi::arg_key::aug_value take a jdi::AST which is not compatible with a enigma::parsing::AST, so I cannot currently parse default values for function parameters or NTTP arguments for template parameters. There are two possible solutions to this: inherit from these two classes and modify them to take an EDL AST, or write methods to transform the EDL AST into a JDI AST. Of these, the latter is definitely much easier, as the former would require updating pretty much everything touching jdi::full_type or the classes that come under it.

As it stands right now, I would estimate roughly 80% of the EDL parser is finished, with the remaining issues being nags to be discussed with @Josh. This week, I need to finish new-expressions, delete-expressions, temporary object initialization, block statements, switch statements and a few missing cases in declarators.

Temporary object initialization will be the hardest challenge of these, as the difference between them and a declarator cannot be established before parsing them, so we need an infinite amount of lookahead to be able to implement it. Consider the following:

int(*(*a)[10]) = nullptr;

and

int(*(*a)[10] + b);

These two are basically the same upto the + operator. Therefore, there is no way to know what rule we are parsing when we begin, which is an issue as the code paths for parsing types and temporary object initialization are vastly different. The only way to fix this is to pass everything to the type parser, and have it bail out with an AST::Node when it detects any token which can be an operator. This will be done by considering the declarator parsed upto now to be an expression, and use routines to convert it as such.

dc03

GSoC Week 6: More full_type shenanigans

Related Posts

GSoC 2023 Week 19 and 20 - That's All, Folks! 29 Sep 2023

GSoC 2023 Week 17 and 18 - Same Old, Same Old 10 Sep 2023

GSoC 2023 Week 15 and 16 29 Aug 2023

GSoC 2023 Week 19 and 20 - That's All, Folks!
29 Sep 2023

GSoC 2023 Week 17 and 18 - Same Old, Same Old
10 Sep 2023

GSoC 2023 Week 15 and 16
29 Aug 2023