GSoC Week 6: More full_type shenanigans
25 Jul 2022
This blog post is related to my GSoC 2022 project.
In the previous week, I worked on finishing up as much of the EDL parser as possible, after which I will be able to write
passes to interpret the parsed code and then implement the game_buffer_save
and game_buffer_load
functions.
There was not a very high quantity of code checked in this time, mainly due to my college classes starting leading to me
having less time to do things, and parsing declarations being a new thing leading to me taking my time with interpreting
things. The week was all about full_type
, with regards to plumbing it through the whole type parser I wrote
the week before the previous one. Most of the functions take a pointer to a full_type
, and the only one which
does not is TryParseTypeID()
which returns a full_type
instead.
The way full_type
works is that it is divided into three parts:
-
The first part to store the
jdi::definition
which represents the rule<type-specifier-seq>
-
The second part to store the
jdi::ref_stack
which represents the rule<declarator>
-
The third part to store any modifiers such as
long
,const
andunsigned
Thus, a full_type
wholly represents a variable declaration minus the initializer. After finishing the type
specifier and declarator parsers, the only thing remaining was initializers, and they are pretty simple: there are three
tokens which can start an initializer, equals (=
), opening curly brace ({
) and opening parenthesis
((
).
-
An equals sign can be followed by an expression having the assignment operator’s precedence.
-
An opening curly brace can be followed by an (optionally designated) initializer list, which consists of initializer clauses (either brace initializers or assignment-precedence expressions followed by commas) followed by a closing curly brace.
-
Much like brace initializers, an opening parenthesis can be followed by an initializer list, followed by a closing parenthesis.
Regarding last week’s issue of similar prefixes for abstract and non-abstract declarators, the solution was quite simple: merge the two rules together, which led to me getting rid of the separate code path for abstract declarators entirely and led to three different options for declarator parsing:
-
ABSTRACT
: The declarator definitely does not contain a qualified-id (type arguments to templates) -
NON_ABSTRACT
: The declarator definitely contains a qualified-id (variable declarations) -
MAYBE_ABSTRACT
: The declarator may or may not contain a qualified-id (function parameters)
With <qualified-id>
and <ptr-operator>
, the fix was also pretty trivial: make a
new function TryParseMaybeNestedPtrOperator
, which parses a nested name specifier until it maybe hits a *
,
in which case it treats it as a pointer to member declaration.
After this, only function-parameter and template-argument parsers are left in declarators. I have put the framework in place
to finish implementing them and the hold-up currently is the fact that jdi::ref_stack::parameter
and jdi::arg_key::aug_value
take a jdi::AST
which is not compatible with a enigma::parsing::AST
, so I cannot currently parse
default values for function parameters or NTTP arguments for template parameters. There are two possible solutions to this:
inherit from these two classes and modify them to take an EDL AST, or write methods to transform the EDL AST into a JDI
AST. Of these, the latter is definitely much easier, as the former would require updating pretty much everything touching
jdi::full_type
or the classes that come under it.
As it stands right now, I would estimate roughly 80% of the EDL parser is finished, with the remaining issues being nags
to be discussed with @Josh
. This week, I need to finish new-expressions, delete-expressions, temporary object initialization,
block statements, switch statements and a few missing cases in declarators.
Temporary object initialization will be the hardest challenge of these, as the difference between them and a declarator cannot be established before parsing them, so we need an infinite amount of lookahead to be able to implement it. Consider the following:
int(*(*a)[10]) = nullptr;
and
int(*(*a)[10] + b);
These two are basically the same upto the +
operator. Therefore, there is no way to know what rule we are
parsing when we begin, which is an issue as the code paths for parsing types and temporary object initialization are vastly
different. The only way to fix this is to pass everything to the type parser, and have it bail out with an AST::Node
when it detects any token which can be an operator. This will be done by considering the declarator parsed upto now to be
an expression, and use routines to convert it as such.