GSoC Week 3: Master Of Buffers

This blog post is related to my GSoC 2022 project.

This week, I worked on Part I of my GSoC project, i.e. the buffer functions. In this time I refactored most of the functions, and found out how to make them conform to how it works in GameMaker Studio (GMS).

Firstly, to make serialization and deserialization of data more consistent and easy to refactor, I implemented two functions to do the job: serialize_to_type() and deserialize_from_type().

These two functions are pretty simple:

After this, I did some refactoring, where I renamed the macros get_buffer and get_bufferr to GET_BUFFER and GET_BUFFER_R respectively. These are the macros used to access buffers, and in debug mode, they are responsible for checking if the buffer being accessed actually exists. Then, I changed the global enigma::buffers variable to be a AssetArray<BinaryBufferAsset> instead of a std::vector<BinaryBuffer*>.

AssetArray is effectively to ENIGMA what std::vector is to the C++ standard library. It allows making a container of assets, which require three methods to be implemented:

These methods make it possible to delete elements from the sequence without having to use the erase-remove idiom, which can be quite slow as it is O(n) in terms of time complexity. To implement these methods without changing the BinaryBuffer class, I created the BinaryBufferAsset class. This class simply wraps a BinaryBuffer inside a std::unique_ptr, and the methods manipulate the smart pointer. The nullptr value is considered as the BinaryBuffer within being “destroyed”.

I also gave types to three of the enums which are used for the bufer constants:

Previously, all these values were just passed around as standard integers. The problem with this was that you could pass the wrong type of value to a function and there would be no error as the enum value would be implicitly cast to an int. In fact, it was possible to not even realize that there was an error, as the erroneous value could have had the same integer value as the correct type of value.

After that, I added the ability to have sub-directories of tests within the SimpleTests/ directory. Previously, the test harness would only check for directories ending with .sog in the top level directory. However, as I have decided to make the tests for each buffer function its own separate test, having it all in the main tests directory would lead to more crowding and would make it harder to navigate through the tests. Therefore, when the test harness finds a directory ending with the .multi extension, it considers all the directories within that directory as tests too. This means that I can group all the buffer tests, each of which begin with the “buffer_” prefix into the buffers.multi sub-directory.

Finally, after having set all this up, I began work on the buffer functions themselves. The problems here did not come from the implementation being complex (in fact the functions are quite trivial in terms of functionality), they came from wanting to conform with how GMS works so that transitioning from it to ENIGMA would be easier.

So, a quick changelog of each function:

Out of all of these, buffer_fill() has caused the most pain till now. While not really difficult to implement, getting it to conform (mostly) with GMS’s implementation was not fun. In fact, I have had 6 commits till now, updating it whenever I find a new edge case and test it in GMS to make sure my implementation conforms properly.

I think the reason that I have been having these issues is mainly because of buffer alignment and how I think about it versus how GMS thinks about it. Initially, I implemented buffer alignment by adding padding bytes after the written data, so that the next element being written was always aligned.

What is interesting is that GMS does the opposite: it writes padding bytes until the data being written would be aligned, and only then does it write the data itself. This means that instead of writing padding for the next element, GMS writes padding for the current element. Logically, it makes more sense as it uses less space, however it was confusing when I first encountered it when seeing buffer_fill() not write zeroes after it wrote data to a buffer.

I have tried to validate the way these functions work as much as I can, by running GMS inside a Windows VM to test how the reference implementation works, and also by writing as many tests as I can under the buffers.multi group. Hopefully, these tests cover enough edge cases so that the buffer functions work properly in normal use. The functions currently remaining for refactoring are:

The functions currently not implemented at all are:

Of these, the ones I will definitely implement are:

I am also interested in buffer_load_partial() and buffer_set_used_size(). Hopefully, I can finish these functions this week, and get back to working on the EDL parser, so that I can have as much time as possible to work on Part II of my GSoC project before college begins.

Incompatibilities with GMS: