GSoC Week 4: Buffer, buffer on the wall, who's the fairest of them all?
11 Jul 2022This blog post is related to my GSoC 2022 project.
In the previous week, I worked on finishing up the functions which I covered in last week’s writeup. These included functions for saving buffers, copying them, calculating checksums, compression / decompression and encoding.
buffer_copy
, buffer_save
and buffer_save_ext
are pretty mundane functions, and
they work pretty intuitively.
-
buffer_copy
takes two buffers, offsets into them, and the number of bytes to write. If either of the buffers arebuffer_wrap
, it wraps around whenever required. Withbuffer_grow
, it expands only the destination buffer if required, and truncates the number of bytes written if it goes past the source buffer end. In the case ofbuffer_fixed
andbuffer_fast
, it does the same, except it does not expand the destination buffer. -
buffer_save
takes a buffer, and a file name to write to. Just does a byte-by-byte write to the file. -
buffer_save_ext
takes a buffer, an offset into it, and a file to write to. It truncates the number of bytes written if it goes off of the end of the buffer (even inbuffer_wrap
’s case), however it also wraps around the offset if the buffer isbuffer_wrap
.
Both the buffer save functions truncate the file on writing, i.e. the previous contents of the file are cleared
before writing to it. One interesting optimization that comes into effect here is using the native std::ofstream::write
function versus using std::transform
with a std::ofstreambuf_iterator
. The problem lies in the
buffers being a std::vector<std::byte>
, whereas the file writing functions expect a stream of const char *
.
With std::ofstream::write
, the code looks like this:
file.write(reinterpret_cast<const char *>(bytes.data()), size);
With std::transform
, the code looks like this:
std::transform(bytes.begin(), bytes.end(), std::ostreambuf_iterator<char>(file),
[](std::byte b) { return static_cast<char>(b); });
where file
is a std::ofstream
and bytes
is a std::vector<std::byte>
.
With std::transform
, which I would say is the idiomatic way to do things, the bytes are written one-by-one,
which means that it will be harder to have fast batch writes and instead can lead to multiple writes being done at a
slower speed. Therefore, I made two functions, write_to_file
and read_from_file
, which do a
reinterpret_cast
before calling std::ofstream::write
and std::ifstream::read
.
While I have not benchmarked this, it should allow the filesystem or the kernel to schedule writes and reads more efficiently instead of one byte at a time. The only problem is that these functions are pretty unsafe as they take a pointer and a number of bytes, so there is no way to verify if there is any undefined behaviour in accessing unowned memory. It is up to the caller to ensure that everything is within the correct bounds.
With these three functions done, I moved onto buffer_base64_encode
and buffer_base64_decode
,
which are the functions which (again, pretty intuitively) handle encoding and decoding a buffer in the base64 format. For
these functions, I wrote two functions to abstract the logic, internal_buffer_base64_encode
and internal_buffer_base64_decode
,
though I do wonder how useful they really are given that they can just be inlined into the main functions. Anyways,
base64 encoding is quite interesting. Instead of being used for encrypting data or compressing it, base64 encoding is done
to make sure that data can be transmitted across machines where the only guarantee is they all understand the ASCII encoding.
To do this, base64 considers the input sequence as sets of triplets of bytes (i.e. 24 bits) and converts them into quadruplets
of 6-bit units. The reason 3 bytes are used is because 24 is the lowest common multiple of 8 bits and 6 bits, and the reason
6-bit units are used is because the range of values that can be stored in 6 bits is 0 to 63, which means that each possible
value can be mapped conveniently to 26 uppercase ASCII English characters, 26 lowercase ASCII English characters, 10 numeric
ASCII digits and 2 extra characters (usually +
and /
). Then, these characters are written to a
string and transmitted.
The mapping from bits to ASCII is exactly in the order described before:
-
0:
A
, 1:B
, … 25:Z
-
26:
a
, 27:b
, … 51:z
-
52:
0
, 53:1
, … 61:9
-
62:
+
, 63:/
In case the size of the input is not a multiple of 3, two situations can occur:
-
One extra byte: The top 6 bits of the byte are used for the first ASCII character, and the lower 2 bits are used as the top two bits of the next character. Two equals signs (
=
) are written as padding characters. -
Two extra bytes: Doing the same thing as one extra byte, however in this case 3 characters are written as two bytes take up 16 bits which require at least 3 6-bit units. An extra equals sign is written as the padding character.
Examples for each case:
input | s | t | r | |||||||||||||||||||||
decimal | 115 | 116 | 114 | |||||||||||||||||||||
bits | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 0 |
decimal | 28 | 55 | 17 | 50 | ||||||||||||||||||||
output | c | 3 | R | y |
input | s | t | ||||||||||||||||||||||
decimal | 115 | 116 | ||||||||||||||||||||||
bits | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
decimal | 28 | 55 | 16 | 0 | ||||||||||||||||||||
output | c | 3 | Q | = |
input | s | |||||||||||||||||||||||
decimal | 115 | |||||||||||||||||||||||
bits | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
decimal | 28 | 48 | 0 | 0 | ||||||||||||||||||||
output | c | w | = | = |
During decoding, the exact reverse is done: each group of 4 6-bit units is converted into a triplet of bytes, using the
inverse of the mapping described above. Note that the padding character (=
) is considered to map to the value
0.
It is also possible that the stream is not properly terminated with padding characters, in which case the string may have 2 or 3 extra characters at the end. It is not possible for a stream to have only one extra character, as each byte maps to at least 2 base64 characters (8 bits requires at least 12 bits, i.e. 2 6-bit units).
With these done, I moved onto CRC-32. This is very straightforward, just using the crc32()
function from
the zlib library which is a dependency of ENIGMA. One problem is, however, that the generator polynomial that zlib uses
for its crc32 calculation is different from the one used by GameMaker Studio, so I have no way of knowing how the one in
GMS behaves. I have tried to approximate what I think makes sense, wrapping the offset with buffer_wrap
and
truncating the write when it goes off of the buffer end.
Also using zlib, I implemented buffer_compress
and buffer_decompress
, using the standard
Z_DEFAULT_COMPRESSION
mode which is also used by GMS. Both of these functions compress / decompress chunks
of 16k of data at a time.
I then implemented the MD5 and SHA1 checksum functions, buffer_md5
and buffer_sha1
. Unlike before,
these functions do not wrap the offset when using buffer_wrap
, they instead return the checksums for an empty
stream. The problem with these functions lay in whether to add a dependency or not. OpenSSL is too large to be used for
just 2 functions, and while the Boost libraries do have these functions, they are part of the internal workings thus their
API is not stable. The same issue arose with other libraries, where along with adding another dependency, there would also
have to be licence considerations.
Therefore, I decided to use the reference implementations from the RFCs for the MD5 and SHA1 algorithms. This way, there
would be no issues with having to add additional licenses for third party code. While the SHA1 code was pretty easy to integrate,
the MD5 code was so old it used ANSI C style function declarations, where the types of parameters were specified separately
from the names of the parameters. It also had a macro PROTO_LIST
, where it would not emit function parameters
if the compiler being used did not support them. So, I modernized the code a little, using fixed width integer types both
in the MD5 and SHA1 code, and “modern” function declarations.
buffer_get_surface
and buffer_set_surface
are pretty simple. The former copies a surface’s texture
data to a buffer, and the latter does the opposite, writing a buffer to a surface’s texture. Simply put, a surface is
an area which can be drawn on, like the viewport which the user sees on their monitor. Each pixel’s color value is stored
in ENIGMA using the BGRA format where 1 byte is used for each component, which means 4 bytes are required per pixel.
buffer_load_partial
is also quite straightforward, abort if the file offset is greater than the file size,
truncate if the bytes being read go beyond the end of the file, grow the destination buffer if it is buffer_grow
,
and wrap the offset for buffer_wrap
. Finally, I implemented some string functions (at the request of @R0bert)
using the MD5 and SHA1 implementations I had used. These are:
-
md5_string_utf8
-
md5_string_unicode
-
md5_file
-
sha1_string_utf8
-
sha1_string_unicode
-
sha1_file
These functions are very straightforward too, just calculating the checksums of the strings or the files passed to them.
With the _unicode
functions, the string is considered to be in the UTF-16LE (little endian) format, and is
first converted into a byte stream before its checksum is calculated.
Now that the only thing that remains in Part I of my project is documenting the buffer functions, I will shift my focus back to the EDL compiler.
I have already worked on finishing
the BNF grammar for EDL, and am coordinating with @Josh on finishing up the parser implementation. Therefore, I
will work on the EDL parser this week, and also try finishing up the buffer functions documentation using doxygen comments
in the buffers.h
header file, as the enigma wiki seems to be in a non-functional state currently.
Incompatibilities with GMS:
buffer_crc32
: different generator polynomial, maybe different behaviour