This article utilizes a slide-based presentation best viewed in a browser with CSS.
What's the big deal with sized types anyway?
It's not like we need all parts of a struct to have well-defined sizes at compile-time, except perhaps for tiny things like call conventions!
Yet here we are, fighting our compilers every day over petty details like that.
Imagine, instead, a world that endorses, nay, embraces unsized types! Wouldn't it be great?
Every type, every variable—runtime-sized! Pointers—not required when dealing with varying sizes! Strings inlined wherever you want; variable-sized integers straight in your codebase at no performance penalty!
That—that would be the ultimate power to model data, to define file formats, to access memory, to.. to.. shape worlds
For starters, types would no longer be primitive, static, frail things.
We have to get rid of that notion that types have a compile-time size.
Instead, types would have non-const parameters and fields that determine their size, at runtime!
See how much structs looks like functions now!
struct sliced_apple(int slice_count) {
// Size depends on non-const type parameter.
array(apple_slice, slice_count) slices;
color color;
// sizeof(sliced_apple(x)) =
// x * sizeof(apple_slice) + sizeof(color)
};
struct sliced_pear {
// Size depends on a field.
int slice_count;
array(pear_slice, slice_count) slices;
};
struct sliced_string {
// HA! Just kidding, you can't slice a string unless you know whether you are slicing by bytes, codepoints, graphemes, glyphs, ligatures, tokens, words, lines, phrases, sentences, C chars, C wchars, Windows wchar_t-s, signed chars, unsigned chars, null chars, or who-knows-what chars. Or non-chars.
};
(Better yet, imagine having to
render a sliced_string
!)
In fact, types would have if-s and else-s, just like normal functions!
It's like those fancy ADT enums in functional programming languages!
Except, now you can customize them to your heart's content. No more fighting the syntax to share state between state machine states!
Also, it wouldn't be hard to imagine an array type defined in terms of for
loops...
...though there might be some questions regarding accessing the elements of such arrays.
// TODO: make a Duff's device-like coroutine using structs
The only limitation of our imagined system would be that the size of a field's type should be determined entirely by variables that come before it..
And perhaps we should forbid using mutable global variables in type definitions.
Otherwise, it'd be far too easy to make buffer overflows! And this is not C!
Granted, there are a few affordances we would lose out on...
For one, we wouldn't be able to get the offsets of elements inside structs any more—at least not without an instance of the struct.
..but it's not like anyone is using that for anything important, right? [citation needed]
Also, we would lose that old hack of multiplying element size by index to get an array element.
Instead, we would walk along the array to find out the total size of the elements before it, just like our grand-grandparents' linked lists, in O(n)
time.
(But it's just O(n)
runtime, so it's not like it matters, right? Developer time gains here are insane!)
Okay, okay, finee... maybe that last one was actually important. O(n) time for common operations like array-index access sounds bad...
However, any sufficiently smart compiler would be able to transform it to the equivalent old O(1) time. Obviously.
(Compilers that fail to do so are left as an exercise to the reader.)
But with all of that implemented, we might finally achieve...
// A variable-sized type
struct sliced_pear {
int slice_count;
array(pear_slice, slice_count) slices;
};
// Two variable-sized fields next to each other in memory
struct fruit_plate {
sliced_pear a;
sliced_pear b;
};
fruit_plate plate;
// A few pointers to the variable-sized type
sliced_pear* a_pointer = &plate.a;
sliced_pear* b_pointer = &plate.b;
a_pointer->slice_count++; // Then, modifying one pointer...
plate.b != b_pointer // ...has to shift the other one!
// Oops. :o)
Enlightenment?
After all, a type that changes size is only safe when wrapped in a pointer.
...Or perhaps at the end of a struct.
But never in the middle.
The middle is dangerous. If the size of a field changes, it invalidates all pointers to subsequent fields, easily overflows/underflows into them, and makes for all kinds of exiting and novel bugs.
In conclusion
Don't blame your favorite compiler that it still doesn't fully support ?Sized
in 2025.
(2025!!)
Thank the Lord instead.
Unsized types are nasty.
This presentation-article has been my second post of #100DaysToOffload.
Thanks for browsing; hope you enjoyed the C/Rust humor!
As a curious note, ImHex's pattern language does have structs with if/else-s and some version of loops, just like the imaginary language described earlier.
...It is, however, a domain-specific language for describing binary data and not a systems programming language.