Skip to content

This article utilizes a slide-based presentation best viewed in a browser with CSS.

What's the big deal with sized types anyway?

It's not like we need all parts of a struct to have well-defined sizes at compile-time, except perhaps for tiny things like call conventions!

Yet here we are, fighting our compilers every day over petty details like that.

Imagine, instead, a world that endorses, nay, embraces unsized types! Wouldn't it be great?
Every type, every variable—runtime-sized! Pointers—not required when dealing with varying sizes! Strings inlined wherever you want; variable-sized integers straight in your codebase at no performance penalty!

That—that would be the ultimate power to model data, to define file formats, to access memory, to.. to.. shape worlds

Couple of high-rises dissolving under exposure to the sheer, uncontrolled power of runtime-sized types

For starters, types would no longer be primitive, static, frail things.

We have to get rid of that notion that types have a compile-time size.

Instead, types would have non-const parameters and fields that determine their size, at runtime!

See how much structs looks like functions now!

Prev Next

struct sliced_apple(int slice_count) {
  // Size depends on non-const type parameter.
  array(apple_slice, slice_count) slices;
  color color;
  // sizeof(sliced_apple(x)) =
  //   x * sizeof(apple_slice) + sizeof(color)
};

struct sliced_pear {
  // Size depends on a field.
  int slice_count;
  array(pear_slice, slice_count) slices;
};

struct sliced_string {
  // HA! Just kidding, you can't slice a string unless you know whether you are slicing by bytes, codepoints, graphemes, glyphs, ligatures, tokens, words, lines, phrases, sentences, C chars, C wchars, Windows wchar_t-s, signed chars, unsigned chars, null chars, or who-knows-what chars. Or non-chars.
};

(Better yet, imagine having to
render a sliced_string!)

In fact, types would have if-s and else-s, just like normal functions!

It's like those fancy ADT enums in functional programming languages!

Except, now you can customize them to your heart's content. No more fighting the syntax to share state between state machine states!

Prev Next

struct expression {
  int type;
  if (type == 0) {
    int constant_value;
  } else if (type == 1) {
    expression addend;
    expression base;
  } else if (type == 2) {
    expression minuend;
    expression subtrahend;
  } else if (type == 3) {
    ...
  }
};

Also, it wouldn't be hard to imagine an array type defined in terms of for loops...

...though there might be some questions regarding accessing the elements of such arrays.

Prev Next

struct array(struct inner, int length) {
  for (int i = 0; i < length; i ++) {
    inner element;
  }
};
// Later...
array(int, 4) my_array;
my_array.element // Er... um, they all have the same name?
// TODO: make a Duff's device-like coroutine using structs

The only limitation of our imagined system would be that the size of a field's type should be determined entirely by variables that come before it..

Prev Next

struct zip_file {
  for (int i = 0; i < directory.files_count; i ++) {
    array(
      byte,
      directory.files[i + 1].offset - directory.files[i].offset
    ) file_data;
  }
  central_directory directory; // Oops, you need to know how large files are to read the directory
};

And perhaps we should forbid using mutable global variables in type definitions.

Otherwise, it'd be far too easy to make buffer overflows! And this is not C!

Prev Next

int my_very_special_number = 4;
struct dont_do_this {
  array(int, my_very_special_number) my_ints;
};
// Later...
dont_do_this a;
my_very_special_number = 5; // xoxo
a.my_ints[4]                // -- an adjacent memory buffer

Granted, there are a few affordances we would lose out on...

For one, we wouldn't be able to get the offsets of elements inside structs any more—at least not without an instance of the struct.

Prev Next

struct apple {
  int slice_count;
  array(apple_slice, slice_count) slices;
  color color;
};

apple my_apple;
color* x = &my_apple.color;                     // allowed: taking a pointer
x = ((void*)&my_apple + offsetof(apple, color)) // not allowed: offsetof

..but it's not like anyone is using that for anything important, right? [citation needed]

Also, we would lose that old hack of multiplying element size by index to get an array element.

Instead, we would walk along the array to find out the total size of the elements before it, just like our grand-grandparents' linked lists, in O(n) time.

Prev Next

array(apple, 175) a_bushel;

a_bushel[60].color // okay
(apple*)((void*)&a_bushel + sizeof(apple) * 60) // sizeof which apple?

(But it's just O(n) runtime, so it's not like it matters, right? Developer time gains here are insane!)

Okay, okay, finee... maybe that last one was actually important. O(n) time for common operations like array-index access sounds bad...

However, any sufficiently smart compiler would be able to transform it to the equivalent old O(1) time. Obviously.

(Compilers that fail to do so are left as an exercise to the reader.)

Prev Next

[[please_dont_make_me_solve_the_halting_problem]]
[[i_beg]]
[[i_promise_max_size_is(40)]]
struct apple {
  // ...
};

But with all of that implemented, we might finally achieve...

Prev

// A variable-sized type
struct sliced_pear {
  int slice_count;
  array(pear_slice, slice_count) slices;
};

// Two variable-sized fields next to each other in memory
struct fruit_plate {
  sliced_pear a;
  sliced_pear b;
};
fruit_plate plate;

// A few pointers to the variable-sized type
sliced_pear* a_pointer = &plate.a;
sliced_pear* b_pointer = &plate.b;

a_pointer->slice_count++;  // Then, modifying one pointer...
plate.b != b_pointer      // ...has to shift the other one!
                          // Oops. :o)

Enlightenment?

Next

After all, a type that changes size is only safe when wrapped in a pointer.

...Or perhaps at the end of a struct.

But never in the middle.

The middle is dangerous. If the size of a field changes, it invalidates all pointers to subsequent fields, easily overflows/underflows into them, and makes for all kinds of exiting and novel bugs.

Prev Next

Buildings, collapsing because the ground is changing size beneath them

In conclusion

Don't blame your favorite compiler that it still doesn't fully support ?Sized in 2025.

(2025!!)

Thank the Lord instead.


Unsized types are nasty.

Prev Next

_Stars

[[tightly_packed]]
// Yay, recursive type parameters!
struct int(int(_) bits = 64) {
  array(bool, bits) value;
};

int(max(size_a, size_b) + 1) add(int(size_a) a, int(size_b) b) {
  // Wait, if we do + 1 in the output type definition,
  // isn't this recursive too?
}

This presentation-article has been my second post of #100DaysToOffload.

Thanks for browsing; hope you enjoyed the C/Rust humor!

Prev

As a curious note, ImHex's pattern language does have structs with if/else-s and some version of loops, just like the imaginary language described earlier.

...It is, however, a domain-specific language for describing binary data and not a systems programming language.

_Onwards!
Onwards!