Home

Dump Rust Struct or Enum Memory Representation as Bytes

Evergreen 🌳 - Published November 3, 2023

I often find myself wanting to know how a struct is laid out in memory. Unfortunately there doesn't seem to be a good way to get this information without adding annoying compiler flags. If you're the kind that prefers printf debugging like me, you'd probably rather reach for a one liner that will let you poke around and try to figure out what's going on for yourself.

Before we get too far I should note: everything that happens after this point should be considered undefined behaviour! I won't be held liable for any segfaults or bugs you might introduce from this dangerous knowledge.

Okay? Well here's that dirty way to dump the memory representation of a Rust type as a slice of bytes:

fn get_memory<'a, T>(input: &'a T) -> &'a [u8] {
	unsafe {
		std::slice::from_raw_parts(
			input as *const _ as *const u8,
			std::mem::size_of::<T>()
		)
	}
}

Here's what it looks like on a struct:

struct Cat {
	name: &'static str
}

println!("Cat: {:?}", get_memory(&Cat {
	name: "Sir Whiskers",
}));

// Cat:
// [106, 144, 81, 0, 110, 85, 0, 0, 12, 0, 0, 0, 0, 0, 0, 0]

On an enum:

enum Animal {
	Cat {
		name: &'static str
	},

	Dog {
		name: &'static str,
		good_boy: bool
	}
}

println!("{:?}", get_memory(&Animal::Cat {
	name: "Sir Whiskers"
}));

// Animal::Cat:
// [
//   0, 0, 0, 0, 0, 0, 0, 0,
//   106, 144, 81, 0, 110, 85, 0, 0,
//   12, 0, 0, 0, 0, 0, 0, 0
// ]

println!("{:?}", get_memory(&Animal::Dog {
	name: "Cleo",
	good_boy: true
}));

// Animal::Dog:
// [
//   1, 1, 0, 0, 0, 0, 0, 0,
//   146, 144, 81, 0, 110, 85, 0, 0,
//   4, 0, 0, 0, 0, 0, 0, 0
// ]

It works on Zero Sized Types:

struct Ghost;

println!("{:?}", get_memory(&Ghost));
// &[]

And built in types like &str:

println!("&'static str: {:?}", get_memory(&"Sir Whiskers"));
// &'static str:
// [106, 144, 81, 0, 110, 85, 0, 0, 12, 0, 0, 0, 0, 0, 0, 0]

Some Observations

It's super fun to explore and figure out for yourself some parts of Rust that you probably wouldn't need to worry about every day. If you're observant you've probably seen some cool stuff in the examples already. Here are a couple of things I spotted:

Struct Sizes

The memory representation of Cat { name: "Sir Wiskers" } is identical to "Sir Whiskers" literal str reference.

Rust is clever in that all structs are only the size of their component parts. There isn't any superfluous runtime type information or function pointers stored on the struct.

In some cases, however, a struct may be slightly larger than its components depending on their alignment and size. You'll hear more about this later.

&str and &[u8]

Since Cat, Animal::Cat and the &str example are all using the same string literal they get merged together and all end up pointing to the same memory location.

It's important to note as well the bytes [106, 144, 81, 0, 110, 85, 0, 0, 12, 0, 0, 0, 0, 0, 0, 0] aren't actually the contents of the string but instead a struct that has a pointer to a memory location as well as the length of the string or slice in bytes. The size of the type is 16 bytes which is split into 8 bytes for the pointer and 8 bytes for the length of the string.

If you notice carefully the second half of the bytes say 12 which is the length of the string "Sir Whiskers".

Enums

The memory representation of the enum examples is a little bit peculiar and slightly harder to decipher than the other simpler examples.

Rust enums are implemented as a tagged union. This means that there is a byte which represents what variant of the enum the value is and then following that is enough space so that any one of the variants would be able to fit. In practice this means that you could have enums that are massive simply because one variant has too many fields.

The the Animal::Cat and Animal::Dog example we can see that the first byte is a 0 when it's a Cat and a 1 if it's a dog. Weirdly however the second byte is 0 when it's a Cat but our Dog example has a 1. Then there seems to be a bunch of empty space until the ninth byte where our familiar 16 bytes of string literals appear.

You've probably picked up on it by now, but the reason for the empty space and the extra 1 comes down to alignment.

Rust &[u8] or &str have an alignment of 8 bytes. This means that they can only be stored at addresses that are a multiple of 8. Since the first byte is used for the tag the next multiple of 8 isn't until the ninth byte so Rust will pad with empty space to make sure everything aligns properly.

The good_boy field on Dog is a bool which has an alignment of 1. To reduce the overall size of the struct, Rust will move this field from the end where it was defined and place it at the beginning. That's where that seemingly extra 1 comes from.

To further demonstrate this, here is that same struct but with a couple more fields added:

enum DetailedAnimal {
    Cat {
        name: &'static str,
        age: u8,
        legs: u8,
        eyes: u8,
    },
    
    Dog {
        name: &'static str,
        age: u8,
        legs: u8,
        eyes: u8,
        good_boy: bool
    }
}


println!("DetailedAnimal::Cat:  {:?}", get_memory(
    &DetailedAnimal::Cat {
        name: "Sir Whiskers"
        legs: 4,
        eyes: 1,
        age: 11,
    }
));

// DetailedAnimal::Cat:
// 
// [
// 	0, 11, 4, 1, 0, 0, 0, 0,
//  106, 144, 81, 0, 110, 85, 0, 0,
// 	12, 0, 0, 0, 0, 0, 0, 0
// ]

println!("DetailedAnimal::Dog:  {:?}", get_memory(
    &DetailedAnimal::Dog {
        name: "Cleo",
        good_boy: true
        age: 10,
        legs: 3,
        eyes: 2,
    }
));

// DetailedAnimal::Dog:
// 
// [
// 	1, 10, 3, 2, 1, 0, 0, 0,
//  146, 144, 81, 0, 110, 85, 0, 0,
// 	4, 0, 0, 0, 0, 0, 0, 0
// ]

As you can see the extra fields added fill up a lot of space that would otherwise be empty due to alignment. In fact despite having more fields the Animal and DetailedAnimal are exactly the same size.

Homework

I've created a Rust playground with all these examples if you're keen to have a bit of a dig around: link to playground. If you want to explore a little further here are some ideas for poking around the memory:

  • What does adding #[repr(C)] to your struct/enum do?
  • What happens if your enum branch has different sized variants?
  • What does a None or a Some look like for Option<T>?
  • What does memory that is not being used look like? Is it zeroed out?

You can also read much more detailed explanations on all the concepts mentioned here:

Have fun!


Bennett is a Software Engineer working at CipherStash. He spends most of his day playing with TypeScript and his nights programming in Rust. You can follow him on Github or Twitter.
This work by Bennett Hardwick is licensed under CC BY-NC-SA 4.0Creative Commons CC logoCreative Commons BY logoCreative Commons NC logoCreative Commons SA logo.