Endings and Beginnings
I wrote a bootloader (see the first post for, why). Let’s see what has happenend and what is yet to come.
UEFI-compatible firmwares thankfully provide abstractions that
make writing bootloaders very straight-forward and high-level
in comparison to the legacy BIOS API.
uefi-rs maps them nicely to Rust data structures and methods.
The Rust support for this environment is rather good,
although some not-so-frequently-used features are not (yet)
supported (see a previous post for more details).
But that’s not all, sadly. The machine state specified by Multiboot requires a bit of x86-specifics and assembly code, mainly on x86_64 systems.
towboot shows that it’s possible to load and boot operating systems on present-day hardware by using Rust, thus gaining support for type- and memory-safety1 and dependency management in comparison to just using C2.
I successfully tested towboot by booting a simple test kernel (as described in the previous post), hhuOS and GNU HURD. It was able to load the kernels and modules. It did pass the correct command lines, the correct memory map and the correct video information. The CPU was in the correct state when jumping into the kernel.
I also tried booting FlingOS, but that failed — at least in parts because the memory layout described in the kernel’s ELF headers was incompatible with the one provided by OVMF at boot (and the workaround I described in a previous post that worked for GNU HURD did not work in this case).
towboot is pretty simple and there are many things that could be added or improved:
towboot currently only supports kernels that follow the Multiboot specification in version 1 and only provides this version of information to kernels.
Support for version 2 would be possible, useful and relatively easily doable, although we’d need a library that supports it3.
With support for Multiboot 2, towboot could pass a pointer to the UEFI System Table to compatible kernels which would enable them to access UEFI variables, for instance. Also, ACPI, SMBIOS and all that stuff.
Most modern systems come with Secure Boot enabled (see section 32.5 of the UEFI specification). If you have a purchased a computer in the last ten years, it’ll probably come with it. This prevents UEFI applications that are not properly signed from being loaded (see this document).
So, what do we do about this? We could just sign the main bootloader executable with a key that is trusted by the firmware. This would be enough to be able to boot, but this would entirely circumvent the security measures: Any Multiboot-compatible kernel could be booted, no matter whether correctly signed or not.
Properly implementing Secure Boot would mean to add signatures for at least the kernel4 to the Multiboot standard in a backwards-compatible way5 and then requiring the kernels themselves to check any code loaded into kernel space at runtime. The former could be done by either prepending or appending the signature to the kernel image for Multiboot 1 or by adding a new tag to Multiboot 2.
towboot does only support loading kernels for the 32-bit x86 plattform as specified by Multiboot 1. While Multiboot 2 does officially add support for x86_64 kernels, there are bootloaders supporting elf64 Multiboot 1 kernels6.
In order to add support for loading a 64-bit kernel, we’d need to detect CPU support and switch to Long Mode first if the firmware is running in 32-bit Protected Mode7.
Currently, when towboot reads a file, it copies the complete content into a newly allocated buffer. This happens synchronously. Both allocating and reading take a noticeable amount of time (the former only on debug builds).
This could perhaps be improved by reading just small chunks at once. That way, towboot could show some sort of progress display to the user — be it a progress bar or just dots. However, this would decrease the performance and would surely add complexity to the code.
UEFI systems with more than one CPU also support running code simultaneously8, but interfacing with the file system is only possible on the so-called bootstrap processor9, so it cannot be done in parallel.
But with a release build, loading modules of about 30 megabytes happens almost immediately, so there is not much to optimize, anyway.
Currently, towboot follows file paths from the root of the partition from where
itself has been loaded from. With some additional logic, it should be possible
to allow for paths relative to the UEFI shell’s current working directory or to
the path of the configuration file or for absolute paths including the file
system identifier (
HD0a1:, for instance).
That’s what I did in 2020/2021 when I wrote towboot as part of my bachelor’s thesis. If you haven’t already done this, please take a look at the other blogposts in this series and at the code.
I’d also like to use this space to thank the Operating Systems research group at HHU who let me do all of this. :) Also thanks to everyone who proofread these texts and to all of the people who endured listening to me talking about UEFI and Rust.
So, what’s next?
Again, thanks to the the Operating Systems research group at HHU, I’m currently working on Multiboot 2 support. And you’ll probably get blogposts about that soon-ish. (Hopefully at least not in two years.) So stay tuned.
Interfacing with some of the UEFI APIs, memory management and jumping to the kernel requires a few pieces of unsafe code. ↩
Also, the tooling is much better. You’ll get access to better IDE integration, to linting and unit testing and documentation and much, much more. ↩
There are currently two crates for Multiboot 2 on crates.io but they do not support setting values, yet. ↩
The modules could be signed, too, but just signing the kernel would bring about the level of security modern Linux distributions provide. ↩
meaning that bootloaders that are not aware of signatures should still be able to load signed kernels ↩
You might ask “but wait, doesn’t the firmware always run in the CPU’s native mode? why would you need to check CPU support? why would you need to switch modes?” and that’s because you’re wrong. Yes, the firmare mostly runs in the CPU’s native mode, but that’s not always the case. ↩
This is just a fancy way to name the first CPU. ↩