
Running cargo test on Bare Metal: Adding libtest Support to GNAT Pro for Rust
Testing embedded software is notoriously awkward. The code runs on hardware you may not have on your desk, the usual tooling assumes an operating system, and the feedback loop is slow. Rust's #[test] attribute and cargo test are indispensable in hosted environments, but what does it take to make them work on a bare metal target? This post walks through how we added libtest support to GNAT Pro for Rust using the AMD Zynq UltraScale+ MPSoC as the concrete example.
The adacore-zynqmp crate provides Rust support for the Zynq UltraScale+ MPSoC (ZynqMP), described in a previous blog post. It targets aarch64-unknown-none, a bare metal AArch64 target with no operating system. One of its features is std support via newlib, a C standard library designed for embedded systems. Newlib provides the POSIX-like syscall interface that Rust's standard library expects, and the crate supplies the missing pieces as Rust functions with C linkage: write() for UART output, sbrk() for heap allocation, and _exit() for program termination.
With the std feature enabled, you can write idiomatic Rust that uses String, Vec, println!, and the rest of the standard library, and it runs on bare metal. The natural next question is: can you also use #[test]?
Why libtest
For a target that already has std support via newlib, libtest is the least-friction option. You get the familiar #[test] attribute, assert_eq!, timing output, and cargo test integration, all without any additional dependencies or custom harness code. Tests look exactly like they would for any hosted Rust crate.
The alternatives each have drawbacks on ZynqMP:
Custom test harness. Rust lets you opt out of the built-in test harness by setting harness = false in Cargo.toml and supplying your own test harness. This gives full control but means writing the harness itself: discovering tests, running them, formatting output. It is the right choice for #[no_std] targets, but on a target that already has std, it means reinventing what you already have.
defmt-test. The defmt-test crate provides a #[defmt_test::tests] attribute macro that generates a test harness suitable for no_std targets. It integrates with probe-rs for flashing and output, and is a popular choice for Cortex-M targets. It is not applicable to AArch64 targets like the ZynqMP, where probe-rs has no support.
embedded-test. The embedded-test crate is a more architecture-flexible alternative to defmt-test. It supports ARM, RISC-V, and Xtensa targets, and uses semihosting to communicate test results back to the host via probe-rs. Similar to defmt-test, it requires probe-rs for flashing, so it also does not apply to the ZynqMP.
Beyond these constraints, defmt-test and embedded-test both assume a hardware-in-the-loop setup: a physical device connected via a debug probe. The libtest + QEMU approach has no such requirement, which makes it more practical for CI environments and for developers who do not have a ZynqMP board on their desk.
What libtest Actually Needs
The appeal of libtest is clear, but "in principle available" is not the same as working out of the box. Enabling it required understanding exactly what libtest depends on at runtime, and filling in the gaps.
Threads
libtest's default execution model spawns each test in its own thread. This serves two purposes: panics in one test do not abort the entire suite, and tests can run in parallel. On bare metal with no thread support, this fails immediately.
The solution already existed in libtest for other constrained targets. libtest has an explicit list of targets that skip thread-per-test execution: emscripten, wasm, and zkvm. Adding target_os = "none" with target_env = "newlib" to that list was a small addition in library/test/src/lib.rs:
let supports_threads = !cfg!(target_os = "emscripten")
&& !cfg!(target_family = "wasm")
&& !cfg!(target_os = "zkvm")
&& !(cfg!(target_os = "none") && cfg!(target_env = "newlib"));With threading disabled, tests run sequentially in the main thread. Panics from assert! macros still propagate, but they terminate the test binary at the first failure rather than being confined to the failing test. This is a reasonable trade-off on bare metal: the system state after a panic is unreliable regardless, and the failing test is still identified in the output before the binary exits.
The same constraint rules out #[should_panic] in its intended use. Without thread isolation, libtest cannot catch a panic from a test function; on this target, the panic strategy is abort, so catch_unwind does not catch anything either. When a #[should_panic] test actually panics (the case the attribute is designed for), the panic aborts the entire suite, and libtest never observes it to mark the test as passing. The opposite case is handled correctly: a #[should_panic] test that does not panic is still reported as a failure, since libtest only needs to notice that the function returned normally. Nothing warns you at compile time.
Timing
libtest reports how long each test takes. This requires std::time::Instant, which in turn requires the platform to provide a monotonic clock. For the newlib platform abstraction layer (PAL) in Rust's standard library, the implementation had previously been delegated to the unsupported stub, which panics at runtime:
// library/std/src/sys/pal/unsupported/time.rs
pub fn now() -> Instant {
panic!("time not implemented on this platform")
}Making timing work required following the dependency chain all the way down to hardware.
Instant and clock()
The first step was a real time.rs for the newlib PAL. The POSIX clock() function nominally returns the amount of processor time consumed by the calling process, expressed in clock ticks of CLOCKS_PER_SEC Hz. On a bare metal target with no scheduler and a single thread of execution, "processor time" and elapsed wall-clock time coincide, so clock() is a suitable source of per-test durations for libtest:
// library/std/src/sys/pal/newlib/time.rs
unsafe extern "C" {
pub safe fn clock() -> libc::clock_t;
}
impl Instant {
pub fn now() -> Instant {
let Ok(clk) = clock().try_into() else {
panic!("failed to determine processor time")
};
Instant(Duration::from_micros(clk))
}
// ...
}The newlib PAL interprets clock() as microseconds via Duration::from_micros(clk). This is a Rust-internal convention with whatever C-side times() implementation provides the underlying counter, not a derivation from newlib's CLOCKS_PER_SEC macro. The contract is that clock() returns microseconds; any times() provider used with this PAL must produce tms_utime in microseconds to match.
Newlib's clock() cooperates with this convention: it calls the POSIX times() function and returns the sum of all four CPU-time fields of the tms struct (user, system, and the corresponding children fields), without further scaling. Setting only tms_utime and leaving the others at zero, as the implementation below does, makes that sum equal to tms_utime, so the microsecond value flows through unchanged.
One real consequence: if newlib's CLOCKS_PER_SEC macro on this target is not 1,000,000, C code that calls clock() and divides by CLOCKS_PER_SEC to obtain seconds will compute a wrong result, because our clock() returns microseconds regardless of what the macro says. The Rust call chain is internally consistent at microsecond resolution; the divergence is a known limitation of the C interface.
times() and the Hardware Counters
POSIX times() returns elapsed real time since an arbitrary past reference and fills in a tms struct with the calling process's user and system CPU time, all in clock ticks of CLOCKS_PER_SEC Hz. The adacore-zynqmp crate needed to implement it using what the ZynqMP actually provides. The AArch64 architecture includes a generic timer with two system registers: cntvct_el0 (the current counter value) and cntfrq_el0 (the counter frequency in Hz). These are readable without any OS involvement:
#[inline]
fn cntvct() -> u64 {
let value: u64;
unsafe { asm!("mrs {0}, cntvct_el0", out(reg) value); }
value
}
#[inline]
fn cntfrq() -> u64 {
let freq: u64;
unsafe { asm!("mrs {0}, cntfrq_el0", out(reg) freq); }
freq
}The times() implementation uses these to fill in the tms struct. To honor the Rust PAL's microsecond convention, it scales the raw counter to 1_000_000 ticks per second and stores the result in tms_utime, leaving the other three fields at zero. Newlib's clock() then sums the four fields and returns the microsecond value unchanged. u128 arithmetic avoids overflow in the intermediate product:
#[unsafe(no_mangle)]
extern "C" fn times(buf: *mut tms) -> c_long {
const MICROS_PER_SEC: u128 = 1_000_000;
let ticks = (cntvct() as u128) * MICROS_PER_SEC / (cntfrq() as u128);
let Ok(ticks) = c_long::try_from(ticks) else {
return -1;
};
unsafe {
(*buf).tms_utime = ticks;
(*buf).tms_stime = 0;
(*buf).tms_cutime = 0;
(*buf).tms_cstime = 0;
}
ticks
}The full dependency chain, from libtest down to silicon:
libtest
└─ std::time::Instant::now()
└─ clock() [POSIX, provided by newlib]
└─ times() [POSIX syscall, implemented in adacore-zynqmp]
└─ cntvct_el0 / cntfrq_el0 [AArch64 hardware registers]The example crate includes a regression test that compares Instant::elapsed() directly against the elapsed cntvct_el0 ticks, which exercises the entire chain end to end.
Exit Code Propagation
For CI, it is not enough that test results appear in terminal output. The test runner must exit with a non-zero status when tests fail and must not hang when a test panics. The semihosting feature (which implies std) addresses both problems directly. _exit() uses a semihosting SYS_EXIT call rather than a soft reset: success maps to ADP_STOPPED_APPLICATION_EXIT and any non-zero status maps to ADP_STOPPED_RUN_TIME_ERROR. QEMU exits with code zero for the former and non-zero for the latter, so cargo test can detect failures from the process exit status rather than parsing output.
The same feature also avoids a second failure mode: when a test panics, the resulting trap reaches the default exception handler, which spins in a busy loop, leaving the process running until the test runner kills it on timeout. With semihosting enabled, the default exception handler calls _exit(-1) instead, terminating QEMU immediately with a non-zero exit code.
UART Initialization
libtest writes output (the test count, each test name, the final summary) immediately, before any user code in the test binary has a chance to run. Test output goes to UART, so if UART requires explicit initialization before first use, the earliest output is lost.
Previously, adacore-zynqmp expected the application to initialize UART before writing to it, which is fine for ordinary binaries that define their own entry point. In a test binary, however, libtest generates the entry point and starts printing before any application code runs, so there is no opportunity to perform that initialization. The fix was to move initialization inside the UART accessor itself, guarded by spin::Once: the first call performs the hardware setup, subsequent calls return immediately. With this change, the first byte libtest writes triggers initialization transparently.
Entry Point Naming
libtest generates a main function as the test binary's entry point. The std runtime does the same in any std binary: the user writes fn main(), and the compiler synthesizes a C-ABI main symbol for the startup code to call. Neither happens in a no_std binary, which is where adacore-zynqmp's entry! macro comes in: it wraps the user's function and exposes it under a fixed symbol that the startup code can call.
Previously, the macro emitted a function named __main, and the startup code called __main directly, bypassing the conventional main symbol entirely. That worked for no_std applications using entry!, but it meant libtest's generated main (and std's) was never reached. Changing the startup code to call main and updating entry! to export the user's function under that name (via #[unsafe(export_name = "main")] rather than a literal fn main) makes main the universal entry point: provided by libtest in test builds, by the std runtime in std binaries, and by entry! in no_std binaries. The three sources are mutually exclusive in any given build, so there is no linker conflict.
One corner case to be aware of: a bin crate that uses entry! and also defines #[test] functions in the same crate must gate the entry! invocation with #[cfg(not(test))], since otherwise both entry! and libtest will emit main during test builds, and the link will fail.
The Result
With these changes in place, cargo test works as expected. For example:
$ cargo test --target aarch64-unknown-none
running 3 tests
test tests::hashmap_insert_and_lookup ... ok
test tests::instant_measures_elapsed_time ... ok
test tests::vec_grows_across_reallocations ... ok
test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out(The accompanying example crate runs an additional regression test for the timing chain, so its output shows four tests rather than three.)
Writing tests for bare metal code looks exactly like writing tests for any other Rust crate. A project using adacore-zynqmp declares it as a regular dependency with std and as a dev-dependency with semihosting, so that production builds do not pull in semihosting:
[dependencies]
adacore-zynqmp = { version = "0.2.0", features = ["std"] }
[dev-dependencies]
adacore-zynqmp = { version = "0.2.0", features = ["semihosting"] }Cargo unifies the features from [dependencies] and [dev-dependencies] when building tests, so test builds get semihosting while production builds do not.
Tests are written as usual in the project's own source. The use adacore_zynqmp as _; import brings the crate's startup code and exception vectors into the link without naming any specific item. Without it, the linker would omit the crate entirely:
use adacore_zynqmp as _;
#[cfg(test)]
mod tests {
#[test]
fn vec_grows_across_reallocations() {
let mut values = Vec::new();
for value in 0..1000 {
values.push(value);
}
assert_eq!(values.len(), 1000);
assert_eq!(values[500], 500);
}
#[test]
fn hashmap_insert_and_lookup() {
use std::collections::HashMap;
let mut map = HashMap::new();
map.insert("hello", 1);
map.insert("world", 2);
assert_eq!(map.get("hello"), Some(&1));
assert_eq!(map.len(), 2);
}
#[test]
fn instant_measures_elapsed_time() {
let start = std::time::Instant::now();
let _: Vec<u32> = (0..10_000).collect();
assert!(start.elapsed() > std::time::Duration::ZERO);
}
// ...
}Running the tests requires a QEMU runner. Add the following to .cargo/config.toml, alongside the rustflags entry that adacore-zynqmp already requires:
[target.aarch64-unknown-none]
rustflags = ["-C", "link-arg=-Tlink.x"]
runner = "qemu-system-aarch64 -machine xlnx-zcu102,secure=on,virtualization=on \
-m 2G -nographic -no-reboot -semihosting-config enable=on,target=native -kernel"Two flags are worth noting. -semihosting-config enable=on,target=native enables semihosting in QEMU and directs semihosting calls to the host process, which is what makes SYS_EXIT terminate QEMU with the correct exit code. -no-reboot makes QEMU exit rather than restart when the machine resets, which acts as a safety net if the semihosting path is not reached.
Two differences from the standard experience. First, libtest normally captures each test's output and only shows it on failure. On bare metal, UART output is unbuffered and goes directly to the terminal regardless of whether the test passes or fails, so println! calls from all tests are always visible.
Second, flags passed via cargo test -- <args> (test filtering, --list, --ignored, --nocapture, and the rest) do not reach libtest. The QEMU runner does not forward command-line arguments into the guest, so those flags are interpreted by QEMU itself and usually cause it to fail. Wiring this up would require pairing QEMU's -append with SYS_GET_CMDLINE on the guest side.
A complete working example is available at github.com/AdaCore/rust-zynqmp/tree/main/examples/libtest.
Conclusion
Getting cargo test to work on a bare-metal target ended up requiring five distinct pieces: disabling thread-per-test execution, implementing a monotonic clock using AArch64 hardware registers, propagating the exit status out of QEMU via semihosting, lazy UART initialization, and recognizing main as a valid entry point name. The implementation is split between adacore-zynqmp and GNAT Pro for Rust's version of the Rust standard library. The libtest threading guard and the newlib PAL timing implementation live in the standard library; the remaining pieces are in the public adacore-zynqmp crate. Customers using GNAT Pro for Rust 27 or newer can enable cargo test on bare-metal targets using the adacore-zynqmp crate. Discussions on upstreaming this capability into the public repositories are ongoing. In the meantime, reach out if you would like to use this capability in your own projects.
The broader lesson is that libtest's dependencies are shallow and concrete. If your target has std support through newlib, each dependency has a clear owner: threading policy lives in libtest, timing lives in the standard library's PAL, and the remaining pieces (UART initialization, entry point wiring, and exit code propagation) live in the board support crate. The result is a largely standard cargo test experience on bare metal: #[test], assert_eq!, timing output, and reliable CI integration. The gaps relative to a hosted environment are narrow but real: #[should_panic] cannot confirm an expected panic, test output is not captured per test, and cargo test -- <args> flags (test filtering, --list, --ignored, and so on) do not reach libtest because the QEMU runner does not forward arguments into the guest.
Author
Tobias Reiher

Tobias Reiher is a Senior Rust Engineer and the Technical Lead of the RecordFlux technology at AdaCore. With a decade of experience in designing and implementing secure systems, he is committed to creating correct software.





