felix hernandez vieyraMarch 28, 20263 min read

do list comprehensions dream of type safety?

PyO3maturinRustPython

Logan Voss / Unsplash

There is a class of problem in Python that no amount of async or clever caching will solve: CPU-bound work. The kind where your server is not waiting on a database or an API, it is just grinding through data, row by row.

I hit this a couple times when building data exports on Python microservices. On my pet project, users can export their financial data as CSV, Excel, or PDF. The server collects the data from the db, and then needs to serialise potentially thousands of transaction rows into formatted files.

Potentially is the keyword here, in reality I just thought it would be a great excuse to get some hands-on experience working with Rust bindings after reading about them for so long. My DAU metric must be around 2, my girlfriend, and myself. We are still far.

Anyway, I wrote this the serialisation layer in Rust, and I am surprised by how simple it was.

The tool that makes this so painless is PyO3, it's a crate that lets you write native Python modules in Rust. Used in combination with maturin, a build tool that compiles your Rust code into a Python wheel. For a language that has such a bad press, this was surprisingly simple.

Here's a little summary of what you need to setup for your crab bindings. A Cargo.toml that declares PyO3 as a dependency with the extension-module feature, and a pyproject.toml that tells Python to use maturin as its build back-end.

On the Rust side, you annotate functions with #[pyfunction] and register them in a #[pymodule]. A function that takes a Python list of dicts and returns bytes looks almost like normal Rust. You extract values from Python dicts, work with native Rust types internally, and return bytes. PyO3 handles the conversion for you.

From Python's perspective, the module is invisible. You just import scribe and call scribe.export_csv(rows, currency) like nothing happened really. It will return bytes, and it is just a function call. The GIL is clueless.

Initially I reflected on whether I wanted a service that would do the heavy-lifting for me and play around with other Rust crates inside of my background tasks, but as someone whose Python is their daily driver learning about how to use Rust bindings just made more sense.

In any case, I kept the boundary clean on purpose. The Python server handles everything it is good at: auth, db queries, request routing. It collects the data, converts db records to plain dicts, and hands them off to the crate to do the compute-heavy serialisation.

Maturin is also fantastic. You can just run maturin develop and it compiles and installs the Rust module into your active virtual environment. Go edit some Rust code, run the command again, and your Python process picks up the new version. The feedback loop is fast enough that it barely feels like you are working across two languages.

The performance difference is real but honestly secondary to why I like this pattern. The bigger win is that Rust's type system and memory safety guarantees apply to the serialisation logic. Having the compiler catch off-by-one errors and enforce exhaustive pattern matching on export scopes is worth the setup cost alone.

If the hot path runs once a day on ten rows, the complexity is honestly not justified. But if you have a Python application with CPU-intensive work, and the interface between the two sides is simple data in, bytes out, PyO3 and maturin make it pretty easy to drop in a compiled module without rearchitecting your system.

Just want to finish off by addressing the elephant in the room, writing Rust code and really understanding the language are two whole different worlds. The language introduces a couple obscure concepts for us, the uninitiated.

If you ask me, this is certainly one of the biggest benefits of having artificial intelligence as a development copilot, as I have now immediate access to a wealth of knowledge.