Just a SQLite SPoF

2024-01

Huidong Yang ✉

February 28, 2024

So this is my 11th project, well, since project #8, I was on a light-try streak, so the degree of completion of those last 3 was significantly lower, but I wanted to experiment with many things, Tauri on Windows 7 (Edge), Elm in a Chrome extension (with lots of message passing), etc. And partially it was because I have team commitments now, so not going all-in gives me needed schedule flexibility.

With the latest endeavor, I wanted to slow down and stick to something for longer. Oh I also started using Fossil for version control, for real for the first time. Stage by hunk is the only thing I really missed, but it's still amazing to see how much a single person can achieve ("GitHub in a box"), with a webby UI, account-based access control, all backed by SQLite.

The project has gone through several stages of preliminary development:

Initial DB schema design
Basic schema migration scheme
Session-based access control (as opposed to JWT), using cookies

A practical lesson learned from schema migration: if you are to add a new foreign key referencing a relation that must not be null, then you can't get an easy/auto migration by simply writing an ALTER TABLE statement. Because you can't specify a default. If the FK could be null, then you would be able to specify the default as null; but since it can't be null, and yet the (auto) migration mechanism must have something to fill in the new column, this simply won't work - and you have to migrate manually.

Some entities are actually inseparable parts that are only separated for normalization reasons, because relational database is designed to be "flat", namely no embedding, only reference. Therefore, to avoid such "breaking" changes in the schema, designers should identify such intrinsically coupled relations and always introduce them together.

Schema Migration

Sure, auto migration generation (like in Android Room) is nice, but DIY helps one understand things better, which is invaluable. Turns out, something sensible can be designed w/o crazy stunts, but do note, that the server application should bundle the migration scripts as part of its executable, rather than reading it from some external config file at runtime, and fortunately, Rust has include_str!.

The presence of each migration script (a bunch of ALTER TABLE statements) indicates a schema version bump, so if you start your version at 1, and you have 1 migration file, then after that migration, you will be at version 2. So migration file naming doesn't matter, it's the number of files, and the order in which you specify the inclusion of them in the application, that do.

Rusqlite vs SQLx

I don't want an abstraction layer over multiple specific SQL implementations. Although generality in programming is generally good, what's more important is to understand the scope of the project at hand. I pick SQL for my project not because traditional client-server SQL setup is particularly interesting, but due to this often overlooked vision of SQLite, that a DB engine can be embedded in the server application itself, i.e. as a library, and still strike a sweet balance between cost-effectiveness and performance, that is something I resonate with.

In many scenarios, optimizing for maximal scalability is NOT optimal. Understand what you want is much more meaningful. Scope identification, if you will. I only ever want this app of mine to run on a single machine, low cost, good enough performance. And who knows, the industry might figure out scalability by the time this thing ever hits the bottleneck? (There's "libsql" for instance.)

And I don't want the builder pattern, or ORM. These are not necessarily bad, OK? Just distracting, because the point of this project is to actually have fun with SQLite proper.

And Rusqlite is pretty nifty. Yes we need some async adapter like tokio-rusqlite to work with the server framework of choice, Axum, but I can get used to call_unwrap.

And in the server code, the "db" module is purposefully void of data generation or processing logic, like random generation, hashing, calculation, just passing in ready-to-use arguments that are necessary for querying or mutating the stored data. For the data-heavy aspects, there's the "data" module. Separate of concerns is good. If something goes wrong, or slow, we want to know if it's the DB, or our data crunching.

Axum

Finally got the chance to try it for real. Very ergonomic, awesome DX, compared to what I previously used, Warp, which had a nice philosophy (filter/pipeline/functional design), but the compilation time got out of control quickly. Remember how people devised macros to balance the chained filters tree to cope with this? And passing in context/state is way too verbose (back then I had to manually add Redis to every individual filter/handler), whereas in Axum, this is handled extremely DRY, previously with "ExtensionLayer", and now even better, the with_state API. Of course, I only treat these as magic - my passion is quite limited so far, one game at a time.

Back when I first played with Axum, I thought it didn't even support CORS which was a deal breaker, which led me to Warp. But later I realized that it could just be me not reading the instructions - a central design of Axum is to leverage the existing Tower ecosystem, and indeed, the author himself made "tower_http" to provide those niceties.

But so far, I think where Axum shines is in the following two concerns:

No-nonsense error handling, all I need is just an overall enum Error type that implements IntoResponse (using status code and some JSON body). Didn't even end up trying some supposedly handy goodies like Anyhow, for me hand-written these error types is much like documentation anyway. This is super nice because it streamlines the shit out of early error return using the ? operator - the code reads linearly. And we can leverage the From trait to reduce the need for error type mapping, say from some sub-type like Rusqlite::Error to the top-level Error.
Extractors - this is pretty cool! I initially thought it's only meant for what its name suggests, namely extracting data, like response header or body, but then I tried to implement a custom extractor for verifying the access token in the cookie, which involves a DB query (note that FromRequestParts gives us access to the state, which is in my case precisely the DB connection), and I was mind-blown. This is significant DX boost. I haven't seen such ergonomic bliss for quite a while. Well done.

Simpler Access Control

In the last client-server project, I tried JWT, but that was my first implementation of an access control mechanism. In retrospect, that was probably not the best path for learning's sake, because the more traditional approach - namely "sessions", is more intuitive, and it's the proper first game to play, JWT is for scalability, but again, it's just outside the scope of my concern.

Along the way, I had the chance to, for the first time, learn what "cookie" is all about. It's free pizza where the token is transparently sent along with each request, no need to manually make the AUTHORIZATION header; sure there're some security-related settings to do, but it wasn't tedious enough for me to resort to yet another lib like "cookie-rs". But I was a bit surprised when I eventually read that in CORS scenarios, the client-side needs to send requests a little differently, withCredentials, and fortunately, Elm has this riskyRequest readily made for us.