Digest

coding

October 2020 Coding

Huidong Yang ✉

November 21, 2020

It was a slow start at the beginning of the month, but I told myself that patience was required there, just as one would need after an extended period of recess: a comfortable ramp of workload. This is very obvious when it comes to physical exercises, but people, myself included, tend to believe that our brain works differently and can snap into the state of maximum energy instantly. I'm very glad that I kept calm during that initial slow phase, and indeed, the virtue of mental robustness, the optimism and perseverance we hold when we're not feeling super productive, is key to unlocking and truly exploring what we can do.

At the end, when the problem was solved, it felt simple, partly because, in general, when it is dark, any exploration and experimentation can seem impossibly hard. But I don't think it was a trivial problem to begin with either: it's good that it feels simple, that means the solution is rather clean and neat. Plus, when I said "dark", it's in fact far from pitch black, I wasn't walking never-walked territory, it's just, my first time.

Summary

In Arrow, a ~~long-overdue~~ revamp of the IDB schema led to efficient state persistence, and replacing List with Dict in the app model eliminated the predominant hot spot as I predicted (previously, Html.Lazy only masked the performance issue temporarily).
Finally started reading some high-level Safe Network code, and along the way found a "wee bug" in sn_api, and was encouraged to submit a (wee) patch.
(Extra) Participated in a local photography contest.

Last month I said I would like to see myself working on multitasking in October. And I failed. Well, it was definite failure at the "daily" scale. But that made me think, maybe daily multitasking in general isn't such a great idea after all. I guess it works only if the tasks have already been "booted up", and we have a general idea how each can progress, or at least have the confidence that progress can be made. Well, neither of the two tasks of October could really qualify. Arrow data revamp seemed particularly daunting, especially given that I hadn't touched the code in almost exactly one year, and I wanted to provide a smooth migration of old user data; Safe Network code reading? Had very little experience reading any sort of code, now suddenly opening the code of my hero project? So both tasks felt like the "crawling in the dark" kind of challenge, taken on by a total noob. And you are asking for daily concurrency? Can't spawn a second thread!

Well, multitasking is just a means through which you hope to optimize throughput, but if it becomes so daunting that the mere thought of it freezes you, then give yourself a break, you can always try it again later. Instead, focus on one task deemed primary, and just forget about the rest for a while. It's almost always psychological: once you see light in one place, all the others start to seem less dark. Near the end of the month, the Arrow work was finishing up very nicely, and I could gather enough energy to start compiling sn_api and digging in. I believe I made the right call to start from the top, namely the CLI app, and pick a single entry point, the Sequence data API (because writing Arrow etc made me care). So, looking back, I did achieve multitasking nevertheless, at the "monthly" scale.

Arrow Data Revamp

To recap, there were two main performance issues:

Given the app model, List.filter doesn't scale with the growing pool of records, where a subset of records need to be frequently retrieved (by ID) from a big list. This is definitely the most devastating performance flaw.
As the persistence mechanism, IDB had been used to emulate a naive key-value store (like localStorage), where the entire list of records are stored at a single key. Therefore, to save a single record update, all the records had to be rewritten again. This is outrageous, but due to the efficiency of IDB, this never actually bothered my daily UX. But I think it's time.

IDB Done Right

It does take some reading to know IDB, but looking back, I think it has just the right amount of complexity, for my app model at least, and that just showed that the design work that went into IDB's API was pretty brilliant: indeed a SQL layer on top would be definitely overkill for my need. Rather, the basic feature set of this API fits Arrow data very naturally:

Object stores: instead of saving an entire list as a single value, it is natural to use separate IDB object stores for different types of records, which are saved at individual keys. This change makes possible precise, "delta" storing (touching only what's changed), avoiding wasteful re-saving of unchanged records. Whole-store retrieval is also made easy and efficient with methods like getAll().
Kyes: I had been long wondering, if a record to be stored already has a unique ID field, wouldn't it be redundant to explicitly store this ID as the key to this value? Turns out, IDB has this exact concern covered: we can specify the common keyPath for an object store, and then for all objects put in this store, no key has to be explicitly given, instead, the key is sort of "marked" within the value itself without having to store it separately (such keys are sensibly termed "in-line keys").
Transactions: nice feature, feels "production-grade"/"industrial-strength", whatever that means. It gives us the guarantee that either all parts of a logical action have succeeded, or nothing takes place (i.e. making the action atomic). This effectively prevents state/data corruption.

Finally, it's worth mentioning that the syntactic sugar provided by the "idb" library is totally worth it. In my opinion of course; after all, some may weigh "zero-dependency" more heavily. But I believe it's no accident that "async/await" is everywhere across languages, and I hope IDB adopts this API wrapper officially one day. Again, ergonomics matters. #WeAreNoMachines.

Dict Saves the Day

Performance optimization can take many iterations, and can be rarely declared finished work. But as one of the perks of dogfooding, I've known the predominant issue: typing in input fields (e.g. to create a new record) gets increasingly sluggish when the number of records grows over time. As I mentioned last month, this problem was once significantly mitigated via Html.Lazy, but resurfaced later when the number kept growing further. But I didn't really give it much thought, until maybe a couple of months ago, when the dots finally got connected, and the answer turned out to be too obvious. Again, back in the beginning, I learned everything from elm-todomvc, including the exclusive use of lists as record containers. Lists and list functions, so quintessential of functional programming, right? But a newbie does not know when and where to stop emulating the example under study. Alternatively, I could have read on the subject more extensively before hacking away on my learning project, and sometimes I wish so; but I actually do not regret it, because I believe first-hand learning, namely learning from one's real mistakes, gives a result different from simply reading and following advices, even those of the experts'. The latter can definitely save a lot of time, but I doubt believing and following something that one hasn't internalized is any different from plain emulation. And I still believe the way to truly internalize knowledge and skills is only through the hard way: do it yourself and see what happens; another way to put it is simply the scientific method: hypothesize, test, and if wrong, hypothesize differently and loop.

But of course, one has to prioritize, because few people today can afford learning everything the hard way. But for me, designing and developing applications is a skill versatile enough to merit such an expensive approach, and I find it truly rewarding. This is one of the luxuries that I get to enjoy.

To recap, it dawned upon me one day that List.filter was the predominant hot spot that caused the terrible typing latency, and the obvious solution was to replace lists with dicts as the record containers.

To demonstrate the problem, let's say we have a pool of "parent" records, and another pool of "child" records (from all the different parents), where each parent keeps track of its children via a list of child IDs. Now the old design was that the two pools were stored as lists, and therefore, to retrieve the child records of a given parent, it takes

List.filter (\child -> List.member child.id parent.myChildren) allChildren

We see that the running time is O(NM), where N is List.length parent.myChildren, and M is List.length allChildren.

Now if we use Dict as the container, we then retrieve a parent's children by

Dict.filterMap (\childId -> Dict.get childId allChildren) parent.myChildren

Assuming Elm's Dict is implemented as a B-tree, the running time is O(NlogM). We learned from calculus that log is the most slow-growing, but after completing the revamp, I actually felt that the typing went all swift again, as responsive as back when there were very few records, even though at that point, a parent already had hundreds of children and the entire children pool had thousands. Tell me nothing gets internalized!

With this pretty straightforward change, the major and the only perceptible performance issue is eliminated. However, it's worth mentioning that, it did involve making hundreds of (small) changes throughout the codebase, and Elm was the hero there. One might be able to keep in the head the vast majority of the places to take care of, but it only takes one missed spot to render the program broken, and quite possibly, the more you managed to cover (but imperfectly), the subtler and sneakier the bug could be. No wonder why it is common for JS developers to carry out a major change by rewriting the entire program from scratch instead of scratching around the existing code. Me? Elm's compiler just kept telling me what's still broken, and honestly it started to feel tedious. And that's a heroic feat to pull off, turning something potentially hellish to just boring.

Simpler Is Better

Taking care of edge cases is good, in principle, but there is a catch: if there are many edge cases, so numeral that you deem the benefit not worth the extra effort, then I think it is actually worse to only take care of a selected few of all the cases. Essentially, it is a trade-off between simplicity and robustness, both of which are highly valuable properties; yet the tricky part is, I find it typical that it can take a disproportionally large "jump" of complexity from the simplistic architecture, in order to handle merely a small fraction of all possible edge cases: basically you pay a lot up front for just very little in return. And it is a particularly bad investment if the architecture is in an early phase of design. You don't much refine things that might be soon thrown away.

Ideally, you either handle all the cases, or nothing. One might even argue that partial robustness is not really that far from no robustness at all. But in practice, things are not that clear-cut. This is especially true when there is no known systematic way to fix it once and for all (think the seemingly endless releases of OS security patches over the decades). So I think one rule of thumb is, if there is strong, real-world evidence (such as usage data) demanding special care of certain edge case, then go ahead; but otherwise, keep it simple!

Why keep it simple though? I think it's mainly for the sake of evolving it in the future. If something can be truly solved and finalized, then it doesn't really matter how complex it is (we can just treat it as an immutable mass of binary code). But writing applications for human use is so different from say, proving theorems. There can always be better ways to design things, and even the criteria of "better" can change over time. In math and science, simplicity is mostly appreciated as a form of beauty, but in engineering, it has a much more salient instrumental value.

Complexity is such a burden. That was how I felt the moment I started this fairly large-scale revamp. For instance, back then I was worried about pathological scenarios where stored data got tampered with, causing state inconsistency, so I wrote repairModel to handle a few such cases. But the matter is, there are so many other ways to mess things up! Half-assed patching gives half-assed robustness, and besides the extra cognitive load it comes with, this seemingly benign effort can induce a false sense of security, or in other ways hinder the development of a truly principled and systematic solution to eliminate the weakness altogether.

Plus, after nearly two years of dogfooding, not a single instance came up in practice where the patching went into use. With that counterevidence, I took the liberty to remove the convoluted and superfluous logic during the early stages of the revamp process.

Another simplification that I decided to make was eliminating all the "UID" states, both persistent and in-memory. Again, this idea was directly copied from "elm-todomvc", and in DB-speak, such UIDs serve as simple key generators for various records. But these states have to be manually maintained, which not only is tedious, but also makes the app model unnecessarily more fragile. I realized that, under the assumptions held in the model, such UIDs can be simply derived from the primary data, and thus don't have to be kept in the state set. Now what are the assumptions? Well, there is actually none that is absolutely necessary, since I'm only dealing with natural numbers. But there is one assumption well held so far: records are never deleted (for deletion is not even implemented, on purpose), and given this, deriving a new ID is as simple as querying the size of the container. Nevertheless, here I did go a step further and guard the ID generation function against pathological scenarios where the no-deletion assumption is violated (e.g. when IDB is manipulated outside the app), because 1) failure to handle such cases will result in data loss (previous records being overwritten), and 2) the extra logic required is inexpensive and strictly encapsulated within the ID generation function:

newId : Dict Id (Ided a) -> Id
newId dict =
    let
        easyId =
            Dict.size dict + 1
    in
    if Dict.member easyId dict then
        maxId dict + 1

    else
        easyId

That means normally (practically always), I only pay the extra (log) time for a membership check, but in return I get an essential data integrity guarantee.

In summary, this story isn't advocating a blind neglect of edge cases for the sake of simplicity. As a matter of fact, most useful applications have intrinsic complexity that requires programmers to face head-on and tackle diligently. My point here is to raise the awareness of the cost of introducing complexity for improving robustness. If it's cheap and doesn't leak, then go for it. If it is done systematically, leading to a complete solution, then perfect! At the very least, a patch has to be justified by a strong, real-world use case. Otherwise, I think it is wise to keep things simple for the moment, especially for an early-stage app that is expected to undergo multiple major changes.

Migration UX

OK, the required data migration as a result of the IDB-related rework was actually the simplest kind among schema changes: recall that previously, entire lists of objects are serialized into single key-value pairs, all inside one object store, whereas the new schema allocates separate object stores, each storing individual objects of a given type (so basically the old keys are now the store names). Notably, the migration involved no change in any of the object types, and thus I'd rather consider this migration really a data reorganization.

Two options came to mind regarding how the migration would be initiated after users with existing (old) data updated the app to the new version.

in-situ: new app can migrate existing data in IDB to the new schema
via file: new app takes in a file of data exported from the old app

UX-wise, it's clear that option 1 is superior, as it does not require users to export the old data before updating the app to the new version. Indeed, option 2 can be particularly troublesome when working with web apps, which are typically served at a single versionless URL without downgrade support. What happens if a user forgot to export the latest data, before the app just got "transparently" updated? Without access to the old app, the user will have to resort to any outdated data file, if available at all, for a frustratingly incomplete data migration. Therefore, option 1 is a must-have for decent UX.

However, is option 2 then useless?

Not completely. I could think of the following use cases that would require a data file for migration:

When a user switches to a different browser or device to run the new app, or if the app is now served from a different address. In such cases where the old IDB data is no longer accessible from the new app instance, the only way to restore the data is via a data file.
If deleting the old IDB object store is preferred. Note that the only place allowed to delete an object store is when handling the upgradeneeded event, which is triggered when running the new app for the first time. But if we delete the old store on IDB upgrade, then in-situ data migration is made impossible. Therefore, by default, the old store must not be deleted automatically, in order to ensure that option 1 is always available to everyone. Now some users, namely the neat freaks (myself included), would strongly prefer not leaving the vestige behind, and thus they would have to use a data file for migration. But the question is, how does the app know which way a user wants to go? The solution that I came up with was to allow those "advanced users" to explicitly opt out of option 1, by e.g. setting a special key in localStorage, which would serve as the command to delete the old store, as well as the confirmation that they know what they are doing, in particular, that they have exported the data beforehand.

SN Code Reading

Safe Network (SN) is a huge project, and understanding it requires knowledge on not only networking and distributed computing, but also cryptography. But that's not the point: with enough motivation, we can learn anything given the available resources today. The question is, how can one (outsider) build up such motivation to "dig deeper" and "get involved"?

I believe just thinking that SN is important is not sufficient. We can talk about the flaws of the current internet all day, but we go right back to it, because the world is already running on it (the content alone, in spite of the imperfect techs used to serve them, is absolutely irreplaceable). Moreover, lots of optimization effort is being put in to remedy the notorious weakness of centralized architectures: major cloud/CDN companies are all building distributed systems, just without losing centralized control. And there have been reports showing that gigantic data centers operate much more efficiently and eco-friendly than grassroots-run rigs. And monopoly has already been established in vital application domains such as search, social/communication, and content publishing.

So in the near future, there is no way for SN to compete in the arena of web content (unless there comes an automated, scalable mechanism to easily "mirror" content from the clearnet). But even without winning the content game, SN will still have its place in communication and publishing, in addition to data storage, and fixing these three problems are by any means a super-big deal. Besides, SN needn't aim to blindly make available whatever is available on the clearnet. As Internet V2, it should take this opportunity to run a filter (in the sense of evolution/natural selection, not of censorship) and leave only what people find valuable and worthy (and the exact proportion of such gems among the whole web is a curious number).

So, we don't have to be too disheartened in the competition. When you build something better, people will come.

Where was I? Oh, regarding how to get started. The code base is large, my strategy is top-down. The highest-level repo is sn_api, in which there's a crate sn_cli that implements a full-featured user application to interact with the network. This app uses the library crate sn_api in the same repo. This library is the entry point.

Now when you start reading it, you will be quickly digging into two lower-level libraries: sn_client and sn_date_types. I only had time to look into very few functions, since I didn't manage to start until the end of the month. But I'm still very glad that I have finally walked my first baby step. It's so liberating to see the psychological wall dissolve. Since all I'm looking at right now are high-level stuff, it's no trouble getting a rough idea of some APIs, such as Sequence and Map. I encountered a mysterious spot in store_sequence where the data was stored twice, and MaidSafe team responded to my question on the dev forum, kindly acknowledging that it was an oversight in the process of a major refactoring, and encouraging me to submit a patch. The team and the community members are indeed very supportive of newcomers.

sn_client is a nice entry point to the lower level code, e.g. its ConnectionManager directly uses qp2p, but I didn't get the chance to look at that. sn_data_types uses rust-crdt, which I know nothing about, but it's also something on the radar.

Photo Contest

And what's up with the photography contest? What? You afraid of losing? That's the point. I caution myself, that life would be far less enjoyable, or even meaningful, if I just let that, winning/losing, take over the motivation and pleasure mechanisms. Nobody gets high from losing, but the point is, very often, that's not the point. It can be as simple as just wanting to be part of something that we resonate with. Photography is what I enjoy (do I have to say, as an amateur), and from my recent visit, I kind of like this brand-new local museum organizing the contest. So go ahead.

Coding Plan

Elm-UI. Yup. For Arrow. OK, my original plan was to first try it out on something much simpler. But I trust it to be more than a mere hello-world kind of thing. The author (Matthew Griffith)'s presentation on this library was one of the most exciting Elm talks ever, and Richard Feldman called it the killer app of Elm. So I reasoned, if I were just to doodle something up on a blank canvas, I wouldn't know what Elm-UI could give me in practice. Instead, if I set out a well-defined goal of re-implementing the current Arrow UI using Elm-UI, then it will be clear whether Elm-UI is truly capable of, and ergonomic in, handling a real-world task for me, besides empowering us with a design language much more superior than HTML+CSS. Or, the shorter question is, is Elm-UI ready now (for me)?

Elm-UI the syntax/interface is already marvelous, I have no doubt; but internally it still has to deal with CSS (so we don't have to), and that makes it hard to perfect. But I'm a practical person: if it can do the existing UI, I will consider it fully capable, so that I can move on and use it to design and implement more UI functionalities.