To Learn Distributed Networking

July 2021

Huidong Yang ✉

July 24, 2021

This is a new experiment. My general experience with reading- or lecture-based learning hasn't been very good, not because of bad performance in school, but I found that there's a big gap between meeting/exceeding the requirements of school and true internalization, aka the psychological definition of true learning: change of behavior. Even being able to teach the subject doesn't guarantee that. Maybe that level of deep understanding and mastery isn't required, sadly, for us to function in the world in most scenarios? But anyway, I want to pursue a more satisfying learning process than what I have had in school. Now I'm not saying I didn't get much out of school, on the contrary, I think almost everything I got and valued, I got from school, just sometimes in an indirect way, e.g. when I got stressed out from the school work, I tended to distract myself by learning about something new, mostly from the internet.

Since leaving school, I have found one such method of learning that is more satisfying, namely project-based learning, in particular, coding-based learning. Yes, back in school, we also had course projects, but I personally found them rather hampered (which is understandable given all the constraints and my own limitations). So I was glad that I finally had the chance to do that outside the framework of formal education, all in and all out. But again, I doubt I would be motivated or equipped to initiate such efforts if I didn't have the formal education.

But now I see that project-based learning isn't a cure-all. At least, it still needs the help of old-fashioned stuff, like reading textbooks and papers. But as I said, I didn't have a great experience with that aspect of school. I am particularly upset about the leakiness of memory, or the fragility of learning without true internalization. This poses a particularly daunting challenge when undertaking a complex field of study that involves tremendous amount of knowledge and insight, in both breadth and depth.

I still believe firmly that ultimately, the test of internalization is whether you can produce something of your own after the learning phase (ref. Richard Feynman, "What I cannot make, I do not understand."). But I feel that now I know what that "making" phase feels like. Nevertheless, I just can't say the same for the phase before that, namely gaining a deep understanding of something that bears much more theoretical depth than the kind of "learning" that mostly takes mere imitation and fluency.

The subject in question is distributed networking. I had little background in the technical aspects of computer networks, but I know how vital the internet is to all of us, to the humanity, and I was inspired by quite a few people who spoke about the problems of today's internet, including its heavy centralization (via servers and data centers), which led to both privacy violations and security vulnerabilities. Most notably, the Scottish engineer David Irvine, who created MaidSafe (the original envisioning and design of the Safe Network, under development), had subtle but profound influence on my making choices along my path. That includes the fact that, interestingly, Irvine was the one who spoke very positively of Rust and Elm, which first prompted me to learn the two languages, and as a result they both have become my "main" languages, in the sense that all my self-initiated projects (post-school) use them as the main languages.

Making Arrow has been very fun, and I believe building useful standalone ("local-only") webapps that do not rely on talking to remote servers is one way to do good with code, but that's not the complete solution. Every single day, I'm reminded of the criticality of the internet, mainly the connectedness of data, people's data. That kind of sharing and exchange deserves a much more reliable and less fragile overlay network on top of the bare internet. To me, the Safe Network, among all the "better internet" projects, is the only one that does not compromise, and has the ambition to build a complete solution, instead of solving a limited subset of problems while still relying on the ways of the status quo. And that is also why it is taking so long. Yet what we want is important for our future: to put the ultimate super-brain (coined after superego) back to the control of all the people.

And to contribute to that it will surely take a lot of learning to get oneself prepared. I was occupied by the coding tasks, and couldn't really commit to a systematic learning plan in this huge field. Maybe I was even scared, of giving up, or getting lost. But recently, more vivid signals have reinforced my motivation to prepare myself, such that I grow capable of contributing to the essential parts of the effort. I believe it's never too late to learn, as long as the right approach is taken. Patience, perseverance, obviously, but also the technical know-hows on learning a large amount of nontrivial knowledge, in a way that actually empowers the learner to critically evaluate solutions and explore new ones.

Late last year (Oct - Nov, 2020) I tried to start with reading the Safe Network code. It was fun and useful, and I had the chance to talk to the MaidSafe team in the process of trying to understand the code. Even earlier (Apr, 2020) I tried to pick up some textbook online and read some introductory materials. But I felt disorganized and distracted, and things didn't stick, for lack of better words.

I didn't have a concrete plan. Now I want to try this again. More seriously. That's why I want to keep a public journal about this long journey. My doubts are gone now. I will take this path. Now the question is, how to best walk the path?

A few general methods came to my mind:

Mind map: Emphasizing the overall, systematic structuring of the knowledge, that is, the breadth, but not the depth.
Q&A: Focusing on a particular point of confusion/difficulty, emphasizing depth, but not the breadth.

I get the importance of the first approach, but to me personally, I think developing a systematic grasp of a field is rather the easy part. One can obtain such a knowledge tree by literature search, plus logical organization.

So I prefer to put my efforts mostly on the second approach. "Learning by asking questions" is what we hear often, but I think the more vital part is trying to answer the questions you have raised, most likely with some help, e.g. from static info (internet, library), or if you're lucky, by talking to people (online, offline). But much more importantly, you have to then digest all the acquired information and give an answer yourself.

So I think up next I will start adding entries in JoT that are each a list of (question, answer) pairs on the subject I learn. I'd like to file these piece under the new category "Flashcard", because 1) I prefer not to have the special character in "Q&A" in the URL (joking?), and 2) although I was absolutely not a fan of flash cards in my school days (never actually used them) as they were notorious for being a one-off, rote cramming tool, now I am able to see one aspect of it that shines: as a more agile implementation of Q&A. They are meant to be frequently revisited, so as long as no rote is involved, such revisits can help solidifying one's understanding of important issues in a field. Honestly, I still view internalization as something bit of a magic. Many things that I thought I internalized in the past now turns out to be beyond my recall (which I know does not equal complete forgetting, yet it is still frustrating. But hey, maybe it is only natural, the "use it or lose it" mechanism in action). So I think with the flashcard mindset, there's one eay way to check if you've got internalization (maybe there's a grayscale to it), i.e. revisit the question on the card, and try to answer it without trying to mechanically recall. Reason instead, with the help of only internalized things.

And of course, the second, and probably more reliable check, is through real projects.

Again, I'm not saying systematic organization of knowledge should be ignored. So maybe some other pieces will be about that.

So for August, I'm going to study Kademlia DHT, on which MaidSafe's routing algorithm is based. Specifically, I'll read 1) the original Kademlia paper, and then 2) the MaidSafe DHT paper. Afterwards, I'll write two Q&A ("Flashcard") pieces.

Some potential extensions:

Code up some DHT demos? Although I'm totally clueless about that.
Read other related papers on Kademlia DHT (optimizations, comparisons with other DHT algorithms).

Finally, a note to myself: when writing such pieces, the complexity, or the cognitive load, of the language itself must be reduced to the minimum. The essence of the answers will be strictly technical and explanatory. They are the wrong places to try to overachieve in writing aesthetics. To put it another way, pretend that you have an audience who don't appreciate having to open a dictionary while reading an answer.