JoT
Digest
living

August 2020 Living

Huidong Yang

Notarization for MFA Removal

Oh this tedious errand I have put on the back-burner since the beginning of the year, during the initial virus outbreak. It wasn't anything urgent, not a user after all. But as I started working on the backup app, where S3 is the first cloud storage I was trying to support as a backup destination, I felt that I need to put this to an end, enough back-logging for a chore. See, at some point, the stress of not doing something is going to exceed that of getting it done. Without root account access I could still connect as an IAM user, so it was all good during the dev and testing phases. But I think a powerful motivator to keep working on this learning project toward a serious product that can take care of my own data is that I have to eventually start using it for real. That means I would have to pay for storage, which in turn means this MFA lock-out needs to be fixed sooner or later.

The removal requires 3 documents:

  1. proof of identify (e.g. passport)
  2. proof of address (e.g. internet bill)
  3. the form

Essentially the form legally protects AWS from any potential liabilities after the MFA removal, and it requires notarization of one's signature to be validated. The notary public of the city also requires translation of the form, with professional certification. Luckily, I found one firm that offered just the certification part at half the price of the full service (but in real life, certification meant just stamp pressing: "I don't have time", as they pleaded. Well, I do. Just need their "status", so to speak). It took me quite a lot of time to nail the translation with confidence, but not without perks. In the process I learned a few words, notably "execute", "acknowledge", and "instrument", that have highly specialized legal senses, such that if you used them that way in everyday life, it would certainly sound mental. I think this is an unnecessary barrier preventing legal literacy from being developed in a wilder population, I mean they could have selected, or even invented special-purpose terms instead. But again the language is well known to borrow utterly generic words for use in technical fields, for instance, "work" in physics, and "root" in mathematics. It feels very economic to reuse common words, but is the cost of ambiguity and confusion worth it?

But I digress. The fee was almost outrageously high, given that all they had to do was applying their stamp on some standard documents, twice. OK, law school isn't cheap, I get it. But hey, rip off much? I digress again. Upon receiving all the documents, the MFA device was promptly removed after 24 hours. Done.

Oh, there was this one hiccup: the notary public must be provided with a clean form, unmarked, which was no longer the case after my translation got professionally certified, as the stamp spanned all over. Fortunately I had the form on my phone that day. Talk about services that are not very composable.

As a last note to myself, let's recap why this lock-out happened. My phone died at the beginning of the year, I was using LastPass's authenticator app, which unfortunately did not make any backup to their server automatically. Now there is an alternative way to pass the MFA, which requires both the email address and the contact phone number under the root account. So essentially, it was this double whammy that locked me out dead, that not only did I lose my phone, but I had long lost access to the old phone number on record, which ridiculously was my office number back in BI.

But not everything is bad: knowing which phone number was on record helped me deduce which of the two addresses must have been entered ca. 2011-2012. Without that certainty, I would have to submit two forms, each with a "possible" proof of address.

With this knowledge, I promptly updated the phone number on the root account after regaining access. I'm now using a different authenticator app, which does have backup automatically enabled, and since I now have access to both the email address and the phone number, lock-out won't happen again when the MFA device dies in the future (just have to keep the number).

Drobo Incident

The take-home message is, bad things can turn out to be a good thing, if you deal and learn. When I got the Drobo, I thought it was the end of ever worrying about data loss, because the chance of two disks failing at the same time is reassuringly low. OK, the main motivation to get a Drobo was that keeping attaching more external drives to my computer just didn't seem like a scalable approach, as the USB hub was already showing power issues, and the boot time increased at least linearly with more drives to spin up. A Drobo allows me to expand storage capacity without increasing the number of attached devices.

Only after the incident did I realize the quite obvious: the disks inside the Drobo are not truly independent, the most straightforward case being, if the Drobo enclosure itself glitches out, you lose all data access right away, in spite of the redundancy setup within (and you can't just retrieve the data on the disks using a regular enclosure). While you're waiting for the Drobo to be fixed, you're locked out.

In my case, it wasn't a complete breakdown, and I regained read-only (RO) access as soon as I reset the Drobo to break out of its infinite loop of rebooting and then put the device in RO mode. The glitch was caused by apparently a loose connection in one of the bays (the first time this happened, Drobo reported a removed disk, with the indicator turned off; but the incident reoccurred, this time the disk was marked by Drobo as failed, with the indicator turned red). A bad sign was the lock-up of the disk activity indicator, normally blinking, turned into a constant green light.

At that time I wasn't sure whether the disk in question had really failed, but once I put it back in, Drobo locked up again; only after using a fresh replacement disk did everything go back to normal.

After that, I got a regular disk enclosure, erased the disk (two-pass), and ran a full read surface test, and the result showed no bad or weak sector. I was worried whether the heat due to Drobo's poor ventilation design did real damage (SMART showed that it once reached the max temperature of 64 degrees Celsius), but now I believe it was just a loose connection, for which again Drobo itself was to blame, as the device was set up ca. December 2018, placed on a stable table ever since.

Now let's get back to the take-home message. RAID-based device is really nice, but after this incident, I've learned that you can't count the files stored in a RAID box as having two independent copies, because again the whole box can break down, not to mention the scorching environment it puts the disks in. RAID just makes the device significantly more reliable than an individual disk, but a solid local backup scheme still requires replication across two truly independent devices. On forums I saw, besides funny pics of melted Drobos, someone outraged by Drobo customer support asking "Are your files backed up somewhere else?", as if the point of having a Drobo was such that people didn't have to do backups any more. Well I fell into that myself. So thanks, this incident, for the early warning.

But one practical matter remains: Drobo storage is most likely the biggest you have, so typically you can't replicate the entire data on another independent device, unless of course you own multiple such RAID boxes of comparable capacity. I certainly don't, so I strategize. My solution is not technical, but rather semantic: Have a clear notion of important or valuable files versus expendable ones. Store this set of files that demand enhanced redundancy on a single disk (e.g. the 8 TB disk just rescued from within Drobo is plenty for my need), and maintain a mirror on the Drobo. The reason to store the source/working copy on a standard disk instead of the Drobo is twofold: 1. every time the computer wakes up, the OS typically finds the Drobo in a disconnected state for a short while, which tends to upset several applications that rely on its availability (driver flaws?); so this, along with its notorious susceptibility to overheating (I bought a mini fan just for it!), makes it far from ideal as an always-on, long-running storage device. 2. by using something of a more restricted capacity, you kind of have to watch the space usage more consciously (with Drobo, I tend to just forget about the notion of storage space as a limited resource).

Yes, it's work to decide what's expendable, but I think setting this clear boundary between "redundancy classes" helps us set up backups that are more efficient and economical. After all, considering the ever increasing amount of data, maintaining full redundancy (every file having an independent replica) is unnecessarily expensive for me, both money- and time-wise.

File Reorganization

Because of the Drobo incident, I got strongly motivated to reexamine how all the stuff had been organized. It's the sort of thing rarely done now, the last big revamp being... so long ago I can't recall. But this is a bigger change, and the most satisfying one so far.

I'm willing to believe that over the years I have developed more solid "fitness" metrics for file organization schemes, and this time, the new factor that stands out the most is ergonomics for backup. I've really learned that the worst backup is such that you cherry-pick different parts within some directory (because you deem only some of them worthy), and/or you set up a multitude of messy and fractured backup jobs. To some extent, the applications promoted this approach with their handy include/exclude functionality, but the fact is, over time I always ended up forgetting what were and were not backed up, and when that happened, it's a slippery slope to become completely apathetic of the backup: I ended up not maintaining it, and an unmaintained backup has very limited value.

The key to fix this conundrum is precisely in how we organize the files in the first place. One must keep backup in mind when deciding how to group or split things. I understand the very tempting ideal of organizing everything in one massive tree, and setting up the backup as an overlay on top of it (like tagging metadata). But the problem is, backup is physical, so if you put everything into this tree and you cherry-pick a subset for backup, what happens when the storage hardware holding the tree fails? You end up with a scattered mess of partially backed up stuff.

So I think a solid approach is to always backup an entire tree, no exclusions. But again, if we organize everything into a single tree, most likely we won't be able to afford the storage requirements, not in a sustainable way. Therefore, a natural solution is to build multiple trees, a forest, each individual tree being either completely backed up, or not at all, but never partially. Now ideally, the expendable trees are better stored on a RAID device for improved reliability. (I mean if they are so expendable that you don't even bother, you might as well delete them to reclaim some free space, right?)

So that's what I do with this new scheme of file organization. For instance, there are 4 trees based on two dimensions, ownership and media type, namely:

  1. I created them, they are text-based files (small)
  2. I created them, they are rich-media files (large)
  3. Others created them, they are text-based files (small)
  4. Others created them, they are rich-media files (large)

For the two that I created, I aim to maintain two local independent replicas and one remote replica for a fairly solid protection; for the two that others created, I aim to maintain two local independent replicas only, if I believe they are not well backed up elsewhere, or even no backup if I believe they are robustly backed up elsewhere (e.g. published source code).

A related but minor change was that Tree 1's source copy (my own text files) was moved out of Dropbox and now is synced via LAN across my two computers, and Dropbox was instead used as the remote backup destination for now. Dropbox is supposedly one of the best in business, but it's also one of the best demonstrations of the flaw of server-centric designs. You can't dismiss the fragile nature of such architectures by arguing "who on earth would deliberately interfere with such a useful, general-purpose service?". And with Dropbox, it's not just the common SPOF problem, but I find it particularly ironic that Dropbox, normally capable of LAN sync, will simply refuse to sync over LAN if the server is inaccessible (I could only guess it's because the design puts the server in the special role of the "single source of truth", and thus without it live, the rest local agents are forbidden to take any action, for in various cases synching requires making assumptions based on some truth).

Finally, it's worth noting that now I have the entirety of my internal SSD, in three partitions, backed up daily via incremental disk imaging. Previously, I included only two partitions (system and application) in this daily backup routine, but the third partition, for other data, was left out because of its relatively large size. Instead, I ran a multitude of piecemeal backup jobs (e.g. of various application data) on a manual, sporadic basis, which just fell apart over time: they all went outdated. Regular, automatic backup of the data vital for system and operational integrity, is a critical task to keep up, day in, day out, so you'd better make it ergonomic (which basically means "set and forget"; you can never beat laziness, you know).


As an extra note, regarding the new organization of Tree 4, I find my newest approach refreshing yet back-to-the-basics. Taxonomy is in general a hard problem. Over more than a decade I tried various ways, in the beginning I was principled but "a priori", in the sense that the structure felt neat (to the taste of myself at that time), but because it's not data-driven, i.e. wasn't based on the actual files that I had, there tended to be overly long paths that could have been collapsed, or even just empty folders where nothing belonged. Later, the way became much more pragmatic, it emphasized convenience, and sometimes followed conventions established elsewhere. But then it was not as elegant as I would hope for, and I just lived with that. Now with this new opportunity and motivation, I decided to give another try, and I like the result, for this time I believe this new taxonomy is going to finally stay, at least the main branches of it, because it's as basic and logical as things go. Tree 4 is essentially my personal "favorites" of the web. We all have such a tree. But will this tree age well over a long period of time, even throughout your life? How often do you find yourself rebuilding it from scratch? I believe every such iteration has something to do with the evolution of one's own mental model of the world. I find the process satisfying, even though each major revamp can take significant effort to reach a new "final" form.

Of course, the nitty-gritty of my way of organizing things might not be super relevant, so I will mention just the two main branches of the tree: "education" and "entertainment". It's a very old classification, and in some sense, a lousy one, because lots of things can be both. But I have returned to this basic scheme, because in practice, working with what I actually have, I find the boundary well defined most of the time; and in those fuzzy cases, I have come up with a very specific measure to disambiguate stuff: everything is by default entertainment, unless you have actually learned enough from it to do something for real on your own, and that includes not only concrete skills, but your general values and the life choices that you have made along your path. And with this simple division there is an implicit goal: to move more stuff, or shift the weight, from entertainment to education, because that would mean that, instead of mere passive consumption, you are truly learning (as one definition of learning is change of behavior).

Silicone Sealant

This has also been on my mind for a long time, and finally I managed to get to it! The kitchen bench around the sink is a wet place, and mold develops easily along joints, as no sealant was applied during the initial installation. But the worse problem was that the silicone sealant the installer applied around the sink was not mold-resistant, and in two years, fungal intrusion has reached a terminal state. Applying bleach-based gel was able to remove the stain in the hard junctions temporarily, but not effective on the filthy sealant.

Again, just like how I learned drilling, I watched several YouTube videos beforehand. A spray bottle of diluted detergent was a common trick to render the applied area (including your fingers) non-stick, and it seemed that the secret of doing a neat job was really just having a well-designed profiling kit at hand. They are basically little scraping pieces with cut corners that trim and smooth the silicone to render a consistent and uniform finish. Fingers might work if precision is not required, but later on I found that fingers, when made non-stick with the solution, were helpful in reinforcing the sealant attachment by applying slight pressure after the use of a profiler.

I didn't have a nice profiler with me, well, because I deemed this job strictly for August and it was the very last day (yup, a stupid obsession with deadlines). But what I had was a piece of hard plastic that came with the sealant, which wasn't made for rendering the thickness that I needed, so I cut one corner wider with pliers, and smoothed the edge with sandpaper, which did give me a decent makeshift to get a fairly smooth and neat surface at the bench joint.

It turned out that applying and smoothing the silicone at an open joint was the easy part. When it came to the sink, I realized that first I had to thoroughly remove the old sealant, and I had to cut all the way in underneath, instead of just the visible portions poking around the edges where black mold revealed itself. When I cut in enough, I realized that the whole sink could be slightly lifted and detached from the bench, and what was revealed then was immensely more disgusting. But at the end of day, it's just more scraping using a blade, finishing all four sides. When everything underneath was clean, the time finally came to apply the new sealant, which was supposedly anti-fungal. Now the tricky part was I had to prop the whole sink up so that I could apply the sealant along the sides of the opening on the bench. But just some firm, broad-based metal containers did the job. Now I noticed that the sink had its own seal underneath, but it was a spongy material that got fragile while soaked for so long, and a portion of it fell off, dangling, so I had to glue it back as well. What a distraction. But finally when the sealant was applied, I put the sink back down, and saw the extra silicone got squeezed out. I did pencil-mark the contour of the sink beforehand so that I knew how much sealant to apply around the opening, and where to put it back. Now the final part was tricky, and that I did not do a good job at. The pushed out silicone was irregular and messy, but while I was attempting to smooth it out, it already started solidifying... and the end result wasn't pretty. But I told myself that the critical part was to create a firm and mold-resistant sealing underneath, and the visible portions were purely aesthetic. I decided that I would simply cut out the exposed sealant around the contour, because from experience, the old sealant wore out quickly around the sink edges anyway.

Not a perfect job, but as a first-timer? I can live with that. Now every time I saw the nicely sealed joint on the bench, where water could no longer sit to attract mold, it felt really nice! The sink trimming though is still todo as of this writing. I will get to it later.