Chat App (5)

March 2022

Huidong Yang ✉

April 7, 2022

So March has been about the GUI. At first I thought I'd start with the more challenging tasks first, that is, things that I had never done before, and then move on to the actual "art work", which would become in comparison easy and relaxing, as it's all about the graphic details and usability / ergonomics. Just take the time, you can't fail that.

It turned out that what was the most mentally stressful was exactly that long grind after the excitement of getting the novel problems solved. Well, to be fair, it wasn't all merely psychological, as there was this technical challenge of "responsive design", or, making the UI adaptive to mobile screens as well as the desktop - it's not really hard to do per se, but the mind-boggling part was, struggling to find the way to do it elegantly, particularly with Elm-UI.

And previously I only heard that artists and visual designers often "agonize over every detail" of their work. And now I knew how it actually felt. I agonized, too. And the funny thing is, I wasn't even trying to get it perfect yet. I just wanted, at this stage, to get an initial draft of a GUI design that's decent and could serve as a foundation for the future iterations. And even that was not easy.

But I have to admit, most of the difficulty was indeed psychological. Although I still believe it was the right decision to work on the novel challenges first, I think part of what made the remaining month a tough time was the sharp contrast between the two phases in terms of pace, mode of working, mindset, etc, that is, when you know you've made a mechanism work for the first time, you can check the box, be happy, take a break, and then move on to the next challenge, while in GUI design, it's filled with getting lost in the immense space of options, trying to rely on your intuition and experience in settling down on a single point that you can at least hold onto and start the first line of code... That is the grind. And I can now better appreciate the role of the design team of a product, and how much mentally easier it is for a programmer if he only has to take a final design document from them and code it up (especially if you use something as intuitive as Elm-UI)!

But I turned that into a learning experience. The growing of a programmer isn't just about technical leveling-up, but no less importantly, how to deal with one's feelings.

To be clear, I knew making good GUI was hard. I used to read blogs by indie developers making their commercial products, it seems that the process of improvement never ends, and often they involve very minute details. (Side note: maybe being commercial products does play a role here, from what I see, excellence in GUI doesn't often come out of open-source projects, perhaps with the exceptions of those backed by the tech giants, which might be an indicator that making good GUI is 1) a lot of hard work, and 2) very often underappreciated with respect to the time and energy cost.)

As a retrospective, I think what made it harder, esp. in the beginning of the GUI work, was my underestimation of what even a very first iteration would take. I didn't have that much GUI experience, and I think I forgot that. Especially when it comes to responsive design, and in terms of doing it in Elm-UI, instead of via CSS media queries, this is my very first time. So I had a cognitive distortion, and that was the real source of the distress. Recall the time before I started my formal programming education, I played with making webpages and a simple image overlay viewer in JavaScript... the point is, I could spent tremendous amount of time on it because I knew so little, but I never felt "unproductive", no "performance anxiety" whatsoever, because it was all play and fun. Yes, now it's clearly different, because I have projects that I consider "real", whatever that means. But the seriousness also makes me more vulnerable to negative emotions, and thus, may negatively impact my mental growth as a programmer. Having a schedule with a monthly target is a great mechanism so far, but pay close attention to such warning signs.

To be fair, I eventually reached the goal set for March, now there is a mostly complete first iteration of a responsive GUI, and I felt immensely happy and relieved. But I think it's more important to not just ignore the distress period before that. Winning the battle, so to speak, doesn't automatically make you immune from PTSD. So that's my take, it's more meaningful to reflect on what went wrong, given that you are in a positive, growth mindset when doing the analysis.

OK, now, let's get into the technical aspects. First, those two mechanisms that I designed and implemented for the very first time were 1) multilingual support, and 2) dark & light theme support. They're pretty common today, and are necessary components of a mature, production-quality, or "commercial-grade" if you will, GUI product, and that's why I want to seize the opportunity to give them a try. And to me personally, they both matter. My parents don't speak English, and my eyes (and a lot other people's) feel less stressed when using the dark mode. So I'm very glad that I made these two work!

Multilingual Translation

The strategy is, there is only one "default" language hard-coded into my source code, and its translation into other languages are hosted as network resources. This is for two reasons: translations are essentially repeated information, so if you put them all in the source code, that's pretty WET. Moreover, besides Chinese, my native tongue, I can't tell whether a translation from a foreign language into English is well-executed; and machine translation is still sub-human, so quality control would be an issue.

Now unsurprisingly, being out-of-source makes the necessary mechanisms more complex, but the result wasn't too crazy. The fetch API is pretty handy, and I devised a simple versioning scheme to enable caching of network resources in localStorage.

My approach to i18n is to keep the usage of the canonical language ("en-US") as simple as possible (but still natural), that is, avoid using unnecessarily complex patterns (typically those highly localized ones) that will be hard to translate to other systems of languages. Whenever possible, stick to a highly decoupled, "modular" way of describing your UI components, so that only short, self-contained tokens need to be translated at a time. When complete sentences are unavoidable, make sure their meanings are unambiguous even without context.

It's not because I view languages as mere cold, dumb symbols of instruction. But I think in GUI, verbal descriptions are only for orientation purposes, namely, once you get comfortable with the UI, the words will become pretty much invisible in your mind, and more often we will then rely on shapes and colors to navigate around (hence the frequent use of icons in GUI). Therefore, GUI isn't the place where you express ideas with sophisticated literary composition (instead, your design of the software itself speaks to people). Words only have to do one thing: be helpful, and get out of your way afterwards.

Another note: for multilingual support of UI, I think one critical part to get right is the very first page that new users are going to land on, because this is the place where non-English speakers will feel the most helpless, so it is imperative that a list of available languages are presented to them before anything else, and the list must include language labels in their native text, in addition to the English names.

Now there is a subtle complexity in order to implement this mechanism. New users, or "guests", by definition haven't created their account yet. For registered users, the language preference is stored in an account-specific database, retrieved upon a successful login, and synched with the in-memory state as users make changes to the preference. But this whole machinery is not available to guests! In order to make it work for them while keeping the logic simple and straightforward, my approach is to maintain a separate state representing the language preference of a guest. To be more nuanced, we use a Maybe String for guest language ID, because then we can use Nothing to encode "intention", i.e. whether the guest has made a conscious choice, or just left it as the default. For example, say the default language is "en-US", now if the guest simply dismisses the language menu, then the state stays as Nothing, and in the case that this is a returning user who has previously picked a non-English preferences, then upon login, the guest state will not be pushed to the database, and the old preference is loaded into the app instead of being overridden. But if the user intentionally picked some language (could even be English), then the guest state will become, e.g. Just "en-US", and that signals a push of this guest preference to the database upon login.

Keeping two separate preferences, one for logged-in users, the other for guests, is by no means the simplest possible approach, but to me it is by far the most intuitive, because then you don't have to think about resolving various kinds of conflicts before and after login, as you'd have to do if you used a single preference.

Color Themes

Now this time, everything is in the source code. Yes, some applications offer the users color scheme customization, but as an infrequent user of the "user style" addon in the browser myself, I see that it is essentially a resolve out of helplessness, when users know that the developers aren't ever going to make anything they find decent, and it is quite a lot of work to achieve anything decent. I believe this kind of delegation of responsibility is not healthy. When it comes to the aesthetics, I do believe that the maker (of the machinery) should also be the author of its visual presentation (and good visual design can greatly facilitate the use of the underlying functions, as we see in all kinds of everyday hardware products). So that's why I want myself to design with care the palettes (Dark and Light, for now), and I take the responsibility to produce decent results, and keep improving them.

One technique I find particularly useful when designing multiple palettes is using "semantic" colors, as opposed to physical ones. For instance, we can define a semantic color "Background", which maps to either a dark or a light color depending on which theme is in use. This way, we have just one single color scheme specification, regardless of how many themes we will create.

There are a few subtle aspects to semantic colors, though:

In most cases, you probably need two versions of a given semantic color, one for use in the background, the other foreground, and it is primarily for the sake of contrast and theme-compatibility. For example, say we have a semantic color named "Informative", and it maps to some darkish blue for the Dark them, and some light blue for Light. But then you will realize that that only works well as the background color of elements (because in Dark mode, font color is light, and in Light, font is dark, so the contrast works nicely), but when you apply it to the foreground (i.e. text), it absolutely fails. The solution is, define two semantic colors, or rather, a pair, such as "FgInformative" and "BgInformative", and the nice part is, you don't need 4 physical colors in total to map them to! For example, BgInformative maps to darkish blue in Dark mode, to light blue in Light, and then with FgInformative, you can simply switch around, and you get nice contrast in both themes. (Note: the aforementioned "Background" color, used as the main background of an app, by definition, doesn't have two versions, but together with the counterpart "Foreground", they do still make the same kind of pair, it's equivalent to naming them "BgMain" and "FgMain".)
Although most often we want to define semantic colors in a more general-purpose way, as shown in the examples above, for some UI constructs or effects, they may not work. For instance, a classic visual effect is a "joint" effect, where two blocks are joined with a "seam"-like 3D illusion by coloring two adjacent borders properly. They typically use colors on the grayscale, so you might think you could use your Background and Foreground variants for them. But no, the problem is, these general-purpose semantic colors tend to go opposite directions along the spectrum between Dark and Light, for instance, if you define a semantic color that is stronger than the normal one, such as "BgSalient" for background, then in Dark mode, that could mean true black, and in Light, true white. Now, you might find that you can use BgSalient as the dark line (I call the "trench") in the joint effect, but then when you switch to the Light mode, you will see a bright white line, which in no way can ever work as the trench. Instead, what you need there in the Light mode is a much lighter gray that appears just dark enough on your main light background. Therefore, the point is, for such highly specialized visual effects, sometimes you just can't use the general-purpose semantic colors for them. But that is not a problem at all! The benefit of using a union type to define all your semantic colors is that you can just add more when needed, so for this example, we might define a new pair named "JointTrench" and "JointRidge", and then map them to physical colors independent of the mappings of other more general semantic colors.

Responsive Design

Besides the visual design itself, which is just irreducible hard work that is dependent on one's skills and experience at a given point in time, another major source of struggle during this phase of development was to decide on a good way to do responsive design in Elm-UI, that is, how to write different code for different devices.

On one hand, we want minimal code duplication, but on the other hand, if we must only have a single set of view functions for both device, then each of the view function will have to take a parameter of the device, and that can have negative performance consequences. One obvious harm to avoid is to not take the raw device dimension (viewport size) as a parameter and do device classification in each view function repeatedly. Classification should be done only once, in the root view function, and when a descendent view function needs to take cases on the device, pass in the union type that represents the variants (e.g. Mobile vs Desktop).

Now the question is, when to use a single function for multiple device types, and when to use separate functions for each device (so that none has to take the device parameter)?

So far, my rule of thumb is the following: if the overall layout and ordering of UI components differ between devices, than make separate functions. But if the difference is only in the details of styling, then use a single function and take cases on the device type. Essentially, I think it boils down to whether you can cleanly extract a single logic to express the overall arrangement of elements, and just parameterize device-dependent details.

I think ultimately, there is no single winning choice in general. Having more separate device-tailored functions means more functions in your compiled code (up to doubling the number if you consider just mobile vs desktop), which might increase the asset size, but I haven't done anything to check that; Having more "shared" functions that take cases on device type means more functions are dependent on the device, and thus must be reevaluated when, say, a viewport resize crosses device type boundary, which is not a good thing, especially if you want to make the most out of Lazy (clearly, what is much worse is passing the raw viewport size instead of the device type into such view functions, as the reevaluation will happen much more frequently during resize).

Nevertheless, I noticed in the code that if you branch out the view code at the very top of the hierarchy (namely, by taking cases on the device type at the point as close to the root view function as possible), then even if you use many of those descendent functions shared between device types, they won't be reevaluated upon device change, because in the top function, you passed a specific device variant (like a constant) to those functions. Therefore, when device type changes, only the top function gets reevaluated, but not the downstream component functions. That is to say, in such a usage pattern, a function that takes cases on the parameter that is passed in as a "constant" by the caller function, is as efficient in terms of Lazy performance as those functions that do not have the parameter at all.

As a demo, say ui is my top-level Elm-UI function that does the device classification and then takes cases on the it.

ui : Model -> Element Msg
ui model =
    let
        device =
            Device.classify model.viewport
    in
    column []
        (case device of
            Mobile ->
                [ mainUi Mobile model.mainStuff
                , navMui model.navStuff
                ]

            Desktop ->
                [ navDui model.navStuff
                , mainUi Desktop model.mainStuff
                ]
        )

Note that in each device branch, I pass the device variant (as a constant) to mainUi, whereas I use two separate functions for the nav component, navMui for mobile, and navDui for desktop, which don't take the device parameter. The point is, in case of a device-type change, only the "ui" function will be reevaluated, and it will switch the active branch to load a different UI tree, but there is no reevaluation of downstream UI components (since I didn't pass the device variable through), so there is no performance difference between the device-shared "mainUi" and the pair of separate "nav" functions.

GUI Mockup

For a long time, I always thought my way of doing GUI art work was just primitive / low-tech, while pros use graphics editors to produce "real-life" visual specifications of what to code up. As discussed above, I can now fully appreciate the benefit of having a concrete target before writing the code, "code to target" if you wil.

But producing such high-quality mockups is a real profession, that is, it's a skill that is not trivial to obtain. I designed the icon for Arrow last year in Inkscape, and I could gauge from that process how far I was to the level of being able to put my thoughts onto an SVG file.

And one way to think of the common practice of producing accurate mockups in the professional space is that it is a form of communication overhead, that is, if the GUI designer can neuro-link to the coder with the same high-fidelity as an SVG, and the coder can intuitively write the code to achieve the visual spec, then this long process can be short-circuited.

And I do have this neuro-link with myself. And with Elm-UI, I can more confidently then ever produce accurate results from a visual spec. So this may not be that low-tech after all. This is the benefit of a monolithic organization (organism).

OK, that was all half-joking of course. And even though I didn't end up trying to become an Inkscape guru before starting the GUI work, I did use my pencil and scratchpad to sketch up some rough schemes of various UI components that I couldn't wrap my head around. This is essentially "paper Inkscape", isn't it?

I do hope that someday I will enter the right mind space to just sit down and practice Inkscape, like an artist. At the very least it'll make the icon creation process in the future much smoother.

But still, in the case of a solo developer who does both the visual design and implementation of the GUI, isn't it the point of the invention of Elm-UI to eliminate the extra process of having to produce this intermediate representation in the form of a vector graphics file? As long as we have an idea of what we want to achieve, maybe with some sketching, isn't Elm-UI created so that we can iterate on the details using real code even faster than drawing graphs?

In a sense, all Inkscape does is to provide an intuitive and reliable way to create an SVG specification of what you want. Now replace SVG with HTML/CSS. At least in theory, if Elm-UI is more intuitive and reliable, and thus efficient, in producing the visual result than a graphics editor, and it produces the real end-product instead of a mockup, then we can argue that it renders such tools obsolete, unless, again, the GUI designer cannot neuro-link to the coder. That is to say, the process that we regard as professional-grade, and industrial-strength, is indeed essentially the communication overhead.

Now this same argument, coupled with the reality, also shows how much hard work it is to make GUI software, via proof by contradiction. If complex, high-quality GUI could be easily produced by solo developers, then why would the tech companies hire essentially a bunch of artists to draw fake stuff, when the coders could do it all alone? Therefore, not many people can master both GUI design and implementation, and even if some people do, they might not be able to always afford it, due to time or energy constraints.