Android App (2)

March 2023

Huidong Yang ✉

April 6, 2023

So the progress has been decent - basic database backup & restore, weekly stats (aggregations), and I've been using the app already! With Material 3 default style, which I have to say is a significant improvement over Material 2, I get a very nice GUI without much effort in that department, and that's a very comforting, compensating reward, as I've been kind of criticizing the "messiness" of the Android ecosystem compared to hardcore designed systems like Elm and Rust. As I said initially, I didn't come to Android for its architectural elegance, or for Kotlin (but Kotlin is no doubt a better Python, big-time), but instead for the ubiquity and featurefulness of the hardware. But after spending two months with it, I've started to like it more. Room (w/ KSP) is pretty good, for embracing SQL instead of trying to hide it away; Jetpack Compose, again, quite ergonomic in UI making with its reactive architecture and emulated "functional" style (plus all the needed escape hatches for Android practitioners, like SideEffect). I really appreciate the completeness of the Android world, SQLite support for serious data persistence and querying, standardized i18n mechanism, build variants and optimizations, release/deploy tools, and of course, a pro-level GUI design system (namely Compose + Material), all these efforts add up to the industrial-strength capability and quality, and for all these reasons I think Android will stay, even with shiny new things like Flutter and Fuchsia in the making/on the horizon.

Over the years, I started to appreciate more and more things that work well now, even though they're flawed. It's brave to pursue ideals, and the future might depend on those efforts, but at the same time, I can now see the amazing values of putting in solid effort using technologies of today.

Now some specific notes about dev experiences during the month.

Modeling Homogeneous Collections of Polymorphic Items

Excuse me for the title, but it's a very practical concern that arose when I tried to get the concrete item types back from a List (or generics in general). I can't, because of type erasure.

To demonstrate the problem, let's borrow an example from CS 100-level courses. Say we've defined type Animal = Dog | Pigeon, whatever. Now once we've declared a collection of type val animals = List<Animal>, we can no longer check (i.e. by using when (animals) is...) what specific animals, dogs or pigeons, the list contains. Unless we check the items inside themselves, that is. For example, we can use animals.filterIsInstance<Dog>() to get back the list of dogs, if we know the list is homogeneous (all dogs). Alternatively, we can use unsafe casting, basically making an assert that we know by construction that the list is say all dogs, but then we need to tell the compiler to @Suppress("UNCHECKED_CAST").

Neither is satisfactory. The instance filtering is wasteful, because we know they're all of certain variant/subclass. The unsafe casting is OK, but just not right. Why doesn't the compiler know? You might say, well, because, type erasure. But if you think about it, the compiler is not at fault here. Why?

When you declare a collection of type List<Animal>, it is by definition a heterogeneous list, and whether all the items are of the same concrete type is totally incidental, so you see, type erasure isn't messing with us because the compiler is incompetent, but it is an intrinsic behavior of generic types. We simply haven't expressed our constraints to the compiler, and the compiler is supposed to only work with what the user has conveyed to it.

That's why the title is about "modeling". Back to the example, if we want the compiler to understand that our list is guaranteed to be homogeneous, yet we want to unify such lists among different item types, then we need to model our data like so.

sealed interface Animals {
    val list: List<Animal>

    data class Dogs(
        override val list: List<Dog>
    ) : Animals

    data class Pigeons(
        override val list: List<Pigeon>
    ) : Animals
}

And when you need a homogeneous animal collection, you declare val animals: Animals, and pattern matching will be done as follows:

when (animals) {
    is Animals.Dogs -> {
        // animals.list is now List<Dog>
    }
    is Animals.Pigeons -> {
        // animals.list is now List<Pigeon>
    }
}

If this sounds like too much trouble, and you'd rather just suppress unchecked cast, then you probably want a dynamically type language. I had the same doubt (proof: I bothered to create a branch named "try/no-unchecked-cast"), but what helped me to settle on the more complicated path was by asking myself the question, What would I do in Elm? (ref: nb/2023-03-24)

type Animal
    = Dog
    | Pigeon

type Animals
    = Dogs (List Dog)
    | Pigeons (List Pigeon)

Is that it? Not quite. There's no such thing as List Dog, because Dog is not a type, but a variant (namely value, or "instance") of Animal. This is where functional and OO styles differ. In Kotlin, Dog and Pigeon classes are standalone types, but in Elm, you can't phrase the idea that a collection only contains one specific variant of some union type, because union types are by nature used to model heterogeneity instead (you can however still check the list items using pattern matching, etc, but the point here is exactly that we don't want to check, for we want to model the data such that it can only be homogeneous and the compiler knows that too).

So how to do this in Elm? Note that what we must obtain is a way to state that the Dogs variant consists of a homogeneous list of Dog "stuff", and the same goes for Pigeons. And thus the natural step to take is that we define such "stuff" types.

-- presumably, they're some records, possibly with different fields,
-- but they could in principle be anything, even union types themselves (e.g. for modeling dog variants)
type alias DogData = { ... }
type alias PigeonData = { ... }

type Animal
    = Dog DogData
    | Pigeon PigeonData

type Animals
    = Dogs (List DogData)
    | Pigeons (List PigeonData)

So you see, Animal and Animals are rather similar in structure now, and indeed, by nature, union types are just devices to make different things be treated the same way. In Animal, we want individual DogData and PigeonData to be treated as the same type, whereas in Animals, we wanted homogeneous List DogData and List PigeonData to be treated as the same type.

In kotlin, a data class kind of serves the dual purpose of being the "tag" and the associated data "record", and multiple such classes can be unified via a sealed interface, which as I mentioned before, is essentially how union type is done OO style.

When to Diverge UI

This is a frequently encountered challenge in UI code organization. Now that we have union types, it typically means that, although the variants will have their specific details differ from one another, they often share a substantial amount of UI structure, especially in terms of overall layout.

Generally speaking, there're two alternative approaches:

Diverge Early and Once - starting from the root view, as soon as the UI tree starts to depend on which variant the data is, diverge right away, and make sure that this will be the single point of bifurcation (e.g. even if a column layout is shared, but as long as more than one of its children are diverging, we must diverge on this parent column). From then on, each data variant has its own view function. However, most likely, these variants will still have common view code down the hierarchy, in order to maintain a coherent UI component style; therefore, you just extract all those common parts into helper functions, and call them in each variant branch of view branch whenever needed.
Diverge Late and As Often as Needed - overall, we maintain a single trunk of view code (in terms of the layout), and then have local diverges whenever some particular subtree depends on which variant the data is, by calling different subtree view functions for those diverging points all over the hierarchy.

I think there's no single best choice between the two approaches. But there are some guiding principles:

Clearly, (2) achieves maximal code reuse, at the cost of potentially multiple points of diverging throughout the view tree. But if there is only one overall UI layout structure, and points of divergence are well localized, it can be a good choice, given that case-taking on the union type data isn't very cumbersome (e.g. with the aforementioned mechanism to use a "tagger" type that wraps homogeneous lists of polymorphic items, it's just a simple "when...is"; on the contrary, having to do unchecked cast every time you diverge the view is too messy).

Nonetheless, (1) isn't always the inferior approach either. Early divergence is especially suitable if the UI has multiple fairly distinct overall layout structures, e.g. desktop vs. mobile. If within each layout, components are to be arranged in a fairly independent manner, then you might as well maintain separate UI trees up front. Now how about code reuse? Well, as long as it's easy to extract view code into helper functions, it's not really a big deal. Jetpack Compose, thanks to its functional style, is pretty good at this; With Elm (+ Elm-UI), it's even more worry-free and reliable.

Interestingly, we see these two approaches in real action with the traditional CSS responsive design, vs. the Elm-UI way. CSS, e.g. with mobile first, maintains one main set of styles (for small screens), and then patch it (using media queries) with modifications meant for larger screens. Elm-UI itself is completely flexible in terms of early vs. late diverge, but because you don't have to pick one, say between mobile first and desktop first, you can just maintain two views in parallel, which also share code easily using functions, if you deem the two views too different (or they need to evolve independently).

For my own project, I started with early diverge, because back then divergence required unchecked cast, so I didn't want to do it more than once. But after the refactoring, I switched to late diverge. The difference in the actual code isn't too substantial, because so far the UI tree isn't that deep. But I do have multiple locations of divergence. And again, the main rationale is that, since this is a phone app, there is just a single overall layout for my data view.

But to be honest, getting rid of unchecked cast by using the tagger type was the real move of substance here. UI code organization, code reuse and refactoring, these are pretty flexible subject matters; as long as you take the functional approach, how you write the view code is mostly chosen either in a pragmatic way, or even as a matter of personal style. (Art, not science.)