Language features across the boundary

Topic markers and context can help keep you DRY

  1. Topic markers
    1. Implicit topics
  2. Programming analog
    1. Extension functions with receivers
      1. Further parallels
  3. Conclusion

In western languages we are used to repeatedly re-declare the topic we are talking about. There is hardly a sentence without a pronoun, and we often have to find creative ways to reduce repetition, in order not to sound dry and robotic.

Imagine a typical conversation between person A and B:

A: How is your new teacher?
B: I don't like him. He's too strict.
A: Is he worse than Mr. X?
B: No, but he still thinks it's the 80s.
...

This could go on for quite a bit, but after the first sentence it should be clear to everyone what the topic is. Then why do we have to repeatedly refer to it?

Think about introducing yourself to an audience. Nobody is in doubt who you are talking about, yet practically every sentence will contain an I or my.

Growing up with such a language, this feels normal. Although, as developers, we tend to get an allergic reaction when seeing excessive repetition.

Once we discover an elegant alternative, we realize that languages like English, German, Danish are really quite verbose. They are said to be subject-prominent.

Topic markers

Some natural languages, like Japanese and Korean, support a feature called topic markers. They allow you to declare a topic, and then largely omit referencing it throughout the conversation.

Going back to the previous example, once the topic is established, we can stop referring to it.

A: How is new teacher?
B: Don't like. Too strict.
A: Worse than Mr. X?
B: No, but still thinks it's the 80s.
...

This sounds unnatural in English, but is the default in Japanese. In fact, we can omit nearly all pronouns, as they are usually inferred from the context:

  • When A asks B a question, B is not in doubt whose teacher A is talking about.
  • When B states her opinion, both know who about.
  • As long as the topic doesn’t change, there is no need to refer to the teacher with pronouns.

Implicit topics

Context-heavy languages assume a lot of implicit information, allowing for very succinct communication. Japanese is known for being on the extreme end of the context-heavy scala, as almost every sentence will omit some information.

In high context cultures many of the words you might think of as essential in another linguistic setting become unnecessary or out of place. Japanese is one such language, where things are often implied by context and mutual understanding.

Source

A classic is the sentence 僕はウナギだ, which can be interpreted as I am an eel, which doesn’t make much sense. The implicit context is that we are in a restaurant, where the sentence is used to state your order, as in I'd like the eel, please.

More generally, you omit pronouns in most conversations. When talking about yourself, stating your thoughts, or telling a story about your kids, you won’t have to say I / my. Once you open your mouth, it is clear that you are the topic. When seeing a colleague and asking a question, you won’t say You. The other person understands that he is the one being asked.

Once I went to a book store, where a band was playing. Just as I entered, they stopped playing, and I went up to one of the members to ask whether they would continue later. The entire conversation was:

Me: もう終わった? # ~ Already finished?
He: まだ。       # ~ Still.

I was taken aback by the curtness of his answer, but this is really quite common.

Programming analog

The first similar concept that comes to mind is a plain class, from which we can reference other members of the class without qualification.

As mentioned, in Japanese you can omit declaring yourself as the topic. Let’s imagine a class about ourselves, called I.

class I {
    private fun think(about: String) {}
    
    fun say() {
        think("No more subject!")
        think("Context is awesome!")
    }
}

As opposed to:

val i = I()

fun say() {
    i.think("therefore I am")
    i.think("repetition is boring")
}

Generally, good design will produce classes with low subject repetition. Similar to how we organize our prose to minimize repetition and keep the reader engaged. If we find ourselves frequently referring to the same subject, then that is a sign that the code belongs in a (topic) class.

This works well when we own the class, but what if the topic is out of our control?

Extension functions with receivers

Have you ever had to write a function which operates on an external type, like the below?

fun mutate(external: External) {
    external.name = "Test"
    external.age = 10_000
    external.location = getCurrentLocation()
    external.status = ...
    external.yetAnotherField = ... 
    /// 10 more fields to set
}

This is super annoying. We basically want a way to declare external as our topic, and then stop referring to it in every single line, as if it was a function of the class itself.

In Kotlin, extension functions are declared by defining a receiver type for the function. Within its scope, the receiver becomes this, and can be omitted.

fun External.mutate() {
    name = "Test"
    age = 10_000
    location = getCurrentLocation()
    status = ...
    ...
}

Such receivers can be thought of as topic markers, indicating that the following block of code will invoke the topic without explicitly referencing it.

The standard library contains generic helper functions, like apply to facilitate this inline.

StringBuilder().apply {
    append("Hello ")
    names.forEach { append(it).append(",") }
    setLength(length - 1)
    append("!")
}.toString() // Hello A, B!

Without apply, we would have to declare the subject in every line, sometimes even twice. We could actually create a similar helper for the Japanese topic marker :

infix fun <T> T.は(block: T.() -> Unit) = block()

RestaurantOrder()  { unagi() }

Receivers are frequently leveraged to enable custom DSLs / type-safe builders. For example, a type-safe way to build html in code:

html {
    header {
        stylesheet(color = "blue")
    }
    body {
        form {
            textColumn("name")
        }
        button("OK")
    }
    footer()
}

Another cool thing here is the notion of nested receivers / topics, just as in natural language. The textColumn only makes sense in the context of a form in a body in html.

Going back to the eel example, how could we express this in code? First, we can always assume an implicit context of I. Next, we are in a restaurant and, facing the waiter, it is clear that we intend to order food.

class I {
    fun RestaurantOrder.unagi() = order(Menu.EelOnRice, 1)
}

I { // implicit I
    restaurant { // when in restaurant
        order { // when about to order
            unagi() // when in the right context, a single word is sufficient
        }
    }
}

Notice that we can define local extensions! This really allows us to express the concept of sub-topics, i.e. the unagi restaurant order only applies to me.

Further parallels

In Japanese, adding back the omitted subject references still leads to grammatically correct sentences, but they will be perceived as awkward, and you will have trouble making yourself understood. This is analogous to unclean code with excessive duplication, or unnecessary qualifications. For others reading your code, this noise is an impairment to understanding your true intent.

Conclusion

In programming, topics are generally handled by defining classes of cohesive members, which can operate on the topic without explicit references. Kotlin adds the magic by allowing the same behavior on external types via extensions. Bonus points for local extensions!

Natural languages like Japanese have a special topic marker particle which feels similar to defining the receiver on an extension function. In context-heavy languages we can omit a lot of implicit information, often leading to much shorter, concise sentences.

To me, both Kotlin and Japanese have a sense of elegance, as it is super easy to produce highly expressive and concise output.


© 2024. All rights reserved.