For as long as I can remember, I’ve been what I would call a visual learner when it comes to language. I enjoyed my few opportunities to participate in spelling bees when I was in grade school, though my one chance to go to the board-wide contest was dashed by “paramecium” (though on the bright side, I haven’t misspelled it either time that I’ve used the word since then). But while I suspect that many people who read this article will be almost exclusively familiar with languages in Latin script, such as English, anyone familiar with some form of Chinese, for example, knows that there is more than one way to put a language together. So today, allow me to present you with a different kind of system for a language, and how to create valid words in that language.
Let’s start with an alphabet. One nice thing about the system we’ll be using is that we can pick any collection of symbols that we want, letters, numbers, tally marks, you name it. As one might expect, our words are going to be strings of symbols in our alphabet. However, we’re going to tweak the familiar English rules of wordsmithing a bit.
First, let’s consider that every symbol in our alphabet is itself a word. This might seem odd to a native English speaker, since most English only uses two, maybe three one-letter words, but it isn’t uncommon in other languages, especially ancient ones, to have symbols that can act as either a letter or a word in its own right, depending on the context.
Second, we’re going to add another ingredient to our language, extra symbols that we need in order to “glue” strings of symbols together. For example, let’s say we have a symbol “^” that glues two words together to make a new word. We’ll also need parentheses, but this is really just for bookkeeping reasons that I’ll go over shortly.
Since I’m a math guy, let’s say that our alphabet consists of the counting numbers (0, 1, 2, 3, etc.), plus this “^” symbol as our connector. So, for example, “1^2” would be a word in this language.
Since “^” glues together words, not just letters (but remember, every letter is itself a word in this system), we could build a longer word like “(1^2)^4”, where I’ve added parentheses to make it clear which two words are being glued together by each instance of “^”. Since “^” can only glue two words together at a time, a string like “1^2^4” would be ambiguous, and so isn’t a valid word in our language.
You can imagine that we could iterate this process on and on, building up bigger and bigger words (and accumulating more and more parentheses), and eventually we’d end up with a sort of catalogue for our language, containing all possible words that we could build.
Great, now we have a language… kind of. It really isn’t useful for anything right now. So far, we’ve come up with a bunch of words, but we still need to decide what each of them means, turning our catalogue into a dictionary, so to speak.
Now, writing a dictionary sounds quite daunting to me, so let’s do what any good mathematician does when faced with a tedious task that is secondary to the point they’re trying to make: just assume that they’ve already done it.
…Poof! And now we have a dictionary. Easy-peasy.
Having this dictionary is nice and all, but you may be able to see how it could be unwieldy to have to deal with each word as its own distinct entity, especially since our dictionary, as we’ve defined it, would contain words with more characters that there are atoms in the observable universe (the most conservative estimate I found for this number was about 1078). One way around this would be to have some rules regarding what words look different, but mean the exact same thing. Put another way, these rules would be about what words are equal to each other. In a sense, we could think of this as reducing our dictionary down to a thesaurus.
Since the whole point of this style of language is to be able to talk about math (in case you hadn’t inferred that from the title), let’s look at a math-y example of one of these languages. As before, let’s say that our alphabet is the counting numbers, but now, let’s have our connecting symbol be “+”, the usual notion of addition. In this case, saying the “usual notion of addition” is doing some work behind the scenes here, because not only does it implicitly tell us what the definition of every word in the language is, it also says that all of these definitions are themselves words in this language. So we get the thesaurus as well as the dictionary.
Speaking generally again, referring to the contents of a “thesaurus” like this as a language may still seem overly generous, as at this point we really just have a bunch of words that mean things. And since we hand-waved away all our definitions, there’s still nothing resembling grammar or anything that could reasonably be called a sentence in this language.
So there is more to this story, but that will have to wait for next time. And if you’ve been following my other posts and are joining me on this detour, don’t worry, I promise that it’ll all come back to categories in the end.
Image by Lazur via Openclipart