April 06, 2005

Reverse Polish Notation and Language

Ever heard of reverse polish notation (RPN)? It was a way to allow early calculators to parse complicated mathematical expressions, using a minimum amount of memory, without needing brackets. Lately I've been thinking about its applicaton to languages, and this time I'm not talking about programming languages. (Although Forth and Joy use RPN, and LISP and its relatives use PN.)

There are people out there who construct languages (conlangs) for a variety of purposes, from providing atmosphere for fiction (Klingon) to facilitating international communication (Esperanto). A subset of this group, often called loglangers (for "logical language makers"), is concerned (in part) with creating languages that are unambiguous. One type of ambiguity they want to eliminate is attachment ambiguity. Here's an example:

(John saw the lady who watched the crowd) with a telescope.

John saw (the lady who watched the crowd with a telescope).

We solve this problem in English by putting a comma after "crowd" if we want the first meaning, but English can't handle "X saw Y who saw Z who saw W with a telescope", "X saw Y and Z who saw W with a telescope", or any number of other complicated expressions.

Some loglangers have solved this problem by including bracket-like "elidible terminators" (optional words for marking the end of a phrase or other expresion) in their language. I'm specifically thinking of Lojban. However, if you treat verbs as operators and noun phrases as operands, you can use RPN or PN to solve this problem entirely with word order:

John the lady who the crowd watched with a telescope saw

John the lady who the crowd with a telescope watched saw

The only words that have moved are "watched" and "saw", switching from Subject Verb Object (SVO) order to SOV order. Actually, there's a fourth element, the prepositional phrase, so what I've really done is switch from SVOP to SOPV. SOVP would also solve the attachment ambiguity.

SOV is actually the most common word order in natural languages. Could that be because it allows speakers and listeners to parse the language using a minimum amount of memory, without attachment ambiguity?


At April 09, 2005, Blogger Fraxas said...

Once again, I'm going to take a couple tiny little phrases from your ((excellent) article I mostly agree with) and beat it into the ground.

It only makes sense to say that SOV languages require less memory than SVO languages if you're talking about memory devoted to the actual grammar of the language, and even then there's bigger influences on the overall complexity of a grammar. Exceptions spring to mind. Anyway, if you're talking about memory-per-sentence, I disagree completely: for two semantically equivalent sentences (which is the only rational basis for comparison of syntactic form), the same memory is required for each because they're semantically identical! Admittedly, it might be chunked differently, but that's irrelevant to the your choice of words above. ;)

Also, you've made a couple thinkos in the article itself. Firstly,

(John saw the lady who watched the crowd) with a telescope

should be

John saw (the lady who watched the crowd) with a telescope

and your comment about how we solve this in English is an affront to the prescriptive grammarian in me. "we" may well solve this problem with a comma after crowd, but that's a non-standard extension of the semantics of the comma. In fact, it's a backformation: in *spoken* english we'd pause for a quarter-beat, which is also what we do when we encounter a comma, but not all spoken quarter-beat pauses are acceptable written commas in English.

The correct way to reformulate the first sentence is as follows: John saw with a telescope the lady who watched the crowd. In fact, separating an adverbial phrase from its verb is discouraged, if not verboten, in written English in general.

At April 10, 2005, Blogger JeremyHussell said...

Well, I have to admit sympathy to descriptive linguistics. You don't write "John broke with a rock the window", you write "John broke the window with a rock". So, basically, I'm directly contradicting your last sentence. :-)

Similarly, I distinctly remember being taught how to use a comma like that in high school English class, with the teacher having us read a sentence on the blackboard, then adding the comma and asking us what the meaning had changed to.

I really did mean:

(John saw the lady who watched the crowd) with (a telescope)


(John) saw (the lady who watched the crowd) (with a telescope)

Actually, there's no standard way to parse sentences. You may treat saw as an operator that takes a variable number of arguments, while I treat saw and with as binary operators.


Post a Comment

<< Home