Only the wicked flee when no one pursueth.
and asked the students to translate this into FOL. e.g.,
\forall x, y .
Flees(x, y) & \neg \exists z . Pursues(z, x) \rightarrow isWicked(x)
Every now and then I get a student who hears the above proverb and defines
When the chuckles subsided, a couple of things occurred to me.
One was that archaic forms such as "pursueth" were as alien to some international students (e.g., those from India and China) and about as unrecognizable as cursive handwriting.
A second revelation, the main topic of this post, was that the entropy of typos and word sense and spelling ambiguities is variable and not easily constant-bounded. The predictive entropy would be one interesting effect to quantitatively measure where possible. Someday, this might aid in recognition of double entendres and intended puns.
But how might one use the information in general? Specifically, how would one hook a quantitative analyzer of text or speech-based discourse to a training corpora, and discover the highest-impact typos?