Originally published on Mon, 02/03/2014 - 14:44
My college roommate Bob told me a joke about the Lone Ranger. It went something like this:
The Lone Ranger and Tonto are on a long and lonely ride into unfamiliar territory. Suddenly, they spot on the ridge above them an angry Apache war party preparing to attack. The Lone Ranger turns to Tonto and asks, "What do we do now, Tonto?" Tonto, clears his throat and says, "What do you mean by 'we', kemo sabe?"
Words often seem to have many senses, although the way senses might be distinguished or gradated is controversial. A sense enumerative lexicon such as Princeton's WordNet has many senses assigned to open class words such as "set" or "play". A continuing research problem in semantics is the recognition of different lexical characteristics that are salient in text understanding.
Even a seemingly simple word such as "we" can show remarkable sense variation. Five sense variants of "we" (and "us") can be easily distinguished:
- Intimate we ("we" as in "you and I") "We should go out sometime."
- Narrative we ("myself and others but not you") Question: "What did you do last night?" Answer: "We went to a movie."
- Instructive we. ("I say we but I mean you") A teacher chastising students: "We do not talk during exams!"
- Haughty we. ("we as I") "We are not amused." Mark Twain once said, "Only kings, presidents, editors, and people with tapeworms have the right to use the editorial 'we'."
- Tribal we ("like-minded people" – a favorite of politicians and preachers) "We need to take back Congress."
Understanding requires a knowledge of context, even for "simple" function words. This is hard but it speaks to the performance limitations of text analytics that do not attempt deep parsing and reasoning.