Category Archives: lexicography

Answers I Wish I Could Send: Etymology Edition

[Ed. note: one in a series.  Emails are only lightly edited for–if you can believe it–clarity.]

Your online dictionary defines “peak” as “a pointed or projecting part of a garment; especially :  the visor of a cap or hat”; and tentatively derives the word from “pike”. This is false. “Peak” derives from “beak” (which is why “bill” is a synonym). If I am correct, your definition should be modified.

Your logic is unassailable: “peak” looks like the word “beak,” and both hats and birds have a bill. Or rather, only the hats that truly matter–good American hats–have a bill. I don’t know why we didn’t see this before.

Oh, wait–we didn’t see it before because that’s not how etymology works. Imagine being tasked with creating ancestral photo albums for everyone in your family. You start with your second-cousin; you have, as your guide and starting point, a photo of that cousin that was taken yesterday. You are led to a large, dusty room that is overflowing, Hoarders-style, with pictures. The pictures go back hundreds of years, and several are stained or torn so badly that you can only guess at who the person in frame is. Some of those pictures will be of this cousin; many of these pictures will be of people who look vaguely like your cousin; many will be of other people you don’t know; there are several of Stinky, the neighbor’s dog. The door behind you creaks shut and locks. There are closed doors to your EAST and SOUTH; to your NORTH is a dimly lit brass lantern.

This is etymology. You are likely to be eaten by a grue.

The reason that there are so few etymologists in the world is not for lack of education or desire; it’s because etymology is really frickin’ hard sometimes. Lines of derivation aren’t always clear, and you don’t just need a pretty good hint that one word derives from another, but a whole corpus full of literature that supports that. So if we give an etymology for something–even if we qualify it with “probably”–then you can expect that there’s some actual evidence for that.

 In the case of “peak,” it looks likely that it is an alteration of the earlier word “pike.” Did you know that both “peak” and “pike” were spelled “pyke” at one point? Granted, it was a point about 600 years ago now, so unless you read Middle English for fun and profit, you probably don’t know that. Etymologists do, though, because it is their job to read Middle English for “fun” and (snort) “profit.”  Not all hope for your theory is lost, however: most scholars qualify the “pike” etymology with a “probably” or “possibly.”  If we discover that “peak” and “beak” both came from some crazy Proto-Indo-European root that means “to be conspicuous to idiots,” then we will gladly update our entry.

 Question: I looked up the word “mien” and noticed the following etymology:

Origin of MIEN

by shortening & alteration from “demean”

First Known Use: 1522

However, in French, they have the same word which can mean (1) mine (mining) or even (2) someone’s expression or outward appearance.

The world is abundant, mon ami. There are many orthographic combos that appear in languages around the globe, as pervasive as late-fall ennui. That doesn’t necessarily mean that all those words are related.

Think of it: a whole life’s experience–love, death, the rains in Provence, her kiss in Milan, the flowers Mémère used to set out at dinner–to be summed up using a handful of symbols. Though we live life together, we experience it alone. The form sin shows up in English and Spanish and Norwegian and Irish and Vietnamese–it even shows up in the language of man’s dreams (Esperanto). Yet none of these sins are related. So many worlds, so few characters to share an experience. It is inevitable that we should tread on each other’s words and give them our own meanings.

In short: the English “mien” really is a shortening of “demean,” and even if it was influenced by the French mien, that is not its origin. Everything dies.

I recently read, in, I believe, the Webster’s Unabridged version, that the origin of the term “Nosy Parker” was unknown~~I believe that this term originated from a series of movies, in which the lead actor was Lionel Barrymore,known as Dr. Gillespie~~these movies, each with a different title, featured Dr. Gillespie in the lead role as not only a doctor, but a solver of mysteries~~he is wheelchair bound in each of the series, and is looked after, fretted over, and followed around by his nurse, Miss (or Mrs.) Parker~~she is constantly trying to find out what he is up to, and listens through the door, reads his messages, whatnot~~hence~~she was nosy Parker, the nurse who could not let anything alone~~~This,I feel, is where the term “Nosy Parker” comes from~~~

Please excuse my tardy reply; I was hypnotized by your tildes. They have a very William Carlos Williams feel to them:

reads his messages, whatnot



~~she was nosy Parker

the nurse who could not

let anything



In any event, that would be a wonderful etymology for “Nosy Parker,” but alas, time is not on your side. “Nosy Parker” first showed up in print in the late 1800s; Lionel Barrymore’s movies date to the 1940s. Generally speaking, the word shows up in print after it is coined, not before, though we cannot discount the existence of a band of time-traveling linguistic trolls who have an inexplicable love of Lionel Barrymore.

Sadly, this state of affairs is fairly common in etymology: there is a perfect, spot-on story about how a word came to be, and then the horrible linear nature of time (as we experience it) screws it all up. “Doozy,” for instance, is supposedly a shortened form of “Duesenberg,” a make of tres classy cars. But “doozy” shows up before any Duesenbergs do. Is that disappointing–or, dare I say, a waste of a good car? Yes. Yes it is. But no amount of wishing, willing, secret incantations, or flux capacitors will change the facts.

I’d just like to say, though your app states that the origin of the word “gorp” is “unknown,” most everybody knows that it is an acronym for “Good Old Raisins and Peanuts.”

Well, you know scholars: dumber than most.

Here is a truth universally acknowledged: we like language to make some goddamned sense. Most of the complaints we hear about how horrible English is are because it (or one of its constituents) “doesn’t make logical sense.” And if something’s origin is shrouded in mystery, it is, in a way, nonsensical–there’s no reason, event, or word combo we can blame for that word. Calling trail mix “gorp” for no discernible reason goes against our instinct for causality and our desire for tidiness. So we invent reason: “Good Old Raisins and Peanuts.” After all, trail mix has raisins in it (sometimes) and peanuts in it (sometimes), and raisins and peanuts are both good (debatable) and old (sure, why not). There it is! There’s our reason! Why can’t you just see it?

Acronymic etymologies are, by and large, total horseshit. Acronyms weren’t really popular until the late 19th century, and very, very few of them have entered English as words. So, no, it’s not “Port Out Starboard Home” or “Constable On Patrol” or “Ship High In Transit,” even though these are all logical within a flawed and totally imaginary system. No, it’s not “Fornication Under Consent of King” or “Found Under Carnal Knowledge” or “For Unlawful Carnal Knowledge” or “Fornication Unallowed in the Commonwealth of the King.” (I mean, ponder for a moment: if sexytimes were actually outlawed in the Commonwealth, don’t you think that there’d be ample record of it?)

The origins of the word “jut.”  It seems obvious the word originates with the name of the Danish peninsula Jutland described in Wikipedia as a peninsula that “juts out” in Northern Europe.  Although there may not be a documented relationship, are you able to include the obvious in the possible origins words?

Yes, we are absolutely able to do that. It’s obvious: Jutland JUTS OUT, so clearly we got the word “jut” from Jutland. While we’re at it, we are also going to change the word “boot” to “bitaly,” and I have to revise the etymology of “ballsack” to note that we probably got it from the name of that famous ribald, Honoré de Balzac.

Etymologies in dictionaries are pretty much about documented linguistic relationships. As fitting as it is that Jutland happens to jut out into the Baltic like it does, it is merely a happy coincidence. Sometimes these happy coincidences also lead to documented linguistic relationships, but we always make a note of it. “Redingote,” for instance, is a funny little word that refers to a style of coat worn by men in the 18th century. It looks sort of like “riding coat,” doesn’t it? And hey, look at that: we have documented evidence that “redingote” is actually the French adaptation (borrowed back into English) of the English “riding coat”!

But it must all come back to the documentation. Etymologists are just crackpots with evidence behind them. We don’t truck much in variable origin stories–that’s really more DC’s and Marvel’s purview.

Question: I regard Webster’s very highly, and use it very much. But I am quite shocked about the lack of knowledge about so many Words’ origin, when the answer is just across the North sea. In Norwegian, Icelandic, Danish or Swedish. The Word QUALM is a very good example.

What about Finnish, huh? Or Faeroese? NOT “ACROSS THE NORTH SEA” ENOUGH FOR YOU?

It’s a common misconception among people who really, really love their native language a lot that their native language is the Ur-language, the language from which all other language sprang. This misconception is hard to counter: I mean, if you are positive that there is a family resemblance between Norwegian and, say, Amharic, then you are damned well going to see a family resemblance. “The word for ‘water’ in Amharic is /whah/ and in Norwegian it’s ‘vann’. SO OBVIOUS.”

Except, well, no. One of the things that etymologists must consider when weighing whether X word in Y language came from B word in C language is whether or not speakers of C language ever had contact Y language during the time that the word first showed up in Y language. If Norway gave English speakers the word “qualm,” then you’d think we’d have some clear evidence of that from the 1500s, when “qualm” showed up in English. But we don’t. We know–because, again, etymologists read all sorts of weird stuff–that there were similar words in a bunch of Germanic languages for the 200 or so years around when “qualm” showed up in English. But not in Norwegian. Not only that, but English speakers didn’t have a ton of exposure to Norwegians in the 1500s. We were more into the Dutch at that point, sorry.  So the likelihood that the English “qualm” came from Norwegian is <hearty laughter>.

To sum up: if there is an Ur-language from which all languages today descended, it is lost to time and it’s deffers not Norwegian. We are sorry to disappoint; thanks for writing.


Filed under correspondence, etymology, lexicography

Repossession: Reclaimed Slurs and Lexicography

[Ed. note: this post contains language that is considered extremely inflammatory. Caveat lector.]

People forward language articles to me all the time–usually the same article multiple times, until my inbox is nothing but language links and plaintive requests from to buy more booze, please. But no one forwarded me Talib Kweli’s recent Medium post on language, probably because it was about the history and uses of the word “nigger.” I asked one of my frequent-forwarders if he had seen the post. “I had,” he wrote, “but I figured you’d have already seen it. I was not going to be the one to forward you a post on the n-word.”

The n-word. I think about slurs on a regular basis, in part because I have to explain to people why they’re entered in some of their dictionaries. It’s not unusual for me to open my email in the morning and see a message with the subject “NIGGER”; after a decade of answering these emails, I still wince when I see the subject line, stark in black and white.

Language has power, and slurs are a remarkably tidy way of asserting that power. They are not simply  neutral descriptors for a person or a group of people (“she’s a lexicographer”), nor are they merely expressive terms used as a vent for the speaker’s emotions and which could be used of any person in any group (“she’s a rotten fucker”). Slurs are descriptors that target one characteristic or aspect of a group and denigrate a member of that group (or the whole group) on the basis of that one aspect (“she’s a spic”). They are cruelly ingenious: because they are often taboo, never to be spoken and never to be discussed, they are prone to gathering around them ancillary attitudes and stereotypes about the slurred. Someone called an “uppity nigger” or a “castrating bitch” or a “flamboyant faggot” can only ignore the comment and feel the mottle of rage and misplaced shame creep up their back: to turn around and call out the speaker only confirms the stereotype they were just slammed with.

But people who are denied the dignity of an honest response, over and over again, will get wily. Language belongs to everyone, oppressed and oppressor alike. And so those at the sharp end of those words have sometimes snatched them out of the hands of their attackers and owned them as labels. It’s effective: as Kweli notes, “Why wouldn’t you want to embody that which most scares your oppressor and change its meaning?”

But language is not a political system you can overthrow; it’s personal. Slur reclamation is risky business for both the oppressed, the oppressor, and the lexicographer.

Slurs are never a pleasant thing to define. Reading the citational evidence for them requires some internal preparation: you are about to see centuries of the ugliest ass-end of humanity on parade and it is your job not only to muscle through it, but to engage it, analyze it, explain it in detail. It is a cavalcade of suck, and you are its unwilling but unapologetic emcee. But when slurs are reclaimed, they become Janus-faced and fragmented, and what was once a straightforward (if horrible) usage is no longer.

Kweli ends his piece on “nigger” and “nigga”  with some practical usage advice:

Say nigger or nigga as much as you like, just be prepared to deal with the consequences of your actions. The consequences of context. The word has racial connotations, and those connotations are different for white people and black people, whether we choose to accept that or not. It’s about personal responsibility.

This is true, but the lexicographer looking to provide usage information can’t gloss over the “consequences of context.” If use of “nigger” or “nigga” really is about personal responsibility within context, and a lexicographer’s job is to explain how a word is generally used in context, how can a lexicographer possibly talk about the consequences of usage when they are unique to every individual speaker and his or her context? Some may think it’s socially appropriate to dismiss a white person’s use of the positive “nigga,” but it is not lexicographically appropriate to do so. If a language belongs to the whole of its speakers and a lexicographer must report on use, then for lexicographers, Eminem’s use of “nigger” is just as valid as Ice-T’s use of “nigger” is just as valid as Mark Twain’s is just as valid as Ted Joans’ is just as valid as the frothing racist Internet commenter’s–and that’s just looking at American uses of the word.

In the great ebb and flow of slur reclamation, lexicographers are often stuck knee-deep in the muck left in its wake, grubbing around for something solid to grab on to. Slurs may exist within a context, but much of that context is not just personal, it’s nonlexical. My male friend can complain about an early-morning meeting he didn’t want to participate in yet did so cheerily because he “wasn’t going to be a bitch about it,” and I know that he is not saying that whiny, uncaffeinated petulance before 7am is the purview of nasty women because I know him, and I know he likes turning a vocabularic expectation (“asshole”) on its head (“bitch”). But if the guy next to me on this flight, who I don’t know but who I already assume to be something of a douche because he has taken up the empty seat between us with his papers, his empty soda can, and half of his left leg, complains that he doesn’t want to be a bitch, but could I move my bag from the DMZ of unoccupied  seat, I will damn sure assume that he is denigrating women with that use of “bitch,” because he is, as I have already unerringly determined, something of a douche, and denigrating women is exactly what a douche would do.

Names good and bad are used in relationship, and lexicographers cannot possibly parse the intricacies of every relationship on the planet (because lexicographers’ closest relationships are with their favorite pens and their coffee mugs, and these are generally nonverbal entities). This goes triple for reclaimed slurs. You’re asking people who took a job specifically because it promised almost no human interaction to delve into the grossest, wrongest human interactions in history and the efforts to right or repair or avenge those interactions, and then concisely describe the lexical fallout from centuries of that. Can you imagine the sort of usage paragraph that would appear at an entry for “nigga” if we tried to accurately describe the word as it’s used by every American who uses or has opinions about it?

The positive “nigga” is derived from “nigger,” and as such, has a share in the controversy surrounding “nigger.” It is generally spoken and used primarily within groups of young black men who are friends, except when it is used in groups of young white men who are friends, or young Latino or Hispanic men who are friends, or young Asian men who are friends, or other groups of young men of various races and ethnicities who are friends. It is rarely used among friends without permission (usually implicit) from the majority of the group, or from the person in the group who may take the most offense at use of this word. Though current evidence shows its use is most common among men, it is also sometimes used by women who are socialized within a community where use of “nigga” is tolerated or encouraged, unless that woman is considered an outsider to the community regardless of whether she truly is or not. The earliest modern uses of  the positive “nigga” are attested to in rap and hip-hop songs by black artists, though its use within the black community is hotly contested from both within the black community (in so far as you can call the majority of black Americans “the black community” without being reductionist and therefore possibly racist) and without. Use of “nigga” between different  groups considered minority or marginalized is also a point of contention. Only use “nigga” if your friends use “nigga” and you feel comfortable enough within that social circle to risk alienating people you love, or unless you are a rap or hip-hop recording artist who feels the same about his or her or thon’s listening community.

The result is that dictionaries and lexicographers have taken an imperfect tack: we sit and wait until “usage settles out,” as we say. We are reticent–and sometimes, not equipped–to enter into the difficult conversation about how slurs are used and how they are changing, because that involves entering into the difficult conversation about human pain and oppression. And this is hard for us, because lexicography has been the province of privilege since the year dot. You look at old pictures of any dictionary company and what do you see? A tweed of old, white guys with Ivy League degrees. Hell, the biggest scandals to come out of lexicography  are that the Oxford English Dictionary was edited by a Scotsman and that the editors of Webster’s Third New International Dictionary had abandoned all human decency and entered “ain’t” (a word that had been around for centuries, had been in dictionaries before the Third, and has not incited riots or led to anyone’s death, as far as I know). There are plenty of modern lexicographers who don’t fit the old paradigm, who want to delve into some of these questions thoughtfully and objectively. We are nonetheless scared shitless that, even with all the facts in front of us, and even with all our training, we are still blinded enough by our privilege and institutional baggage that the minute we ask “What about ‘nigga’?”, we will unwittingly perpetuate oppression.

It’s a funny thing: lexicography as a discipline has to deal with the dirty, ugly ways that language has been used and abused by and for power, and yet the tradition is one of British genteelness, of Yankee restraint, of safe distance from the political realities of some words. We bleat out the caveat that dictionary definitions describe “words, not things,” but as often as we draw that line in the sand, lexicographers also must admit that sometimes, the word is the thing.

About ten years ago, I got a phone call from a gentleman who found “nigger” in his family dictionary. I vividly remember the call; his polite but bristling questions, the stuffiness of the little phone booth I was in. I assumed that he wanted the word removed from the dictionary, so I explained to him why it was entered, gave some of the history of the word, how we don’t make up the words that go into the language but just record them. He listened–thoughtfully, honestly–to my explanation, and then said, “I understand that. But I’m thinking of my 10-year-old daughter. The word ‘nigger’ shouldn’t exist for her. She should not have to confront that in a dictionary, which is supposed to tell her what words really mean. So I want you to explain to her–she’s sitting right here–that the first part of that last sentence in that definition is wrong.”

I blinked hard. The first part of that last sentence. We don’t write definitions in sentences. While I stared at the entry, it hit me over the head like a shelf of Unabridgeds: he was not complaining about any of the definitions of “nigger” which we mark as “offensive.” He was referring to the last sentence of the usage paragraph. That sentence begins, “Its use by and among blacks is not always intended or taken as offensive.”  The offense was that “nigger” is not always offensive.

Our conversation continued, but did not go well. Though we were each listening carefully, we talked past each other, worried that the other might be missing our point and so preemptively overexplaining our positions.

“Let me ask,” he said suddenly. “Do you have children?”

“Two,” I said. “Two daughters. In fact, one is almost your daughter’s age.”

“And how would you feel,” he continued, “if your children had grown up–I don’t know what race you are–hearing their friends use this word and then being told it was fine? How would you as a parent feel if you had been called this word all your life by people who set fire to your yard and chased you out of your town, or threw rocks and bottles at you on your way to school, even after Jim Crow was defeated; if everywhere you went, this was the word that the world saw you as and threw at you until you believed that was all you’d amount to–how would you feel, after all that, if your little girl came home and told you the dictionary said that being called a nigger was no big deal?”

I couldn’t give him a lexicographer’s answer. We weren’t just talking about words any more.


Filed under general, lexicography, making word sausage

The Times, They Are A-Changing (And So Should Your Dictionary)

I was on an airplane heading to Georgia for a conference when I got into my usual “take my mind off the possibility this plane will suddenly plummet from the sky” conversation with my seatmate. Talk turned to dictionaries, and my seatmate began heaping praise on her old one. She had, she told me proudly, a Webster’s Second, and there was no way in heaven or on earth she was going to give it up for one of those silly modern dictionaries. “My son keeps trying to get me to use a dictionary on my phone, but I tell him, ‘Those new dictionaries aren’t the same quality as the one I have at home.'”

I opened my mouth to say that, nice though the definitions in the Second are, they are almost 80 years out of date, when the supercell we were flying past let out a little meteorological burp and the plane flew right through it. I am not entirely sure, but I believe we may have flipped over several times, and I am certain that the sound that came out of my mouth was not a spirited defense of the modern dictionary (though it was certainly “spirited” in the “possessed by banshees” sense). Our bounce through North Carolina airspace lasted only ten seconds, and afterwards my seatmate excused herself to the lavatory, so our conversation was over.

Had the conversation continued, I would have said this: old dictionaries are nostalgia bombs in more ways than one. The heft of the Second and the Third are glorious: tooled leather and gold-leaf embossing, that powdery vanilla smell of old paper as you smooth the pages back. Then you see this:

doo dee doo dee doo WHAT

“Negrito,” Webster’s Second

Consulting old dictionary definitions is like having dinner with your grandparents. The evening usually starts off well enough, with your grandparents telling stories of their life during the war or down on the farm, and then there is that one point where your dear old granny says something that is slightly outré and you know that the whole conversation is slowly going off the rails, but before you can think of some tactful way to change the subject, your dear grandma is using words like “Japs” and “Eye-ties” and “the blacks,” words that make you inadvertently screech your fork across your plate. And when you look for some sign of self-consciousness–some sign that she should know better, Grandma–all you see is the same little old lady who was there before the vileness came tumbling out of her mouth, slowly daubing her meatloaf with mashed potato.

I have been reminded of the chronological fixedness of old dictionaries as we have begun working on the Unabridged Dictionary. It’s no secret that most dictionaries in print today are written using another dictionary as the base; the Unabridged is being built on the very doughty scaffolding of Webster’s Third New International Dictionary. We review the entries in the Third, add (many, many) new entries, and flesh out or correct entries that need it, and in no time at all, idiomatically speaking, the dictionary we’re working on is no longer the Third but a new critter entirely. But this transformational work is not as easy as you’d think, because the Third is 50 years old, and some of the language used and the implicit attitudes expressed therein are like those dinners with Grandma after she’s polished off her second martini. It’s not that the definers of the Third were trying to be offensive, it’s just that society and our cultural ethos have changed a little since 1961. When the Third was released, there was no Equal Pay Act or fully ratified Fourteenth Amendment or Roe v. Wade; sodomy was a felony in every state in the U.S.; and one of the top pop hits was “Runaround Sue,” a song that we today would call “slut-shaming.” Considering the time, it’s frankly amazing that the Third is as careful and circumspect as it is.

For dictionaries that are updated more frequently–even dictionaries updated every 10 years–this de-Archie-Bunkering happens naturally. You notice, for instance, that there’s mention of women in the citations for “firefighter” or “CEO,” and all you do is make sure that you edit out the masculine pronoun in the definition. Or let’s say that you undertake a revision and discover that what was formerly called “Black English” is now called “African-American Vernacular English.” Fine: you search the data for any label that reads “Black English” and make the change. In this way, the dictionary is updated for modern mores in manageable nibbles. But the fact is that you are catching things as you encounter them, rather than hunting for them. For the Unabridged, we’d have to grab our pitchforks and head into the forest looking for the monsters.

It all begins with lists (if there is one thing we are good at, it is making depressing lists). We compiled lists of every word in the Third, the latest Collegiate, and the Learner’s Dictionary that was given any sort of stigmatizing label, regardless of whether that label was current (dated, old-fashioned, vulgar, obscene) or not (abuse, contempt). Then we began to think of words we had encountered in our many jaunts through the Third that struck us as culturally sensitive or potentially offensive: “Negro,” for instance, or “colored.” This list grew as each of us began thinking about awkward family dinners with That One Uncle who likes to talk loudly with his mouth full and eventually lapses into saying horrible things that make our eyes widen and our mothers tsk in disapproval. As we each delved into the archives of our mythic That One Uncle, we together sang the body apoplectic: “Do we have ‘Asiatic’ on the list?” “Do we have ‘homosexual’ on the list?” “Please tell me that ‘Arab’ and ‘Muslim’ are on the list.” “Oh good Lord, we absolutely need to put ‘redskin’ on the list.” And because everything’s better in threes, we had a third list of words that might be potentially sexist: any word with a masculine pronoun in the definition; any word with a gender-specific term (“woman,” “girl,” “mistress,” “man,” “boy”) in the definition; words ending in the suffix “-ette” or “-ess”; any word with the affix “-man.” Compiling these lists was deeply exhausting work, mostly because we’d swing between being riled up about and deeply embarrassed by the imaginary collective -isms of That One Uncle. 

Eventually, we had our list of words. But we weren’t ready to revise yet, because first, we had to search through every entry in the Third that contained any member of those lists. If “man” or “boy” appeared in a definition, a usage note, an example sentence or verbal illustration, an etymology, or even a subject label, the word where it appeared was put on the Potentially Offensive List. When all was said and done, we had thousands and thousands of entries to go through.

This is the point at which my dear friends who are computational linguists want to hear about the programmatic handling of these entries, but the truth is that everything had to be done by hand. Despite Philip Gove’s zeal for order and systematic defining, none of these terms had parallel handling in the Third, so it wasn’t as simple as swapping out “Negro” with “African-American,” for instance. Some of these terms were also a little too nuanced for a simple search-and-replace. The word “primitive” as it is applied to people groups is culturally outdated, but that doesn’t mean that every instance of the word “primitive” in the Third needs to be swapped out with…what, exactly? Is there a single synonymous word for this particular sense of “primitive” that would fit every stigmatized use of it in the Third? How would we know without having a real, live, myopic and undercaffeinated editor look at ever stigmatized use of “primitive” first? Our stalwart and defiantly cheerful Cross-Reference department began sorting through 50 years of fodder for awkward family dinners, and then an equally cheerful group of editors (and me) began to update these entries.

There is something utterly dispiriting about encountering that volume of offensiveness, but it can also motivate you. I am making this goddamned better, you think, because no one else should have to deal with That One Uncle in this dictionary, and you swallow the bile and bite back the “WTF!”s and keep editing “Negro” out of entries.

But as you may guess, offensiveness isn’t always so easily predictable. Take, for instance, the entry in the Third for “atheistic,” which I had in one of my early defining batches. The definition reads, in full, “relating to, characterized by, or given to atheism : GODLESS, IMPIOUS, IRREVERENT.”

“Oh my God,” I muttered, then paused briefly to regret my word choice. To a lexicographer, that boldface colon between “atheism” and “godless” is not just a cute way of breaking up space, but a way to signal that the things on either side of that colon are exactly synonymous. That means that if someone is describing another person as “atheistic,” according to that definition, they mean both that that person subscribes to atheism and that they are impious, irreverent, and godless. I believe that this definition wasn’t a malicious attack on atheists–it was just sloppy defining. These are two separate meanings and shouldn’t have been shoved together into one. But that boldface colon in the middle of the entry makes what could have been a perfectly neutral definition into a moral judgment on atheists.

There were occasional reprieves: sometimes the issues we uncovered weren’t completely depressing. While looking through the entry for “runner,” I ran across the definition “a seaman engaged for a short single voyage” and howled like a 12-year-old boy. “Seaman” went on the Potentially Offensive List; that sense of “runner” has yet to be fixed.

And there’s the rub (hur hur hur): the Unabridged is a work in progress. We’ve already changed thousands of entries, but there are, as our Director of Defining has put it, “no doubt many more excitingly offensive things to be discovered.”

Lexicographers like to remind people often and loudly that a dictionary is a record of the English language as it is used–and it is, fully and totally, from its entry list to the language used in the definitions. That’s why I cringe when people tell me they prefer to cite Webster’s 1828 or Webster’s Second when discussing what words mean today. Both those dictionaries are perfectly serviceable and scholarly dictionaries of their day, but the sun set on that day a long time ago. By all means, love your old dictionaries–cherish them for the works of art that they are, keep them around to remind you of days gone by–but maybe don’t look up “Negrito” in them.


Filed under lexicography, making word sausage

The Voice of Authority: Morality and Dictionaries

Last Thursday was a rare treat in our house: one of those nights where the homework was done early, the dinner was cooked by someone else, and snow was in the forecast. The evening stretched out, molasses-lazy. My eldest daughter sauntered into the kitchen where I was spending some meditative time with the pots and a scrub brush.

“So,” she began lightly, “I wanted to talk to you about your pottymouth.”

I hummed. She does not approve of my penchant for cussing.

“When I came into your office today, you said the s-word. Cursing is evidence of a lack of creativity.” It is always a delight to hear your feeble parenting parroted back at you.

“A guy said something stupid on the radio this morning and then defended it by misquoting the dictionary. I was just frustrated, that’s all.”

She whisked a dishtowel off the shelf and began drying pots. “Lance Armstrong?”


“Are you talking about Lance Armstrong?”

“No. What are you talking about?”

She put the pot lid away before answering. “So,” she breezed, “maybe don’t watch the Lance Armstrong interview until after I’m in bed, okay?”


That morning, John Mackey, CEO of grocery chain Whole Foods, told NPR that he had been wrong to call Obama’s new health care plan “socialist,” as he had been doing for years. “It’s more like fascism,” he said, conjuring images of jackbooted Brownshirts roughing up old ladies and forcing flu shots on them. Not surprisingly, lookups of “fascism” spiked.

So did the outcry from the people who generally shop at Whole Foods–people my father would call “crunchy-nuts-and-berries types,” people who talk about sustainably harvested herring and know how to pronounce “quinoa.” John Mackey backpedaled, and twelve hours later was telling another radio host that he made a boo-boo as regards his choice of words:

I was trying to distinguish it between socialism so I took the dictionary definition of fascism, which is when the means of production are still owned privately but the government controls it — that’s a type of fascism.

I was finishing up my shift in the syntax mines with one more lookup tweet. Lookups of “fascism” were off the charts, and as I read the transcript of Mackey’s apology, both my mouth and the door to my office flew open. In popped my eldest daughter, and out popped “Oh, you have got to be shitting me.”

“Mom!” she scolded. Then, “Never mind, I’ll come back when you’re civilized.”

Later, while I washed dishes and waited for snow, Lance Armstrong appeared on everyone’s TV and told Oprah that he didn’t think that doping was cheating, and guess who absolved him of it?

He insisted that given the widespread culture of doping in the sport during those years, it was not possible to win the Tour without doping.

“Did you feel you were cheating?” Winfrey asked.

“At the time, no,” Armstrong said, explaining it with moral relativism. “I looked up cheat in the dictionary and the definition was to gain an advantage on a rival. I viewed it as a level playing field.”

Armstrong’s justification is laughable, of course, as is the reporter’s modifying clause in the final paragraph. We hear it and holler, “C’mon!” We may even check the dictionary, whereupon we leave a Seen & Heard comment at the entry for “cheat” that reads, “Lance Armstrong! C’mon!” But the fact is that appealing to an external authority to justify your position is, like the McRib sandwich and idiocy, an ontological constant: “the scriptures tell us…”; “the Constitution states…”; “my dad says…”. The dictionary is an authority, and so gets dragged into all manner of arguments.

“How come,” countless editorial emails begin, “you say that ‘biannual’ can mean ‘once every two years’ or ‘twice a year’? Stupidest, most useless definition ever! C’mon! Make up your mind! I have a bet riding on this.” When I write and say no one has won the bet, that “biannual” really can–and does–mean “once every two years” and “twice a year,” I often get the reply, “Whatever, tl;dr. Which meaning is right? I have a bet riding on this.” You can hear them grouse at their monitors: “Just pick one, Dictionary, because authorities do not contradict themselves. Once they do, they cease being authoritative, and you’re not doing so hot right now.”

Sometimes the stakes are higher. Ten years ago, we added a second subsense to the noun “marriage” that covered uses of “marriage” that refer to same-sex unions. Someone eventually noticed.

Outrage! screamed about 4,000 emails, all flooding my inbox in the space of a week. How dare you tell us that gay marriage is okay now?

I was not surprised, honestly: I drafted a long, thoughtful reply about how words get into the dictionary, noting that this sense of “marriage” had been used by both proponents and opponents of same-sex marriage since at least 1921, and finishing with the caution that the dictionary merely serves to record our language as it is used. I spent the next two weeks doing nothing but sending this reply out to everyone and their mother.

The problem–because when it comes to correspondence on this scale, there’s always a problem–was that I was making assumptions about what sort of authority people took the dictionary for. I realize that I’m sort of biased since I’m on the inside, but I assume we all know the dictionary is only an authority on the meanings and uses of words. These particular correspondents, however, believe that the dictionary is the publishing arm of the New World Order as run by a liberal, elitist cabal who is out to destroy everything a rational person and the annals of history hold dear. To them, the dictionary is a political tool and therefore a back-door authority on life itself, and this entry in particular was evidence of a conspiracy to force us all into SCOTUS-mandated gay marriages with Ellen DeGeneres or Anderson Cooper. They responded accordingly: Noah Webster is turning in his grave knowing that his dictionary, our moral barometer, can no longer tell the difference between right and wrong. Some people were not so sentimental: “Drink a cup of battery acid and eat broken glass, whore of Babylon,” answered one correspondent.

I closed my eyes and pressed my fingertips into my orbital sockets until I saw explosions, then forwarded the email to our President. “Do I qualify for hazard pay now? And the battery acid comment reminds me that we’re out of coffee upstairs.”

What proof do people have that the dictionary is not merely a record of language? Plenty, my correspondents sputter: everywhere you look, people are citing dictionary definitions as justifications for all sorts of wrong things. “The Supreme Court uses the dictionary in making their decisions!” one of my correspondents warned. “The dictionary is an authority on how we live life, and our morals, and it’s a pretty piss-poor one in my opinion.”

This is true: courts will sometimes use dictionary definitions in their deliberations. But though I am not a lawyer, something tells me they are not basing their judgments solely on the dictionary. As for the dictionary being a moral guide, it never was and it never should be. We enter the words “murder” and “headcheese” into the dictionary, but that shouldn’t be read as advocacy for trying either one of them. 

One of Merriam-Webster’s marketing taglines used to be “The Voice of Authority.” In truth, it’s a tagline that makes me uneasy: it makes the dictionary sound like the fatuously beaming spokesperson for capital-A Authority, and all that a sneaky or powerful person needs to do to validate whatever shenanigans they are up to is align themselves with that mouthpiece, possibly appropriate it and use to their advantage. I’m not pointing fingers at John Mackey or Lance Armstrong: I, too, have gone to the dictionary in the past to defend my own personal and totally non-lexical beefs with someone (pray for us now and in the hour of our peeving). But the people who tend to point to a dictionary definition and defend their moral high-ground based on it remind me of the kids I knew growing up who would close their eyes, open their Bibles, and declare that whatever verse their finger touched was going to be God speaking directly to them. Sometimes they landed on “Be not afraid, for I am with you,” and they’d trot to the playground and tell Angela to “shut up, God told me he was with me and I am going to ask him to make you barf all over your dress because you are stuck-up and dumb.” Other days, those kids were quiet and refused to play double-dutch or Chinese jumprope; that morning, their finger landed on “Now Esau was a hairy man.” For them, the Bible’s primary use was for sticking it to that big idiotface Angela.

So it is with the dictionary: if some people treat the Bible like a holy slot machine that occasionally pays out big, then others treat the dictionary like the defense’s case-clinching surprise witness. People escort the dictionary to the stand and use it to destroy the prosecution: “The Voice of Authority says that government oversight of health care is fascism”; “The Voice of Authority gives/does not give gay marriage validity”; “The Voice of Authority says I didn’t cheat.” We go with this line of reasoning, but only up to a certain point: no one ever says, “The Voice of Authority compels you to eat headcheese.” In that case, we recognize that the dictionary is just a book that tells you what people mean when they use the word “headcheese.” No one in their right mind would think that the dictionary is in bed with Big Deli.

I lampoon “The Voice of Authority” at home– “Hey, the Epiglottis of Authority is telling you to quit farting around and do your homework now.”–but I cringe when I see intelligent people imbue The Voice of Authority with moral weight. In the preface to his very first dictionary, the 1806 Compendious Dictionary of the English Language, Noah Webster spends time highlighting the wrongs of lexicographers before him. In the midst of his genteel rant, he notes:

This fact is a remarkable proof of the indolence of authors, of their confidence in the opinions of a great man, and their willingness to live upon the labors of others. It shows us also the extensive mischiefs resulting from the mistakes of an eminent author, and the danger of taking his opinions upon trust.

It’s a passage I reflect on frequently when trying to explain that the dictionary really isn’t an unchanging and infallible dispensary of moral wisdom, nor is it a prop for your personal convictions. It’s a book that tells you how people use words. Noah Webster treated it that way; the Supreme Court treats it that way; we should all treat it that way. The Epiglottis of Authority means it.


UPDATE: Via this Washington Post article, I find that James Brudley (Fordham U) and Lawrence Baum (Ohio State) recently published a study on how SCOTUS has used the dictionary. The whole paper is available for free download, but the last few sentences of their abstract tell you everything you need to know:

Yet our findings demonstrate that the image of dictionary usage as heuristic and authoritative is a mirage. This contrast between the exalted status ascribed to dictionary definitions and the highly subjective way the Court uses them in practice reflects insufficient attention to the inherent limitations of dictionaries, limitations that have been identified by other scholars and by some appellate judges. Further, the justices’ subjective dictionary culture is likely to mislead lawyers faced with the responsibility to construct arguments for the justices to review. The Article concludes by offering a three-step plan for the Court to develop a healthier approach to its dictionary habit.

Both the article and the paper are worth the read, if only to find that in 2008, one member of the Court decided to cite the definition of “promote” from Webster’s Second New International Dictionary in writing a majority opinion. Webster’s Second, I hasten to remind you, has been arguably out of date since 1935 and inarguably out of print since 1961.



Filed under correspondence, general, lexicography

“God,” Guns, and Group Defining

When people want to make small talk with me—before they realize that I am terrible at it and not worth the time and effort—they will ask what I do, and then sometimes respond with, “So, you pretty much know everything, right?”

I have just taken to smiling wearily and saying, “Yes, I know everything.” I have teenagers, and often enough they are happy to disabuse those people of this asinine notion.

No one knows everything, and lexicographers are just like the rest of humanity (only slightly quieter and perhaps a little more openly deranged). There you are as a lexicographer, minding your own business with “harpy,” when you scan downscreen to your next word and encounter “harquebus” in all its Francophonic glory. You flip through your mental card catalog of Words I Have Seen, find the one labeled “harquebus,” and find your memory has only written, “from a novel, maybe Count of Monte Cristo? Is that a novel? SEE ALSO: sandwiches I have loved.”

Fortunately, the lexicographer doesn’t have to rely on this mental catalog. The lexicographer relies on citations. But what do you do when the citations are less than helpful? Here, for instance, the citations are all variants on “She pulled a harquebus from her corset/stomacher/stocking and shot him dead,” which gives you nothing besides a genus term for your definition (“a gun”) and a ten-minute respite as you ponder whether a gun would even fit inside a corset—or good Lord, a stocking, wouldn’t stockings fall down or even tear under the weight of a what’s-a-hoozy—harquebus? And why are heroines in these novels always pulling weapons from their underwear, anyway?

You return to the citations with a sigh and a determination to carefully study the cover of the next trashy novel you see, just to observe whether the buxom, swooning lass’s dress has pockets in it or not.

The problem with “harquebus” is not just that the citations are maddeningly vague and all pulled from Harlequin novels. The fact is that the word “harquebus” refers to a very specific thing, and you need to know a bit about the thing “harquebus” in order to define the word “harquebus.” Or, at the very least, you need to know enough about the thing to know whether these particular uses for the thing are valid.

You do not know that. But fortunately, there’s a guy on the editorial floor with a thing for Renaissance-era weaponry, and he will know.

You know he knows because of a précis of wonder and beauty: the Specialized Subjects list. This is a document that tells you everything that every editor on the floor knows. It is full of surprises and is one of the best ways to get to know your co-workers without having to actually talk to them. Of course the senior etymologist “has at least superficial familiarity with most European languages, best within Slavic, Celtic, and Germanic,” but did you know that he also is  a mushroom-picking philatelist? Likewise, our French editor is a weapons enthusiast. The quiet health nut, it turns out, loves cigars. I know about the 9th-century Latin Mass, knitting, and muscle cars.

The list is handy for general definers who are stuck with “hot rod,” but it’s also handy for the Director of Defining, who uses it when a group of words (say, music theory terms) should be defined by someone with superior knowledge of the subject. Welcome to “group defining,” the ever-deepening hole into which you daily and hourly dig yourself by proclaiming that you have any knowledge of any subject whatsoever. For the new Unabridged Dictionary, I have been given, as a group definer, all the religion terms. This is what an interdisciplinary degree and a penchant for reading and marking books like “Freethinkers: A History of American Secularism” will get you: a batch for revision that is about 10,000 entries long. (I’m one-sixth of the way through and am currently stuck on the entry for “god.” See you in whichever afterlife destination you feel like condemning me to.)

There is something very tricky about group defining, because that is where you find yourself balancing the thing-ness and the word-ness of a definition. A harquebus, as I have learned from the guy with a thing for Renaissance-era weaponry, is a matchlock gun that is heavy enough that it was usually fired from a support. Those characteristics are what distinguish a harquebus from a blunderbuss, which was “probably a better choice for stuffing into a corset,” says my colleague. The distinguishing characteristics of a harquebus therefore belong in the definition for “harquebus,” even if the batch of citations I have at hand don’t mention any of them. The group definer has specialized knowledge, as well as a whole raft of odd books they can plunder for citations so our formal evidence matches up with reality.

But even a good raft of odd books can’t catch everything. I spent about two weeks revising three related theology entries because each of those words was used, for quite a long time, very deliberately incorrectly. They were employed by one side of a theological argument as rhetoric and epithets to discredit the legitimacy of the other side. It’s as if the whole early Christian church was at a hockey game together and someone started a “Monophysites suck” chant that went on for roughly 1,000 years. But if you aren’t someone who knows about the initial theological brouhaha and the way it resonated through the Middle Ages–perhaps because you never had to write a paper on the Nestorian and Eutychean controversies, because you chose a better degree than I did–you wouldn’t know that was the case.

Lexicographers talk with a sort of heavy-breathing fetishism about the corpus, the citations, the data. It will give us all the answers. But every corpus in the world has holes in it, limitations. That’s part of why a good dictionary is compiled by people–living, breathing, awkward people who can look through that corpus, give advice, and do some citational spackling based on the knowledge and experience they gleaned from outside the office. Lexicographers may throw around the size of their corpus, but it’s the people sifting painstakingly through that corpus, like archaeologists weighing potsherds, that make all the difference.

When my children were little, they learned that the word “wedgie” referred to “the condition of having one’s clothing wedged between the buttocks,” as the Collegiate so toffishly puts it. They were absolutely ecstatic: here was a word for this thing that happened to them pretty much constantly! And it was a good word, too, a word that had great screechability and ended in a long-e for maximum sustain. Best of all, it had to do with butts. For about three days, both the six-year-old and the two-year-old hollered the word “wedgie” constantly.

Now, like most parents with young children, my husband and I were desperate for some little veil of ivoried respectability to drape over this big, nekkid waller of parenthood that was so often punctuated (primarily in public spaces, usually with a finger or two up a nostril) with “MAMA! I HAVE A WEDGIE!” So I told my kids not to call it a “wedgie”—I told them to call it “an issue.”

They did, for many years. And while people may have cocked their heads to hear a worried-looking preschooler say, “Mama, I have an issue,” the veil of respectability slid artfully into place. For a while.

The day soon came when both my children learned that when other people use the word “issue,” they are not referring to wedgies. They are referring to vital and unsettled matters that generally require discussion.

“Yes,” I answered, as my eldest explained this to me in tones of deep-purple mistrust, “but isn’t a wedgie basically the same thing in our house? Besides, no one else knew what we were talking about. They thought that you were just deeply interested in the election.”

She frowned so deeply that the tip of her nose met her eyebrows. “But you write dictionaries: you knew it wasn’t like that in the real world.”

It’s a refrain I call to mind every time I read endless citations for “god” that use the word vaguely at best, and it is my mumbled offering of thanks for a team of editors who have wide, varied experiences and specialties I can draw on when the citations leave me hanging. When people come to the dictionary and look up a word like “harquebus” they expect you to give them the definition from the real world: the world where women don’t stuff a gun the size of a musket into their corsets, no matter what the citations tell you; the world where “Monophysite” is not a politicized slur; the world where a wedgie is a wedgie.


Filed under lexicography, making word sausage

No Logic in “Etymological”: A Response I Actually Sent

Today I got an email from someone who watched the “irregardless” video and was appalled (though in the gentlest and kindest manner possible) that I said “irregardless” was a word. It’s not logical! Just look at that sloppy coinage: “ir-” and “regardless.” Why, it should mean “WITH regard to,” not “without regard to”! Who in their right mind is going to use “irrespective” and “regardless”–both perfectly serviceable words–to create a synonym of each word that looks like it should mean the opposite of what it does?

I drafted the reply I wanted to send and saved it to my Nobody Knows The Trouble I Seen folder. Midway through my real response, though, I changed my mind: this guy needed to see the NKTTIS response. Something about the tone of his letter was bothering me. It was not, as these letters usually are, arrogant. It was sad.

English is a little bit like a child. We love and nurture it into being, and once it gains gross motor skills, it starts going exactly where we don’t want it to go: it heads right for the goddamned light sockets. We put it in nice clothes and tell it to make friends, and it comes home covered in mud, with its underwear on its head and someone else’s socks on its feet. We ask it to clean up or to take out the garbage, and instead it hollers at us that we don’t run its life, man. Then it stomps off to its room to listen to The Smiths in the dark.

Everything we’ve done to and for English is for its own good, we tell it (angrily, as it slouches in its chair and writes “irregardless” all over itself in ballpoint pen). This is to help you grow into a language people will respect! Are you listening to me? Why aren’t you listening to me??

Like  well-adjusted children eventually do, English lives its own life. We can tell it to clean itself up and act more like one of the Classical languages (I bet Latin doesn’t sneak German in through its bedroom window, does it?). We can threaten, cajole, wheedle, beg, yell, throw tantrums, and start learning French instead. But no matter what we do, we will never really be the boss of it. And that, frankly, is what makes it so beautiful.

Here’s the response for your erudition. (That is a fancy way of saying “for to make you smart”!)


I’m glad you enjoyed the video, which did indeed generate a lot of email. You raise a number of points, so I hope you’ll forgive the lengthy reply.

You’re right that “irregardless” is an odd blend of “irrespective” and “regardless,” but to jettison it sheerly because people “foolishly and incorrectly” created a blend without any regard to the etymological logic of the word is–to be blunt and etymologically logical–ridiculous. We’d have to get rid of thousands of words if we could only use the etymologically pure ones. I’m not just talking about the “to utterly destroy” sense of “decimate” here: “hangnail,” “apron,” and “pea” would have to go, as they were coined through sloppy misreadings of “angnail,” “napron,” and “pease”; “derring-do” gets the axe (or is it “ax”?) for being a slightly deaf phonetic rendering of Middle English’s dorring don; “airplane” is banned as a needless alteration of the earlier “aeroplane”; and so on.

Further, what do we do about those words like “decimate” that have dared to stray from their etymological moorings? Should we dump them, and if so, where is our chronological line of demarcation? Pedants argue that the “utterly destroy” sense of “decimate” is a modern invention, a festering boil upon the shining face of Proper English, but that particular use is 400 years old. In fact, most uses that people rail against are: shortenings and abbreviations go back to the 12th century, Chaucer created some highly illogical compound words, and Shakespeare verbed nouns.

As someone who spends her workday determining whether “however” is an adverbial conjunction or a conjunctive adverb and quietly cussing to herself, I appreciate that you want English to be a logical and tidy language. You’re not the first person to wish this, and you won’t be the last. Unfortunately, English stopped being logical and tidy about 1500 years ago, give or take, and no amount of correction will fix–or has fixed–this. And if I may go one further, all these horrifying and “wrong” words still have not managed to destroy (or even decimate, in the etymologically correct sense) the English language. It barrels on.

Language expansion, much like a good party, tends to be a bit messy. Happily, the English language is big enough for all of us. And if you take that sentence less as an expression of hope and more as a death knell for a much beloved language, well, there’s always Esperanto.


Filed under correspondence, general, lexicography, the decline of English

Seeing Cerise: Defining Colors in Webster’s Third

When you spend all your time in a book, you think you know it. All the editors at Merriam-Webster know the Third, but now that we’re undertaking a revision of the beast, we’re ears-deep in it, drowning in stuffy single-statement definitions. Each of us breathes a bit shallower when we start futzing around with Philip Babcock Gove’s defining style, waiting for his ghost to dock our pay or perhaps cuff us upside the head as we sully his great work. Add to this the fact that, it’s true, familiarity does breed contempt. At least once a batch, I look at a perfectly constructed definition, accurate and dispassionate to the point of inhumanity, and wish I could add a wildly inappropriate example sentence just to liven things up a bit, like <Doctors suggest you eat kale until your pee is neon green with excess micronutrients.> So you may understand why, while I was slogging my way through a B batch, I was delighted to run across this:

begonia n3 : a deep pink that is bluer, lighter, and stronger than average coral (sense 3b), bluer than fiesta, and bluer and stronger than sweet william — called also gaiety

I lit up like a used car lot. As I was at my desk on the editorial floor, and my cubemate was in a foul mood owing to an e-mail he had received about the thesaurus entry for “love,” I very carefully laid my palms flat on my desk to keep myself from clapping and merely mouthed the words “average coral (sense 3b)” four times. It was, as far as I could tell, an accurate definition–but it was so evocative and full of personality that I began to wonder if it had been slipped in after Gove shuffled off this mortal coil and joined the editorial floor invisible.

So began a deep-pink goose chase through the Third, as I looked for “fiesta,” then “sweet william,” and then “average coral.” I eventually ended up at “coral,” where sense 3c yielded up the fresh wonder, “a strong pink that is yellower and stronger than carnation rose, bluer, stronger, and slightly lighter than rose d’Althaea, and lighter, stronger, and slightly yellower than sea pink.” Carnation rose was clearly the color of the pinkish flower on the tin of Carnation Evaporated Milk, and Rose d’Althaea was clearly Scarlett O’Hara’s flouncy cousin, but it was the last color that captivated me. “Sea pink,” I murmured, and incurred the harumphing wrath of my neighbor. As he stalked off to find a quieter corner, I wanted to stand up and shout, “I grew up 1500 miles from an ocean! I didn’t know the sea was pink!”

The Third’s color definitions became my break from defining or proofreading. After staring into the middle distance for a few seconds, I’d think of a color and look it up in the Third, invariably ending my chromatic excursions with a fool grin on my face. Vermillion: “a variable color averaging a vivid reddish orange that is redder, darker, and slightly stronger than chrome orange, redder and darker than golden poppy, and redder and lighter than international orange.” Lapis lazuli blue: “a moderate blue that is redder and duller than average copen and redder and deeper than azurite blue, dresden blue, or pompadour.” Cadet: “a grayish blue that is redder and paler than electric, redder and duller than copenhagen, and less strong and very slightly redder than Gobelin.” Electric! Copen! International orange! Prior to “begonia,” the Third was a middle-aged management man with a Brylcreemed combover, in well-pressed shirt-sleeves and pants that were a bit too tight at the waist, full of busy self-importance. Now, he was the same middle-aged manager, but unbeknownst to the rest of the office, he danced flamenco on the weekends.

How did this all this flamenco dancing slip past Gove, the authoritarian curmudgeon who oversaw the creation of Third?

Of course, nothing of this magnitude would have slipped past Gove. The color definitions in the Third were very carefully engineered in accordance with Gove’s vision of a dictionary that was not only completely objective and precise, but was also the most scientifically minded dictionary of its day. One only need look as far as the masthead of the Third to see the lengths that Gove went to: 202 lengths, all listed under the tidy heading, “Outside Consultants.” These consultants were pedigreed and heavily degreed experts in their respective fields, and their job was to provide direction for specialty areas that in-house editors may not have had much experience with, such as the Mayan calendar, traffic regulations, and (gasp) coffee. Gove took his color definitions seriously. There are seven consultants listed for color; there are only four total consultants for mathematics and physics.

The color definitions in the Third are a meeting of old and new. The chief color consultant for the Third was Isaac H. Godlove, a man whose name means nothing to you unless you study the history of color theory. Since fewer people study the history of color theory than do lexicography full-time, I will tell you that Godlove was the chairman of the Committee of Measurement and Specification of the Inter-Society Color Council, a member of the Colorimetry Committee of the Optical Society, director of the Munsell Research Laboratory (which gave rise to the Munsell Color Company, a company that was evidently formed specifically to standardize colors), and a guy whose business cards must have been double-thick fold-out jobbies. He was also the color consultant for Webster’s Second New International Dictionary.

For Webster’s Second, Dr. Godlove developed a system of defining colors by hue, saturation, and brilliance. “Cherry,” for instance, is defined in the Second as “A bright-red color; specif., a color, yellowish-red in hue, of very high saturation and medium brilliance.” If this doesn’t call to mind an exact color–and I don’t see how it could unless you were a colorimetrist–the Second helpfully requests that you also see the entry for “color.” The entry for “color” is three columns long in the Second, begins with the label “Psychophysics,” and includes a lively discussion on the different ways to measure hue, the nature of light waves, and the neurochemical impulses that, when combined, potentially yield the sensation we refer to as “color.” There are graphs and two color plates. It is serious business.

Godlove’s work as a colorist was brilliant, and Gove likely knew it. (He may have been a workaholic perfectionist who pioneered the Rule of Silence, but he wasn’t a moron.) To duplicate this sort of defining system would have cost time and money, and Gove hated anything that breathed inefficiency. It seemed best, then, to use the framework that Godlove had set up for the Second. There was one snag: these standardized definitions that appealed to an objective standard set up by The Standards People couldn’t stand on their own. Every definition followed the same pattern: “a color, [color name] in hue, of [high/medium/low] saturation, and [high/medium/low] brilliance Cf. COLOR.” But apart from one reference to an indistinct and very subjectively observed color, like “yellowish yellow-green” at “holly green,” there was nothing in the definition to orient the casual reader apart from the color plates given at the colossal brain-twisting entry at “color.” And, of course, there weren’t color swatches for every color defined in the Second. “Holly green” is only the yellowish yellow-green that is of low saturation and medium brilliance, whatever that may be.

Gove called Godlove back in to work on the color definitions of the Third, and to entice him, he gave him a team of color theorists to boss around. As astonishing as it sounds, color names had been increasingly standardized since the 1930s, and their use had even been analyzed in mass-marketing–very sciencey!–and these guidelines and findings were to be incorporated into the Third. Who better to do this than the man who helped pioneer color standards?

The working files for the Third begin with the Black Books: our editorial style guide as written by Gove and adhered to by editors under pain of death (or a stern note from Gove, which was essentially the same thing). The Black Books are 600-plus pages of single-spaced directions filed in loose-leaf black binders, and they used to sit on the top of one of our long banks of citation drawers, lending that little warren an air of regimented malevolence. You only had to look at them to feel the ghost of Gove march past you, wondering why you were gawking instead of busting your hump on the E file.

The Black Books have much to say on many things, but less to say on the color definitions than you’d think. Perhaps the very first sentence is all that Gove needed to say: “Godlove’s psychophysical defs of color names and their references had better be regarded as sacrosanct.” Full stop. General editors were absolutely not to be mucking about in the color definitions.

Gove let Godlove use the latest scientific techniques in discussing color: there are color plates in the Third, as there were in the Second, and there is an entire page devoted to explaining the color charts and descriptive color names in the Third, as well as a five-page long dye chart tucked neatly in between the first and second homographs of the word “dye.” (The explanation of color charts in the Third abandons the discussion of psychophysicality and favors equations. Very Cold War.) But there are two big differences between the Second and the Third.

The first is that the color definitions in the Third were to be relational–that is, every color could be defined as being more or less of something than another color entered in the Third. Formulaic statements regarding the hue, saturation, and brilliance (now called “lightness”) of a color were insufficient. The other revolution is that the analyzed work of “color specialists from Sears Roebuck and Montgomery Ward,” as Gove put it, would be used in defining the color names in the Third. In other words, users of the Third were not just going to get the names of colors that were considered scientific standards: they were going to get the names of this fall’s fashions in the Monkey Ward’s catalog. Gove sums up: “The range therefore is in the direction of the layman.”

And what a kaleidoscope the layman got. You could spend an hour alone getting lost in “cerise” (“a moderate red that is slightly darker than claret (sense 3a), slightly lighter than Harvard crimson (sense 1), very slightly bluer and duller than average strawberry (sense 2a), and bluer and very slightly lighter than Turkey red”). No doubt people did. That may explain why we don’t define colors this way anymore.

The Third, with its zeal for modernism and science and objectivity, sometimes lost sight of the forest for all the xylem and phloem. As specific as the definition of “cerise” is–and as smart as I am–all I get out of that is that “cerise” means “moderate red” and that there is more than one sense of “Harvard crimson,” which must really piss Yale off.

Let’s also take into account that if we’re doing our job–defining from citations–then colors are frustratingly, pound-on-the-desk difficult to pin down. Text-only citations give you absolutely nothing to go on: “Misses large, available in Cranberry, Olive, Cinnamon, Ochre, Cadet, Holly, Taupe.” These might as well be the names of the Seven Dwarves for all the information they give me.

Clearly, then, you need a color swatch. That should make matters easier. Here’s a swatch for you:

That is a quick Google image search for “taupe color swatch.” Some of those colors are distinctly not what I think of when I think of “taupe.” And that’s part of the problem.

Even taking printing or monitor differences into account, the fact is that the use of color names is standard, but the things those names represent are not. One man’s “taupe” is another’s “beige” is another’s “bone” is another’s “eggshell” is another’s “sand” is another’s “tan.” By the time I came around, we had given up on Godlove’s precision and instead gave the very first part of the Third’s definition for most colors: “cerise” is, in the Collegiate, “a moderate red.” That’s not terribly specific, but it does allow for variations in reproduction, marketing uses, and psychophysical observations of a wide variety of colors that are called “cerise.” (Please do not tell me you are red-green colorblind.)

The only place where a little poetry comes back into the dictionary is at the definitions for the basic Roy G. Biv: the colors of the visible spectrum. In defining those colors, we hearken back to generations of lexicographers before us (even back to Grumpy Uncle Noah) and play a bit of word association: when I say “blue,” the first thing you picture is…what?

For some poor schmuck, stuck indoors at some point in the 1850s revising Webster’s 1847 dictionary, blue was the clear sky. Collegiate definers have determined that red is blood or rubies. Green is growing grass, or maybe it’s emeralds, and yellow is ripe lemons or sunflowers. Whimsy does still take a backseat to practical matters, though. “Orange” presented problems–after all, what’s orange? Oranges, of all things, and you can’t say, with a straight face, that the color orange is the color of oranges without deserving a good smack.

You’d think that this word association would work well enough, but there’s always tweaking that needs to be done. Cerise, for instance, is the color of…what, exactly? I’ll tell you what: it is the color of a suit set my grandmother owned and only wore to Christmas brunches at the Aviation Club, where she would sit me down in my velveteen layer-cake of a holiday dress and demand my silence while she and Mrs. Tannendorf would drink mimosas and bloody Marys and pine for the good old days of Eisenhower. That suit is, I am telling you, exactly cerise, but that doesn’t do you much good, does it? You also can’t make sweeping assumptions about your reader. Sunflowers are yellow–but chances are good that if someone learning English knows what the word “sunflower” means, they probably know what “yellow” means as well. We had to get a bit more creative when we wrote our own ESL dictionary (here the ghost of Gove frowns): “orange” in our Learner’s Dictionary is not a color between red and yellow, as it is in the Collegiate. It is the color of fire or carrots.

It’s not that these picturesque color definitions are more correct or incorrect than the definitions before them. But defining colors is a bit like defining the word “love”: likely to make you sound like a nitwit in the real world.  You could argue that a straight-up scientific approach is best; that no comparisons should be made at all in color definitions. But after the labyrinth of the Third’s “cerise,” the simplest route is beguiling: Yellow is the color of the sun or ripe lemons. Green is grass; red is blood, brown is coffee or chocolate. And blue is still the color of the clear sky.

(Please do not tell me you are blue-green colorblind.)


Filed under famous lexicographers, general, history, lexicography, making word sausage

Assembling the Treasury, Wordhoard, Synonymicon, Thesaurus

All lexicographers, regardless of where on the prescriptivist/descriptivist spectrum they fall, like to tell you they are totally objective when writing their dictionaries. They get worked up into a veritable froth if you suggest otherwise, maybe even raising their voices to conversational levels and daring to make eye contact when they tell you that you are utterly wrong. Lexicography’s underlying tenet is complete objectivity! Get thee behind me, John Dryden!

Notice how they conveniently fail to talk about thesauruses when objectivity comes up.

Unlike dictionaries, there is no one approach to compiling a thesaurus, no Unified Theory of Synonyms. The main goal that all of them have is to present an entry word and a group of words related to that entry word, but how those words are specifically related to the entry–and how they are presented–is varied, to say the least.

I grew up using a Roget’s Thesaurus (and I use the indefinite article advisedly, as “Roget’s” is not a trademarked name in my part of the world). Like other dorks of my genus, I spent many a Sunday afternoon sprawled out on the couch, paging through a reference book. The dictionary and encyclopedia were hauled out any old time, but the Roget’s was reserved for dim, snow-muffled days, when it was too cold to go sledding and I was feeling as pensive and thoughtful as a nine-year-old can possibly feel. Roget’s had an elevating effect on me, and I’d be so moved by its profundity that I’d read it aloud to our dog. “Section one,” I’d intone solemnly to Buffy, our crabby Airedale, whose spot on the couch I was bogarting. “Existence. Being, subsistence, entity, essence. Ens. ” She’d huff and I’d sigh, and we’d stare out the window at the whiteout, feeling deeply for a few seconds about dog treats and life, respectively.

Roget’s is brim-full of existential gravitas because of how it was compiled. In the early 1800s, one Peter Mark Roget thought that a collection of words arranged by semantically related clusters within larger, epistemological categories would be a useful tool for the discerning scholar, and fifty years later, Roget’s Thesaurus was released to the public. Roget’s focused–and continues to focus, under a slew of different names and publishers–on grouping terms within larger semantic ideas and divisions. “Existence,” the first subcategory, includes nouns, verbs, adjectives, adverbs, formal philosophical terms, slang, and a wide variety of words that have something to do with the idea of existence, including “fact,” “sloth,” and–so cheery!– “grim reality.”

A brilliant system, though perhaps a little too much for the average teenager looking for another synonym of “desire” to use in this frickin’ essay on Wuthering Heights. But without it, teachers would no longer be regaled with fabulously inane and overly thesaurized essay sentences like, “Heathcliff dementated because of his propensity for Catherine,” or “Linton was Heathcliff’s foeman in wanting to pay one’s court to Catherine.” Dr. Peter Mark Roget, we salute you.

Dr. Roget’s categorization system isn’t to everyone’s taste, however. Discerning gentleman-scholars may have had the time to take the measure of each of the given synonyms, but the college student running on caffeine and youth, scrambling for an impressive synonym of “admonition” at 4:00am, may not. Some folks prefer a dictionary-like presentation of terms, and Roget’s is intentionally not dictionary-like. Enter the competitors (Merriam-Webster) and the losers who get to try to one-up Dr. Roget (me).

Where Roget’s revels in its epistemological abstractness, the thesaurus I was tapped to write was going to revel in its solidity. Unlike Roget’s, M-W thesauri deal entirely in listing related groups of lexical synonyms and antonyms. Instead of rambling chunks of loosely related words and an index that was half the size of the book, we’d present a list of words that mean exactly the same thing as the headword.  Words that were close would be called “related words” or “near synonyms,” and near synonyms would be grouped together by similarity of meaning. Same deal with antonyms. Very tidy.

And because we like tidy things, two groups of words that are commonly perceived as synonyms and antonyms would not be entered, because they were not lexically tidy: members of a genus and complementary pairs. This meant no “sofa” and “furniture,” nor any “black” and “white.” “Sofa” is not a lexical synonym of “furniture” because the word “sofa” does not mean “furniture.” Rather, a sofa is a type of furniture–it’s a member of a genus. And “black” is not the lexical antonym of “white.” If you look up “black” in the dictionary, its definition isn’t “not white.” “Black” and “white” are a complementary pair, like “knife” and “fork,” and “lexicographer” and “boring.” Not lexical, not eligible for entry. I nodded: yes, this is what we are good at, the lexical thing. This will be easy.

And it wasn’t.

Before you can find synonyms, you need to figure out what the meaning core of your headword will be. You must begin with a meaning that is broad enough to encompass most of the synonyms a person will want, but narrow enough that there’s some significant difference between it and another headword.

That seems like common sense until you begin writing and realize that you may need to have a bunch of synonyms in mind before you start actually looking for that bunch of synonyms. I worked backwards: I’d doodle out a list of possible synonyms for “general” and then begin looking for the common meaning they shared. When drafting that meaning, I also had to learn to avoid a common device used in dictionary defining: the synonymous cross-reference. Single-word cross-references in dictionary definitions are synonyms, and why waste a synonym in a meaning core when you can put it in the synonym list? (Because it is easy and I am lazy, that’s why.)

Once the meaning core is in place, you begin the hunt. The first M-W thesaurus was compiled, yes, by hand, with editors flipping through the Third and trying to keep track of all the possible synonyms for “love.” It was an overwhelming task, one sure to induce some strong hallucinations and psychotic breaks, and perhaps that explains why “chatty” was not listed as a synonym of “glib” in the first edition of the Collegiate Thesaurus but “well-hung” was. I had it easier, but even with a computer and a searchable dictionary database, finding and ordering synonyms and near synonyms was tricky. My nature was working against me: I am a splitter–a definer who likes detailing every possible denotative nook and connotative cranny of a word’s meaning–and so perhaps not the best person in the world to write a thesaurus. ‘Togs,’ I reasoned, means ‘clothing’, but it also refers to clothing worn for a specific purpose. Is that enough lexical synonymy to include ‘togs’ as a synonym? Or is it a near synonym? A vacuum whirred downstairs. It was 6:00pm, and I was going to be locked in the building overnight with nothing to eat and a bunch of boring, pedantic ghosts if I didn’t leave pronto. Synonym it is.

I began to rearrange lists of words by register, then by connotation, then for no other reason than they looked right next to each other. “Gear” and “rig” seemed to fit together–they are more technical words, referring to specific types of clothes used in particular activities, like mountaineering. And “costume” and “garb” sat well next to each other–they refer to dress-up, fanciful, special-occasion clothes. But those two groups are distinct: “costume” and “rig” didn’t work together, and that seemed right to me. It’s just like assembling a puzzle, I reassured myself. A puzzle with blank pieces you color in as you place them, and in the end you hope you have come up with a convincing representation of the Mona Lisa.

This sort of derangement is both de facto and de rigueur in the lexicography biz, but I was unprepared for one thing: that what seems right to me may not be what seems right to other editors. My thesaurus batches were returned; my carefully constructed near synonym groups had been scattered and re-formed. “Rig,” “outfit,” and “costume” ended up together, with “gear” left out. Other near synonyms were dropped; some were promoted to true synonyms. I was so thrown by this that it took me a while notice the crowing glory of the revision: my managing editor added two true synonyms I had, in all my shuffling, missed: “clothes” (with the comment “!!!”) and “habiliment(s).” “Clothes” is exactly the sort of obvious synonym it’s easy to overlook when you are slogging through a dictionary, trying to find every last possible synonym or antonym of a word. I had fallen prey to Well-Hung Syndrome. As for “habiliments,” I had never seen the word before in my life. Assuming the Drudge’s Hunch, I stared at my reconfigured entry. I rubbed my face, and then rubbed it some more, until it began to look like a flatiron steak. The revisions made me feel a bit dumb and defensive.

Defensive, yes, because isn’t lexicography coolly objective? And isn’t my very objective read of “gear” and “outfit” perfectly fine the way it is, since I’m totally and completely objective? But if we’re both objective and we disagree, then how objective are we being? One of us, I tutted to myself, was not being objective.

In truth, neither of us was being 100% objective. The very nature of grouping, ranking, and sorting near synonyms means that a certain amount of subjectivity will inevitably be a part of the process. I stuff my word sausage differently that the managing editor stuffs his.

Nonetheless, my turd-stirring nature won out. I padded over to the managing editor’s cubicle and interrupted his reading and marking with my concerns regarding objectivity. He listened patiently to my concerns about the order of near synonyms, but when I brought up “habiliments,” his brows beetled. “Kory,” he sighed, “that sort of catch is exactly why we do this as a group. ‘Habiliments’ is a synonym for ‘clothing,’ even if you don’t know the word. And if you really think that ‘outfit’ doesn’t belong with ‘gear’ and ‘costume,’ then write your reasons down on a pink and I’ll consider it.”


He shook his newspaper in irritation. “Last I checked, my title was not ‘Dictator.’ This is a group effort.”

Lexicographers get defensive about objectivity because we know that, no matter how much training we have, we cannot be truly objective because we experience language subjectively. (We are, contrary to popular belief, fully human and not at all robots.) Sometimes our own personal experience with the language is invaluable: that subjective sprachgefühl helps guide a lexicographer when defining, when editing, when rubbing a jumble of synonyms between your hands to discover their relative heft and shape.

Sprachgefühl isn’t just weighed against written evidence–it is put against other sprachgefühlen. Every citation we take, every rewording of a definition, every example sentence penned is a subjective use of language. But when considered together, subjectivity fades into a picture of patterned, communal–and objective–use. Language is a human group effort, and so should lexicography be.


Filed under lexicography, making word sausage, thesaurizing

A Letter to a Prospective Lexicographer

We regularly receive letters from people who want an editorial job at M-W and ask for more information on lexicography. It’s my job to answer those letters. Here is the response I wish I could send.

Thank you for your interest in becoming an editor at Merriam-Webster.  I am happy to share some information on the field of lexicography with you.

There are only three formal requirements for becoming a Merriam-Webster editor. First, we respectfully ask that you be a native speaker of English. I think I should break this to you now, before you begin shopping for tweeds and practicing your “tally ho what”: we focus primarily on American English. It’s not that we don’t like British English and its speakers. Indeed, we have an instinctual, deep love for any people who, upon encountering a steamed pudding with currants in it for the first time, thought, “The name of this shall be ‘Spotted Dick’.” But since we are the oldest American dictionary company around, and we are located in a particularly American part of the world, we feel it’s best to play to our strengths.

Second, we ask that you have a degree from an accredited college or university. It needn’t be an advanced degree, nor does it need to be a linguistics degree. Dare I say it? I dare: most of us got degrees in things other than linguistics. While you are gasping in outrage, incredulity, and a little bit of disdain, allow me to say that all Merriam-Webster lexicographers end up dealing with words from a wide variety of fields–economics, business, physics, math, cooking, music, law, ancient hair-care techniques, and so on–and it helps to have a cadre of trained experts in those fields who will look up at you dolefully from their own defining batch when you too-nonchalantly wander over to their cubicle and ask them for their opinions on “EBITDA.”

If you feel that this information on degrees is so broad as to be unhelpful, know that we seem to collect medievalists for some reason. Our costume parties are awkward, rare, and yet entirely historically accurate.

Third, you must be possessed of sprachgefühl. This is an innate sense of the rhythm of language, as well as one of those delicious German words you’ll hear thrown around the office a bit (but not as often as you’ll hear “weltschmerz”). How do you know if you have sprachgefühl? You don’t know. Even if you think you might have it, you won’t really know if you are possessed of it until you’re here, letting the sentence “It’s time to plant out the lettuce” pad around inside your head, paying careful attention to how it rubs up against the language centers of your brain. Sprachgefühl is also evidently one of those things, like eyesight and hearing, that can dull with overuse: after several decades of working here, you will find that occasionally you go a little deaf as regards the natural rhythm of English, and you’ll trudge to your car at the end of a very long Thursday convinced that you are actually a native speaker of some weird Low German dialect and not English.

It’s okay if sprachgefühl eludes you; once you make this life-changing discovery, you are free to quit and pursue a career where your average weekly wage will not be a buck-fifty and as many Necco wafers as you can nick from the receptionist’s candy bowl at the end of every work day.

Those are the formal requirements for a job here. I would add these caveats regarding the lexicographical lifestyle:

1. In addition to sprachgefühl, it is also a good idea to be possessed of what the late lexicographer Fred Cassidy called “sitzfleisch.” Lexicography is so sedentary a calling that it makes load-bearing walls look busy by comparison.

2. It is not a good idea to come in thinking that you are All That as regards grammar and usage. You will have to set aside your grammatical prejudices in light of evidence, and if you are nothing but swagger and self-aggrandizement, then you will fall particularly hard the first time the Director of Defining tells you it’s totally idiomatic to use “nauseous” to mean “feeling sick.” Swagger and self-aggrandizement are not part of the lexicographer’s idiom. Fidgeting, social awkwardness, and a penchant for bad puns are.

3. “I knew that the work in which I engaged is generally considered as drudgery for the blind, as the proper toil of artless industry; a task that requires neither the light of learning, nor the activity of genius, but may be successfully performed without any higher quality than that of bearing burdens with dull patience, and beating the track of the alphabet with sluggish resolution.”

Heed the words of His Cantankerousness Samuel Johnson, the patron saint of the lexicographer. This passage is excerpted from his 1747 letter to the Earl of Chesterfield in which Johnson proposes writing a new dictionary of the English language. “Bearing burdens with dull patience,” “beating the track of the alphabet with sluggish resolution”–and that’s what he thought before he started writing his dictionary.

It may well be that none of this dissuades you. That’s fine: slight derangement is not grounds for disqualification from a career in lexicography.

You should know, however, as you move forward in your search that jobs in lexicography are few and far between. Our late Editor in Chief used to tell people it was just a matter of being in the right place at the right time. This is so vague as to be maddening, so I am happy to clarify: it is a matter of being in one of the offices of a dictionary company just as the Editor in Chief says, “I think we may need to hire some more lexicographers.”

Take heart: one of my coworkers wrote once every three months for over a year about editorial jobs until finally our Director of Defining hired her. She’s a fabulous editor and we are lucky to have her. She also has a linguistics degree. All God’s critters got a place in the choir.

It’s worth noting that, though lexicography moves so slowly it is technically a solid, it is nonetheless changing. New online tools mean that you have more information at your fingertips, which means you must engage that sprachgefühl a lot more and know how to use a computer. (You’d be surprised.) Modern lexicographers have the luxury of writing for an online medium, where space is not at a premium and no one has to proofread the dictionary’s end-of-line breaks in 4-point type on blue galleys ever, ever again. When I came on, all new editorial hires were required to read and take extensive notes on the front matter to Webster’s Third New International Dictionary, Unabridged. This is no longer required, thanks to the tireless work of Amnesty International. And, of course, we’re allowed to talk inside the building now.

I hope this information, while not particularly encouraging, is helpful. If you are still interested, against all better judgment, in a career in lexicography, do feel free to send us your cover letter and resumé. We will keep it on file for a year, occasionally taking it out to marvel at your enthusiasm and shake our heads in wonder.


Filed under general, lexicography

The Impossible Task: Cross-Referencing the Unabridged

As I mentioned on the Twit Machine recently, I have been working on a very exciting project: a new edition of Webster’s Third New International Dictionary, Unabridged.

“About frickin’ time!” fans of the Third hollered in one thunderous voice, and with good reason: the Third was released in 1961. It has been updated by means of an Addenda Section once every seven or so years, but an A-Z revision has been long overdue. We will be the first people to tell you that, longingly, as we peer out from underneath the production schedule.

And so we’ve begun the long, slow work of revising and updating. There is a stately surrealism to stripping down and refurbishing of one of America’s most celebrated and controversial dictionaries, kind of like taking the Pope underwear-shopping. When you get right down to it, you are left there in your small mortality, looking at the boxer-briefs of something that has been revered and hallowed for longer than you’ve been on this earth, and that is unsettling.

Nonetheless, here I am, staring intently at the varicosities of the Third and doing my best to patch them up.

Over the years, I’ve been asked why we don’t just slap some new words into the Third while we’re mucking about with new Collegiate editions. Hell, it’s just data, my dictionary-loving friends would say. It’s just an entry. It’ll only take you two extra minutes.

I have discovered that it’s not just an entry, and it’s not just two extra minutes, because of something called “cross-reference.”

Every dictionary you use has rules about the words entered therein, and one of the basic rules of any decent dictionary is that you cannot use a word in the definitions, usage notes, or example sentences that is not defined somewhere, somehow in that very dictionary. That sounds sensible, but you’d be surprised how many discount dictionaries don’t follow this rule–and what a difficult rule it is to follow, even in this digital age. In order to make sure that this rule is followed, we have a whole group of editors whose job is to beat the track of the alphabet, hoovering up all the information they can about the words in this book, and making everything tidy.

I was recently pulled from doing some subject-specific defining and put on the ever expanding task of making sure new entries are entered properly into the data. Part of this involves some cross-reference work, but “not a lot,” as the Director of Editorial Operations put it. “Just a bit.”

Silly me, I took “just a bit” at face value. In fact, “just a bit” means “there’s quite a lot and you will only find and correct a little bit of it.”

My very first entry gave me trouble. There was a word in a quotation that looked odd. I don’t think that’s supposed to be hyphenated, I thought, and so I went to the Third. No, indeed, it was entered in the Third as a closed compound, and I patted myself on the back for being so observant. Mid-pat, I realized I then had to do something about that.

There are options available to the editor doing cross-reference, but none of them is easy. The simplest choice is to alter the quotation to omit the troublesome word. Of course, as luck would have it, this wasn’t possible in this case, as the word to be omitted was the verb of the sentence, and a verbless example sentence was certainly going to raise a few eyebrows when this new dictionary came out. Well, then, I’d just have to find another quotation to sub in. Off to the citation files, where I found the absolute perfect substitute. Oh, it was gorgeous: short, idiomatic, completely covering the contextual meaning and connotation of the word in question, and the author’s name made me giggle (last name: Butters). This was it. After running it through the cross-reference gauntlet, I discovered it used two words not entered in this dictionary.

The next option is to see if the compounding style of this word is going to change at all in the new edition. We base this on citational information, so a quick search of the database showed be that the hyphenated and closed compounds had roughly the same amount of use. I shoot an e-mail to the Director of Defining and ask him if he has any advice. His response is, “Look through the revision files. Quickly.” Because like all dictionaries, this one has a deadline and we will make many, many people (not least of whom, the Publisher) sad if we push it back.

The revision files yield many surprises, chief of which is that some of the entries in it are from editors who came and went 20 years ago–the Third has, let’s remember, been in need of revision for a long time–and their notes have been appended by successive generations of editors who are correcting or reiterating their point. (“Style was once open; now determinedly hyphenated. A. Editor, 1982.” “Style now closed; ignore previous note. B. Editor, 1986.” “Word is open compound. Ignore A. & B., they are morons. C. Editor, 1992.”) I open one notes file. It is several hundred pages long.

After some searching, I find a note for this entry that leads me to believe that the hyphenated compound will not be entered. I make an assortment of irritated editorial noises and, after opening the cit files again, start looking for a third replacement sentence. An hour has gone by and I have spent it on one quotation at one entry. The word I am agonizing over is not even the word I’m entering: it is peripheral, incidental. But when you are doing cross-reference, nothing is peripheral or incidental.

Some variation of this continues for the rest of the letter, then progressive batches, and the number of annoying e-mails I send to my colleagues skyrockets. I can almost hear the server groan when I hit “New Message” and begin my fourteenth e-mail of the day to one of the science editors. “Me again. What are you going to do with ‘thumb drive’? I’m sure you haven’t even given it a thought, but can you give it one for me in, say, the next ten minutes?” I send more e-mails to the Director of Defining. “Howdy. Do you have any thoughts on how to handle the expansion of ‘HIPAA’?” And again, later: “One more: can I edit ‘douche-canoe’ down to ‘douche …’ in this quotation for ‘bromantic,’ or will I have to enter a new sense of ‘canoe’? If I’m doing that, should I just enter ‘douche-canoe’?”

It’s not just a matter of hunting down compounding styles. There are the new entries that require other new entries, each of those requiring two new entries, one of which will require substantial revision to another four entries, two of which will require new etymologies. One medical entry requires that I re-open 9 letters for revision and ask our Pronunciation Editor for six new prons in letters he’d already done. It takes me four hours to enter all this into the file.

At one point, I spend time trying to find a better quotation for a word to avoid the dread hyphenated-but-not-entered-as-such compound, only to discover 30 minutes into my search that the hyphen in question is actually an end-of-line break, and so not a real hyphen at all. The only upside to this is that the quotation I can now retain was written by someone with another chortle-inducing name. We take joy where we can find it.

Every inquiry leads me down a garden path of more inquiry, until I am lost in the weeds and just want to lie down in the grass and sleep for many years. I’m in so many different letters at once, I can’t tell you where I am in the project. (Here the Publisher frowns.) And here is the most perverse thing of all: even with all the time I’m putting in making sure that all these entries are tidy, there is no way I will catch every cross-reference error. Words that I assume are entered are not; styles that I assume are fine will be changed; words will be dropped or modified during copyediting, setting off another string of cross-reference changes. When I try to explain what the cross-reference work is like to another general definer, I sum it up by saying, “Google ‘ping-pong balls, mousetraps, and nuclear chain reaction.'” The ping-pong balls are the entries. All those sprung, upended mousetraps are me.

That is why we have Cross-Reference, the stalwart department who does this for every damn book we publish. Cross-Reference consists of the sweetest people on the editorial floor, but make no mistake: they are brilliant in ways that blabbering dilettantes like me cannot possibly comprehend. Consider: I have only done cross-reference work digitally, but there are people in our Cross-Ref department who remember the days when they did this by hand–when checking on the proposed styling of a new entry involved a silent plod across the editorial floor, a short aerobics routine that involved carefully lifting and stacking galleys, and tens of thousands of index cards. At one point, I asked one of the Cross-Ref editors how they knew that a styling change would be made later in the alphabet. “Oh,” she said, “you just keep track. Most of it just sticks in there, in all those nooks and crannies in your mind.”

I considered, not for the first time, that I must I have a very smooth brain.

They not only catch mistakes, but are lightning fast. They have to be: by the time they get a finished dictionary, they usually only have a few weeks to do their work before the book is due at the printer’s, and the printer gets very cranky if we are late. When the defining work is done, everyone breathes a huge sigh of relief and we celebrate with doughnuts, but no one gives a thought to the tireless drudges who are still–quietly, cheerfully–making sure that we haven’t used “douche-canoe” in an entry without defining it. There is very little glory in lexicography, and where there is glory, definers and etymologists get it all. But Cross-Ref are the ones who actually deserve it.

So when you read a dictionary entry in the new unabridged and have to look up another word in said book, raise a glass to the masterful editors of Cross-Reference, and be very glad that I am not one of them.


Filed under lexicography, making word sausage