Tag Archives: unabridged

The Times, They Are A-Changing (And So Should Your Dictionary)

I was on an airplane heading to Georgia for a conference when I got into my usual “take my mind off the possibility this plane will suddenly plummet from the sky” conversation with my seatmate. Talk turned to dictionaries, and my seatmate began heaping praise on her old one. She had, she told me proudly, a Webster’s Second, and there was no way in heaven or on earth she was going to give it up for one of those silly modern dictionaries. “My son keeps trying to get me to use a dictionary on my phone, but I tell him, ‘Those new dictionaries aren’t the same quality as the one I have at home.’”

I opened my mouth to say that, nice though the definitions in the Second are, they are almost 80 years out of date, when the supercell we were flying past let out a little meteorological burp and the plane flew right through it. I am not entirely sure, but I believe we may have flipped over several times, and I am certain that the sound that came out of my mouth was not a spirited defense of the modern dictionary (though it was certainly “spirited” in the “possessed by banshees” sense). Our bounce through North Carolina airspace lasted only ten seconds, and afterwards my seatmate excused herself to the lavatory, so our conversation was over.

Had the conversation continued, I would have said this: old dictionaries are nostalgia bombs in more ways than one. The heft of the Second and the Third are glorious: tooled leather and gold-leaf embossing, that powdery vanilla smell of old paper as you smooth the pages back. Then you see this:

doo dee doo dee doo WHAT

“Negrito,” Webster’s Second

Consulting old dictionary definitions is like having dinner with your grandparents. The evening usually starts off well enough, with your grandparents telling stories of their life during the war or down on the farm, and then there is that one point where your dear old granny says something that is slightly outré and you know that the whole conversation is slowly going off the rails, but before you can think of some tactful way to change the subject, your dear grandma is using words like “Japs” and “Eye-ties” and “the blacks,” words that make you inadvertently screech your fork across your plate. And when you look for some sign of self-consciousness–some sign that she should know better, Grandma–all you see is the same little old lady who was there before the vileness came tumbling out of her mouth, slowly daubing her meatloaf with mashed potato.

I have been reminded of the chronological fixedness of old dictionaries as we have begun working on the Unabridged Dictionary. It’s no secret that most dictionaries in print today are written using another dictionary as the base; the Unabridged is being built on the very doughty scaffolding of Webster’s Third New International Dictionary. We review the entries in the Third, add (many, many) new entries, and flesh out or correct entries that need it, and in no time at all, idiomatically speaking, the dictionary we’re working on is no longer the Third but a new critter entirely. But this transformational work is not as easy as you’d think, because the Third is 50 years old, and some of the language used and the implicit attitudes expressed therein are like those dinners with Grandma after she’s polished off her second martini. It’s not that the definers of the Third were trying to be offensive, it’s just that society and our cultural ethos have changed a little since 1961. When the Third was released, there was no Equal Pay Act or fully ratified Fourteenth Amendment or Roe v. Wade; sodomy was a felony in every state in the U.S.; and one of the top pop hits was “Runaround Sue,” a song that we today would call “slut-shaming.” Considering the time, it’s frankly amazing that the Third is as careful and circumspect as it is.

For dictionaries that are updated more frequently–even dictionaries updated every 10 years–this de-Archie-Bunkering happens naturally. You notice, for instance, that there’s mention of women in the citations for “firefighter” or “CEO,” and all you do is make sure that you edit out the masculine pronoun in the definition. Or let’s say that you undertake a revision and discover that what was formerly called “Black English” is now called “African-American Vernacular English.” Fine: you search the data for any label that reads “Black English” and make the change. In this way, the dictionary is updated for modern mores in manageable nibbles. But the fact is that you are catching things as you encounter them, rather than hunting for them. For the Unabridged, we’d have to grab our pitchforks and head into the forest looking for the monsters.

It all begins with lists (if there is one thing we are good at, it is making depressing lists). We compiled lists of every word in the Third, the latest Collegiate, and the Learner’s Dictionary that was given any sort of stigmatizing label, regardless of whether that label was current (dated, old-fashioned, vulgar, obscene) or not (abuse, contempt). Then we began to think of words we had encountered in our many jaunts through the Third that struck us as culturally sensitive or potentially offensive: “Negro,” for instance, or “colored.” This list grew as each of us began thinking about awkward family dinners with That One Uncle who likes to talk loudly with his mouth full and eventually lapses into saying horrible things that make our eyes widen and our mothers tsk in disapproval. As we each delved into the archives of our mythic That One Uncle, we together sang the body apoplectic: “Do we have ‘Asiatic’ on the list?” “Do we have ‘homosexual’ on the list?” “Please tell me that ‘Arab’ and ‘Muslim’ are on the list.” “Oh good Lord, we absolutely need to put ‘redskin’ on the list.” And because everything’s better in threes, we had a third list of words that might be potentially sexist: any word with a masculine pronoun in the definition; any word with a gender-specific term (“woman,” “girl,” “mistress,” “man,” “boy”) in the definition; words ending in the suffix “-ette” or “-ess”; any word with the affix “-man.” Compiling these lists was deeply exhausting work, mostly because we’d swing between being riled up about and deeply embarrassed by the imaginary collective -isms of That One Uncle. 

Eventually, we had our list of words. But we weren’t ready to revise yet, because first, we had to search through every entry in the Third that contained any member of those lists. If “man” or “boy” appeared in a definition, a usage note, an example sentence or verbal illustration, an etymology, or even a subject label, the word where it appeared was put on the Potentially Offensive List. When all was said and done, we had thousands and thousands of entries to go through.

This is the point at which my dear friends who are computational linguists want to hear about the programmatic handling of these entries, but the truth is that everything had to be done by hand. Despite Philip Gove’s zeal for order and systematic defining, none of these terms had parallel handling in the Third, so it wasn’t as simple as swapping out “Negro” with “African-American,” for instance. Some of these terms were also a little too nuanced for a simple search-and-replace. The word “primitive” as it is applied to people groups is culturally outdated, but that doesn’t mean that every instance of the word “primitive” in the Third needs to be swapped out with…what, exactly? Is there a single synonymous word for this particular sense of “primitive” that would fit every stigmatized use of it in the Third? How would we know without having a real, live, myopic and undercaffeinated editor look at ever stigmatized use of “primitive” first? Our stalwart and defiantly cheerful Cross-Reference department began sorting through 50 years of fodder for awkward family dinners, and then an equally cheerful group of editors (and me) began to update these entries.

There is something utterly dispiriting about encountering that volume of offensiveness, but it can also motivate you. I am making this goddamned better, you think, because no one else should have to deal with That One Uncle in this dictionary, and you swallow the bile and bite back the “WTF!”s and keep editing “Negro” out of entries.

But as you may guess, offensiveness isn’t always so easily predictable. Take, for instance, the entry in the Third for “atheistic,” which I had in one of my early defining batches. The definition reads, in full, “relating to, characterized by, or given to atheism : GODLESS, IMPIOUS, IRREVERENT.”

“Oh my God,” I muttered, then paused briefly to regret my word choice. To a lexicographer, that boldface colon between “atheism” and “godless” is not just a cute way of breaking up space, but a way to signal that the things on either side of that colon are exactly synonymous. That means that if someone is describing another person as “atheistic,” according to that definition, they mean both that that person subscribes to atheism and that they are impious, irreverent, and godless. I believe that this definition wasn’t a malicious attack on atheists–it was just sloppy defining. These are two separate meanings and shouldn’t have been shoved together into one. But that boldface colon in the middle of the entry makes what could have been a perfectly neutral definition into a moral judgment on atheists.

There were occasional reprieves: sometimes the issues we uncovered weren’t completely depressing. While looking through the entry for “runner,” I ran across the definition “a seaman engaged for a short single voyage” and howled like a 12-year-old boy. “Seaman” went on the Potentially Offensive List; that sense of “runner” has yet to be fixed.

And there’s the rub (hur hur hur): the Unabridged is a work in progress. We’ve already changed thousands of entries, but there are, as our Director of Defining has put it, “no doubt many more excitingly offensive things to be discovered.”

Lexicographers like to remind people often and loudly that a dictionary is a record of the English language as it is used–and it is, fully and totally, from its entry list to the language used in the definitions. That’s why I cringe when people tell me they prefer to cite Webster’s 1828 or Webster’s Second when discussing what words mean today. Both those dictionaries are perfectly serviceable and scholarly dictionaries of their day, but the sun set on that day a long time ago. By all means, love your old dictionaries–cherish them for the works of art that they are, keep them around to remind you of days gone by–but maybe don’t look up “Negrito” in them.

61 Comments

Filed under lexicography, making word sausage

“God,” Guns, and Group Defining

When people want to make small talk with me—before they realize that I am terrible at it and not worth the time and effort—they will ask what I do, and then sometimes respond with, “So, you pretty much know everything, right?”

I have just taken to smiling wearily and saying, “Yes, I know everything.” I have teenagers, and often enough they are happy to disabuse those people of this asinine notion.

No one knows everything, and lexicographers are just like the rest of humanity (only slightly quieter and perhaps a little more openly deranged). There you are as a lexicographer, minding your own business with “harpy,” when you scan downscreen to your next word and encounter “harquebus” in all its Francophonic glory. You flip through your mental card catalog of Words I Have Seen, find the one labeled “harquebus,” and find your memory has only written, “from a novel, maybe Count of Monte Cristo? Is that a novel? SEE ALSO: sandwiches I have loved.”

Fortunately, the lexicographer doesn’t have to rely on this mental catalog. The lexicographer relies on citations. But what do you do when the citations are less than helpful? Here, for instance, the citations are all variants on “She pulled a harquebus from her corset/stomacher/stocking and shot him dead,” which gives you nothing besides a genus term for your definition (“a gun”) and a ten-minute respite as you ponder whether a gun would even fit inside a corset—or good Lord, a stocking, wouldn’t stockings fall down or even tear under the weight of a what’s-a-hoozy—harquebus? And why are heroines in these novels always pulling weapons from their underwear, anyway?

You return to the citations with a sigh and a determination to carefully study the cover of the next trashy novel you see, just to observe whether the buxom, swooning lass’s dress has pockets in it or not.

The problem with “harquebus” is not just that the citations are maddeningly vague and all pulled from Harlequin novels. The fact is that the word “harquebus” refers to a very specific thing, and you need to know a bit about the thing “harquebus” in order to define the word “harquebus.” Or, at the very least, you need to know enough about the thing to know whether these particular uses for the thing are valid.

You do not know that. But fortunately, there’s a guy on the editorial floor with a thing for Renaissance-era weaponry, and he will know.

You know he knows because of a précis of wonder and beauty: the Specialized Subjects list. This is a document that tells you everything that every editor on the floor knows. It is full of surprises and is one of the best ways to get to know your co-workers without having to actually talk to them. Of course the senior etymologist “has at least superficial familiarity with most European languages, best within Slavic, Celtic, and Germanic,” but did you know that he also is  a mushroom-picking philatelist? Likewise, our French editor is a weapons enthusiast. The quiet health nut, it turns out, loves cigars. I know about the 9th-century Latin Mass, knitting, and muscle cars.

The list is handy for general definers who are stuck with “hot rod,” but it’s also handy for the Director of Defining, who uses it when a group of words (say, music theory terms) should be defined by someone with superior knowledge of the subject. Welcome to “group defining,” the ever-deepening hole into which you daily and hourly dig yourself by proclaiming that you have any knowledge of any subject whatsoever. For the new Unabridged Dictionary, I have been given, as a group definer, all the religion terms. This is what an interdisciplinary degree and a penchant for reading and marking books like “Freethinkers: A History of American Secularism” will get you: a batch for revision that is about 10,000 entries long. (I’m one-sixth of the way through and am currently stuck on the entry for “god.” See you in whichever afterlife destination you feel like condemning me to.)

There is something very tricky about group defining, because that is where you find yourself balancing the thing-ness and the word-ness of a definition. A harquebus, as I have learned from the guy with a thing for Renaissance-era weaponry, is a matchlock gun that is heavy enough that it was usually fired from a support. Those characteristics are what distinguish a harquebus from a blunderbuss, which was “probably a better choice for stuffing into a corset,” says my colleague. The distinguishing characteristics of a harquebus therefore belong in the definition for “harquebus,” even if the batch of citations I have at hand don’t mention any of them. The group definer has specialized knowledge, as well as a whole raft of odd books they can plunder for citations so our formal evidence matches up with reality.

But even a good raft of odd books can’t catch everything. I spent about two weeks revising three related theology entries because each of those words was used, for quite a long time, very deliberately incorrectly. They were employed by one side of a theological argument as rhetoric and epithets to discredit the legitimacy of the other side. It’s as if the whole early Christian church was at a hockey game together and someone started a “Monophysites suck” chant that went on for roughly 1,000 years. But if you aren’t someone who knows about the initial theological brouhaha and the way it resonated through the Middle Ages–perhaps because you never had to write a paper on the Nestorian and Eutychean controversies, because you chose a better degree than I did–you wouldn’t know that was the case.

Lexicographers talk with a sort of heavy-breathing fetishism about the corpus, the citations, the data. It will give us all the answers. But every corpus in the world has holes in it, limitations. That’s part of why a good dictionary is compiled by people–living, breathing, awkward people who can look through that corpus, give advice, and do some citational spackling based on the knowledge and experience they gleaned from outside the office. Lexicographers may throw around the size of their corpus, but it’s the people sifting painstakingly through that corpus, like archaeologists weighing potsherds, that make all the difference.

When my children were little, they learned that the word “wedgie” referred to “the condition of having one’s clothing wedged between the buttocks,” as the Collegiate so toffishly puts it. They were absolutely ecstatic: here was a word for this thing that happened to them pretty much constantly! And it was a good word, too, a word that had great screechability and ended in a long-e for maximum sustain. Best of all, it had to do with butts. For about three days, both the six-year-old and the two-year-old hollered the word “wedgie” constantly.

Now, like most parents with young children, my husband and I were desperate for some little veil of ivoried respectability to drape over this big, nekkid waller of parenthood that was so often punctuated (primarily in public spaces, usually with a finger or two up a nostril) with “MAMA! I HAVE A WEDGIE!” So I told my kids not to call it a “wedgie”—I told them to call it “an issue.”

They did, for many years. And while people may have cocked their heads to hear a worried-looking preschooler say, “Mama, I have an issue,” the veil of respectability slid artfully into place. For a while.

The day soon came when both my children learned that when other people use the word “issue,” they are not referring to wedgies. They are referring to vital and unsettled matters that generally require discussion.

“Yes,” I answered, as my eldest explained this to me in tones of deep-purple mistrust, “but isn’t a wedgie basically the same thing in our house? Besides, no one else knew what we were talking about. They thought that you were just deeply interested in the election.”

She frowned so deeply that the tip of her nose met her eyebrows. “But you write dictionaries: you knew it wasn’t like that in the real world.”

It’s a refrain I call to mind every time I read endless citations for “god” that use the word vaguely at best, and it is my mumbled offering of thanks for a team of editors who have wide, varied experiences and specialties I can draw on when the citations leave me hanging. When people come to the dictionary and look up a word like “harquebus” they expect you to give them the definition from the real world: the world where women don’t stuff a gun the size of a musket into their corsets, no matter what the citations tell you; the world where “Monophysite” is not a politicized slur; the world where a wedgie is a wedgie.

23 Comments

Filed under lexicography, making word sausage

The Impossible Task: Cross-Referencing the Unabridged

As I mentioned on the Twit Machine recently, I have been working on a very exciting project: a new edition of Webster’s Third New International Dictionary, Unabridged.

“About frickin’ time!” fans of the Third hollered in one thunderous voice, and with good reason: the Third was released in 1961. It has been updated by means of an Addenda Section once every seven or so years, but an A-Z revision has been long overdue. We will be the first people to tell you that, longingly, as we peer out from underneath the production schedule.

And so we’ve begun the long, slow work of revising and updating. There is a stately surrealism to stripping down and refurbishing of one of America’s most celebrated and controversial dictionaries, kind of like taking the Pope underwear-shopping. When you get right down to it, you are left there in your small mortality, looking at the boxer-briefs of something that has been revered and hallowed for longer than you’ve been on this earth, and that is unsettling.

Nonetheless, here I am, staring intently at the varicosities of the Third and doing my best to patch them up.

Over the years, I’ve been asked why we don’t just slap some new words into the Third while we’re mucking about with new Collegiate editions. Hell, it’s just data, my dictionary-loving friends would say. It’s just an entry. It’ll only take you two extra minutes.

I have discovered that it’s not just an entry, and it’s not just two extra minutes, because of something called “cross-reference.”

Every dictionary you use has rules about the words entered therein, and one of the basic rules of any decent dictionary is that you cannot use a word in the definitions, usage notes, or example sentences that is not defined somewhere, somehow in that very dictionary. That sounds sensible, but you’d be surprised how many discount dictionaries don’t follow this rule–and what a difficult rule it is to follow, even in this digital age. In order to make sure that this rule is followed, we have a whole group of editors whose job is to beat the track of the alphabet, hoovering up all the information they can about the words in this book, and making everything tidy.

I was recently pulled from doing some subject-specific defining and put on the ever expanding task of making sure new entries are entered properly into the data. Part of this involves some cross-reference work, but “not a lot,” as the Director of Editorial Operations put it. “Just a bit.”

Silly me, I took “just a bit” at face value. In fact, “just a bit” means “there’s quite a lot and you will only find and correct a little bit of it.”

My very first entry gave me trouble. There was a word in a quotation that looked odd. I don’t think that’s supposed to be hyphenated, I thought, and so I went to the Third. No, indeed, it was entered in the Third as a closed compound, and I patted myself on the back for being so observant. Mid-pat, I realized I then had to do something about that.

There are options available to the editor doing cross-reference, but none of them is easy. The simplest choice is to alter the quotation to omit the troublesome word. Of course, as luck would have it, this wasn’t possible in this case, as the word to be omitted was the verb of the sentence, and a verbless example sentence was certainly going to raise a few eyebrows when this new dictionary came out. Well, then, I’d just have to find another quotation to sub in. Off to the citation files, where I found the absolute perfect substitute. Oh, it was gorgeous: short, idiomatic, completely covering the contextual meaning and connotation of the word in question, and the author’s name made me giggle (last name: Butters). This was it. After running it through the cross-reference gauntlet, I discovered it used two words not entered in this dictionary.

The next option is to see if the compounding style of this word is going to change at all in the new edition. We base this on citational information, so a quick search of the database showed be that the hyphenated and closed compounds had roughly the same amount of use. I shoot an e-mail to the Director of Defining and ask him if he has any advice. His response is, “Look through the revision files. Quickly.” Because like all dictionaries, this one has a deadline and we will make many, many people (not least of whom, the Publisher) sad if we push it back.

The revision files yield many surprises, chief of which is that some of the entries in it are from editors who came and went 20 years ago–the Third has, let’s remember, been in need of revision for a long time–and their notes have been appended by successive generations of editors who are correcting or reiterating their point. (“Style was once open; now determinedly hyphenated. A. Editor, 1982.” “Style now closed; ignore previous note. B. Editor, 1986.” “Word is open compound. Ignore A. & B., they are morons. C. Editor, 1992.”) I open one notes file. It is several hundred pages long.

After some searching, I find a note for this entry that leads me to believe that the hyphenated compound will not be entered. I make an assortment of irritated editorial noises and, after opening the cit files again, start looking for a third replacement sentence. An hour has gone by and I have spent it on one quotation at one entry. The word I am agonizing over is not even the word I’m entering: it is peripheral, incidental. But when you are doing cross-reference, nothing is peripheral or incidental.

Some variation of this continues for the rest of the letter, then progressive batches, and the number of annoying e-mails I send to my colleagues skyrockets. I can almost hear the server groan when I hit “New Message” and begin my fourteenth e-mail of the day to one of the science editors. “Me again. What are you going to do with ‘thumb drive’? I’m sure you haven’t even given it a thought, but can you give it one for me in, say, the next ten minutes?” I send more e-mails to the Director of Defining. “Howdy. Do you have any thoughts on how to handle the expansion of ‘HIPAA’?” And again, later: “One more: can I edit ‘douche-canoe’ down to ‘douche …’ in this quotation for ‘bromantic,’ or will I have to enter a new sense of ‘canoe’? If I’m doing that, should I just enter ‘douche-canoe’?”

It’s not just a matter of hunting down compounding styles. There are the new entries that require other new entries, each of those requiring two new entries, one of which will require substantial revision to another four entries, two of which will require new etymologies. One medical entry requires that I re-open 9 letters for revision and ask our Pronunciation Editor for six new prons in letters he’d already done. It takes me four hours to enter all this into the file.

At one point, I spend time trying to find a better quotation for a word to avoid the dread hyphenated-but-not-entered-as-such compound, only to discover 30 minutes into my search that the hyphen in question is actually an end-of-line break, and so not a real hyphen at all. The only upside to this is that the quotation I can now retain was written by someone with another chortle-inducing name. We take joy where we can find it.

Every inquiry leads me down a garden path of more inquiry, until I am lost in the weeds and just want to lie down in the grass and sleep for many years. I’m in so many different letters at once, I can’t tell you where I am in the project. (Here the Publisher frowns.) And here is the most perverse thing of all: even with all the time I’m putting in making sure that all these entries are tidy, there is no way I will catch every cross-reference error. Words that I assume are entered are not; styles that I assume are fine will be changed; words will be dropped or modified during copyediting, setting off another string of cross-reference changes. When I try to explain what the cross-reference work is like to another general definer, I sum it up by saying, “Google ‘ping-pong balls, mousetraps, and nuclear chain reaction.’” The ping-pong balls are the entries. All those sprung, upended mousetraps are me.

That is why we have Cross-Reference, the stalwart department who does this for every damn book we publish. Cross-Reference consists of the sweetest people on the editorial floor, but make no mistake: they are brilliant in ways that blabbering dilettantes like me cannot possibly comprehend. Consider: I have only done cross-reference work digitally, but there are people in our Cross-Ref department who remember the days when they did this by hand–when checking on the proposed styling of a new entry involved a silent plod across the editorial floor, a short aerobics routine that involved carefully lifting and stacking galleys, and tens of thousands of index cards. At one point, I asked one of the Cross-Ref editors how they knew that a styling change would be made later in the alphabet. “Oh,” she said, “you just keep track. Most of it just sticks in there, in all those nooks and crannies in your mind.”

I considered, not for the first time, that I must I have a very smooth brain.

They not only catch mistakes, but are lightning fast. They have to be: by the time they get a finished dictionary, they usually only have a few weeks to do their work before the book is due at the printer’s, and the printer gets very cranky if we are late. When the defining work is done, everyone breathes a huge sigh of relief and we celebrate with doughnuts, but no one gives a thought to the tireless drudges who are still–quietly, cheerfully–making sure that we haven’t used “douche-canoe” in an entry without defining it. There is very little glory in lexicography, and where there is glory, definers and etymologists get it all. But Cross-Ref are the ones who actually deserve it.

So when you read a dictionary entry in the new unabridged and have to look up another word in said book, raise a glass to the masterful editors of Cross-Reference, and be very glad that I am not one of them.

22 Comments

Filed under lexicography, making word sausage