Unreading: Citations, Corpora, and Craziness

There comes a point in every career when you realize that your job has changed you irredeemably, for better or for worse.  That point came for me the day I bolted up mid-meal from the dinner table, turned the radio up, and began a scrabbling search for pen and paper. “What are you doing?” my daughter asked.

I shushed her. “Someone on the radio said ‘ho-bag.’ Mama’s taking a citation.”

At Merriam-Webster, every editor spends part of their workday reading a wide variety of print and online sources, looking for words that catch their eye. When you find a word that makes you stop short–could be a new word, could be an old one–you underline it and bracket the context. That chunk of language eventually ends up on a 3×5 index card which is filed alphabetically with others of its kind. BOOM: you have just created a citation.

Allow me to indulge in some lexicographical wiener-measuring for a moment: our citation files are enormous.  The files are split between the old-school paper citations, which go back to the 1800s, and the new citation database. The paper files take up about 40% of the editorial floor and are tended by an  Editorial Librarian who, until she retired, spent almost 30 years on a kick-stool in front of the catalogs, filing away citations, moving citations from one drawer to another, relabeling catalog drawers. Between the two citation sets, we’ve collected well over 100,000,000 indexed words.  Well, so what? Here’s what: each of those 100,000,000+ words in our corpus was read and collected by a living, breathing editor.

I know what you’re thinking, because it was the first thing I thought when I heard about reading and marking: HELL YES SIGN ME UP FOR THAT.  Get paid to read? Not edit, not revise–just read?? Let me give you a couple of well-intentioned warnings (Happy Holidays!).

When I began reading and marking, I would begin reading an article and get halfway through it before realizing that I hadn’t marked a thing. I had made the classic rookie mistake of engaging with the content. If you’re on the hunt for interesting vocabulary–and particularly if you’re reading something that piques your interest–you need to intentionally miss the forest for the trees.  You must focus only on the language used without caring at all about the point made with that language.  But you can’t just skim. No, you need to be able to read closely enough to catch a subtle grammatical or lexical shift in a word, but not so closely that you forget your primary objective (MAKE CITATIONS). It’s not reading, and it’s not not-reading. It’s unreading.

You also don’t always get to read things that you would, well, want to read.  I would rather shove a dull grease pencil into my eye than read political rhetoric. Yet for 6 years, I read and marked both The Nation and National Review (balance!). Everyone wants to read the fun stuff, but someone’s got to read Today’s Chemist at Work or the latest batch of minutes from the UN Human Rights Task Force, and Noah Webster help you if the editor in charge of reading and marking happens to pass your desk and see that your pile is lacking or manageable.

There’s a more insidious and lasting difficulty that comes from reading and marking, though: once you do it, you can’t stop. (This also applies to proofreading, defining, copyediting,  and eating FunYuns.) I attempted to re-read Dracula recently and was pulled up short on page 1, when Stoker used “thirsty” in a way I wasn’t familiar with. “Hmm,” I thought, “do we have this in the files?” I had to–was compelled to–log on to our system and search the corpus until I was sure that we had, in fact, indexed this use of “thirsty.” By then, I had forgotten about Bram and his fear of female sexuality and was mindlessly, reflexively searching for other things to mark.

Here’s a short list of unusual things that I’ve seen marked for the cit files:

  • menus
  • TV dinner cartons
  • beer bottles
  • diaper boxes
  • napkins
  • photos of store signs (seriously, though, how could I not? “Route 66 Dinor”! That’s an amazing and very specific regional spelling of “diner,” you guys!)
  • orchestra/ballet/roller derby programs
  • VHS and DVD covers
  • the Yellow Pages

Allow me to reiterate my point: relating to the printed word like this is not normal. This is not something you should aspire to.

When you spend your day taking words apart and describing their every movement in painful, meticulous detail, you develop a very strange relationship to them. I imagine it’s not much different than being a doctor: an attractive person comes into your office and takes off all their clothes, and you stare at the sphygmomanometer. (That’s a piece of medical equipment, by the way, not the name of an anatomical structure.) Spending eight hours a day in relative isolation with only disembodied, stripped-down words will change you, and not always for the better.

Case in point: William Chester Minor, the famous madman of Simon Winchester’s book The Professor and the Madman. If you’ve read the book, you know that Minor was integral to the production of the Oxford English Dictionary and batshit crazy to boot. But here’s a little tidbit Winchester’s book omits: two years before the battlefield event that precipitated his nervous breakdown, William Minor was a lexicographer at the G. & C. Merriam Co. His name is in the 1864 edition of The American Dictionary of the English Language.  Joshua Kendall, in his excellent Nation article on Minor and the 1864, notes that “mental instability would also plague numerous nineteenth-century American lexicographers, including [Noah] Webster and his sole assistant on the 1828 dictionary, James Gates Percival, as well as Webster’s successor as editor, his son-in-law Chauncey Goodrich.”

What a track record! What a profession! WTF!

There is a little hope, however. Just because you aren’t reading exciting things doesn’t mean you won’t stumble on something that another editor marked that redeems the day. For the last edition of the Collegiate Dictionary, I worked on “heavy.” It was…heavy.  Lots of the citations dealt with dreariness, death, and gloom: I read dozens and dozens of citations for everything from “heavy injuries” to “heavy depression” to “heavy rain.”  And then I flipped over the next citation and read this:

“When it comes to heavy studio-craft, Bon Jovi is no Def Leppard.”

The juxtaposition–the sheer inanity of that statement when read immediately after “troops sustained heavy casualties”–was just the sort of bad-taste-in-a-loud-tie palate cleanser I needed to keep going. I finished the batch and didn’t require a long rest in a mental hospital afterwards. Thanks, editor who had to mark Rolling Stone that year.


Filed under famous lexicographers, lexicography, making word sausage

20 responses to “Unreading: Citations, Corpora, and Craziness

  1. How could The Professor and the Madman have omitted that? It seems like a rather important and relevant detail. Also, I had no idea that so many lexicographers went nuts. Who knew lexicography had such a dark side?

    • I do formal logic, and lots of logicians and mathematicians who work on foundational set theory, mathematical logic, and so on all had various mental health problems. I’ve doing both lexicography (for bilingual translation) and work in that same area of logic for years, for fun, on my own. The net social result of all of it is that I feel more like an animal documentarian than a participant in conversations at parties.

      I was also a massage therapist years ago, and yes, learning anatomy does temporarily kill one’s libido. I was looking at a naked partner then, but all I could think of was how she would look if all of her skin were torn clean off. I learned years later in university that Buddhist monks practice a meditation of exactly that sort to curb their sexual desires (they imagine sexually appealing people and strip them to the bone in their minds).

  2. I wonder what you think of James Murray’s “Directions to Readers for the Dictionary” as a guide for modern lexicographers: are they still essentially the right advice? Here are the relevant points:

    5. Make a quotation for every word that strikes you as rare, obsolete, old-fashioned, new, peculiar, or used in a peculiar way.

    6. Take special note of passages which show or imply that a word is either new and tentative, or needing explanations as obsolete or archaic, and which thus help to fix the date of its introduction or disuse.

    7. Make as many quotations as you can for ordinary words, especially when they are used significantly, and tend by the context to explain or suggest their own meaning. […]

    9. It is not necessary to quote a full sentence; but the quotation must be sufficient to show the meaning, or use, and to make connected sense.

    I once had a doctor who asked me to take off all my clothes before examining me. It made me so nervous I could hardly answer any of his questions, though I am not a physically modest person (nude is one thing, naked and helpless in front of a stranger, quite another). I never went back to him.

    • korystamper

      I generally agree with Murray, though I take issue with point 9. I think that a few sentences of context help provide a variety of meaning types that are important for the lexicographer’s work: denotative, connotative, tonal, register, collocational. As a definer, I’d much rather have a longer citation to draw information from than read a short one and feel like I have to hunt down the original source in order to make sure I’m reading it correctly. Standard disclaimers apply; if any other lexicographers want to chime in, I certainly would love to hear their opinions.

      There’s another important thing to keep in mind: Murray was writing for readers, not trained lexicographers. I know that the longer I do defining and the more I delve into the vagaries of English grammar, the more likely I am to notice an odd adverbial use of a conjunction, or pinpoint a true adjectival use of what was formerly an attributive noun. That said, Murray’s instructions really are top-notch, and his program for collecting citations for the OED was brilliant: it made the OED a manageable project. Well, a slightly more manageable project.

  3. Are you saying that a writer of anything can give new meaning to a word just by using it in a certain context?

    • korystamper

      Sure, a writer can do whatever he or she wants. Hey, I love it when writers give words new meanings! Keeps me in a paycheck.

      Dictionaries are built on lexical critical mass, so a writer’s pet coinage will not make it into the dictionary if other writers don’t use it. But perhaps I’ll go over the criteria for entry in another blog post….

      • I use words in new contexts all the time in my books. I suppose I should have said an official new meaning. But you answered my question. Thank you. And yes, a post on the criteria would be interesting reading.

  4. Marc Leavitt

    I admit to doing the same thing. As a retired editor, I can’t read without editing. It’s a “Stop me before I kill again” compulsion. I just shot off two emails to the NY Times to correct mistakes in copy.I’m afraid it’s an occupational hazard.

  5. The French have a word for it: la déformation professionelle.

  6. Hullo! I teach English abroad and am fascinated by the way language changes. Any of the print materials you mention could also serve to show English in context–diaper boxes, anyone?

    I’m back in the states for Christmas, and all of the new usages I’ve been hearing pop out like neon signs in the dark. For example, I overheard someone at the airport say, “Why not? I ain’t got shit else to do,” which was the first time I’d heard “shit else” swapped for “anything.”

    I came across your blog this morning and look forward to reading more. I vote to hear about criteria for entry in a future post!

  7. That’s the best (well, possibly the only, but still) account of what it’s like to read for citations I’ve ever read. I once read an anthology for citations, and was brought up short a few months later when I read two different works (an interview with one author and a critical appreciation of another) that referred to stories in the anthology, and I realized that not only had I entirely missed the points of the stories, but barely had any idea what they were actually about.

  8. PJ

    As a civil war surgeon, Minor would certainly have been frequently exposed to mercury “medicines”. I wonder if there was mercury in the inks that Webster, Percival, and Goodrich used. Maybe they did their marking with red ink made from cinnabar.

    • korystamper

      I’m sure Minor had other issues and that lexicography isn’t the thing that sent him off the cliff’s edge into insanity. But after almost 14 years of doing this, I will say this: I wouldn’t be surprised if lexicography was the gentle sloping road that ran along that edge.

  9. Pingback: Weekly favorites (Dec 26-Jan 1) | Adventures in Freelance Translation

  10. Dan Dalton

    I’m curious about a bit of pronunciation. You refer to “the cit files,” and it brought me up short. When you say this aloud (or in your internal monologue, I suppose, silent as your work is), does it rhyme with “bit” or “bite”?

    • korystamper

      “Cit” rhymes with “bite”; it’s short for “citation” and retains the pronunciation of the first syllable of “citation.” I’m fine saying it, but typing it pulls me up short. In fact, any time I type a homophone of “cit,” I stop and re-read what I’ve written very carefully to make sure I’ve used the right one.

      • johnwcowan

        Using cit file instead of cite file reminds me of the modern habit among audio people of writing mic instead of mike as short for microphone. That’s bad enough, but when it comes to miced and micing, I really jib at it: what was so wrong with mike, miked, miking, whose pronunciations are transparent?

        Signed, Disgusted in NYC

        • korystamper

          Dear Disgusted:

          Thanks for your comment. While I understand your concern, “cit” is a longstanding abbreviation of the noun “citation,” whereas “cite” is a verb. I cannot speak for “mic” as God knows what those people were thinking. They were probably all high or something.

          Kory Stamper, Voice of Authority

  11. johnwcowan

    Don’t know about you, but I at least also use cite as a noun (by analogy with quote, no doubt), so add that to your cite file and smoke it.

    Seriously, I grew up in what is called “a lawin’ family” (in To Kill A Mockingbird), and I found lots of usage by googling for [“any cites” -animal -import -export -endangered -species] (to eliminate references to CITES). This one is particularly interesting, as it is an abstract of a journal article: “Please note that for non-indexed journals like IJP&PT the Web of Science® only includes cites from JCR journals, and thus no self-cites nor any cites from other non-indexed journals.” The OED also has four cites for the word, though I think the first is just a typo (it is “cite. omitted” which should be “cit. omitted”, pronounced of course “cite omitted”).

  12. Hank Gillette

    Would it have killed you to include Stoker’s use of the word “thirsty” that you referenced? I had to stop reading your post and look it up (thank you, Project Gutenburg”).

    For the curious, the useage was:

    “I had for dinner, or rather supper, a chicken done up some way with red pepper, which was very good but thirsty.”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s