Assembling the Treasury, Wordhoard, Synonymicon, Thesaurus

All lexicographers, regardless of where on the prescriptivist/descriptivist spectrum they fall, like to tell you they are totally objective when writing their dictionaries. They get worked up into a veritable froth if you suggest otherwise, maybe even raising their voices to conversational levels and daring to make eye contact when they tell you that you are utterly wrong. Lexicography’s underlying tenet is complete objectivity! Get thee behind me, John Dryden!

Notice how they conveniently fail to talk about thesauruses when objectivity comes up.

Unlike dictionaries, there is no one approach to compiling a thesaurus, no Unified Theory of Synonyms. The main goal that all of them have is to present an entry word and a group of words related to that entry word, but how those words are specifically related to the entry–and how they are presented–is varied, to say the least.

I grew up using a Roget’s Thesaurus (and I use the indefinite article advisedly, as “Roget’s” is not a trademarked name in my part of the world). Like other dorks of my genus, I spent many a Sunday afternoon sprawled out on the couch, paging through a reference book. The dictionary and encyclopedia were hauled out any old time, but the Roget’s was reserved for dim, snow-muffled days, when it was too cold to go sledding and I was feeling as pensive and thoughtful as a nine-year-old can possibly feel. Roget’s had an elevating effect on me, and I’d be so moved by its profundity that I’d read it aloud to our dog. “Section one,” I’d intone solemnly to Buffy, our crabby Airedale, whose spot on the couch I was bogarting. “Existence. Being, subsistence, entity, essence. Ens. ” She’d huff and I’d sigh, and we’d stare out the window at the whiteout, feeling deeply for a few seconds about dog treats and life, respectively.

Roget’s is brim-full of existential gravitas because of how it was compiled. In the early 1800s, one Peter Mark Roget thought that a collection of words arranged by semantically related clusters within larger, epistemological categories would be a useful tool for the discerning scholar, and fifty years later, Roget’s Thesaurus was released to the public. Roget’s focused–and continues to focus, under a slew of different names and publishers–on grouping terms within larger semantic ideas and divisions. “Existence,” the first subcategory, includes nouns, verbs, adjectives, adverbs, formal philosophical terms, slang, and a wide variety of words that have something to do with the idea of existence, including “fact,” “sloth,” and–so cheery!– “grim reality.”

A brilliant system, though perhaps a little too much for the average teenager looking for another synonym of “desire” to use in this frickin’ essay on Wuthering Heights. But without it, teachers would no longer be regaled with fabulously inane and overly thesaurized essay sentences like, “Heathcliff dementated because of his propensity for Catherine,” or “Linton was Heathcliff’s foeman in wanting to pay one’s court to Catherine.” Dr. Peter Mark Roget, we salute you.

Dr. Roget’s categorization system isn’t to everyone’s taste, however. Discerning gentleman-scholars may have had the time to take the measure of each of the given synonyms, but the college student running on caffeine and youth, scrambling for an impressive synonym of “admonition” at 4:00am, may not. Some folks prefer a dictionary-like presentation of terms, and Roget’s is intentionally not dictionary-like. Enter the competitors (Merriam-Webster) and the losers who get to try to one-up Dr. Roget (me).

Where Roget’s revels in its epistemological abstractness, the thesaurus I was tapped to write was going to revel in its solidity. Unlike Roget’s, M-W thesauri deal entirely in listing related groups of lexical synonyms and antonyms. Instead of rambling chunks of loosely related words and an index that was half the size of the book, we’d present a list of words that mean exactly the same thing as the headword.  Words that were close would be called “related words” or “near synonyms,” and near synonyms would be grouped together by similarity of meaning. Same deal with antonyms. Very tidy.

And because we like tidy things, two groups of words that are commonly perceived as synonyms and antonyms would not be entered, because they were not lexically tidy: members of a genus and complementary pairs. This meant no “sofa” and “furniture,” nor any “black” and “white.” “Sofa” is not a lexical synonym of “furniture” because the word “sofa” does not mean “furniture.” Rather, a sofa is a type of furniture–it’s a member of a genus. And “black” is not the lexical antonym of “white.” If you look up “black” in the dictionary, its definition isn’t “not white.” “Black” and “white” are a complementary pair, like “knife” and “fork,” and “lexicographer” and “boring.” Not lexical, not eligible for entry. I nodded: yes, this is what we are good at, the lexical thing. This will be easy.

And it wasn’t.

Before you can find synonyms, you need to figure out what the meaning core of your headword will be. You must begin with a meaning that is broad enough to encompass most of the synonyms a person will want, but narrow enough that there’s some significant difference between it and another headword.

That seems like common sense until you begin writing and realize that you may need to have a bunch of synonyms in mind before you start actually looking for that bunch of synonyms. I worked backwards: I’d doodle out a list of possible synonyms for “general” and then begin looking for the common meaning they shared. When drafting that meaning, I also had to learn to avoid a common device used in dictionary defining: the synonymous cross-reference. Single-word cross-references in dictionary definitions are synonyms, and why waste a synonym in a meaning core when you can put it in the synonym list? (Because it is easy and I am lazy, that’s why.)

Once the meaning core is in place, you begin the hunt. The first M-W thesaurus was compiled, yes, by hand, with editors flipping through the Third and trying to keep track of all the possible synonyms for “love.” It was an overwhelming task, one sure to induce some strong hallucinations and psychotic breaks, and perhaps that explains why “chatty” was not listed as a synonym of “glib” in the first edition of the Collegiate Thesaurus but “well-hung” was. I had it easier, but even with a computer and a searchable dictionary database, finding and ordering synonyms and near synonyms was tricky. My nature was working against me: I am a splitter–a definer who likes detailing every possible denotative nook and connotative cranny of a word’s meaning–and so perhaps not the best person in the world to write a thesaurus. ‘Togs,’ I reasoned, means ‘clothing’, but it also refers to clothing worn for a specific purpose. Is that enough lexical synonymy to include ‘togs’ as a synonym? Or is it a near synonym? A vacuum whirred downstairs. It was 6:00pm, and I was going to be locked in the building overnight with nothing to eat and a bunch of boring, pedantic ghosts if I didn’t leave pronto. Synonym it is.

I began to rearrange lists of words by register, then by connotation, then for no other reason than they looked right next to each other. “Gear” and “rig” seemed to fit together–they are more technical words, referring to specific types of clothes used in particular activities, like mountaineering. And “costume” and “garb” sat well next to each other–they refer to dress-up, fanciful, special-occasion clothes. But those two groups are distinct: “costume” and “rig” didn’t work together, and that seemed right to me. It’s just like assembling a puzzle, I reassured myself. A puzzle with blank pieces you color in as you place them, and in the end you hope you have come up with a convincing representation of the Mona Lisa.

This sort of derangement is both de facto and de rigueur in the lexicography biz, but I was unprepared for one thing: that what seems right to me may not be what seems right to other editors. My thesaurus batches were returned; my carefully constructed near synonym groups had been scattered and re-formed. “Rig,” “outfit,” and “costume” ended up together, with “gear” left out. Other near synonyms were dropped; some were promoted to true synonyms. I was so thrown by this that it took me a while notice the crowing glory of the revision: my managing editor added two true synonyms I had, in all my shuffling, missed: “clothes” (with the comment “!!!”) and “habiliment(s).” “Clothes” is exactly the sort of obvious synonym it’s easy to overlook when you are slogging through a dictionary, trying to find every last possible synonym or antonym of a word. I had fallen prey to Well-Hung Syndrome. As for “habiliments,” I had never seen the word before in my life. Assuming the Drudge’s Hunch, I stared at my reconfigured entry. I rubbed my face, and then rubbed it some more, until it began to look like a flatiron steak. The revisions made me feel a bit dumb and defensive.

Defensive, yes, because isn’t lexicography coolly objective? And isn’t my very objective read of “gear” and “outfit” perfectly fine the way it is, since I’m totally and completely objective? But if we’re both objective and we disagree, then how objective are we being? One of us, I tutted to myself, was not being objective.

In truth, neither of us was being 100% objective. The very nature of grouping, ranking, and sorting near synonyms means that a certain amount of subjectivity will inevitably be a part of the process. I stuff my word sausage differently that the managing editor stuffs his.

Nonetheless, my turd-stirring nature won out. I padded over to the managing editor’s cubicle and interrupted his reading and marking with my concerns regarding objectivity. He listened patiently to my concerns about the order of near synonyms, but when I brought up “habiliments,” his brows beetled. “Kory,” he sighed, “that sort of catch is exactly why we do this as a group. ‘Habiliments’ is a synonym for ‘clothing,’ even if you don’t know the word. And if you really think that ‘outfit’ doesn’t belong with ‘gear’ and ‘costume,’ then write your reasons down on a pink and I’ll consider it.”


He shook his newspaper in irritation. “Last I checked, my title was not ‘Dictator.’ This is a group effort.”

Lexicographers get defensive about objectivity because we know that, no matter how much training we have, we cannot be truly objective because we experience language subjectively. (We are, contrary to popular belief, fully human and not at all robots.) Sometimes our own personal experience with the language is invaluable: that subjective sprachgefühl helps guide a lexicographer when defining, when editing, when rubbing a jumble of synonyms between your hands to discover their relative heft and shape.

Sprachgefühl isn’t just weighed against written evidence–it is put against other sprachgefühlen. Every citation we take, every rewording of a definition, every example sentence penned is a subjective use of language. But when considered together, subjectivity fades into a picture of patterned, communal–and objective–use. Language is a human group effort, and so should lexicography be.



Filed under lexicography, making word sausage, thesaurizing

18 responses to “Assembling the Treasury, Wordhoard, Synonymicon, Thesaurus

  1. So which is more challenging (if such a thing can be measured): editing the dictionary, or the thesaurus? Or the Usage Guide? 😉

  2. Attempting to distinguish between “synonyms” and “near-synonyms” is what’s not objective. Except in technical matters, where “Creutzfeld-Jakob disease” is exactly the same as “Jacob-Creutzfeld disease”, there are no synonyms, only near-synonyms. No two words have exactly the same field of use, even if we stick to prose uses. The moment your organization attempted to draw that distinction, they fell into a pit and were eaten by a grue.

    I’m not even sure it makes sense to say that A and B are “more synonymous” than B and C are. Would anyone care to offer a numerical estimate of that?

  3. Mr. Cowan,

    Every broad generalization is incorrect, including this one. There are plenty examples of (especially noun) synonyms. Some things are just known by different names to different people. For example, you could build a long list of scatological synonyms that don’t each refer to specific sizes, shapes, colors, or odors — they’re more or less interchangeable, and therefore actual synonyms. Or the simple synonymous pair nursing/breastfeeding.

    I get where you’re coming from, though. Smiles, grins, and smirks are all different facial expressions. I just don’t think it’s as absolute as you infer.

    But I don’t think people use thesauri to find a word that is an exact synonym of this other word. I sure don’t. I go to the thesaurus when I need a word that sort of means this other word but is slightly more specific, or descriptive, or compelling. In other words, I use a thesaurus to find the RIGHT word to replace the almost-right word I already have.

    • Andy:

      Oh, I agree that near-synonyms belong in thesauruses. I just don’t think there’s any point sweating over what is a synonym and what is a near-synonym.

      In particular, I consider register to be part of a word’s meaning, because it partially specifies what the effect of using the word is on the listeners. You can use feces in some contexts where you can’t use its four-letter synonym, but to say “It’s just a crock of feces” is if anything more objectionable than the alternative, because it’s too graphic. Similarly, some nouns on this particular synonym list can be either mass nouns or count nouns, but log and grogan are invariably count nouns.

      As for nursing and breastfeeding, they are only synonyms over part of their semantic range. You can go to nursing school for four years, but you only go to breastfeeding school (at La Leche League, e.g.) for a few weeks.

  4. Charming Charlie

    Brilliant. I like to know how the word sausage is made, and I like the pretty words you use to explain how. I’m only disappointed that no where in here is a dinosaur pun. THESAURUS ROAR.

    “Thesaurus” has no entry in m-w online’s thesaurus. Neither does “meta!”

  5. I don’t have a print dictionary, but I do have a thesaurus. Mine is the Oxford Thesaurus, which is nice and simple: much like a dictionary but with a list of synonyms instead of a definition under each headword (occasionally some antonyms thrown in as an afterthought). I turn to it in those situations where I can’t think of the word I want, but all I know is that it’s in the same ballpark as something else. I don’t see the point in making thesauruses any more complicated than that.

    Regarding “near synonyms”, I have to say I don’t like the term. I think it’s far better to say that near synonyms are literally synonyms. If two words mean exactly the same thing (putting aside whether that’s a mythical notion), then they are “perfect synonyms”, which I take to be a subclass of synonyms in general.

    Incidentally, one topic you might consider at some point is an overview of the reasons people use dictionaries, and whether some of these more than others define the target audience. For example: (a) To discover the meaning of a word you don’t know. (b) To check the spelling of a word. (c) To see how good the lexicographer is. (d) To double-check that the word you’re about to use is actually the one you want and doesn’t have connotations you’re not aware of. For my part, when I turn to a dictionary it’s usually (d).

    • There’s also an (e) that might be a sub-(a) reason: Word origin.

      Recently I was talking with someone about words/terms that are used in Colorado but are unknown in Maine (and the geographical reverse), and “passel” leaped to mind. The person I was talking with asked where it came from, and I could only scratch my head and say, “Dunno. I’m guessing corrupted French? Cowboys and Scotsmen seem to have had a thing for corrupting French.” at the time, but when I got home I checked it out (for those wondering, it’s corrupted “parcel”) with M-W online.

      • Not corruption but something called “early loss of /r/”. Certain words lost their /r/ sounds after vowels even before the general loss of such sounds in the English of England. Examples are bass (the fish, which is not pronounced with an /r/ anywhere), passel, bust (from burst), cuss (from curse), the obscene ass (from arse). The /r/ sound returned in some words: we no longer say hoss, gal, skasely. In the process, parsnip picked up an /r/ that didn’t originally belong there.

  6. korystamper

    My blog readers are so smart. I can always be assured that the comments left here will make me feel both proud and slightly intimidated.

    Regarding near and true synonyms and the differentiation thereof, it is a bit of a pragmatic stretch. Even us linguistic-free-love descriptivists understand that “clothes” and “clothing” have slightly different meanings if you consider connotation, register, and frequency to be part of their meaning. “Clothes” and “clothing” are mostly synonymous. But practically speaking, if you are going to move away from Roget’s vaguely Leibnizian system of organization, you have to draw lines somewhere, and straight-up lexical synonymy based primarily on denotative meaning is as good a place to draw that line as any. In truth, it’s a very Govian approach, so maybe I’ll appeal to history as our excuse.

    As for Adrian’s question about why people use dictionaries, my own experience tells me that it’s mostly a) and b), with some d) thrown in for good measure. The only people who care about c) are other lexicographers, language scholars, and/or nutbars. Not that there’s anything wrong with nutbars. Some of my best friends are nutbars!

    One more shocking admission: though I have a number of them at home, and have great lust in my heart for the Oxford Historical Thesaurus, I don’t use thesauruses. Not even the one I wrote.

  7. Till

    An off-topic question here from Germany.
    Being a reader of your blog for some time now and assuming you do not allow typos, I stumbled across the word “sprachgefühlen” in the last paragraph. (Quote “… is put against other sprachgefühlen”)

    Why did you use the dative case of “Sprachgefühl” in this context?
    Was it your “sprachgefühl” that made you apply the American grammar as in “against whom” to the German word, or did you stick to rules – maybe even your own?
    With that goes the second question: Is “against whom” a clear dative case construction (still?). I know you’re not a grammarian, but I assume you know the answer anyway.

    • Till: I think it was intended as a plural: anglophones often get German plurals mixed up. I assume the correct plural would be Sprachgefühle, but it isn’t normally used in English — I at least would say “against the Sprachgefühl of other people.”

      • As for datives, the English dative pronouns us, you, him, her, them, whom displaced their accusative counterparts long ago, but they are not especially dative today; by contrast, me and it descend ultimately from accusative forms.

      • Till

        …and that would *sound* correct for me.
        I was just wondering if there was a rule governing the case adaption of imbedded foreign words.

        • korystamper

          John and Till: It was intended as plural, though a facetious one. I wrote the post at grumble-grumble a.m. and didn’t have enough brainpower at that point to remember and use the correct German plural and then find a clever way of explaining that I was using the German plural. You will often hear people pretend at German by adding “sch” to the beginning of words and “-en” to the end, so I went with that.

          The moral of this story: don’t use the big words when writing so late/early.

          John: zero plural didn’t even occur to me! But it’s right: the English “sprachgefühl” is a noncount noun.

  8. Luiz Benevides

    Oddly enough, I’m shifting from M-W to your blog then back to M-W and then to Malinowski’s classic “Argonauts of the Western Pacific”. Your straight-forward subjectivity statements look a great deal like the writing of the modern anthropology’s founding father. Fortunately you look better yourself. Keep on gracefully paddling those still waters.

  9. Pingback: Link love: language (43) « Sentence first

  10. Pingback: Assembling the Treasury, Wordhoard, Synonymicon, Thesaurus | harm·less drudg·ery « Kryptoshah’s Weblog

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s