By the time I arrived at COBUILD as part of the 1993 intake I was hired to work on the second edition of the dictionary, the whole project had been fully computerized for many years.

This meant working on the screen at the terminals connected to the mainframe computer, humming in a separate room, yet with green text on a black background, as described by Andrew Delahanty in Part 1. Mainframe computers were named after Shakespeare’s characters – Titania was one – and sometimes get hot and need time to heal, giving us an afternoon break.

There was a pleasant contrast between the high-tech, sophisticated nature of the project and the elegant Victorian building where we worked, its large sash windows overlooking a beautiful garden where we sometimes ate our lunch in the summer.

It was also a great venue for seminars and parties, both of which would bring members of the English department of the University of Birmingham, to which COBUILD was attached and the wider university.

Compiling on-screen using a purpose-built text editor requires the acquisition of a new set of skills, as I once only worked on paper; But the thing that really blew my mind was the corpus. Previously I had only seen the output of Concordance – a corpus on paper, as on my previous project we were able to request a printed sample of lines for particularly difficult entries. Intimacy with the corpus was a revelation.

I was almost paralyzed for several weeks, overwhelmed by the amount and quality of data I expected to process. The fund – soon to be rebranded as The Bank of English – was smaller than today’s standards, but the insights it provided in English behavior were nothing I had never seen before.

At COBUILD we worked with the corpus in a way that is different from the way we have to use it elsewhere.

Using specially developed software, we lexicographers (and grammarians) will analyze the evidence of the word we were compiling. We will then base our modifications to existing entries from the first edition, as well as all new entries and senses that we were adding.

We were a large team and there was always an ally available to discuss problematic entries or difficult decisions about splitting the senses, but the evidence provided by the corpus was the basis of every work we did. I don’t think we ever looked at any other learner dictionary. It sounds very arrogant, but we had no need for it; In front of us was all the stuff we needed.

I have worked on many corpus-based dictionaries and other projects since then, and I rarely work on a dictionary that does not use corpus proof to some extent.

A corpus is always the first port of my call when I encounter a new word or meaning. However, I think the COBUILD dictionary remains unique because it is so straightforward and entirely based on what only a corpus can give, which is a testament to how language really works.

Leave a Reply

Your email address will not be published.