The data is in; the news is good

Is ‘data’ singular or plural? In academic (particularly scientific) writing, it’s standard to see ‘these data are’. But otherwise, it’s increasingly common to see ‘this data is’. The question is whether we treat ‘data’ as meaning ‘distinct pieces of information’ or just ‘information’ (a grammatically singular mass noun).

Simon Rogers has a nice summary of the debate.

The academic usage follows tradition – and the word’s Latin origin, because of course ‘data’ is the Latin plural of ‘datum’. Case closed?

No: many linguists now believe that English is not Latin.

English has a lot of words taken directly from other languages, but that doesn’t mean that those words, as English words, need to follow the conventions of the original language. We lifted ‘schadenfreude’ and ‘dachshund’ from German, but we don’t give them the initial capitals that German nouns get.

For ‘data’, I think the best analogy is ‘news’. I watched the BBC’s fine adaptation of Henry IV Part 1 last night, and my ears pricked up when Jeremy Irons (in the title role) asked:

But wherefore do I tell these news to thee?

It turns out that when ‘news’ was first recorded, in the late 14th century, it was a plural – meaning ‘new things’. Over time, it came to be treated as singular, although as late as the 19th century there were still a few occurrences of ‘these news’. But plural ‘news’ sounds utterly alien to modern ears.

To most non-academics, plural ‘data’ isn’t there yet, but it does sound a bit odd and stuffy. It’s on its way out; in many areas of English, it’s already left.

So judge your audience: if writing in the kind of niche where plural ‘data’ remains the norm, you’ll probably do best to match their style. But otherwise, more of your readers will find ‘these data are’ peculiarly technical than will find ‘this data is’ depressingly illiterate.

  • Lev  On July 8, 2012 at 12:51 pm

    I know of another similar case: the gerund of the verb “to mosaic”, as in “make a mosaic from several pictures”. Is it “mosaicing” or “mosaicking”?

    According to Googlefight, “mosaicing” is used twice as often (see However, I’ve been told that academic articles use “mosaicking” more often. (I don’t have any evidence on that, though.)

  • pauldanon  On July 8, 2012 at 1:05 pm

    Hi. It’s not principally about singular/plural, nor is it about Latin. It’s about the difference between so-called count and non-count nouns. In “there’s a lot of water on the ground” the “water” is non-count. You can’t count it the way you can bottles and the custom for non-count nouns is that the verb is singular. In “the waters here are health-giving” the “waters” is count and, because it’s plural, so is the verb. In other words, English has a group of nouns which lack plural forms and whose verbs are singular. “Data” is one such noun.

    • londonstatto  On July 8, 2012 at 4:35 pm

      Agreed – the analogy I use is sand. You don’t have one sand, you have one grain of sand. And in English you don’t have one datum, you have one piece of data.

      • Tom  On July 8, 2012 at 5:45 pm

        I think we’re making the same point in different language.
        As I say, the divide is between treating the word as grammatically plural (although not logically plural – even in scientific papers you don’t see ‘these 17 data’) and treating it as a grammatically singular mass (i.e.non-count) noun like information or news or sand.

  • Mandy CollinsCollins  On July 9, 2012 at 1:22 pm

    What’s your view on ‘media’? Sometimes I want to use it as a plural (which it is, strictly speaking) but sometimes it’s singular too. Depends on the context, I suspect.

    • Tom  On July 10, 2012 at 10:10 pm

      I use plural. I’ve definitely seen singular used, but I think it’s a minority usage. At the moment. I guess it’s analogous with ‘the press’ – but I like to treat that as plural too.

      Although, thinking about it, in the plural ‘media’ is normally used to mean the outlets, such as the BBC and the NY Times etc, while in the singular, ‘medium’ is normally used to mean print or TV or radio etc as a whole. So there’s an asymmetry there. Possibly that’s significant, but I’m really quite tired…

