The Wall Street Journal published a blog post in which it decided to class data as a singular noun which, according to the rules of subject-verb agreement goes with a singular verb, much like information.
For the WSJ this is good English:
the data is collected
However, many traditionalists contend that data is in fact the Latin plural of the singular, datum, and therefore we should be saying:
the data are collected
However, although this may be the case, data is no longer a Latin word (or rather, it is now an English word of Latin origin) and it has to conform to English grammar rules. But there is no agreement here on how to use it.
Checking the n-grams it reveals a small but significant difference between American English and British English. The first graph shows how data is used in American English:
This next graph shows how it is used in British English:
It would appear then that whilst more people tend to use data as plural, in American English the gap between singular and plural usage is much smaller than in British English. However, for both it must be said that over the past 25 years or so while the number of people using data as singular has stayed roughly the same, the number of people using data as plural is falling. Will the day come soon when data is becomes more popular than data are?
And what does this mean in the classroom? Quite simply, in class you can class it as either singular or plural – as long as you are consistent!