I had a first play with the Guardian APIs yesterday, and was pleasantly surprised by how simple and effective they were. I'm not one of those Ruby-on-Rails kids who mashes Google up with the Facebook in between grinding a fat half-pipe on an ollie; even I found them really great to work with.
And, being the kind of childish individual who still gets kicks out of looking rude words up in the dictionary, I decided to use the API to examine changes in obscenity in the last 10 years.
To do this, I took a group of common swear-words, and searched The Guardian archives for the number of articles containing each one for each of the last 10 years. This gave me an idea for the prevalence of the word, without being overly affected by, say, the vulgar rantings of Mr Brooker. I then divided this number by the total number of articles available for that year - giving the percentage of Guardian content containing each profanity, and accounting for the fact that the total volume of digital content, and by extension UK liberal coarseness, has grown year on year.
I present the results below. Good news! The y-axis runs from 0 to 0.9%; even a liberal rag like the Grauniad seems to be pretty resistant to the tidal wave of filth which some might have you believe is flooding the English language nowadays.
A few observations:
- Swearing is growing slowly year-on-year, across the board;
- Unusually, in 2001, swearing stayed more-or-less level. Bastard declined after 2001 - probably an after-effect of 9/11, after which most other swearing grew;
- Wank is massively underperforming over the last decade, whilst cock is flat;
- Shit has grown disproportionately and steadily since 2005, whilst fuck has gone as far as it can;