I had a first play with the Guardian APIs yesterday, and was pleasantly surprised by how simple and effective they were. I'm not one of those Ruby-on-Rails kids who mashes Google up with the Facebook in between grinding a fat half-pipe on an ollie; even I found them really great to work with.
And, being the kind of childish individual who still gets kicks out of looking rude words up in the dictionary, I decided to use the API to examine changes in obscenity in the last 10 years.
To do this, I took a group of common swear-words, and searched The Guardian archives for the number of articles containing each one for each of the last 10 years. This gave me an idea for the prevalence of the word, without being overly affected by, say, the vulgar rantings of Mr Brooker. I then divided this number by the total number of articles available for that year - giving the percentage of Guardian content containing each profanity, and accounting for the fact that the total volume of digital content, and by extension UK liberal coarseness, has grown year on year.
I present the results below. Good news! The y-axis runs from 0 to 0.9%; even a liberal rag like the Grauniad seems to be pretty resistant to the tidal wave of filth which some might have you believe is flooding the English language nowadays.
A few observations:
- Swearing is growing slowly year-on-year, across the board;
- Unusually, in 2001, swearing stayed more-or-less level. Bastard declined after 2001 - probably an after-effect of 9/11, after which most other swearing grew;
- Wank is massively underperforming over the last decade, whilst cock is flat;
- Shit has grown disproportionately and steadily since 2005, whilst fuck has gone as far as it can;
genius!
Posted by: James Tindall | April 02, 2009 at 03:41 PM
You forgot cockweasel: http://browse.guardian.co.uk/search?search=cockweasel
Posted by: Simon W | April 02, 2009 at 04:37 PM
Arse. And balls!
Posted by: Jay | April 02, 2009 at 07:39 PM
My friend Steve has been doing something interesting with this too: http://guardiantrends.appspot.com/
Posted by: Rob Bradford | April 02, 2009 at 09:10 PM
As the editor who once made the mistake of putting 'Fuck Cilla Black' on the cover of G2 I am very glad to see that suggestions that we are a bunch of potty-mouths are completely unfounded...genius graph
Posted by: ian katz | April 02, 2009 at 09:44 PM
You should have used twat too. Nonetheless, this is hilarious.
Posted by: paul | April 02, 2009 at 09:45 PM
genius. Dont get back to work.
Mark
Posted by: Mark | April 02, 2009 at 10:01 PM
Bastard declined after 2001 - probably an after-effect of 9/11
You're kidding, right?
Posted by: Rob | April 02, 2009 at 11:23 PM
excellent
Posted by: duncan | April 03, 2009 at 11:12 AM
Back in the 1950s I was threatened with the sack as sports editor of a local newspaper for using the word 'bloody' in a quote from Jimmy Greaves. Times have changed for the better, I swear. Norman Giller, dyslexic dol frat
Posted by: Norman Giller | April 03, 2009 at 11:41 AM
Tom, have you considered running a search for misspelled words? I can imagine it'd be several degrees harder to execute than this example, but could be very useful in the context of the recent debate over subediting positions at newspapers.
I applied for the API with this in mind, but (unsurprisingly?) haven't heard back from them...
Posted by: Conrad Quilty-Harper | April 03, 2009 at 11:56 AM
I got a comment deleted once for saying the Hungarian equivalent of wanker.
Posted by: Pestinpest | April 03, 2009 at 02:38 PM
...can we have a Y axis please?.... : -)
Posted by: emily bell | April 03, 2009 at 02:56 PM
Emily: Y runs from 0 to 0.9%...
Conrad: yeah I was wondering about that, but trying to think about how to count misspellings. There's a few common ones ("teh") which you could go for, but I reckon a spellchecker would require some training to get to grips with all sorts of names, language, etc...
Posted by: Tom Hume | April 03, 2009 at 04:32 PM
"Wank is massively underperforming over the last decade"
Nah, it's just not used very often outside of Derek & Clive.
Clearly, you should have looked for wankers instead-- I'm sure there are plenty more of them these days.
Posted by: luther blissett | April 04, 2009 at 07:05 AM
If you replaced the offensive terms with
Banker
Politician
Pension
Expenses
Justice and
Scrutiny
you'd get similar results
Posted by: Ron MOULE | April 04, 2009 at 01:58 PM
in south africa before and after the war "skaap" (sheep) was a fighting word and "bloody" got you kicked out of society. then along came american negro influence and motherfucker now is as common as blimey. fuck as you say has peaked - it's time we went (back) to blasphemy and here we can be inspired by the cultures that never let it go - i.e. the Greek gamo tin panagia sou - I fuck your virgin Mary. tasty.
Posted by: justme | April 04, 2009 at 09:09 PM
Sadly the utilisation of arse is showing decline....
Posted by: Matt Gibson | April 05, 2009 at 09:52 AM
You divided by the number of articles. But perhaps you should divide by the number of words. The trend could be a function of the word count increasing, or could be larger than is shown if the word count has been decreasing.
Posted by: mankoff | April 06, 2009 at 03:04 PM
Well played sir, loving it!
Posted by: Chris Heilmann | April 09, 2009 at 04:23 PM
Hmm. Looking for 'cockweasel' as Simon W suggsted above, returns results that say that Anna Pickard used it twice and Heidi Stephens wrote the said word once. Nobody else wrote the word cockweasel in the Guardian over those years, apparently.
Which makes me wonder - would there be any way of picking out the sweariest writers (with the option of excluding Charlie Brooker, who would probably cause the profanoscope to explode)?
Or to total up all the swears by men and those by women and put them up as a boys vs girls bar graph?
I think much fun is being missed :)
Posted by: John P | April 15, 2009 at 10:45 AM
This being the Gruniad, did you think about searching for misspelled swear words like cnut and bsatard? Of course searching for fcuk would definitely push that trendline upwards.
Posted by: John W | July 07, 2009 at 02:48 PM
in south africa before and after the war "skaap" (sheep) was a fighting word and "bloody" got you kicked out of society. then along came american negro influence and motherfucker now is as common as blimey. fuck as you say has peaked - it's time we went (back) to blasphemy and here we can be inspired by the cultures that never let it go . tasty. http://www.fullmediafire.com
Posted by: dean | August 30, 2010 at 08:32 AM