Social Data or so called “big data” can provide significant insights into your customer, brands, company, partners, industry and competitors. However like any other dataset you get what you ask for and this is particularly the case in big data (see The future is now: 10 startups leading the way in ‘big data’).
To prove out this point we decided to examine the term “misogynist” in relation to our two political leaders (i.e. Gillard and Abbott).
Firstly as most of us are aware big data analysis technologies are keyword driven and parametrized. So looking for Australian mentions we landed on 8,665 mentions over the last month. What is wrong with that data then? How valid is it?
We then analyzed the same keywords and looked for specific unknown country data. Based on the diagram below and searching on a specific country parameter we would have excluded 28% (or 4,405 mentions) of the big data. Generally, when searching for a specific source of mentions such as “Australia” there will be a similar “loss” of data – between 25 and 40% – because this is the number of mentions which do not have geographic information associated with them.
When we look at the detail of this unknown big data (below) we can see that the largest contributor is Twitter which is further compounded by social data from Facebook (if we were monitoring for Facebook data) as Country data isn’t defined in the Facebook streams.
So what learning’s does this provide:
- With Social Data you get what you ask for
- Spend time researching your keywords and start at the highest possible level of keyword definition
- Research keywords by company, brand, competitor, industry terms and constantly hone these keywords
- Research country dataset variances and carefully examine when and how to exclude certain country (or unknown country data)
- …and one from iGo2; hire a professional
Notes to research:
- Search a: misogynist AND Gillard
- Search b: misogynist AND Abbott
- Dataset: 16th Sept 2012 to 15th Oct 2012