Google Play icon

New Study Points to Inherent Biases in Big Data Collection

Share
Posted June 25, 2015

Big data – which has become quite the buzzword these days – refers to automatically generated information about people’s behaviour. The reason it’s called “big” is because it can easily include millions of observations per single set.

New research suggests that drawing wide-ranging conclusions based only on data obtained from Facebook, Twitter and other social networking sites is prone to biases in terms of age, gender, race/ethnicity, online experience and Internet skills, so that the voices of those who are not on these sites are effectively unheard. Image credit: luckey_sun via flickr.com, CC BY-SA 2.0.

New research suggests that drawing wide-ranging conclusions based only on data obtained from Facebook, Twitter and other social networking sites is prone to biases in terms of age, gender, race/ethnicity, online experience and Internet skills, so that the voices of those who are not on these sites are effectively unheard. Image credit: luckey_sun via flickr.com, CC BY-SA 2.0.

Unlike traditional surveys, based on explicit questions, big data is created whenever people engage in certain actions when using an online service or system – with every click, Facebook, Twitter and other social media users leave behind digital traces of themselves that can be used by businesses, governments and various other groups that rely on “big data”.

But while the information derived from social media networks can certainly shed some light on mass-scale social trends, some analyses based on this method of data collection are prone to biases from the get-go.

This is the conclusion of a new study from the Northwestern University, titled “Is Bigger Always Better? Potential Biases of Big Data Derived from Social Network Sites”.

Published in the journal The Annals of the American Academy of Political and Social Science, the study points out that since people don‘t join social media outlets randomly, the data generated by analysing their online behaviour is potentially biased in terms of demographics, socioeconomic background or Internet skills.

“The problem is that the only people whose behaviours and opinions are represented are those who decided to join the site in the first place,” said study author Eszter Hargittai, the April McClain-Delaney and John Delaney Professor in the School of Communication. “If people are analysing big data to answer certain questions, they may be leaving out entire groups of people and their voices.”

For this particular study, the Web Use Project – Hargittai’s research group that focuses on how differences in Internet use contributes to social inequality – looked at a type of big data analysis that draws wide-ranging conclusions based on data obtained from users of particular sites and services.

While there have already been a number of studies documenting the challenges of big data research, this is the first one to provide empirical evidence of potential biases.

Hargittai used two datasets – a nationally representative sample from the Pew Internet Project and her own data collected from wired and young educated adults.

Turns out, age, gender, financial status and Internet skills all contribute in determining which sites and services people choose to engage in.

“The less privileged are not on these sites [Facebook, Twitter and the like], so their opinions are not there either,” she said. “Even among young adults who are generally thought of as the most active on social network sites, we see socioeconomic differences when it comes to Twitter and Tumblr. We also see gender and skill differences on who is on what site.”

The paper concludes with some preliminary advice on study design aimed at avoiding biases, and suggests supplementing big data with other sources of information to decrease the negative impacts of leaning too hard on social networking sites alone.

Sources: study abstract, phys.org.

Featured news from related categories:

Technology Org App
Google Play icon
83,394 science & technology articles

Most Popular Articles

  1. Bright Fireball Explodes Over Ontario, Meteorite Fragments Might Have Reached the Ground (August 5, 2019)
  2. Why older people smell the way they do? Japanese have even a special word for it (August 4, 2019)
  3. Moisturizers May Be Turning Your Skin Into ‘Swiss Cheese’ (4 days old)
  4. Terraforming the Surface of Mars with Silica Aerogel? (July 23, 2019)
  5. Swarm Autonomy Tested in Second Major DARPA OFFSET Field Experiment (August 8, 2019)

Follow us

Facebook   Twitter   Pinterest   Tumblr   RSS   Newsletter via Email