It’s easy to form the mental image of a hacker hunched over a computer, probing a way to get your personal information, whether to sell it, acquire credit cards in your name or use your health insurance.
It does happen, but University of New Mexico Department of Computer Science Professor Stephanie Forrest and Ph.D. student Benjamin Edwards, working with Steven Hofmeyr from the Lawrence Berkeley National Laboratory (Berkeley Lab), say it is not happening more frequently than it did a decade ago. Data breaches, in general, are not growing in size.
“Cybersecurity has become a global problem, and to tackle it effectively will require careful analysis of complex datasets from diverse sources,” said Forrest. “This study illustrates how modern data science can shed light on one of today’s most challenging problems.”
In a new paper, titled “Hype and Heavy Tails: A Closer Look at Data Breaches,” which won the Best Paper Award at the Workshop on the Economics of Information Security in June, the researchers looked at both malicious and negligent breaches. Malicious breaches occur when attackers specifically target someone’s personal information. Negligent breaches occur when someone’s private information is accidentally exposed for example if a database of personnel records is stored on a laptop that is lost or stolen.
They used information published by the Privacy Rights Clearinghouse, a private non-profit that tracks public reports of data breaches, and they note that their results are drawn from publicly acknowledged data breaches.
The researchers constructed a statistical model based on public data about breaches collected over the last decade and used the model to analyze trends and make predictions about future breaches. The data clearly showed that information is exposed twice as often through negligence as it is from malicious attacks. Using expanded data that includes high profile data breaches form this summer, the model also predicts that there is a 98.2 percent chance of a breach that exposes more than 5 million records during the next three years.
What is the bottom line, that is, what is the real cost in dollars of these data breaches? Estimating financial costs of breaches accurately also requires analyzing their cost. The research team applied some existing cost models to project that over the next three years, data breaches could cost individuals, companies and public entities up to $180 billion.
“With this work, our goal was to answer the questions: Are security breaches getting bigger? Are they happening more frequently? And when they do happen, are the impacts more catastrophic? When we fit the cyber security data to the statistical model, we found a ‘long tail’ distribution, which is liable to distort public perception,” says Hofmeyr. “It’s kind of like if you’ve just experienced a big earthquake, you may suddenly be scared of big earthquakes, even though the probability for big earthquakes hasn’t changed.
“It’s the same for security. And, the reason that we can say that is because we have this principled statistical model, which gives us a more comprehensive and contextual view than simply looking at averages.”
There’s a take away message for public policy experts in this. Industry reports, which are widely circulated and difficult to confirm, often use inappropriate statistical techniques and should be taken with a large grain of salt. Policies that encourage uniform reporting of security problems would provide clarity in this very murky area.
Edwards summed it up. “So much of our current understanding about security problems relies on private data and opaque analysis methods. Studies like ours provide a rational counterpoint for policy makers and they show the benefit of putting data about security problems into the public domain.”
This research was partly supported by U.S. Department of Energy’s Office of Science. the single largest supporter of basic research in the physical sciences in the United States. It is also supported by the National Science Foundation and the Defense Advance Research Projects Agency.