Pretty much the motto of my profession is “word choice matters.” I say it a lot. It appears somewhere in the marginalia of pretty much everything I’ve ever edited. Words have denotation, and connotation. There are considerations for dialect, and for popular use.
It can be fiddly and annoying to be queried so; I get it. You know what you meant, and you grabbed the word in your head that, to you, meant that thing. One of the glories of having your work edited is that someone who isn’t you can hold up a mirror, to make sure that the word on the page means as close as possible to what you meant in your head, to the greatest number of people, no matter where they’re from or what language they natively speak. Here at AlienVault, we’ve had some great discussions about the differences in connotation in different words between our Irish speakers, who learned Hiberno-English (which gets the hyphen when none of the others do), Chinese speakers, who learned British English, and Americans, who learned American English with intense regional dialect (the Texans and the Californians are occasionally mutually unintelligible.)
But there’s one thing that none of us tolerate; the choosing of a word to deliberately mislead. When one works in fiction, one is used to the painting of pictures with words. When one chooses to work primarily in technology, it’s often because you’re way more comfortable with the nicely concrete, if entirely mutable. In technology, a thing is, or it is not. It’s variations on a theme of zeros and ones, no matter whether it’s software or hardware.
It is therefore maddening beyond belief when the unambiguous words of technology are used to mislead the non-technical public. I’m of course talking about the Cambridge Analytica debacle, which is being referred to across the media landscape as “a data breach.”
A data breach is when someone who is not authorized to handle specific information obtains access to that information. It’s a non-trivial failure of the security measures a responsible company or reasonable individuals would have in place. It implies wrongdoing, it implies malice, it implies a victim/attacker relationship.
But when data is harvested and used with the unknowing opt-in of thousands of people, that’s not a breach. There are no hackers here; just people who knew how to use freely-given personal data to manipulate not very technically astute people to some political end.
We’ve been regularly covering data breaches for years. No one hacked into Facebook’s servers exploiting a bug, like hackers did when they stole the personal data of more than 140 million people from Equifax. No one tricked Facebook users into giving away their passwords and then stole their data, like Russian hackers did when they broke into the email accounts of John Podesta and others through phishing emails.
Facebook obviously doesn't want the public to think it suffered a massive security breach, like Yahoo did in 2013 and 2014. We agree not because we want to minimize the significance of the Cambridge Analytica story, but because the real story is far more troubling: This data collection was par for the course. In other words, it was a feature, not a bug. And while the process that Kogan exploited is no longer allowed, Facebook still collects—and then sells—massive amounts of data on its users.
As Zeynep Tufekci, the author of Twitter And Tear Gas, put it, Facebook’s vehement defense that this was not a data breach is itself actually a damning statement of what’s wrong with Facebook, and Silicon Valley’s ad industry in general.
“If your business is building a massive surveillance machinery, the data will eventually be used & misused,” Tufekci, a University of North Carolina professor who studies the social impact of technology, wrote on Twitter. “There is no informed consent because it's not possible to reasonably inform or consent.”
Facebook’s security team, Tufekci concluded, can’t mitigate the company’s business model, which is predicated on collecting as much of our data, and our friend’s data, as possible.
We can condemn the misuse of this data, and Facebook’s data collection practices, without calling it a data breach, a term that may confuse readers and distract them from what we believe is the real problem here: Silicon Valley giants have built massive data collection machines with almost no guardrails on how they are used.
It's incredibly important, when explaining technology to those who are not fluent in its nuance, to use your words precisely, in a consistent and clear way. And as much as little sideslips like “on premise” and “crypto” make us all twitch, the use of the term “breach” when no actual breach occurred not only misleads people, it blurs the line between those victimized by breaches, and those manipulated without their explicit consent. Most importantly, it makes people lose trust in the ability of security researchers and security tools to keep them safe from breaches. We absolutely must not allow the dilution of this term, or we risk losing ground we cannot afford to lose in the effort to educate people to be active participants in their own security, and the security of the places they work and the virtual places they play.