newstatesman | It’s been said in some more breathless quarters of the internet that
this is the “data breach” that could have “caused Brexit”. Given it was a
US-focused bit of harvesting, that would be the most astonishing piece
of political advertising success in history – especially as among the
big players in the political and broader online advertising world,
Cambridge Analytica are not well regarded: some of the people who are
best at this regard them as little more than “snake oil salesmen”.
One of the key things this kind of data would be useful for – and what
the original academic study it came from looked into – is finding what
Facebook Likes correlate with personality traits, or other Facebook
likes.
The dream scenario for this would be to find that every woman in your
sample who liked “The Republican Party” also liked “Chick-Fil-A”,
“Taylor Swift” and “Nascar racing”. That way, you could target ads at
people who liked the latter three – but not the former – knowing you had
a good chance of reaching people likely to appreciate the message
you’ve got. This is a pretty widely used, but crude, bit of Facebook
advertising.
When people talk about it being possible Cambridge Analytica used this
information to build algorithms which could still be useful after all
the original data was deleted, this is what they’re talking about – and
that’s possible, but missing a much, much bigger bit of the picture.
So, everything’s OK then?
No. Look at it this way: the data we’re all getting excited about here
is a sample of public profile information from 50 million users,
harvested from 270,000 people.
Facebook itself, daily, has access to all of that public information,
and much more, from a sample of two billion people – a sample around
7,000 times larger than the Cambridge Analytica one, and one much deeper
and richer thanks to its real-time updating status.
If Facebook wants to offer sales based on correlations – for
advertisers looking for an audience open to their message, its data
would be infinitely more powerful and useful than a small (in big data
terms) four-year-out-of-date bit of Cambridge Analytica data.
Facebook aren’t anywhere near alone in this world: every day your
personal information is bought and sold, bundled and retraded. You won’t
know the name of the brands, but the actual giants in this company
don’t deal in the tens of millions with data, they deal with hundreds of
millions, or even billions of records – one advert I saw today referred
to a company which claimed real-world identification of 340 million
people.
This is how lots of real advertising targeting works: people can buy up
databases of thousands or millions of users, from all sorts of sources,
and turn them into the ultimate custom audience – match the IDs of
these people and show them this advert. Or they can do the tricks
Cambridge Analytica did, but refined and with much more data behind them
(there’s never been much evidence Cambridge Analytica’s model worked
very well, despite their sales pitch boasts).
The media has a model when reporting on “hacks” or on “breaches” – and
on reporting on when companies in the spotlight have given evidence to
public authorities, and most places have been following those well-trod
routes.
But doing so is like doing forensics on the burning of a twig, in the
middle of a raging forest fire. You might get some answers – but they’ll
do you no good. We need to think bigger.
0 comments:
Post a Comment