Wednesday, April 12, 2017
ISP Data Pollution: Hiding the Needle in a Pile of Needles?
theatlantic | The basic idea is simple.
Internet providers want to know as much as possible about your browsing
habits in order to sell a detailed profile of you to advertisers. If the
data the provider gathers from your home network is full of confusing,
random online activity, in addition to your actual web-browsing history,
it’s harder to make any inferences about you based on your data output.
Steven
Smith, a senior staff member at MIT’s Lincoln Laboratory, cooked up a
data-pollution program for his own family last month, after the Senate
passed the privacy bill that would later become law. He uploaded the
code for the project, which is unaffiliated with his employer, to GitHub.
For a week and a half, his program has been pumping fake web traffic
out of his home network, in an effort to mask his family’s real web
activity.
Smith’s algorithm begins by stringing together
a few words from an open-source dictionary and googling them. It grabs
the resulting links in a random order, and saves them in a database for
later use. The program also follows the Google results, capturing the
links that appear on those pages, and then follows those links, and so
on. The table of URLs grows quickly, but it’s capped around 100,000, to
keep the computer’s memory from overloading.
A program called PhantomJS, which mimics a person using a web browser, regularly downloads data from the URLs that have been
captured—minus the images, to avoid downloading unsavory or infected
files. Smith set his program to download a page about every five
seconds. Over the course of a month, that’s enough data to max out the
50 gigabytes of data that Smith buys from his internet service provider.
Although
it relies heavily on randomness, the program tries to emulate user
behavior in certain ways. Smith programmed it to visit no more than 100
domains a day, and to occasionally visit a URL twice—simulating a user
reload. The pace of browsing slows down at night, and speeds up again
during the day. And as PhantomJS roams around the internet, it changes
its camouflage by switching between different user agents, which are
identifiers that announce what type of browser a visitor is using. By
doing so, Smith hopes to create the illusion of multiple users browsing
on his network using different devices and software. “I’m basically
using common sense and intuition,” Smith said.
By
CNu
at
April 12, 2017
0 Comments
Labels: Ass Clownery , FAIL , wikileaks wednesday
Subscribe to:
Post Comments (Atom)
The Hidden Holocausts At Hanslope Park
radiolab | This is the story of a few documents that tumbled out of the secret archives of the biggest empire the world has ever known, of...
-
theatlantic | The Ku Klux Klan, Ronald Reagan, and, for most of its history, the NRA all worked to control guns. The Founding Fathers...
-
dailybeast | Of all the problems in America today, none is both as obvious and as overlooked as the colossal human catastrophe that is our...
-
Video - John Marco Allegro in an interview with Van Kooten & De Bie. TSMATC | Describing the growth of the mushroom ( boletos), P...