May 09, 2011 Hard data on why your users should avoid file-sharing sites A new study provides solid proof that people all over the world are actively Dumpster diving in file-sharing services
By Woody Leonhard
There's a new study (PDF)
that you and your recalcitrant clients should read. Writing for the fourth Usenix Workshop on Large-Scale Exploits and Emergent Threats (LEET '11
) last month, researchers at the University of Leuven and the graduate school at Institut Eurécom started poking around file sharing sites and what they found will raise -- no, curl -- your eyebrows.
Although there are many variations on the theme, file-sharing sites generally have you upload a file, then hand you a URL that other people can use to access the file. You send the URL to your co-workers, family, or 10,000 of your closest friends. They simply pop the URL into a Web browser and download the file.
The study found that many of the most popular sites generate sequential URLs: If your file is located at www.fileshareplace.com/id=123456
, the next file is located at www.fileshareplace.com/id=123457
. The researchers put together a little crawler that poked at sequential URLs and downloaded whatever it could find. Online data Dumpster diving, one URL at a time.
The researchers intentionally created a slow crawler, so they wouldn't get knocked off any of the hosting sites. The result? One month of harvesting netted 310,735 unique files, including 27,700 JPEGs, 13,400 ZIPs, 7,000 PDFs, 4,000 DOCs, 1,200 XLSs, and almost 1,000 PPTs. They even came up with a handful of SQL files -- presumably databases sitting there on the file-sharing sites, ripe for the taking.....
But, but, but ... I can hear your users sputter ... but, nobody really goes data Dumpster diving like that, do they?
Well, yes they do. The researchers put honeypot files on all of the file-sharing servers that use sequential URLs. They used different kinds of files -- HTML, EXE, PDF, and DOC files, all of which "phoned home" in various ways, many of them requiring the user to knowingly click to allow the interaction. They didn't post or otherwise identify the URLs involved.
In a one-month trial, 275 files phoned home, from 80 unique IP addresses. Half of the honeypotted IPs were in Russia, 25 percent in Ukraine, but the other 25 percent came from all over, including the U.S., U.K., Europe, and the Middle East. There's no question that people all over the world are actively Dumpster diving in file-sharing services.