If you have, like me, spent any of the past weeks binge-watching the second season of Netflix’s House of Cards, you’ll know that there’s a lot of talk about the “Deep Web.” Mentions of illegal drugs and arms trading, child pornography and ‘hitman’ services, might make you think you’re missing out. However the Deep Web is a much more intricate and difficult concept. Thus, welcome to the “Into the Deep Web” series. In the next month(s) I will look at the parts of the World Wide Web which are vast, hidden and sometimes illegal.
To start of the series I would like to look at the basics of what the Deep Web is, and isn’t, and look at the history of the Deep Web and how it has evolved. First you should know that the Deep Web has always existed, as soon as the World Wide Web has come into existence. The most general definition of the Deep Web would be: “Would Wide Web content which is not indexed by standard search engines.“
The way search engines like Google or Bing index websites and website content is by releasing crawlers. These ‘crawlers’ are computer programs which navigate from one hyperlink to another at very high speeds and report information -like amount of visits, content-ID and DNS server information- back to the google server, which in turn indexes this so that you will see a number of websites when you are searching for “Post Cold War economic policies in Hungary.” The Deep Web then are the websites these crawlers cannot visit, either because there are no working hyperlinks to these webpages, or the hyperlinks are only accessible with a password or via a program called TOR (which we will look at later).
In the end anonymity turned out the be the foundation for what is known as “The Darknet”
So how much of the World Wide Web cannot be traced by these search engines, are there places where Facebook and Google cannot find your personal information and sell it to companies for billions of dollars? It is hard to say, the latest evidence that was accepted by the academic community comes from the year 2001, and since then Google has become a lot bigger, and better at finding information on the web. At the time this estimation made by Professor Michael K. Bergman in ‘The White Paper: The Deep Web: Surfacing Hidden Value“, which ironically I found published in its entirety and for free on a Deep Web website, was that the Deep Web is a thousand to two-thousand times bigger than the surface web. Which is not a very specific definition but it’s the only one we have to work with and the only one which is not deemed completely wrong.
This was in 2001, where most of the deep web content was still publicly accessible as long as you knew the link to the website you wanted to find. Most of the content were databases with academic articles, company databases or personal networks protected by a firewall. There was no room for the illegal drugs trade or identity theft that you think of now when you hear about the Deep Web.
How this changed in 2002, when a few internet nerds started The Onion Project, a project which adapted U.S. Naval technology to create an open source browser bundle which protected your identity on the internet. The TOR browser bundle is an internet browser just like FireFox or Chrome, but it bounces the information you access across the globe before delivering it to you. This means that it is impossible for crawlers to find your website and extremely hard to analyse your internet traffic.
This created the possibility for creating websites which not only were hidden, but also anonymous for the host and its visitors. And this small bit of anonymity was the foundation for The Darknet. This is the part of the deep web where you can anonymously trade in drugs, arms, Pay-Pal accounts without being traced by Google or, say, the FBI.
Next week I will write about the Darknet and how it evolved from 2002 until now, how illegal marketplaces like ‘The Silkroad’ have operated for years without being busted by the FBI and how easy it is to order drugs online and have them sent to your doorstep by mail.