Most of them are in English and contain illegal content

Apr 8, 2016 15:10 GMT  ·  By

With 71 percent of people responding in a recent Ipsos poll saying that the Dark Web should be shut down, it may be a good idea to learn and understand what we can find in this hidden and hard-to-reach part of the Internet before we form our own opinion.

Protected by encryption and hidden from normal browsers, users need special technologies like TOR, I2P, and Freenet to access this part of the Web.

Because of this, most of us have never accessed the Dark Web and would barely known how to start without consulting a tutorial or asking for help in advance.

Only 29,532 .onion sites discovered

In recent research conducted by Intelliagg and Darksum, the two companies decided to map out the Tor-based Dark Web for the rest of us and see if its reputation was the one that everybody depicts as a place for finding illegal content.

Employing a series of automated scripts, the two companies scanned, crawled, and indexed all Tor websites they could reach. The result of their data mining operations revealed that there are only 29,532 .onion websites around, a number that is smaller than previous estimates and almost insignificant compared to the billions of Internet domains currently in use today.

The researchers said that they saw many Tor websites come online and then disappear forever, so their 30,000 estimate may be actually smaller since many of those sites are bound to go away and never return again.

To put it in numbers, 54% of these 30K .onion sites disappeared during the course of the survey, with researchers reasoning they must have been temporary domains set up for cyber-crime campaigns, probably used to host C&C servers or other types of temporary services.

Manual inspection reveals that 68% of .onion sites contain illegal content

As expected, most of the other Tor sites were written in English (76%), with German (4%) and Chinese (3.7%) being the other most popular languages.

Organized in categories, most of the sites were used for file sharing, leaking data, financial fraud, and news media. But this categorization was hardly representative of their real content.

Researchers opted to use another set of automated scripts that scanned the entire sites' content, and then to detect their real purpose based on their content.

What they found was that this script labeled 52 percent of all the sites it found as having legal content while only 48 percent could be considered under UK and US law.

Since scripts are prone to errors, researchers also took out a sample of 1,000 sites and had human operators review their content. Based on this manual scan, the two companies said that the balance shifted, and 68 percent of the sites contained illegal content.

"We believe it is important for the public to gain a better understanding of the contents of the dark web in order for there to be a proper debate about its nature, dangers - and potential benefits," researchers said in their report. "Misunderstanding about the dark net is rife, and has been fuelled by often misleading media coverage."

Dark Web main categories broken down
Dark Web main categories broken down

Photo Gallery (2 Images)

Dark Web prominent categories
Dark Web main categories broken down
Open gallery