Gartner’s definition of Dark Data

Complex Algorithm to Find Patterns in Dark Data

Complex Algorithm to Find Patterns in Dark Data

I was reading Ahmed Banafa’s post on “Understanding Dark Data” on LinkedIn, and it got me thinking about Garter’s definition of Dark Data.

I may be cynical but I think Gartner’s definition of Dark Data has a lot to do with marketing and selling consultancy.

I’d define Dark Data as data you can’t see but you know it’s there by the influence it’s having on other data. Pretty much exactly like dark matter. We don’t define dark matter as matter we can see but isn’t doing anything useful at the moment. It’s matter we can’t see but we can know it’s there by the effects it’s having.

But that doesn’t mean that dark data wouldn’t be useful. There’s a great bit in Doctor Who where the Cyber-Planner, aka ‘Mr. Clever’, says “Doctor, why is there no record of you anywhere in the Cyberiad? Oh, your good. Oh you’ve been eliminating yourself from history. You know you could be reconstructed by the hole you’ve left”. (Series 7 ep 12 – Nightmare in Silver)

You can potentially re-contruct the dark data from the effect it’s having. Of course this isn’t easy, but there are plenty of examples in history of people managing to do just that. During WWII, the Allies were very interested in intelligence about the Germans war machine. Naturally the Germans didn’t publish that data or make it available to the Allies. However the Allies came up with a number of work arounds. For example, by analysing the metals the Germans were putting in their bomb casings they could work which metals the Germans were short of. How? Because you make your bomb casing out of whatever abundant scrap metal you have hanging around. It’s going to be blown up so you don’t put anything in it that you need because it’s scarce. By looking for what metals were absent (“the hole”), the Allies could work out what metals the Germans were short of. Like I said, not easy, but very valuable.

By defining dark data as “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes” they’re making it easy for them to sell this idea to companies. You already have this data, you just need to do something with it, let us help you. By giving it a trendy name they’re making it easy to market. Let me help you to exploit your unused data – bit boring. Let me help you exploit your DARK DATA – now that’s marketing 😉

Do I think what they’re trying to do is wrong? No, I actually think it’s a great idea, it’s just the purist in me wishes they’d chosen a different name 🙂

Image Copyright: kentoh / 123RF Stock Photo


Smooth seas do not make skilful sailors.

African Proverb

Author: Jamie

I blend and clarify data in novel ways to create new and illuminating insights.

Share This Post On

Trackbacks/Pingbacks

  1. Federica Romanelli, The hidden risks of dark data |  Technethics - […] http://www.exploringdatascience.com/blog/gartners-definition-dark-data/ […]
  2. Federica Romanelli, The hidden risks of dark data - Avv. Federica Romanelli - […] http://www.exploringdatascience.com… […]

leave us a comment - we love to chat :-)

Share This

Share This

Share this post with your friends!