Recently I attended a lecture by Yvonne Hofstetter where she talks about three stages of data fusion (= what can be “done” with big data); the aggregation and storage (1), the analysis (2) and finally the automated action informed by the analysis (3). Without going any further in her view of things I’d like to use this trisection going forward.

It seems rather obvious from the public debate that “we” have hardly the slightest idea of what’s going on behind the curtains of gigantic data centers when ridiculously sophisticated algorithms take the data of millions or billions of users between their teeth and crunch them to bits and pieces (some nice drama in that sentence). We only know that stage 1 is indeed happening on grand scale and we safely assume that stage 2 is going on… but what kind of analysis and what kind of “stage 3 actions” are being done, could be done or will be done? So damn clueless we are. Ok, there is “personalized advertisement”; Gmail delights you with ads matching words in a thread and Amazon shows you stuff you might be interested in (or stuff you JUST bought and therefore want to buy again right away?? wtf!) through its myriads of shopwindows (= ad-spots on other sites). We see tag clouds (word frequency analysis) of hashtagged twitter-conversations, topic-mood mappings of the latter as a result of scanning tweets for emotions and we hear about impressive (or irritating) successes of predicting pregnancy or sickness based on online traces before that person notices it themselves.

But WHAT ELSE for Christ’s sake? Given the endless stream of investments and power (electrical and political, hah) flowing into that sector, those can only be the low hanging fruits, no?
I don’t pretend to know anything substantial on this matter myself at this point. That’s why I think it would be a great idea to find out together (what a motivating, inclusive and positive way to put it, isn’t it). In previous posts I mentioned the notion “show me what you can do with my data” as a (educational) strategy to raise awareness among citizens. Now I’d like to put some meat on this notion.

So – how about a forum/wiki/stackoverflow-thingy that gathers scenarios and possibilities around “stage 3 analytically informed actions” in a robust, scientific and fact-checked way (meaning no blossoming of conspiracy-theories for the emotionally hurt among us). Anyone could post questions and scenarios, but they are likely to bubble up in the charts only when they get the stamp of being “expert”-approved (that status could be gained through a trust earning mechanism like on stackoverflow) for technical correctness.
And THEN I’d suggest to actually execute one/some of them in a controlled fashion! Maybe just once, maybe a yearly voting that concludes with the execution of a scenario. Maybe with data that was collected semi-legal or fully legal through participants that signed up for it in a “donate (parts of) your personal data for the cause” kind of way.

For instance I might agree to have my social activity and my geo-information being tracked a month prior to “showdown”. That showdown could the be a scenario for instance where sudden little deviations from a daily route are suggested to thousands of users at once with the purpose of bringing as many potential new couples (the matching of which was determined by personality and social analysis) together by having them mysteriously meet at locations where they were automatically being directed towards.
Ok, that is a rather romantic, respectively big-brother-fanning idea and not nearly accounting for the potential of stage 3 automation. But as I said I am short of stage3-imagination myself… and well, this might be a fun start of the annual data automation tournament :)

I think if broadcasted and executed well it (not necessarily the strange example above) could have a big impact in terms of enriching the so called “public debate” on the matter of “big data” – people would finally have realistic fuel for their imagination instead of being pushed around by everything between idealized pro-marketing and hysteric contra-paranoia.

I would prefer seeing such mass-educational/explorative measures being taken by governmental organizations. Either by a national organization like the “German Office for Information Security, BSI”, the “National Cyberdefence Centre, NCAZ” or by an EU organizations like the “European Cybercrime Centre, EC3” (I doubt the US would currently have the credibility to host such activities). But if not government, than grassroots; Anonymous, the Chaos Computer Club…?

