analytically informed automated actions… but WHAT?

Recently I attended a lecture by Yvonne Hofstetter where she talks about three stages of data fusion (= what can be “done” with big data); the aggregation and storage (1), the analysis (2) and finally the automated action informed by the analysis (3). Without going any further in her view of things I’d like to use this trisection going forward.

It seems rather obvious from the public debate that “we” have hardly the slightest idea of what’s going on behind the curtains of gigantic data centers when ridiculously sophisticated algorithms take the data of millions or billions of users between their teeth and crunch them to bits and pieces (some nice drama in that sentence). We only know that stage 1 is indeed happening on grand scale and we safely assume that stage 2 is going on… but what kind of analysis and what kind of “stage 3 actions” are being done, could be done or will be done? So damn clueless we are. Ok, there is “personalized advertisement”; Gmail delights you with ads matching words in a thread and Amazon shows you stuff you might be interested in (or stuff you JUST bought and therefore want to buy again right away?? wtf!) through its myriads of shopwindows (= ad-spots on other sites). We see tag clouds (word frequency analysis) of hashtagged twitter-conversations, topic-mood mappings of the latter as a result of scanning tweets for emotions and we hear about impressive (or irritating) successes of predicting pregnancy or sickness based on online traces before that person notices it themselves.

But WHAT ELSE for Christ’s sake? Given the endless stream of investments and power (electrical and political, hah) flowing into that sector, those can only be the low hanging fruits, no?
I don’t pretend to know anything substantial on this matter myself at this point. That’s why I think it would be a great idea to find out together (what a motivating, inclusive and positive way to put it, isn’t it). In previous posts I mentioned the notion “show me what you can do with my data” as a (educational) strategy to raise awareness among citizens. Now I’d like to put some meat on this notion.

So – how about a forum/wiki/stackoverflow-thingy that gathers scenarios and possibilities around “stage 3 analytically informed actions” in a robust, scientific and fact-checked way (meaning no blossoming of conspiracy-theories for the emotionally hurt among us). Anyone could post questions and scenarios, but they are likely to bubble up in the charts only when they get the stamp of being “expert”-approved (that status could be gained through a trust earning mechanism like on stackoverflow) for technical correctness.
And THEN I’d suggest to actually execute one/some of them in a controlled fashion! Maybe just once, maybe a yearly voting that concludes with the execution of a scenario. Maybe with data that was collected semi-legal or fully legal through participants that signed up for it in a “donate (parts of) your personal data for the cause” kind of way.

For instance I might agree to have my social activity and my geo-information being tracked a month prior to “showdown”. That showdown could the be a scenario for instance where sudden little deviations from a daily route are suggested to thousands of users at once with the purpose of bringing as many potential new couples (the matching of which was determined by personality and social analysis) together by having them mysteriously meet at locations where they were automatically being directed towards.
Ok, that is a rather romantic, respectively big-brother-fanning idea and not nearly accounting for the potential of stage 3 automation. But as I said I am short of stage3-imagination myself… and well, this might be a fun start of the annual data automation tournament :)

I think if broadcasted and executed well it (not necessarily the strange example above) could have a big impact in terms of enriching the so called “public debate” on the matter of “big data” – people would finally have realistic fuel for their imagination instead of being pushed around by everything between idealized pro-marketing and hysteric contra-paranoia.

I would prefer seeing such mass-educational/explorative measures being taken by governmental organizations. Either by a national organization like the “German Office for Information Security, BSI”, the “National Cyberdefence Centre, NCAZ” or by an EU organizations like the “European Cybercrime Centre, EC3” (I doubt the US would currently have the credibility to host such activities). But if not government, than grassroots; Anonymous, the Chaos Computer Club…?

Tabs i currently have open on that topic:

EU-regulations are long-term upgrades

Lately I can’t help but constantly see analogies between programming, politics and law.
Recently I was in the situation that many programmers are finding themselves in from time to time. I had to upgrade a system gradually, meaning the old version had to keep functional while I was building the new version underneath it. Then one piece at a time I could switch off the old parts and have their function be taken over by the new code – if something brakes I can quickly locate the source of error because I go step by step. Much much safer than trying to jump straight from old to new.

As you can imagine, during the transition the volume of code is twice (or even more because of temporary transition helper methods) the amount of what it was in the old version and what it will be in the new version. And depending on the degree of difference in the concepts of the two versions you might have fundamentally different logics operate side by side. For me this feels like building up to a complexity that can be absolutely exhausting and error-prone. A bit like having fever while having to concentrate really hard.
So it is SUCH a relief when I can finally comment the old code out one by one and eventually erase all traces of it. It’s like a healing process for the code that I was the doctor for! :) I know it is now set up more elegant, robust and future-proof and this feels really really good.
Maybe the emotional (and functional) curve is comparable of having a difficult conversation with your partner. While it happens it might feel overloaded and heated – but once you settle on new common ground it feels (hopefully) like a meaningful relationship-upgrade.

Ok, where am I getting at?

I see the described dynamic as parallel to what is happening in many European minds and anti-EU parties regarding yet another highly granular “regulations from Brussels” that should be a sole matter of “national regulation” instead.
I don’t think that way. On the contrary, I am so happy about every little piece of regulation that comes from Brussels because it reminds me so much on my code-upgrading process. I can’t wait to experience how things shift to more elegance and higher integration as the old “code” breaks away while the new is being build.
Yuval Harari writes in his excellent book “A brief history of humankind” how the one trend in human history that is evident over the centuries is the trend towards unification of humanly created cultural structures. I totally support the idea of a strong Europe and see the regulation-pain as part of an upgrading process.

Happy voting tomorrow :)

it’s boring to be predictable

While I understand and partly agree with the critical voices regarding ever-increasing data collection I want to add a perspective that I didn’t hear in that discussion so far. It might sound naive or even awkward in that context, nevertheless I think it deserves a place.

Let me start like this; what are some possible counter-measures if someone knows something about you and is about to benefit from that information or cause you damage? For instance someone knows about your affair and you are afraid your partner might hear it from them.

1) try to silence the information-holder by begging, offering something in return or by threatening
2) expect the breach of this information and
2a) disseminate a scenario in which upon disclosure of the information it becomes part of a story that reduces the harm or even flips things around
2b) discredit the messenger for
2ba) the very reason of harming you in this way
2bb) any other reason
3) “come out” and share the sensible information yourself in order to get ahead of the information-holder, in that way the disclosure and its context is controllable

Afterwards there are different ways to deal with what happened.

1) be angry and plan ways to harm the messenger in return
2) deny the truth of the revelation and live with that lie
3) find and reveal proof that the information-holder acted with bad intentions when exposing you and therefore his action might even be more to be condemned than your secret
4) accept what happened and integrate it into your personality

Ok, these lists are definitely incomplete and relating to data aggregation mechanisms online is very well a different story. You can hardly blame an individual with bad intentions against you in particular on the other end. It’s more rigorous business plans and harvesting from the masses. Nor do you often have the slightest idea what “they” are looking for and what “they” could use it for.

What I find interesting though are point 3 of the counter-measures in combination with point 4 of the ways to deal with it.
Am I totally off or overly romantic by assuming that an increase in observation/tracking CAN also cause humans to develop more “transparent personalities”? Meaning less secretive, more likely to admit mistakes, to reveal intentions and face their own ballast in terms of trauma, loss or disease.
The scenario here would be: “Well, if AI-algorithms in some dozen data-centers around the world concluded that my shopping-behavior is likely to be influenced by a loss I suffered in early childhood, than I could just as well stop hiding that painful fact from my friends!?”
Or to say it in an image; the stuff we don’t like anyone to see suggests itself to be pictured as some dark sea behind us. Data collection mechanisms use razor sharp math to find patterns in your traces – they shoot myriad laser-beams into your dark sea. You could rightfully see this as intrusion and be furious about it – but you could also use this as trigger to “light up the darkness” (oh, so cheesy) of your sea and maybe dry some of it out?

But again, it might be naive to even talk about character development as an effect of algorithmic intrusion of your life. After all we really can’t tell what the mathematical masterminds (meaning both humans and self-evolving AI) have build, are building and will build to harness the data of billions of people and systems.

A different angle, but also targeting character development. I personally think it is boring to be predictable. I remember scenes in my past where I was either being told that this or that of my thoughts and action was easily predictable, or I noticed that it was – and I just don’t like it. I don’t like to be swallowed entirely by someone else’s intellect, it provokes me and eventually stimulates my sense of competition and getting better at something. I would imagine many people “have this symptom” in different degrees? In that way to be confronted with ones own predictability can be an engine for development, no? The scenario here would be: “Oh well, if I am just this consumer-sheep that is all to easy to be targeted with ads than I’ll better upgrade my preferences or complicate some of my consumer-patterns – let’s see if those algorithms still ‘get me right'”. But even if that kind of competition wouldn’t be stimulated – I would hope being confronted with ones own predictability is at least an engine for reflection.
So, one government-strategy of ‘maturing your citizens’ could be portals aka “show me what you can do with my data” or “who am I under the lens of your data”?

powerful context = the right data structure

Very much enjoyed how Alan Kay (probably in 1987) is building up in this talk with a nice example towards context that is doing most of the thinking for you and then compares it to choosing the right data structure that helps computing results just through their inherent structure.

I transcribed the three crucial sentences starting at 3:55:

If you want to be good at solving a problem and acting much smarter than you are then you have to find your context so its gonna do most of the thinking for you.
Most computer scientists know this because it goes under another heading called “choose the appropriate data structure before you start tinkering around with the algorithm”. Find the right data structure, it will have most of the results computed almost automatically as part of its inherent structure.


– – Addition Feb 22, 2014

Also enjoyed this statement from Michael Blaha in this interview about UML for Database Design:

Often the most difficult aspect of software development is abstracting a problem and thinking about it clearly — that is the purpose of conceptual data modeling.
A conceptual model lets developers think deeply about a system, understand its core essence, and then choose a proper representation. A sound data model is extensible, understandable, ready to implement, less prone to errors, and usually performs well without special tuning effort.

caring for the quality of neuronal wiring

Recently I had a dream that span across a whole morning through several wake-up/half-awake/fall-asleep iterations. It was about trying to understand an algorithm; some sort of long-winded structure that describes a mechanical process. I can’t quite remember. Anyhow, what I do remember was my ongoing attempt to grasp what this thing does as a whole. I remember sensations of understanding parts of it but failing to comprehend how the parts interact. I remember sensations of “comprehension-pockets” blurring into confusion again when trying to focus on the next higher level of functionality. Interestingly I showed more persistence in this ongoing mental scenery as I probably would have demonstrated when facing a challenge of that type while awake. There was no frustration building up – it was more like a stoic inquiry into the right sequence of understanding parts that would lead to understanding the whole…

I’d like to use the incident of this dream to share some thoughts I am having as a result of the last few years in my biography that are best described as follows: running from academia, being immersed in “alternative” (experience-based) education and since winter 2012 being back on track with computer science and mathematics in an academic context.
At this point I can best follow through with this intention by posting a revised version of an email I sent to the Viewpoints Research Institute in Los Angeles a few months ago (as a motivational letter explaining what attracts me to them for my internship in summer 2014 – I ended up choosing an internship at the LRZ but would be very happy to have the chance at some point to join forces with the VPRI, in particular to work with (=learn from) Alan Kay).

I’ve been fascinated with math, programming and algorithms since I saw the mandelbrot set the first time as a result of my own code in the middle of some night during my time at secondary school. […] As a result of the past years I got very sensitive to what real content created through deep original thinking is in comparison to shiny packaging (which does have its place though) of the work of someone else or even just the repeating of phrases within a social club.

I am too young to have experienced the “early days of computation” myself – but I am eager to comprehend this storyline not just in names and pictures (that too), but also by really understanding what kind of possibility spaces were unlocked through the contributions of the various people thinking in this sphere over the last decades and how they were building on the shoulders of thinking from the last centuries and so forth. There is more to computation then evermore faster processing units and evermore satisfying customer-experiences. And I have a feeling at least the past two decades have mostly focused on that business/excitement-part of computation. However, I am not bold enough to claim to know what that untapped potential is all about.

The second point is the focus on learning. I am a very visual thinker and need to “see” the patterns and dynamics of things to tie new knowledge into my brain. It makes me curious what can possibly be misunderstood about a particular problem because understanding possible gaps or “bad” entry-points requires comprehending “the space around” a particular problem. Like Alan pointed out in his presentation about conveying Pythagoras’ theorem; there are many different conceptual approaches how to go about explaining – but each of them has a different value attached as to how well it prepares the brain on the receiving end for similar problems in the future. Most approaches can help understand this particular problem for the next exam – but might very well cause confusion later on when other problems are explained with a very different conceptual strategy. On the other hand can an approach set a “neuronal base” for thousands of other problems to come in theoretical and very practical situations that a person might encounter throughout a life-time!

I’d dare saying to care about the quality of the knowledge-graph in the brains of people is the most important investment we can make as a society. And it is also the most expensive investment – because just to be open to (wanting to) hear anything about Pythagoras’ theorem you need a stable infrastructure surrounding you that allows for peace of mind in terms of covering basic and social needs. It can not be underestimated how much an open, curious and peaceful mind is worth in very real costs when seen statistically across a society. However, as a result of that rich infrastructure it becomes increasingly important among those who benefit from it to deal with the various distracting elements (entertainment, status…) long enough for new knowledge to get anchored. A brain after learning something is not the same as it was before. The neurons are physically restructured. It is that focus on salience in patterns that sets apart the noise from the trajectory into the future. Not in a sci-fi / trans-humanistic / futuristic / singularity / whatever-way, but in a very real causal way.

teaching-materials based on what you already know

I am learning C++ and JavaScript right now. I already know Java to some degree. I would love a thoroughly curated tutorial/explanation-site that offers me teaching materials based on what i already know.

Not necessarily restricted to programming languages. First step would be to find out what the person knows through direct checkboxing or through test-questions. Next step is to find out what the person wants to learn. Then search for tutorials, teaching materials, analogies, stories etc. to match that specific mapping of  “already-know to want-to-learn”.

Of course such a mapping-database has to be build first. So i’d start with a large survey where people submit their mappings. Then there would come a phase of identifying people who can create (or select / adapt existing) teaching materials for all the categories. So an experienced C++ and Java programmer could sit down and think what kind of snippets, analogies or tutorials could help someone who already knows Java and wants to learn C++…

Anyone who has gone through learning material can tag it with other potentially relevant mappings (“escalate the relevancy of the content vertically and or horizontally”) a’la “i think this material could also benefit people who want to transition from a meat-based diet to a vegetarian diet” (unlikely that this example fits together with the previous C++ > Java example… but who knows).

Playlists could be assembled a’la “this really helped me to get to B based on the A that i already knew – now i can easily get from B to C and i recommend this order of learning instead of trying to get straight from A to C”.

Of course also some gamification with rewards and stuff that incentivises both the creators of teaching materials as well as the learners.

I think if you can tie the desired learning very specifically to things your student already knows you can get very efficient in building new territory in the knowledge-graph of the students mind. Somewhat like a personal master-student setup but on large internet database kind of scale?

[Addition Dec 29, 2013] I would like to add a critical voice to this idea. “Sideloading” knowledge by specifically building on what you already know is great for quick access – for hacks. Nevertheless there is a strong point to make about knowing something from the ground up and learning the language and metaphors of a domain independent from access points it may have to other domains…

memorization tokens

One of these situations that makes me stop and smile about myself and reflect about patterns, habits and just how i (and possibly everyone else) work… and it’s time for a fresh post here – so let me share the story with you.

Just when i switched off my computer at work i remembered something that i wanted to do online. Ok, i’ll do it first thing at home. But how to remember? Yeah, i could send myself an email from my iPhone as i have done it thousands of times before. But hold on – actually there is a little physical thing on my desk that will most definitely help me remember once i retrieve it from my trouser pocket or backpack at home. But what now – pocket or backpack? Which option is more likely to yield the desired effect; namely the token “falling into my hands” at home without having to remember that it is there? How can i outsmart my future forgetful self basically. I choose the pocket – knowing that i might not have reason to check my pocket again right this evening, but surely at some point throughout the coming days. The latest before the next washing, because that (pocket-checking of trousers before putting them in the washing machine) is a routine i can (almost) certainly trust my future self to follow. So i think to myself: the future-reminder will definitely not be lost, it will trigger the very latest in about a week from now. Which i deem acceptable as the matter isn’t that urgent.

Alright then, i bike home along the Isar river through a rainy bit still lovely Munich. My thoughts wander around and the memorization in question here is quickly lost. UNTIL…

I come home, unlock the door… and have a funny moment of thinking where to put the key before taking off my shoes. Because apparently i learned from my forgetfulness of the previous weeks where i repeatedly left the key after entering the apartment either on the entrance-shelf or on the kitchen-shelf. After taking of my shoes i would forget that i put the key there and it would stay there until the next morning when i would have a short freak-out moment searching for my keys when packing my backpack for the day. Sooo, i decide to put the keys in my pocket… and GUESS WHAT i discover in my pocket when putting the keys in! Yep, the memorization token, right there!

To sum up; the result of a learning-loop regarding key-placing-forgetfulness lead to the immediate (instead of max. one week until the washing machine) retrieval of the memorization token. And as a result to the online action that the token was “charged with”.

Cool, check, done, NICE! Then i went for a run along the Isar and found myself continuously amused by this story and thinking about how clever it would be to analyze ones daily workflows/routines with the intention to identify these “pockets” (now meant abstract, not (necessarily) the trouser pocket). Pockets where memorization tokens can be placed physically (or virtually) because of an action that you will reliable perform in the future within a certain timeline and that’ll make you “automatically” retrieve that very token again… strategically capitalizing on your reliable patterns to memorize things in a way where you can forget them with the comfort of knowing that they WILL find you again.