Science + Data Journalsim = new/s/leak

On January 1st, we officially started to build our “Network of Searchable Leaks” or, in short: new/s/leak. Our goal is to put the lastest reseearch of language technolgy and data visualization together to help journalists keeping their heads over water when meeting a dataset like the famous Cablegate. The idea is to have a network of all actors (people, organizations, places) and show who will do what, with whom, where, and when.

What sounds like magic is actually feasible using current research results: sceptics might want to look at the Network of the Day (in German), which will be the starting point for our new tool.

At some point, we want to arrive at something ressembling this sketch from our project proposal:

Wireframe from proposal

An early wireframe for our software


The first kickoff with all project players in one room happend on January 18 (after several internal kickoffs and the meeting at Datenlabor): we were all warmly welcomed and well-caffeinated guests of our Visualization Colleagues from Interactive Graphics Systems Group at TU Darmstadt. We had lots of constructive discussions about journalists’ needs, search, visual data representations, and our project name (which was the only question we had to postpone).
The most important outcome is that we are on a good way:

Four TU Darmstadt computer science students (Lukas Raymann, Patrick Mell, Bettina Johanna Ballin and Nils Christopher Böschen)  already built a prototype as their software project. It shows a network of entities from the underlying documents, together with a timeline:


The first new/s/leak prototype

The screenshot offers a glimpse on something which which could have helped the people that had to work double shifts to browse the 2 million records of the Cablegate Leaks – if new/s/leak had been around at that time already.

The next steps will bring more search functionality, dynamic changes in the network, and more data.


We made it!

Happy news: VW foundation officially decided to fund our project with the working title DIVID-DJ: Data Extraction and Interactive Visualization of Unexplored Textual Datasets for Investigative Data-Driven Journalism.
We are one out of eight projects funded as a part of the initiative “Science and Data Journalism”. Our goal is to create a piece of software that visualizes the content of large text data collections, to help journalists working with data leaks.

VW foundation invited all project partners to a kickoff meeting at TU Dortmund, where all projects were introduced prior to the “Daten-Labor” conference of Netzwerk Recherche. The project funding will officially start in January 2016.

More details come!