Associated Whistleblowing Press Interview
Below is an interview about the leaking process of the Associated Whistleblowing Press. This interview was conducted in two parts, the first on 11/24/2012 and the second on 11/26/2012. The jabber instant messenger was used for the interview and it has been edited from log form for readability but the questions and responses are exactly as provided.
 Part One
M.C.: So I've looked at AWP some already but I was wondering if you could tell me a bit more about and its goals. Specifically, what would the ideal outcome of release of a set of documents be for AWP?
AWP: I think the main point of the AWP is the network structure. Even though we have an international focus, as in, we work on the Internet, we want to work with local communities. In this sense the ideal outcome would be a set of documents that could empower local actors to face issues in their communities. The point is that information on a local scale can be more powerful than on a globally, maybe more abstract level.
By all that I mean that the outcome has to be an improved public control of institutions, accountability for their actions and so on...
M.C.: ah, makes sense
AWP: Like for example, massive releases make an impact on public awareness, but if the release is small but contains relevant information, it will have a definite impact on a local level… And by local I'm referring to anything from national level to cities, towns, neighborhoods, organizations or events (a period of elections, for example).
And I guess the effect could be stronger as closer it gets to micro level hehe. If you can prove fraud during an election it will have a strong effect, for example.
M.C.: huh, that's interesting. Most leaking sites seem to be on the global or national scale but I've noticed a few on the local level. Since you work with local communities, how do you determine what will be relevant on the local scale? And do you work with people in those specific communities during the pre- and post- release process?
AWP: As we think in terms of network, each local whistleblowing platform (as we call them) is a node in the network and we are prepared to give them a certain level of autonomy. We understand that they are the most qualified people to discern what is relevant or not. Obviously our relation must always be based on trust. Right now we are preparing a node in Spain, first of all it will be national-level, although we have discussed making smaller nodes for certain activist platforms. The work we will always have to do with them is set up a functional working group: get them involved in the process and start building a network of people who will participate in moving the project around in the media (traditional and non-traditional), managing any information we receive, analyzing and publishing. Post release is an interesting question and we have different ideas on the method. Our hope is to incentive a collective response to any issues, depending on the scale of course. We have this idea of trying to bride the gap between news and action.
"but I've noticed a few on the local level." Which ones do you mean?
M.C.: I don't know how active they are and I haven't been able to reach them yet but there seem to be several country specific leaking sites and a few for major cities. I saw most of these on LeakDirectory, which seems to be down right now
AWP: Oh ok thanks for the tip, we're looking to collaborate and speak with other groups and so on.
M.C.: But I remember several leaking sites in Canada, a few in the UK, one in Israel, and there others too. Most were country specific but there were a few for major cities, the one that comes to mind first is BaltiLeaks. I'm not sure how active all of them are though
So generally do you or the local nodes initially receive leaked info? Or both?
AWP: Ok, will investigate. I think we know who is listing on LeakDirectory, we'll ask.
All info goes straight to our central servers and then we share the info for analysis. The people we trust the most in each node will help with verification and with organizing a team for analysis.
M.C.: ah, okay. And after you receive info what steps does it generally go through to prepare it for release?
AWP: We have a set of basic data management policies which will be found constantly in all our sites, before submission etc. Every node has to follow these guidelines even though they are essentially self-organized and relatively autonomous. Basically we state that we'll only accept material that serves to prove corruption and abuse, that is no first hand accounts or rumors. Afterwards we have to verify the material working with the nodes, considering possible forgery, use, reason or cost for faking information. We'll ask for external opinions if necessary. Then we have to redact certain information to protect violating people's privacy, if they appear in the information and are unrelated for example. We also have to clean the files' possible metadata - for source protection etc. Then analysis, which ideally would be done in a hard news format: when, where, why and so on and publishing through our media contacts. You have a full version of these policies on our site if you want to read them
 Part Two
M.C.: Your website mentions that you cross check documents and gives some of the things you consider (motivation for forgery) but is there any consideration not listed there? And generally what tools or techniques would you use to verify its accuracy?
AWP: You mean considerations strictly related to cross-checking?
M.C.: cross-checking or verification of documents authenticity
AWP: There are both conceptual tools and electronic tools. Conceptual cross-checking means detecting the accuracy of the document regarding historical facts, actors and their behaviours. Although the central node of AWP provides guidance on this type of cross-checking, it is a work to be done by the local working-groups. Our feeling is that local community is more prepared to detect, conceptually, whether a submitted material is accurate or not.
As for electronic cross-checking, there are tools to and practices in order to detect the authenticity of scanned documents, photos and videos. This is a work mostly to be performed by specialists in the field of forgery, who we are in contact with. But actually I consider that the most relevant and decisive task regarding cross-checking is performed in the conceptual checking, since there is a point were electronic forgery is almost perfect.
There are also some ways of cross-checking which are performed after the document is published.
M.C.: ah, what are those?
AWP: It happens when the part affected by the leak publicly stresses that the information is authentic, directly or indirectly.
M.C.: ah, right, like if you get a request by the organization it was allegedly leaked from to take it down?
AWP: Not only a request directly sent to us, but there are cases when organizations publicly admit that some of their classified information was leaked.
M.C.: Do you have any way the people looking at the document can tell if an organization has done this or if it got past electronic and/or conceptual cross-checking?
AWP: Sorry, can you re-structure the question?
M.C.: Sure. If someone is looking at a document, could they tell if an organization has said directly or indirectly that the information is authentic? Or if it got past your cross-checking?
AWP: We will only release documents which passed by our standards of cross-checking. So if the document is published by us, it will be implicit that it passed by our cross-checking. As for confirmation from the organization where the document was leaked from, we shall point out this confirmation in the same 'release page' of the document. Documents will have a 'front page' where this kind of information will be displayed. The 'cross-checking seal' of the AWP and possible information about organizations' confirmation of authenticity.
M.C.: ok, great. So how do you clean up metadata from different types of documents? What tools do you use to do this and do you have to do parts of this by hand?
AWP: This kind of information we are not able to disclose publicly since it could facilitate forgery. But there as several methods of 'metadata cleaning'.
Most of it is indeed manually done.
M.C.: ah, okay. I was wondering if it was automated if there might even be a way to do it before the document is sent
AWP: There are tools available on Internet which could be used by potential whistle-blowers, indeed
That is also an idea that we could implement
M.C.: You mention a two-part analysis step on your website. How long does each part of this step take? And how much does it vary between different types of releases? Also, is this done primarily by the local groups or AWP?
AWP: Part 1 strictly depends on the kind of batch. It takes longer as bigger the batch is, since it consists on classifying and tracking the batch. Part 2 is about news-making and should take 'standard' time of writing a journalistic piece. Some journalists take months preparing an investigative piece, others can take 1 or 2 days. As for what we are going to publish in our newsroom, http://whistle.is, will be 'hard-news' which usually take no more than 2 days per subject.
As for how long media partners will take to write a piece, we sincerely can't tell.
We are studying ways of providing crowdsourcing tools where journalists and public can investigate and publish together on big batches.
M.C.: ah, have you found anything useful in crowdsourcing so far? I've been wondering about how viable that is for a while
Also, is it mostly the media partners who do the second part of the analysis step?
AWP: We are going to launch a tool for crowdsourced Cablegate analysis and newsmaking this week. We plan to have it as a study-case which could be implemented to other databases and batches in the future. Yes, tools are still to be created, since all this is really new in terms of historicity.
- dont publish it before 28nov 2012 *** http://cablegate.awp.is
oh wow, that looks interesting
AWP: we are still fixing minor bugs, text and presentation as well as increasing the database. but that is the major idea.
M.C.: I'll definitely be keeping an eye on that to see how it goes. The main other similar site I have seen is Crowdleaks but that looks like it is also under development
AWP: Yes, unfortunately. I never saw Crowdleaks really in activity.
- you might find it interesting as
well. this is the graph tool we will have attached to each cable in the cablegate tool. this graphic is the graphic for 08beijing3055 for example: http://126.96.36.199/cab/08BEIJING3055.gml.html
- last link must be opened with Opera browser or latest firefox*
M.C.: Do you ever remove/redact names or other information from documents?
AWP: Yes, we redact all information regarding third parties which are not directly involved in the wrongdoing to be reported in the document.
M.C.: Is that done manually when you first read through the document in the first part of the analysis step?
AWP: Yes, exactly. Depending on the batch, i can be done in the second part as well.
M.C.: So, just to be sure I have this right, the steps you typically go through after receiving a document are conceptual cross-checking, electronic cross-checking, removal of metadata, read-through for large sets of documents, analysis of documents and redaction in this step or the previous one, and publication on the news wire? And most of these steps with the exception of electronic cross-checking and potentially some metadata removal are done by hand?
Oh, also, do you use any system for tracking which documents have been read or analyzed in the analysis step?
AWP: Yes,I think you understood correctly. But there are cases and cases. It does not mean that we are going to analyse each file separately in each batch. For example, we receive 10gb of e-mails of a certain organization. We won't cross check all of them, we will cross-check the database as a whole.
M.C.: that makes sense. But doesn't removal of names get tricky there because someone not related to the wrongdoing could easily be mentioned in such a large database of emails?
AWP: Systems for cross-checking/analysis tracking are also still to be created and should fit specific need according to each batch or database they cover.
M.C.: ah, ok. Could you go through what features might be most useful for a few different types of batches or databases?
AWP: Its complicated. I could send a mail with a list :-)
But basically it should cover all the types of fields, be searchable through these fields, have commenting features and selection of different 'status modes' for each single file. Also, the ability to cluster files into pre-determined rules (by tags, keywords, field types, size, type of file etc).
M.C.: ah, okay. If you get a chance an email with a list could be useful at some point
also, since you mentioned large databases of emails, what if some of the emails pass your moral policies but others do not or are not significant? Would all of the emails be published or just some of them?
AWP: Just some of them.
Although electronic (and in some cases conceptual) cross-checking can be done massively, documents are considered individually when it regards to moral policies.
M.C.: ah, ok
M.C.: How would you define leaking and whistleblowing? Do you think of them as synonyms or different things? And how does AWP fit into that definition?
AWP: Leaking and whistle-blowing originally are not the same thing. Recently, tho, these practices are getting closer to each other when it concerns to public opinion. I wrote something here that could be useful to understand: http://whistle.is/?p=283 . Originally leaking should be simply 'the activity to make secret information be publicly accessible' and whistle-blowing the 'ethical leak', it means, leaking with the purpose of revealing wrongdoing, crimes, corruption etc.
AWP is a media agency focused on 'whistle-blowing'.
M.C.: thanks! So I know AWP is fairly new. Have any of your local nodes released documents yet?
AWP: We currently have only 1 node, although the plans are to expand it during the next 2 months. This node, the Icelandic one, created 57 days ago, did not release files yet.
M.C.: ah, okay. I'm interested to hear how it goes when it does. Are there any tools that you don't have or don't exist now that would be helpful for you in processing documents? What would these be? And you mentioned some tools for metadata cleaning and cross-checking- could you tell me if these are free and open source or not? Also, is your crowd sourcing platform free and open source?
AWP: Our crowd sourcing platform for the Cablegate files is free and open source. As for the rest, can I think about it calmly and send to you by e-mail? Do you have a PGP/GPG key?
M.C.: Sure. And http://pgp.mit.edu:11371/pks/lookup?search=Shidash&op=index the top one there is my key (7F063537)
And I think that is all my questions unless you have anything you want to add or any questions for me
AWP: I think that is ok for now
thanks for getting in touch :-)