What do you get if you cross anti-corruption activists with mathematicians? It’s no joke. This was exactly what we did in Cape Coast, Ghana, recently, bringing these two groups together to analyse procurement data for evidence of corruption ‘red flags’ in a two-day hackathon at AIMS Ghana.
By Liz Dávid-Barrett (University of Sussex) and Mihály Fazekas (University of Cambridge).
Follow us on Twitter @corruption_red
The anti-corruption world has a tendency to pin its hopes on transparency as a solution, and recently in particular on the idea that big data will revolutionise the fight against corruption. The logic is compelling. Whereas we always relied heavily on subjective perceptions to measure corruption, big data allows for more objective evidence about how – and how much – administrative procedures are subverted and manipulated.
But some profound obstacles remain. The most obvious is that, before data can be analysed, it needs to be collected. Yet many governments – especially in developing countries – struggle to collect relevant data of a reasonable quality.
Another problem is that we need people to analyse data, and they need to know what they are looking for. That means knowing the local context and the favoured tricks for conducting and covering up corrupt behaviour.
Our aim is to bridge these two gaps. In our research, funded by the British Academy/DFID Anti-Corruption Evidence programme, we collect and analyse data about public procurement for corruption risks. We also collaborate with the African Maths Initiative, who have developed a free user-friendly software platform for statistical analysis. The software makes it easier for groups to analyse our data using our method, and is an important step in turning transparency into accountability.
We look for red flags such as tenders being advertised only for a very short time (which benefits insiders who had prior knowledge), non-competitive procedures being used (often invoked on spurious grounds), or supposedly open competitions that attract only one bidder. While none of these is necessarily evidence of corruption, analysing big datasets allows us to spot patterns which – at the very least – flag up areas that warrant further investigation.
We initially focused on development aid that is spent through national procurement systems, collecting a database of more than 500,000 contracts, in more than 100 countries and covering almost a 20-year period. We share this data openly and have used it to analyse how the local regime type and changes in procurement rules influence corruption patterns.
We then worked with the African Data Initiative to develop a procurement-specific menu on their open-source statistics software, R-Instat. They originally developed R-Instat to address the need for greater access to statistical analysis tools in Africa, and have used the tool for a number of other development goals, most prominently, for helping farmers to analyse and act upon climate data.
The procurement menu on R-Instat provides a drop-down menu, on which users will find simple ways to analyse the data (for video guides, see here, here and here). We have uploaded two datasets of World-Bank funded contracts across a wide range of developing countries and many years. You can choose any one of our ‘red flags’, and easily see its prevalence in a particular country over time, or compare countries or sectors in the same year.
Visualisations are important in data analysis, so R-Instat also includes many graph and map options as well as tables. In our Ghana workshop, we brought the data, the method, and the software and added two more key ingredients: anti-corruption activists from Ghanaian NGOs, who know all about how politicians and public officials manipulate the procurement process to divert public money into their pockets; and some of Africa’s best maths students, studying at AIMS Ghana, who know how to use statistics to analyse data.
Over one-and-a-half days, working in interdisciplinary groups, the teams analysed our donor aid data using R-Instat. Even in such a short time, they were able to produce graphs comparing the prevalence of red flags across countries or over time. They learned that there are different ways of counting contracts – by number or by value – and these might yield different results. And they noticed that sometimes a lot of data are missing, which might be a red flag in itself.
By the end of the workshop, the groups had started to delve deeper into the data . They spotted surprising patterns and sought to understand what might lie behind them, or used their knowledge of how corruption occurs to think about how red flags might relate to one another. Are companies that win because they are the only bidder more likely to be registered in offshore secret jurisdictions, for example?
Playing with the data is easy in R-Instat, and that means that users can easily experiment, test ideas and theories, or look for evidence to support or refute rumours. As one of the participants said afterwards, “now we can talk about corruption in this country in a more focused, targeted and sensible manner…cos too often the overly political and emotive discussions tend to obfuscate the real issues and even impact”.
The workshop helped show what can be done with evidence. We still need data as a starting point, but by showcasing what can be done with the right tools, we also hope to create advocates who will lobby governments for greater transparency as well as for tighter control over how public money is spent.