EU Transparency

Budget monitoring and data standards

Tags: , , , , , , , , ,

Effective citizen budget monitoring requires data – and lots of it – but just as important the data needs to be accessible and comprehensible. Thanks to the European Transparency Initiative driven forward by Commission Vice-President Siim Kallas, the EU has come a long way in a short time on accepting the principles of budget transparency, in particular through the new rules requiring the publication of end beneficiaries of EU funds. Where the Commission’s transparency advocates appear to have taken their eye off the ball when it comes to how these new rules have been implemented.

Taking the case of EU farm subsidies, the implementing rules require each member states to maintain a website database where citizens can search for recipients of farm subsidies and find two pieces of information: how much they get and the municipality where they are located. It is a shame that there is no requirement to explain why the money was paid. As Agriculture Commissioner Mariann Fischer Boel told a audience in July 2006,

“Telling the public about who gets how much money is only half of the story. The other half is explaining what the money is for.”

Gaps in the data aside, the main problem with the ‘web platform approach’ favoured by the Commission and many other public institutions is that the data is locked up in a website and so it’s not possible to analyse the data in its entirely. For instance, to find out simple things like which region got the most – or the least – in farm subsidies, or who are the largest recipients or how the payments are distributed. These basic tasks of analysis require having the entire dataset in one place, so it can be analysed using any one of a number of statistical software applications. With web platforms, the data is presented in the way that the government wishes it to be presented. The citizen, and the external budget monitor, is disempowered.

Raw data is much easier to process than inaccessible web platforms

High quality data holds the key to budget transparency

It would be a trivial task for the government to publish alongside its web platform, the raw data file, but only in a few cases have member states chosen to do this (e.g. the UK, Czech Republic and Belgium). The Commission’s own Financial Transparency System is guilty of the same sin of only providing a restricted web platform and not making the entire dataset available.

There are reasons why governments might remain keen to build a web platform, for instance they may not think that there is sufficient civil society interest and capacity to do the job better, and they may wish to present the data in context and alongside explanations and provide tools for interactivity and feedback from citizens. All well and good. But if a government goes down this path, it should be required to create an open system with public access to each layer of the website – data, analysis and presentation – as described very well by Richard Allan, Chair of the UK Government’s Power of Information Taskforce (see also the illustration of such an open system below)

Access should be provided at all layers of websites that publish government data

Access should be provided at all layers of websites that publish government data

Faced with an inpenetrable web platform, the prospective citizen budget monitor’s only option is to ‘screen-scrape’ the government website. Screen-scraping involves generating an automated routine that queries every single record in the database and records the output into a data file. It’s a highly skilled activity and the domain of a small handful of computer programmers. Nils Mulvad and Simon Roe of are both proficient at screen-scraping. So too is Julian Todd, whose work includes tracking votes in the UK Parliament and administering a healthy dose of transparency to the United Nations and Richard Pope, who runs – a fantastic site that helps people find out about applications for new buildings near where they live. But their labours would not be necessary if governments took the simple and costless step of publishing the raw datasets that lie behind their web platforms in a simple, machine-readable data format (such as XML or CSV).

As well as the barrier of restrictive web platforms, another big obstacle to accessing budget data in the EU is the way so many public authorities choose to publish data in formats that are not accessible for analysis and re-use. Governments in Spain are very fond of publishing their CAP payment data in PDF files that can run to thousands of pages and are next to impossible to convert back to a data file. Many governments are publishing details of expenditure under Structural and Cohesion Funds and the European Fisheries Funds in the form of PDF files. It is very difficult – and sometimes close to impossible – to extract the underlying data from a PDF file. The cynical part of me thinks that perhaps this is an intentional decision on the part of governments who wish to comply with the letter of the laws on transparency but want to make sure that citizens are prevented from analysing the data for themselves.

The current edition of the Economist features an article praising budget transparency in the United States but also describing how it is possible for a politician to come unstuck if their transparency measures are not implemented properly by officials:

“On the campaign trail Sarah Palin sometimes bragged that she had, as the reforming governor of Alaska, put the state’s books online. She did sign the legislation, but the result is a clunky collection of spreadsheets and PDFs.”

So what can be done? Well, for a start, those in the Commission responsible for the European Transparency Initiative should set down some technical guidelines, starting with the requirement all data should be provided in a raw form, fully machine-readable. What does that mean? Tom Steinberg of mySociety suggested to me the following, which I think is at the very least a good place to start:

“Data must be made available in electronic formats that separate each piece of information into discrete, appropriately categorised units which can be automatically imported by a computer into a database”

That means No to PDFs, No to clunky and inaccessible web platforms, No to Word Documents. It means Yes to XML feeds and Yes to CSV files. Following these rules actually requires less work on the part of governments and should cost taxpayers less money. In this new era of budget pressure, that’s something that everyone can welcome.

The same principles should apply to civil society websites that reuse government data. is now three years old and currently being rebuilt from the ground up, with faster search performance, new tools for user generated content and a fully-featured API that will allow anyone to make use of the underlying dataset for their own purposes. It’s our intention that the site will meet the very highest standards of accessibility, performance and openness.

Tags: , , , , , , , , ,

4 Responses to “Budget monitoring and data standards”

  1. david osimo
    on Feb 10th, 2009
    @ 1:55 pm

    Great stuff. In Brussels on March 16th you will be able to meet Jose Alonso of W3C who’s active on this theme. Plus my old article on measuring transparency deals exactly with this issue, using the 8 principles of govt data as reference. Machine-readable data is for me the key term.

  2. Simon Roe
    on Feb 12th, 2009
    @ 3:40 pm

    I would add that a standard format is important too. If every country publishes its data in a machine readable way, but there is no easy way to compare the data (one publishing postcodes, one lat/long, one publishing in Euro, the other in the local currency etc) then the job analyzing all the data is only slightly easier.

  3. How to spend €10 billion |
    on Apr 27th, 2009
    @ 4:59 pm

    [...] The European Commission’s own Financial Transparency System, which publishes data on end beneficiaries of a range of grants and contracts, was launched at the end of 2008, but with a fatal flaw, common to many government data websites. The website only offered a web-form based interface and there was no way to access the entire database, without building a screen-scraper. The Commission has since made amends and a couple of weeks ago it published the whole data set in not one but three formats: CSV, XLS and XML. Three cheers for the Commission – at least someone is listening to my plea for better data standards. [...]

  4. Lots of new farm subsidy data now online |
    on May 17th, 2009
    @ 8:27 pm

    [...] For more on the need for European countries to do better when it comes to releasing budget data, read my article over at  [...]

Leave a Reply