No, this is not another reflection from my research on economic data on African economies, but the allegations by Chris Giles in the Financial Times against the inequality data used in Thomas Piketty’s bestselling book. Giles shows that Piketty has indeed made some questionable choices in compiling the data for his historical series.
Piketty has responded. Not very convincingly, and not addressing the claims made be Giles about deliberately changing the data. Matt O’Brien at the Washington Post has looked at the different series and says that “they’re embarrassing, but they don’t change the big picture.”
1) The mistakes in the data are not as serious as they were in the underlying data used by Kenneth Rogoff and Carmen Reinhart for their book This Time Is Different: Eight Centuries of Financial Folly, but the Piketty defence is equally weak. As also Tyler Cowen notes, it is not good enough to say that the data does not matter because we all know that the argument is correct by and large. That is simply not true. Both Rogoff and Reinart, and Piketty did derive most of their credibility by backing up the arguments with data and more data. When these are wrong it defeats the purpose. If “everyone” agreed on the stylized facts than the so called factual statistical analysis was not needed in the first place.
2) Research based on erroneous and missing data is done all the time. And most of the time researchers get away with it. And even when grave weaknesses in the data are documented, it is demoted to the footnotes, and the big conclusions live on. The dispute on the settler mortality statistics between David Albouy and Daron Acemoglu, Simon Johnson, and James Robinson is one such example. As I have pointed out in Poor Numbers: How We Are Misled by African Development Statistics and What to Do about It the prevailing consensus seems to be that bad data are better than no data. In this sense evidence is just an afterthought. (Matthew Yglesias makes the same point here).
3) Certainly there are genuine trade offs between no evidence and speculative analysis based on weak data or even produced data. The response from Piketty I have most sympathy for is this one:
I certainly agree that available data sources on wealth are much less systematic than for income. In fact, one of the main reasons why I am in favor of wealth taxation and automatic exchange of bank information is that this would be a way to develop more financial transparency and more reliable sources of information on wealth dynamics.
It is important and true that if we limit our analysis to the officially recorded information we may end up ‘seeing like a state‘. Moreover, that is why debates like those around the data revolution are so important – evidence collected by states today will anchor policy and academic debates tomorrow.