Data Analysis using XBRL

It’s a common financial news headline, something like ’S&P earnings up 10%’. It’s something that we take on faith, since not too many of us are ever going to fact check. That would mean going through 500 financial reports, making a nice long list, and then constantly updating it. Sounds better to just accept what has been said.

But in the past ten years, the stocks in the Standard and Poors (S&P) 500 listing have been filing in a format called Extensible Business Reporting Language (XBRL). It is intended to be computer readable accounting statements, and with a bit of work should be able to be read automatically.

So the question becomes – Could XBRL be used to see what is happening across these 500 companies, and maybe glean some insights that haven’t been published yet?

Methodology

The three components needed are:

Reading Engine -The data source will be the Securities and Exchange Commission’s (SEC) website, since all companies are required to post their financial statements here in XBRL format. It is very good at showing what companies have published recently, and they make it easy to extract the files. However, the files are not that easy to read. There is no common software that reads the files (There are up to six individual files that need to be read), and no easy way to do so. But with some programming work, I was able to read the files, separate the data into what I need, and store it.

Database - The database has to be structured in such a way that duration items can be stored for different length time spans, ie 3 and 9 months of the same fiscal year. It also needs to allow pulling out 500 companies worth of data, and allow for addition, subtraction and ratio analysis. Also will need a list of the 500 companies in the S&P 500, as well as the ability to update and change as needed.

Analysis - The hard work.

By Phil Gaiser