Published November, 2015

Every year, public companies report a single 10-K and three 10-Qs; each of these reports contain facts that are duplicates of facts reported in previous filings. XBRL US analyzed filings for 498 companies[1] representing the market to identify how public company disclosures can be made more effective.

Key findings from the research:

  • Significant amounts of duplicate data suggest ways to reduce the burden on preparers:
    • Data reported by these companies for the most recent 10-K included 1.244 million numeric fact values. Of the values reported in these submissions, 574,622 facts were reported for the first time. The remaining 669,016 numeric facts (54%) had already been reported in previous filings submitted to the Commission.
    • Further analysis demonstrates that fact values are often reported multiple times throughout a company’s historical filings. Of the recurring 669,016 fact values reported, these same facts appeared 2.124 million times across all filings submitted since the start of the XBRL SEC program (large companies began filing in XBRL in 2009; smaller companies began filing later due to the phased implementation of the SEC’s XBRL program).
    • On average, each numerical fact value was reported 3.18 times.
  • Certain industry schedules are labor-intensive and likely not useful in HTML format. One example can be found with  Real Estate Investment Trusts (REIT) required to file SCHEDULE III REAL ESTATE AND ACCUMULATED DEPRECIATION listing property holdings and associated properties of each holding. Providing this in a paper based format is impractical for a user of the data to consume and tedious for a filer to prepare.
    • REITs within the analyzed datasets were identified by filtering on SIC code 6798. The six companies in this industry averaged 6,316 facts versus only 2,497 for all companies.  Although these values are rarely repeated, they are very structured and easily reported in an electronic filing.
    • For example, Realty Income Corp reported 28,795 fact values in their latest 10-K. The bulk of these reported facts appear in SCHEDULE III REAL ESTATE AND ACCUMULATED DEPRECIATION of the 10K filing.

The burden on public company preparers could be significantly reduced if preparers were only required to report numeric values that are new when a new filing is created; and there would be no impact on the data available to investors. Similarly, by eliminating the HTML requirement for data that is difficult to consume in paper format, would also reduce the workload for public company filers.

How was this data compiled

The data used for this analysis came directly from XBRL filings submitted to the Securities and Exchange Commission (SEC).  When a company files with the SEC, XBRL  US automatically downloads that data into a Postgres database.  As part of this process we automatically assign an identifier to each fact that identifies it as a unique fact value with a hash.  If the same fact value is reported in a subsequent filing it will have the same hash.  This means we can count the occurrences of a given hash in the XBRL US  database to identify how many times a fact value has been reported.

For this analysis we identified the latest 10-K for  companies in the S&P 500, and counted the number of numerical fact values in that filing.  From this filing we also had a list of hash identifiers. We then counted how many times those same hash identifiers appeared in previous filings.  By counting these we can determine the number of times a fact has been reported in earlier filings, what fact has been reported the most times and the total times a given hash identifier has ever appeared.

The hash identifier is comprised of the following components:

  • Element Name
  • Dimensions
  • Period Reported
  • Decimals
  • Units
  • Reporting Company (CIK)

Download zip Duplicate Values Reported by Public Companies 11-30-15 published November, 2015.



Other Posts