Home Forums The XBRL API Lists of similar concepts

This topic contains 14 replies, has 4 voices, and was last updated by  David Tauriello 4 months, 3 weeks ago.

  • Author
  • #119805

    Tim Bui

    Hello, as companies have the freedom to create unique names of their tags (concepts), does XBRLUS keep a list of all similar concepts on the edgar_db database so that the users can save time in doing standardization?

    For example, to get a simple Total Revenue, so far I found 5 different concepts:

    1. Revenues
    2. SalesRevenueNet
    3. SalesRevenueGoodsNet
    4. TotalRevenuesAndOtherIncome
    5. RevenueFromContractWithCustomerIncludingAssessedTax

    To find Nonrecurring charges (such as impairment and restructuring), so far I found 13 concepts:

    1. Assetacquisitioncharge
    2. AssetImpairmentCharges
    3. ImpairmentOfLongLivedAssetsToBeDisposedOf
    4. RestructuringCostsAndAssetImpairmentCharges
    5. ImpairmentOfIntangibleAssetsIndefinitelivedExcludingGoodwill
    6. Impairmentandrestructuringexpenses
    7. RestructuringSettlementAndImpairmentProvisions
    8. GoodwillImpairmentLoss
    9. RestructuringCharges
    10. RestructuringCosts
    11. RestructuringChargesAndAcquisitionRelatedCosts
    12. GoodwillAndIntangibleAssetImpairment
    13. ImpairmentOfLongLivedAssetsHeldForUse

    The above 13 non-recurring items are exclusive of the additional 7 gain/loss concepts below:

    1. GainsLossesOnExtinguishmentOfDebt
    2. DerivativeGainLossOnDerivativeNet
    3. GainLossOnSaleOfBusiness
    4. GainLossOnSaleOfNonstrategicBusinessesAndAssets
    5. GainLossRelatedToLitigationSettlement
    6. GainLossOnDispositionOfAssets1
    7. GainLossOnDispositionOfIntangibleAssets

    Thank you!

  • #119925

    David Tauriello

    Tim – we don’t maintain lists like the ones posted here; you’ve done a great job to create a few groups for standardizing elements and tagged it so others can find it easily and post additional groups or additions to those listed above.

  • #119958

    Tim Bui

    For trend analysis, we need to look at historical numbers. However, some companies changed the tag names so it is very difficult to pull a consistent list of numbers using the API. If anyone has a solution, please share with me.

    For example, Microsoft (CIK: 0000789019) used 4 tags for its total revenue over the last few years:
    – For FY 6/18 (which has data for 6/18, 6/17,and 6/16) it used “revenuefromcontractwithcustomerexcludingassessedtax”
    – For FY 6/17, it used “SalesRevenueGoodsNet”
    – For FY 6/15, it used “SalesRevenueNet”
    – For FY 6/10, it used “Revenues”
    Using only one of the above tags will get only 3 years of data. If we use all 4 tags, then we have duplications such as the list below. Even with 4 tags, I am not able to get Total Rev for FY2018.

    entity.cik period.fiscal-year period.end concept.local-name fact.value
    0000789019 2009 2009-07-01 Revenues 58437000000
    0000789019 2010 2010-07-01 Revenues 62484000000
    0000789019 2009 2009-07-01 SalesRevenueNet 58437000000
    0000789019 2010 2010-07-01 SalesRevenueNet 62484000000
    0000789019 2011 2011-07-01 SalesRevenueNet 69943000000
    0000789019 2012 2012-07-01 SalesRevenueNet 73723000000
    0000789019 2013 2013-07-01 SalesRevenueNet 77849000000
    0000789019 2014 2014-07-01 SalesRevenueNet 86833000000
    0000789019 2014 2014-07-01 SalesRevenueGoodsNet 72948000000
    0000789019 2017 2017-07-01 SalesRevenueGoodsNet 57190000000
    0000789019 2017 2017-07-01 SalesRevenueNet 89950000000
    0000789019 2016 2016-07-01 SalesRevenueGoodsNet 61502000000
    0000789019 2016 2016-07-01 SalesRevenueNet 85320000000
    0000789019 2015 2015-07-01 SalesRevenueGoodsNet 75956000000
    0000789019 2015 2015-07-01 SalesRevenueNet 93580000000

  • #119959

    Tim Bui

    The trend in Gross Margin (GrossProfit / Total Revenue) says a lot about the business condition of a company. However not every company reports Gross Profit. For those who do not, we have to subtract Cost of Goods Sold from Total Revenue. I do not have a solution to get consistent numbers for Total Revenue yet, however, for Cost of Goods Sold, I have found a few different tags below:
    – CostOfGoodsAndServicesSold
    – CostOfRevenue
    – CostOfGoodsSoldExcludingDepreciationDepletionAndAmortization
    – CostsAndExpenses

  • #123764

    David Tauriello

    Tim – to get lists of concepts in the base (standard) US GAAP and IFRS Taxonomies, query: /dts/search?dts.taxonomy=US%20GAAP,IFRS&fields=dts.id,dts.taxonomy-name

    With each dts.id from above that you want concept information for, query like this:


    Every company references a base taxonomy in its filing as a starting point. The company filing is essentially a taxonomy that is inheriting from the base and adding concepts as necessary according to the company’s policies and practices with respect to its financial statements.

    NOTE: you could include concept.local-name with a comma-delimited list before the fields that are returned to get only specific concepts.

    If you pull an entire taxonomy, understand that there’s a great deal of information and a significant number of records involved, so this may take a while. Fortunately, these taxonomies don’t change – new releases annually – so the details only need to be pulled one time.

  • #124615

    Tim Bui

    Hi David,
    I finally got around to apply the codes that you described above. It’s just amazing how much information returned from the call.

    A couple questions please:
    1. Using =CONCATENATE(A1&”/concept/search?dts.id=257590&fields=concept.local-name.sort(ASC),concept.*,label.*”) only return about 2000 concept local names. How can I change the code so that I can get a complete list so that I can pick and choose what I need?

    2. I am still trying to find ways to shorten my code to get more factual data for more companies per call (to be under the length of the code that Google restricts). Can I use the concept.id in place of the concept.local-name in the call?

    As always, thank you for help!


  • #124655

    David Tauriello

    Use ENDPOINT.offset(integer) in the fields= portion of your query to get additional concepts. Something like concept.offset(2001) will show concepts after 2,000 (for Power User and Sole Practitioner Individuals, as well as all Organizational XBRL US Members). This offset works for all endpoints (fact, dts, etc.)

    See the documentation and this thread for more information: https://xbrl.us/forums/topic/how-to-get-a-sample-of-records-via-xbrl-api-for-evaluation/#post-116618.

    Again – because you’re querying for the base taxonomy concepts, the details you return will not change (these are published once), so your best/most effective approach will be to copy the details to a tab (or file) and use that as your reference.

    Like other “id” parameters, concept.id returns a unique integer that corresponds with concept.local-name so it can be used as a reliable substitute.

  • #124690

    Tim Bui

    Thank you, David. I was able to download 17,035 concepts. The next part is to understand these data and select which one to use.

    Thank you again for your help

  • #133617

    Tommy Carstensen

    Tim, did you ever manager to create a list of identical concepts? It’s a bit of a mess, when it’s not standardized. I’m surprised the SEC chose to allow companies to use random names as they see fit. It completely defeats the purpose of XBRL.

    • #133655

      Tim Bui

      Hi Tommy,

      I started on the standardization but have not finished it yet. At the beginning, I planned to use data from XBRL US but David Tauriello pointed me to the SEC website where I can get all of the data from all filers more efficiently (https://www.sec.gov/dera/data/financial-statement-data-sets.html). I am importing these data into a SQL Server to do standardization. I am learning the SEC data structure and think I have a way to do better standardization, but I need to test this method further. I would be happy to share my methodology with you if it works. In the meantime, if you want, I can give you what I have done so far, but it is incomplete. I use financial information for investing so I only standardize the items that I think relevant to my work.
      Yes, I agree that whichever entity (SEC, AICPA, CFA,…) that allows companies to name their tags (concepts) at-will really weaken the case for XBRL and disadvantage “smaller” financial data users like me. It’s illegal to fudge the numbers but accountants can name the numbers whatever they want and can change these names when it’s convenient–making peer comparison or historical comparison extremely difficult. Data providers such as Bloomberg, Factset, CapitalIQ, Thomson Reuters will be in business for a very long time.

      • #133680

        Tommy Carstensen

        Aye Tim, I think Morningstar and the rest of them will continue to thrive as long as the data is not standardised. I wanted to plot the data over time and across companies within an industry, but that turned out not to be trivial, because there is no requirement for the data to be homogenous over time and across industries. I hope the law and XBRL specifications are changed and data is homogenised going forward to enable small fintech companies to compete.

        Here my first attempt to plot data for General Mills over time:

        I shall be watching this thread to learn more about your attempts to standardise the data. I appreciate your efforts on behalf of the community. Thanks!

      • #139650

        D Q

        I too am interested in standardizing concepts for investing purposes.
        I don’t know SQL but DM me if I can help.

  • #139657

    Tim Bui

    Hello DQ, I am still working on standardizing the tags. I think I am on the right track, however this is not a trivial task. I was able do download 87MM (yes million) lines of data from the SEC. And now I am parsing them out using SQL Server. It is slow moving because I have to check and recheck to make sure the data match with the 10Ks and 10Qs. I am happy to give the results to whoever wants them because they are not proprietary data and I am too a beneficiary of communities like XBRL US (David Tauriello at XBRL US has spent a lot of time bringing me up to speed.)

    To use this massive amount of data, one will need to use some database for sorting and screening. Microsoft Excel or Access cannot handle this much of data. Maybe we all can put in a request to XBRL US to allow members to contribute to this standardization efforts by creating a depository area on XBRL US pgAdmin.

    Commercial data providers do provide their own standardization but depending one’s need, the standardization has to be customized somewhat. Sorry for this oxymoron word of customizing the standardization. But for example, companies reports several types of Account Receivables. There are 948 distinct tags on just AccRec. Most data providers have just 1 line for AccRec. I try to break them down to 4 subcategories: AccRec_Trade_Short_Term, Acc_Rec_Finance_Short_Term, Acc_Rec_Trade_Long_Term and Acc_Rec_Trade_Long_Term. Acc_Rec_Trades are the receivables from regular customers. Acc_Rec_Financing are the receivables from financing activities such as a promissory note coming due or GM financing the dealers’ floorplans. ST or LT determines whether they are in current assets or long term assets. The change of each of these subcategories provides different type of information to the financial readers.

    In the meantime, I would highly recommend you to check out the XBRL XL (https://xbrlxl.com/) website created by Jim Truscott. Jim also tries to do standardization. Jim had a demonstration of his Excel API hosted by XBRL US a few months back.

    Let’s hope we here from XBRL US on this matter.

    • #140245

      David Tauriello

      To use this massive amount of data, one will need to use some database for sorting and screening. Microsoft Excel or Access cannot handle this much of data. Maybe we all can put in a request to XBRL US to allow members to contribute to this standardization efforts by creating a depository area on XBRL US pgAdmin.

      I’ve raised this idea internally; another possibility might be creating some sort of common Google Sheet that holds these standardized terms (taking off from Peter Guldberg’s template for balance sheet)

  • #139658

    Tim Bui

    By the way, DQ, There is a company named Intrinio (https://intrinio.com/) that provides standardization on Excel. The prices seem to be very reasonable. I I tested their system out and found that their intereface is very to use. I think their data is suitable for most purposes. I do I own parsing raw data because I wanted to do further subsegments for my own use.

You must be logged in to reply to this topic.