Posted on Friday, January 19, 2024
By David Tauriello, Vice President of Operations, XBRL US
Everyone appreciates a good tip. While no substitute for a comprehensive review of all relevant information, shortcuts that enhance efficiency, save money, or broaden a perspective are useful and can have a lasting impact.
Since 2018, we’ve been making data we’ve copied into our Public Filings Database from US and European regulators available through the XBRL API.
In that time, we’ve answered thousands of questions from XBRL US Members and non-Members about the data, the XBRL API and resources we’ve made available that use the XBRL API to query our data collection. From what we know about these resources and what we’ve heard from people working with them, the following tips & techniques might offer insight to your next querying session – click on the tip to learn more.
Start small, grow a query and know your limits.
Sort 'Fields to Return' before querying to quickly scan results.
Focus on 'Fields to Return' to build more powerful queries.
Understand ultimus and unique and the different capacities of each.
Get accurate with facts across a specific report type.
Explore dimensions for a more complete picture of the entity's story.
Output the add-in's auto-populated results into your worksheet.
Query the cube to save on processing time.
Use additional options in the spreadsheet's ‘showData’ function to customize results.
Adapt what's available to support your continued development.
The XBRL Filed Data spreadsheet extension lets you create basic queries that can be further manipulated in the cell after you use the default 'Function + Data' option to return data to a worksheet. Be sure to try the other output options, too: the 'Data' option is good for one-off results; 'Query' helps with template-building or sharing queries with others; and 'SQL' prints a command that can be used with the Public Filings Database.
An initial query that is specific (perhaps with an answer you already know or can easily check) will test and confirm your understanding of the XBRL API. Most filtering parameters in the XBRL API support comma separated lists (ie. period.fiscal-year=2023,2022,2021 or concept.local-name=assets,liabilities), so you can easily add or remove filters to get what you need. You might find it’s quicker to edit a query in the worksheet cell than to modify and re-run it from the task pane.
A secondary benefit of iterating while building a query is that by focusing initial query trials on confirming known results before expanding the search, you'll save time because large result sets (all facts reported by a company across all years) can take more time to gather and return than smaller results.
Most XBRL US Members can use the XBRL API to return all available data for a query, in sets of 5,000 rows at a time (most 10-K reports have fewer than 3,000 facts). Non-members can get up to 1,000 rows in sets of 100. By sorting results and using an endpoint’s offset field, programming languages can iterate or loop a query to get the maximum data, and spreadsheet extension users can specify the offset in the corresponding text field as the starting point for results.
In the XBRL Filed Data spreadsheet extension, the 'Fields to Return' are arranged alphabetically by default. These can be rearranged by dragging the labels up or down and the + sign displays a searchable list of other attributes that can be added to the results.
Within the list of fields selected, each can be set as an ascending or descending sort using the arrows to the right of each field in the list. The XBRL API will apply sorting in the order the fields are arranged left-to-right when you look at the results from the query (top-to-bottom in the 'Fields to Return' of the add-in).
Using sort on at least one field that contains different values is a requirement for using the offset parameter (offset is necessary to get any/all results above the initial limit for an account).
For example,if you select the fact function and paste 0000320193 (which is the SEC entity identifier for Apple Inc.) in the 'or Get Facts by Entity Code' field then click 'Get' to return Apple facts, you'll get a list (ordered by the fact.id). If you then click the up arrow on concept.local-name and click 'Get' again, the facts will be sorted in alphabetical order. Adding sort to other fields after the alphabetical sort (period.fiscal-year, for example) will apply that sort after the alphabetical sort. Like adding parameters, sort can be manipulated manually in the query, as well, by appending .sort(ASC) or .sort(DESC) to fields.
The forms for creating queries in the spreadsheet extension task pane are not a complete set of options for searching. Most of the 'Fields to Return' are also searchable as parameters when they're added in the 'Additional Parameters' field (they're just not displayed to keep options simple).
For example, paste 647772 in the 'Select Report by Entity Name' and click 'Get' to return Apple facts in a 10-K, then add the parameter concept.is-monetary=true as an 'Additional Parameter' and click 'Get' again to query for only monetary facts in the report. Add &concept.is-base=false to 'Additional Parameters' and you'll have monetary concepts that are extended elements in Apple's 10-K.
One of the first things you might notice about data in an XBRL report is that duplicate values are common. That's because the same fact can be reported in multiple reports, such as prior year comparisons.
To help reduce redundancy, a parameter like fact.ultimus=true instructs the database to return a fact if it corresponds to the latest instance of a reported fact as defined by its attributes.
For example, Apple Inc.'s annual revenues for 2021 appears in multiple company reports:
https://api.xbrl.us/api/v1/fact/search?entity.code=0000320193&period.fiscal-year=2021&period.fiscal-period=Y&concept.local-name=Assets&fact.has-dimensions=false&fields=concept.local-name,fact.value,period.fiscal-period,period.fiscal-year,report.document-type,report.filing-date.
By the way, XBRL API queries display in the browser - click the link, log in and see the results.
Add &fact.ultimus=true to the query above and get only the latest instance of this fact appearing in a report.
The unique parameter also reduces repetition of facts by comparing the contents of each row of results to all others and eliminating all but one (unique) occurrence for each. What's important to note is that if any result field has information that differs from others (ie. report.document-type is 10-K vs. 10-Q) the rows are considered unique. Unique is a checkbox option near the 'Get' button in the spreadsheet extension or you can add &unique as a query filter to create a unique result set.
Try adding &unique to the end of this query to reduce the result to a single row:
https://api.xbrl.us/api/v1/fact/search?entity.code=0000320193&report.document-type=10-K&period.fiscal-year=2021&period.fiscal-period=Y&concept.local-name=Assets&fact.has-dimensions=false&fields=concept.local-name,fact.value,period.fiscal-period,period.fiscal-year
A third option to help reduce repetition - fact.accuracy-index - is similar to ultimus index, although this parameter works at the report level (and only goes as far back as 2018). A fact-accuracy-index parameter value of 1 returns the most-accurate version of the fact. It can be used as a parameter either with a single report (by id, accession, or URL), or can be applied to a specific report.document-type (ie. 10-K) to isolate results. The latter case is important to researchers, as regulators like the SEC expand XBRL requirements to include additional report types that are not as well-known as 10-Ks and 10-Qs. This discussion has more detail about ultimus, unique and accuracy.
Most SEC filers report non-dimensionalized facts on the company’s face financial statements (Balance Sheet, Income Statement and Cash Flow). A fact with no dimensions is itself a total - this is sometimes called the default.
Companies often report facts that are disaggregated by additional descriptors (ie. sales by region or product). These are dimensionalized facts - a dimension is a way of reporting a fact like Revenues that may be reported as a total, but may also be broken down into types, like revenue by product type or geography. When creating a query, the category (for example Eastern Region or Western Region) is called a member of the dimension.
A fact with a dimension (or multiple dimensions) has members that must ‘roll up’ to the non-dimensional default value. Said another way - the dimensionalized facts combined always tell a complete story about its non-dimensionalizied counterpart (ie. the total). Whether it’s a story about product and service offerings, geographic location performance or another way of dissecting data, a dimension and its members give a complete accounting of the total.
In the fact and cube endpoints of the XBRL API, the dimension-pair attribute is a ‘shorthand’ mechanism that shows both facets of this fact.
Some fields on the extension create auto-populated lists of options to choose as you type them in (on the fact function, 'Select Report by Entity Name' and 'Filter by Concept Name' do this). You can get these results as a query with data in the spreadsheet by clicking the blue icon that appears next to the field's label.
For example, type Apple Inc. in the 'Report by Entity Name' field and select a report, then put the term "cash" in the 'Filter by Concept Name' field and click the blue icon next to the label for this parameter. This will put the function, query and a list of all the concepts in the report that include the string "cash".
Next, edit the cell where the query is located to add "concept.namespace" as a field to be included in the results, and you will be able to differentiate base from extension taxonomy elements created by the company.
An XBRL report contains cubes which represent financial tables. A cube can be extracted in its entirety by selecting the cube function. The FERC spreadsheet template is an example of the use of the cube endpoint to extract financial statement data using a single query.
In addition to returning results for the query from the URL, three optional arguments - type, fieldname and showheaders - can be defined in the spreadsheet function to refine results. The last position sets whether the column headers are displayed, and the other two work together to filter the output - read more about implementation instructions for each option.
The 'basic' report views in our spreadsheet templates and code samples in Jupyter notebooks are designed to let new users get results with a few keystrokes (everyone's got a favorite company they'd like to know more about). The underlying queries and formulas in the spreadsheets can be re-purposed in new spreadsheets to gather data across companies, sectors or specific report types.
When you need to know more about what's possible (nesting queries, text searches in documents), review the documentation, post questions to the discussion forum or join an office hour. Get more on these options on the XBRL Data Community page.
Point of View
Recent PoV Posts
- Identifying the Obligor for Municipal Securities Jul 16
- Defining Success for the FDTA Jul 01
- Stay Within the Guardrails for Successful Data Quality May 28
In the last few months
- August 2024 (1)
- July 2024 (2)
- May 2024 (1)