Laura Rusu's Replies

Forum Replies Created

Viewing 15 posts - 406 through 420 (of 598 total)

← 1 … 27 28 29 … 40 →

Author

Posts
Tuesday, August 24, 2021 at 9:38 AM in reply to: Lists of similar concepts #193556

Tim Bui
Participant

I am grateful for your sending the codes, Peter! I am out of town, but will study it when I get back. Thanks again!

Tuesday, August 24, 2021 at 8:15 PM in reply to: Lists of similar concepts #193567

Peter W Reed
Participant

Be forewarned the snip I provided was only tested for “AAPL” 2014. I’m now testing for “AAPL” 2015 I found the labels have changed. I provide below the salient change that needs to be made to the program. I ran into this when I was using BeautifulSoup. I found the number of variations is small but it caused my program to look like spaghetti. This is why I moved over to xmlschema and a well formed dictionary.

2014: xmlTmp_xbrl[‘labelLink’][0][‘label’]
2015: xmlTmp_xbrl[‘link:labelLink’][0][‘link:label’]

Wednesday, August 25, 2021 at 1:15 PM in reply to: Lists of similar concepts #193597

Mikko Olkkonen
Participant

Peter, Is the goal of your machine learning program to find tags corresponding to certain higher level concept such as Revenues or Free Cash Flow. That is basically what I am trying to do as follows:
1) I am turning financial statement data (https://www.sec.gov/dera/data/financial-statement-data-sets.html) into a local mongo database . For this I am using my deramongo.sh script that can be found in my github.
2) I output all tag-commonname pairs from my local mongo database by using my script mapping.js (sample output map3.txt of about one million tag-common name pairs is on my github)
3) from that map3.txt I can query any tag. My algorithm searches all common names corresponding to that given tag. Next it searches for new tags corresponding to each of the common names found in the previous stage. Next new common names are searched for all found tags etc. For example, if I query Revenues tag, this “fuzzy logic” algorithm converges to synonym tags like Revenue, SalesRevenueNet, SalesRevenueGoodsNet, RevenueFromSaleOfGoods and many others some of which are unfortunately clearly of poor quality i.e. not really synonyms for Revenues.
Anyway, I think that this way I can produce adequate buckets for some tens of concepts that are of interest to me. I am very interested in hearing if your machine learning program can do something like this. Maybe my (almost complete) inventory of tag-common name pairs (map3.txt) is relevant input to your machine learning program?

Wednesday, August 25, 2021 at 10:06 PM in reply to: Lists of similar concepts #193606

Peter W Reed
Participant

Your approach is more sophisticated than mine. I’ll look at your github this weekend. On the Edgar page for a company’s 10-Q/K filing there is a Schema file (.xsd) and five xml files. I use the Python module ‘xmlschema’ to parse the linkbase file for common name to gaap tag mapping. The result is a dictionary with about 500 records, most of little interest that need to be cleaned. Below is one record of 500 in the dictionary. I use tuples for file storage.
(‘total net sales’, ‘lab_us_gaap_revenuefromcontractwithcustomerexcludingassessedtax’)

My ML algorithm interest isn’t in the creation of the mapping. InsteadI’ve been using Clustering to analyze large groups of stocks, like the EFT QQQ. I need a method to gather a large number of stocks’ common attributes (“Net Sales”) over a long period. The investment strategy I use and want to improve upon is selecting the “best” twelve stocks that make up an ETF like QQQ.

Saturday, September 4, 2021 at 7:50 PM in reply to: Lists of similar concepts #193851

Peter W Reed
Participant

I searched github “olkkonen in:name” for your repository. There were no public repositories for me to view.

I decided to have another go at the Excel downloads for 10-K and 10-Q. The approach to obtaining commonality/comparability between filings over time and between companies is context. Context being the sheet and sub-sections within sheets. I’ll let you know if I have success.

In the meantime can you invite me into your private github? Or did I just not find the correct repo? Thank you.

Sunday, September 5, 2021 at 1:15 AM in reply to: Getting started with the XBRL Filed Data Add-in for Excel #193858

Farrukh Javaid
Participant

Hello David,
I cannot login on the XBRL API in the excel sheet despite using the current account credentials / ClientID / Client Secret etc. Every time I try to login it says login failed.
Can you please help as how to further proceed with the login and use data?
Thank you.

Best Regards
Farrukh Javaid

Sunday, September 5, 2021 at 7:31 AM in reply to: Lists of similar concepts #193861

Mikko Olkkonen
Participant

Peter, My repo should be public at
https://github.com/molkko/deramongo

Monday, September 6, 2021 at 8:51 AM in reply to: Getting started with the XBRL Filed Data Add-in for Excel #193879

David Tauriello
Keymaster

Hi Farrukh – please try the troubleshooting steps listed above, starting with using a private or incognito browser window to login and generate pairs at https://xbrl.us/access-token – make sure you login with your XBRL US Web account credentials (email and password), and not the ‘Sign in with Google’ option, and also be sure you have logged in with the same email address you used to request provisioning for your account.

Monday, September 6, 2021 at 9:12 AM in reply to: Using fact.accuracy-index #193881

David Tauriello
Keymaster

Hi Bruno – I wanted to follow up on this thread, as I don’t think I responded full to your questions about element values not appearing.

Missing facts in a report might be due to an account limitation. Non-member accounts return a maximum of 1,000 facts, 100 facts at a time (fact.offset is used in the fields part of the query to get past 100 … set it to 200 and it will return facts 200 – 300 for the query).

So, if you changed the green cells limiting the report section so it returns the full report, you will only get 100 facts.

On the issue of a specific dts.id not returning data, it may be that the report name changed between filings, so if you looked for ‘Balance Sheet’ in one period and in the next it’s gone, the report may have been renamed to ‘Statement of Operations’ (this would be a very extreme circumstance)

We posted a report compare template recently (copying the first columns a couple more times) – https://xbrl.us/wp-content/uploads/2021/08/tri-compare-reports.zip – maybe using this template helps your searching.

Monday, September 6, 2021 at 1:00 PM in reply to: The XBRL API #193884

Rick Labs
Participant

Total beginner. Was wondering if the API has its own query processor, or is it built on something like SPARQ? Any references to the engineering design philosophy / history of the API query processor, and/or general documentation of the query portion of the API very much appreciated.

Example query 1 – I send in a comma separated list of 10 tickers (all pre-selected, each within a single industry) and a year of interest. It returns SALES for EACH ticker for that specified year.

Example query 2 – I send in a single ticker – system “looks up” the INDUSTRY CODE for that ticker, searches all firms with that industry code, sorts on sales, returns the top 10 (or less, if fewer than 10 companies found with that industry code). Returns: COMPANY NAME, TICKER, SALES for the results set.

Example query 3 – same as #1 above, but for each of the 10 tickers, return sales for the past 20-40 quarters, with each column aligned so it’s for (roughly) the same three month period.

Are queries like the above even possible? If possible are they experimental or burdensome on the API/query processor (or back end database) at this point?

Are there “got ya’s” to be sure the data returned is correctly date aligned? (Some year ends / quarter ends differ across different companies.)

I’m very new and have not yet had the chance to tour the full existing documentation set or plumb this forum. I apologize if this is all super easy to find. I find any design philosophy / history type material the very best place to start. Links to actual examples of that class of queries (above, multiple company data returned in one go) most appreciated too.

Thanks in advance.

Rick
Investment Manager looking at industries and sector fundamental data

Wednesday, September 8, 2021 at 7:13 AM in reply to: Lists of similar concepts #193908

Peter Miller
Participant

Hi guys,

since our goal is to create standardized financial statements, I wonder if it wouldn’t be advantageous to use the “Bulk data” zip-file which can be downloaded on the SEC website. It might contain in json format all the financial data which is available for all companies.

Isn’t it easier to map these common names to standardized names instead of linking the xbrl tags to standardized names? What are your thoughts?

Wednesday, September 8, 2021 at 12:47 PM in reply to: Lists of similar concepts #193918

Peter W Reed
Participant

I don’t understand the distinction between Common names and standardized names. I use the “XBRL TAXONOMY EXTENSION LABEL LINKBASE DOCUMENT” that comes with each 10-Q/K submittals. A reason I use this xml file is due to custom labels. For Example, “lab_us-gaap_ComprehensiveIncomeNetOfTax” is mapped to “COMPREHENSIVE INCOME ATTRIBUTABLE TO COSTCO” in their latest 10-K filing.

The work Mikko Olkkonen is doing on github looks promising for standardization. Note: Mikko there is a broken url in your readme. I’ll submit a comment on github later.

What I’ve found with the bulk data and API material is timeliness. Its seems about a quarter out of date.

Back to your question on standardization of common name to gaap tag. Below is a snip from and API call I submitted. The mapping is here, but it is the same mapping found in the LABEL LINKBASE DOCUMENT file.

{“cik”:909832,”taxonomy”:”us-gaap”,”tag”:”GrossProfit”,”label”:”Gross Profit”,”description”:”Aggregate revenue less cost of goods and services sold or operating expenses directly attributable to the revenue generation activity.”
x

Wednesday, September 8, 2021 at 2:48 PM in reply to: Using fact.accuracy-index #193922

Bruno Lerer
Participant

Hi, David and thanks for your response.

I downloaded and played with the report compare template and it is indeed much better and easier to work with.

To try it out, I used dts.ids from smaller filers (as opposed to the usual suspects like Apple, Microsoft, etc.). Interestingly enough, once you move into that territory, dts.ids frequently (I would say 9 out of 10) result in no response, for some reason. And as I mentioned previously, of the 3 random reports I managed to get balance sheets for, one had Total Assets and Total Liabilities, the second had Assets and Liabilities and the third had Assets but only Liabilities, Current. That would definitely make concept searches a little difficult…

One initial comment would be to repeat a point from my previous post, which seems more important after testing the template.

As before, one needs to “get dts.id values from the Select Report by Entity Name field on the fact function dropdown of the XBRL Filed Data add-in”. If what you are after is the balance sheet or income statement, you are probably looking for either 10-Ks or 10-Qs. The drop down list generated by entering something like “natio” in the Entity Name field can have dozens (sometimes more than a hundred, it seems) entries, most of which being 8-Ks and their ilk. It would be really, really great to somehow be able to filter the drop down list to include only 10-Ks (or 10-Qs) so as to make it easier for the user to drill down to a particular report of a specific entity.

I’ll continue to work with the template and will be happy to report anything interesting I run into.

Friday, September 10, 2021 at 7:17 AM in reply to: Using fact.accuracy-index #193972

David Tauriello
Keymaster

Bruno – thanks for your reply. If you have or find a dts.id for an ’empty’ financial report (10-K or 10-Q) please post it here or email info@xbrl.us. We create a dts.id from the files shared to SEC that are copied to our database; we’ve missed reports at the SEC before (I’m not aware of any missing now), but I’ve never seen a report in our repository that didn’t have the underlying data.

The sections of reports that are displayed by the template are based on how the company has created the report, according to requirements established by the SEC, which allows for latitude regarding element selection (or creation) and element labels. Generally, companies can use any label they like for facts, and this may be the variance you are describing.

In the report comparison Excel template, ungroup the left side of the report to see the relationship query, which was updated again this week to use the ‘preferred label’ for report facts (ensuring the front side text from the report is what’s displayed in the template). The concept.local-name (which you can see by unhiding columns to the right of any of the example reports in the template) is ultimately what can be used to confirm/compare facts. We’re using preferred labels in this view to help users understand that the metadata for facts is a facet of XBRL that gives users control of information display.

Appreciate your interest in a filter for the report listing – this is a bit complex, as it requires a filtering option on the Entity Name. If you know the exact name, keep typing … eventually, you’ll get reports for ‘National Steel Co’ (or whatever ‘natio’ you need) and the list will be shorter.

Friday, September 10, 2021 at 10:59 AM in reply to: Query examples – 10 tickers in one go? #193977

David Tauriello
Keymaster

Hi Rick – thanks for writing and for your interest. Dive right in and try the XBRL API! We have tools and templates working with Google Sheets, Excel in Office365 or with any other web-connected programming interface (we’ve posted a couple of Jupyter notebooks on our XBRL Data Community page – see the Documentation & Discussion links to the right of this post to get started. Non-members can get 100 records at a time, up to 1,000 for a specific query (see this page for additional details).

All of the queries you posed are possible – your second example will need two independent queries (/report for SIC code and company, then /fact for the details you specified, plus some post-query coding or formula work to index and match so you can display any /report details you needed that aren’t available through /fact – most are already in there). Your third query sounds like the first – to get data into columns by company name and rows by period, you would likely need to code or use formula.

The XBRL API Documentation link includes a PDF with additional details. Our implementation of the XBRL API connects to our Public Filings Database (Postgres) with a load process importing new reports posted by the US SEC every few minutes. We’re evaluating several additional data sources to add to our collection.
Author

Posts

Viewing 15 posts - 406 through 420 (of 598 total)

← 1 … 27 28 29 … 40 →