Forum Replies Created
-
AuthorPosts
-
Peter MillerParticipant
Hi,
thanks a lot.
Is it from your experience possible, that the same “us gaap tag” is assigned to different “common names” when it comes to different companies in different time periods?
Peter W ReedParticipantThanks for the tip. I read everything I could find by Mr. Charles Hoffman. The “problem” I have with his terminology tables is they stopped being updated around 2015 (or earlier). Also, using the “XBRL TAXONOMY EXTENSION LABEL LINKBASE DOCUMENT” gives me the 10-Q/K mapping for the specific 10-Q/K for the company of interest.
I got the xmlschema library working well. The software is in a pseudo code state – i.e. a hack. When I get it cleaned up and universal I’ll put it on github.
By “Universal” I mean there are at least two variations on the Python dictionary keys used in the Edgar LABEL LINKBASE files.
Peter W ReedParticipantYes, even with the same company in different fiscal reporting periods for a given year. I was hoping doing the 10-K mappings would be sufficient for that year. It didn’t work that way for me.
I’m still hoping the XBRL taxonomy files for a given year can provide a universal solution. I don’t have a background in accounting/finance, which may be required to solve this.
Peter W ReedParticipantThanks for the tip on Jim Truscott. I started writing some scripts to populate Peter Gulberg spread sheets. But quickly saw I’d run into the same common name, XBRL tag name problem. I think I have resolved my LinkBase scripts to create. I’m anxious to begin constructing Machine Learning programs which is my primary goal.
Have you looked at the excel sheet in the XBRL taxonomy Excel workbook calling out depth for each gaap tag? I hope to use that sheet to ensure features (gaap tags) are independent in a financial sense of the word.
Peter W ReedParticipantI use the Edgar search page to get to the list of a company’s 10-Q/K. Hope you recognize this task. On that page is a link to the company’s filings by period. If you click that link you get to the detailed Edgar filing. Below is an example for Apple. That page has the complete AAPL filing for that period. You’ll see the Linkbase file on that website page.
Tim BuiParticipantPeter, The Linkbase file sounds so promising. I use the new Edgar filing page all the time, but I can’t find the Linkbase file any where. Would you please give more detail steps on how to get this file? Thanks!
Peter W ReedParticipantI just discovered the link I pasted in my response gets you to the Edgar page for 10-K/Q. Sorry, it is most likely done on purpose to foil bots. Use the link and then drill down to the 10-K/Q list. From that page you’ll click on individual filings. “Filing” is the name of the button to click.
https://www.sec.gov/edgar/browse/?CIK=320193&owner=exclude Edgar landing page for AAPL
Peter W ReedParticipantIt appears the Edgar website has methods in place to hinder automated programs. Also, I just tried to paste an image into my response. It failed.
The best I can do is provide a url to Apple’s Edgar first page. Open up the 10-K hyperlink. You’ll see a “View all…” button, click there. Finally, click on an individual “Filing” button and you are there. I created a text cheat sheet that makes the clicks easier/faster. The files in the bottom half of the final page contains the linkbase, schema (WDSL), and the xml file version of the 10-K/Q. The XBRL! file is there too (click on it if you’ve not seen this before).
Tim BuiParticipantGot it, Peter. Thank you for your instruction. I see several XML files in the lower half named XBRL TAXONOMY EXTENSION CALCULATION LINKBASE DOCUMENT, XBRL TAXONOMY EXTENSION DEFINITION LINKBASE DOCUMENT, XBRL TAXONOMY EXTENSION LABEL LINKBASE DOCUMENT. Thank you again!
Peter W ReedParticipantimport pandas as pd
import xmlschema
import requestsr = requests.get(xml_url,headers=headers)
xs = xmlschema.XMLSchema11(c)
xmlTmp_xbrl = xs.to_dict(d)The Python module I’ve had the best success with is “xmlschema”. The Schema (.xsd) file “c” is the r.text from browsing to that file. “d” is the r.text for the Linkbase, or XML Extract file. The result is a hierarchical dictionary.
I had BeautifulSoup working a few months back, but it was messy due to all of the XBRL variations I found. The dictionary is easier/cleaner to use.
Tim BuiParticipantThanks for sending the Python instruction, Peter. It’s wonderful. This will help me a lot.
Peter W ReedParticipantGlad it was helpful. Below is a cut and paste from a DataFrame I created using BeautifulSoup. It didn’t transfer well and not all columns are shown. The index is the gaap tag “us-gaap:treasurystockacquiredaveragecostpershare”. For this tag there are two common names used in the COST 10-Q/K from 2016 to 2021.
There is a total of 140 rows in the DataFrame for COST. This is the intersection of all common names used throughout the study period, 2016-2021. My intention is to make each record complete and self-describing (eleven columns).
Index Value Duration EndDate Ticker Type Period CmmNam0 CmnNam1
us-gaap:treasurystock
acquiredaveragecostpershare 151 D 2016-11-20 COST 10Q Q1 Average price per share Treasury Stock Acquired, Average Cost Per SharePeter MillerParticipantHello Peter,
From my understanding, the created dictionary shows the US-GAAP tag but not the common name (Verbose Label). To link these, it would be necessary to have both.
Kind Regards
Peter W ReedParticipantThe code below extracts the common name and the associated gaap tag name. The dictionary is “xmlTmp_xbrl” from the linkbase file. I drop two record types I’m not interested in. By creating two lists I’m able to easily create two dictionaries. One with gaap tag as the index (df2) and the second with the common name as the index (df1).
Note, this code is for a COST 10-Q filing. The keys could be different for different stocks.
gaapTag = []
commonName = []
for one in xmlTmp_xbrl[‘labelLink’][0][‘label’]:
if re.search(“us-gaap_”, one[‘@xlink:label’]):
if one[‘$’].endswith(“]”):
continue
elif one[‘@xlink:label’].endswith(“Abstract_lbl”) or one[‘@xlink:label’].endswith(“TextBlock_lbl”):
continue
else:
gaapTag.append(one[‘@xlink:label’])
commonName.append(one[‘$’])
—–
df1 = pd.DataFrame(gaapTag, index=commonName, columns=[‘GAAP_Tag’])
df2 = pd.DataFrame(commonName, index=gaapTag, columns=[‘Common_Name’])Peter W ReedParticipantI hope my code snip is clear (the dictionary is for AAPL, not COST). Let me know if you have any questions about the code?
I went back to the standards document – “http://www.xbrl.org/specification/xbrl-2.1/rec-2003-12-31/xbrl-2.1-rec-2003-12-31+corrected-errata-2013-02-20.html”. Section 5.2.2.2 addresses the labels. The first paragraph for that section is cut and pasted below:
“Although each taxonomy defines a single set of elements representing a set of business reporting Concepts, the human-readable XBRL documentation for those concepts, including labels (strings used as human-readable names for each concept) and other explanatory documentation, is contained in a resource element in the label Linkbase. The resource uses the @xml:lang attribute to specify the language used (via the XML standard lang attribute) and an optional classification of the purpose of the documentation (via a role attribute).”
-
AuthorPosts