Forum Replies Created

Viewing 15 posts - 556 through 570 (of 598 total)
  • Author
    Posts
  • in reply to: Extracting Blocktext tags (i.e. sections of filings) #203004
    David Tauriello
    Keymaster

    Hi Satish – these facts are HTML encoded; there is no ‘plain text’ version – the data is in there, but might be under several HTML tags for formatting purposes. You have a couple of options:

    • use regex in your routine to remove tags after you’ve retrieved the data (something like <.*?> should leave you with plain text, which might be tough to read … maybe replace it with spaces, tabs or line breaks?)
    • concatenate the fact.id with this string to create a URL that renders the fact: CONCAT( https://csuite.xbrl.us/php/dispatch.php?Task=htmlExportFact&FactID= , xxxxxx ) – we’re using this approach in some of the spreadsheet templates posted in the XBRL Data Community
    in reply to: Extracting Blocktext tags (i.e. sections of filings) #203027
    Satish Sahoo
    Participant

    Hi David,
    Thanks a lot for your response. This is very helpful.
    Just have another related quick question. I see that at least since the Inline XBRL has started, the section files are separately posted in EDGAR website filings. Is it possible to point to the URL of those files using the API? Not sure if this is within the API framework. If it’s then it would be great. Thanks

    in reply to: Extracting Blocktext tags (i.e. sections of filings) #203044
    Satish Sahoo
    Participant

    Hi David,
    After some more digging, I could get to the files that contain text for any particular TextBlock tag. But I realized that the fact.value doesn’t really contain the entire block inside it. Rather it seems to be truncated. Is there a size limit on the fact.value output ? If yes is there any setting that can be used for the fact.value to contain the entire text block within it ?

    As an example you can check the fact.value of the following fact id.
    https://csuite.xbrl.us/php/dispatch.php?Task=htmlExportFact&FactID=221545926

    Thanks

    in reply to: Extracting Blocktext tags (i.e. sections of filings) #203046
    David Tauriello
    Keymaster

    Hi Satish – thanks for your question. As part of the process to keep our Public Filings Database current, we make exact copies of the documents submitted to the SEC, FERC and other regulators that contain XBRL (as .xml instances or .html files that have inline XBRL in them). We do not copy the exhibit files (.htm but without XBRL), images, text files, etc.). You can use the report.sec-url field to get the page on EDGAR where these additional files exist.

    in reply to: Extracting Blocktext tags (i.e. sections of filings) #203047
    David Tauriello
    Keymaster

    Hi Satish – there might be a character limit if you’re trying to get the HTML from spreadsheet. This is why we use a hyperlink in spreadsheet to the browser view of the fact when there’s a “<\” character combination.

    If you query with curl or python, or use an API testing tool, you should see all of the HTML (the data in the HTML we present is the same data in our database).

    in reply to: Extracting Blocktext tags (i.e. sections of filings) #203054
    Satish Sahoo
    Participant

    Thanks, David. I think you pointed me in the right direction. I guess the truncation is happening when I am writing the json list which is the output from the API into a panda data frame. So the API is still producing the entire section. It’s just the output rendering that is causing the truncation. Thanks

    in reply to: The XBRL API #203100
    David Tauriello
    Keymaster

    A Public Filings Database User writes:

    I wonder if there is a flag that differentiates among the Balance Sheet, Income Statement and Cash flow statements in the data base?. Is there a way to tie each value in the fact table with one of the statements?

    The relationship table holds information about the structure of the report (sections, order of line items as defined in the presentation linkbase), and the fact table holds the data corresponding to the concepts that are the report’s line items.

    One of the recent enhancements to the XBRL Filed Data spreadsheet extension was the addition of a new custom function – =XBRL.showSQL() – that translates XBRL API queries back into SQL statements that can be used with the Public Filings Database.

    You can use the Full ESEF or SEC Report template (https://xbrl.us/wp-content/uploads/2022/08/FullReport-DimensionPivot-Template.xlsx) to generate the following queries for facts and relationship. The data rows are ‘lined up’ in the report with a lookup on concept.local-name which is synonymous with relationship.target-name (fact.element_local_name and el_qname.local_name in the SQLs). Hopefully, that gets you on the path to a single SQL query that accomplishes your goal.

    SQL #1

    SELECT  (fact.fact_value)::text AS "fact.value"
        							    , (fact.element_local_name)::varchar AS "concept.local-name"
        							    , (re.is_base)::boolean AS "concept.is-base"
        							    , (fact.fact_id)::int AS "fact.id"
        							    , (fact.fiscal_year)::int AS "period.fiscal-year"
        							    , (fact.fiscal_period)::text AS "period.fiscal-period"    							    
    				 FROM fact
    				 JOIN element AS element ON fact.element_id = element.element_id
    				 JOIN report AS fact_report ON fact.accession_id = fact_report.report_id
    				 JOIN report_element AS re ON fact.element_id = re.element_id AND re.report_id = fact_report.report_id AND re.report_id = fact.accession_id				 
    	 WHERE  fact_report.report_id  in ('496120')
    			ORDER BY  (fact.element_local_name)::varchar ASC

    SQL #2

    SELECT  (target_qname.local_name)::varchar AS "relationship.target-name"
        							    , (dts_relationship.to_element_id)::int AS "relationship.target-concept-id"
        							    , (label_uri.uri)::varchar AS "relationship.preferred-label"
        							    , (rel_dts_network.description)::varchar AS "network.role-description"
        							    , (dts_relationship.tree_sequence)::int AS "relationship.tree-sequence"
        							    , (dts_relationship.tree_depth)::int AS "relationship.tree-depth"    							    
    				 FROM dts_relationship
    				 JOIN element AS to_element ON dts_relationship.to_element_id = to_element.element_id
    				 JOIN qname AS target_qname ON to_element.qname_id = target_qname.qname_id
    				 LEFT JOIN uri AS label_uri ON dts_relationship.preferred_label_role_uri_id = label_uri.uri_id
    				 JOIN dts_network AS rel_dts_network ON rel_dts_network.dts_network_id = dts_relationship.dts_network_id
    				 JOIN qname AS el_qname ON rel_dts_network.extended_link_qname_id = el_qname.qname_id				 
    	 WHERE  rel_dts_network.dts_id  in ('624413') AND  LOWER(el_qname.local_name)  in ('presentationlink')
    			ORDER BY  (rel_dts_network.description)::varchar ASC,  (dts_relationship.tree_sequence)::int ASC

    The raw output from =XBRL.showSQL() can be cleaned and ready to use by wrapping the formula in =CLEAN() – this will remove the beginning and ending quotes and change all double quotes to singles … just remove the offset and limit and you’re good to go.

    in reply to: The XBRL API #204888
    Tim Bui
    Participant

    Hello, I am trying to get the holdings of different ETFs from different issuers such as SPY, IVV,…Does anyone know know to get these data in csv form so that I can upload into my database for further analysis? I can go to the individual issuer’s website and download the holdings using ETF ticker by ticker, but it is so tedious. Thank you in advance for your help!

    in reply to: Lists of similar concepts #205327
    Anonymous
    Inactive

    I haven’t been succesfull with my standarization attempts but made this basic infrastructure to make quick queries and try different approaches, maybe someone finds it useful: https://github.com/gmzi/edgarQ

    in reply to: Lists of similar concepts #205332
    Peter W Reed
    Participant

    gaston thank you for the GitHub program it looks like a very good tool. I wish I had seen it 2.5 years ago. I wrote a set of modules to scrape the same information from Edgar. If my program becomes too unwieldy to maintain, I’ll switch over to your downloading style.

    One comment on standardization. It is impossible IMHO if it is also desired to work over all time. I’ve found instances where the GAAP concept changed over time (not certain if “concept” is the right term). This is not to say we can’t apply NLP/ML to bridge the terms.

    A site I’ve found useful in addition to the XBRL taxonomy files (Excel version) is CalcBench – “https://www.calcbench.com/home/standardizedmetrics.&#8221; Try it with any gaap tag. Note – I have no conflict of interest with the company.

    in reply to: Lists of similar concepts #205423
    Anonymous
    Inactive

    Thanks Peter!! I see calcbench provides a lot of guidance about which sections to look at. I’ll check how far can I get in the free tier, but the search function seems really powerful

    in reply to: Lists of similar concepts #206908
    Peter Miller
    Participant

    Hi guys,

    is it somehow possible that we create a chat group (like Discord or something else)?

    There we could discuss approaches to standardize financial data. I feel like everyone has tried something different and we have never pooled our knowledge. In any case, I have invested over 500 hours in this topic and some failures could have been avoided if I had discussed my approach beforehand.

    That being said, I’m sure the “ultimate approach” will require manual assignment of “tag keys” (which will have to be created first). This alone is a team effort.

    Also, I think we can support each other from a technical point of view. I’ve only had Python experience in pandas for the last three years. I am not very good in reading XBRL documents.

    I’m not up to date on what the most secure and convenient chat platform is these days.

    Does anyone have any suggestion?

    in reply to: Lists of similar concepts #206930
    Peter W Reed
    Participant

    Hi Peter,

    I think a chat room for this discussion topic is a good idea. If our goal is to build software that automates the process, Github comes to mind. Github has a discussion feature (I’ve not used).

    With that said, I’ve abandoned the Esperanto approach. I’ve decided to leverage gaap terminology. The indexes of my DataFrames are gaap tags – e.g., us-gaap:TreasuryStockSharesAcquired. Instead of traditional concept grouping such as “Balance Sheet” aggregation, I’m using the “calculation” sheet in the XBRL taxonomy file (.xls). Here is a link for the 2021 2021_xbrl_taxonomy. It has a column for GAAP tag and its associated common name.

    in reply to: Lists of similar concepts #207001
    Peter Miller
    Participant

    Okay guys, I have started a GitHub discussion. I am new to GitHub but please join and tell me your thoughts.

    https://github.com/ThePythonDude/XBRL-financial-statement-standardization/discussions/1

    in reply to: Lists of similar concepts #207102
    Peter W Reed
    Participant

    Hi Peter, I commented on your github post. As I scrolled through the posts here I recognized lost opportunities. So many ideas have been shared. It is time we capture them in an open-source environment like github.

Viewing 15 posts - 556 through 570 (of 598 total)