Data Provenance & Compliance
One-page summary for procurement, ops, legal, and compliance teams evaluating WARN Firehose as a data vendor. Everything below is designed to be forwarded internally without further annotation.
Short version: Every data point we sell is sourced from public U.S. federal or state government filings. We do not access, license, or redistribute material non-public information. We aggregate, normalize, and cross-reference — that is the product.
1. Where the data comes from
All six primary datasets are retrieved from government systems via their public APIs, bulk downloads, or publicly-available filing portals. No scraping behind authentication, no data brokers, no non-public feeds.
| Dataset | Source | Public access |
|---|---|---|
| WARN Act filings | State labor departments (50 states) | State portals, FOIA, required disclosure |
| SEC 8-K Item 2.05 (restructuring) | SEC EDGAR submissions API | sec.gov/edgar |
| Chapter 11 bankruptcy dockets | PACER court RSS feeds + FJC archive | Federal courts public access |
| H-1B / LCA petitions | DOL OFLC disclosure data | dol.gov |
| JOLTS labor turnover | BLS public API | bls.gov/jlt |
| DOL unemployment claims | DOL weekly claims release | doleta.gov |
2. MNPI analysis
Is any of this material non-public information? No. Every input to every signal we produce is published by the issuing government agency under a statutory disclosure regime. WARN notices are mandated 60+ days before mass layoffs. 8-K Item 2.05 is filed within four business days of board approval. Bankruptcy petitions are court records.
Are our derived scores MNPI? No. A score computed from public inputs using a published methodology remains non-MNPI. We publish the methodology on the Risk Signals page so clients and their compliance teams can audit it. A fund can replicate the score internally from the same sources — we save integration time, not provide exclusive access.
Do we provide any non-public information? No. We do not have relationships with company insiders, inside counsel, or anyone who would be a source of MNPI. We do not buy or license data from third-party vendors whose provenance we cannot verify.
3. Alternative-data classification
For compliance officers running alt-data intake review, we fall into the narrowest and safest subcategory:
- Public-source aggregator — no exclusive data, no panels, no web-tracking, no location data, no expert networks, no transactional/card data, no web traffic, no satellite imagery.
- No PII — all company-level. Individual employee names and addresses filtered out of WARN raw data where present.
- No consumer data — no data about natural persons as consumers or employees in the product. Individual H-1B beneficiaries are identified only by employer and occupation per DOL's public disclosure format.
Most funds' alt-data intake checklists are built around the panel-data and web-scraping risk categories. Our answer to every relevant question is "not applicable — public government data."
4. Licensing & usage rights
Customers receive a non-exclusive license to use the data for internal analysis, investment decisions, and research. Redistribution of raw data to third parties is not permitted. Derived works (e.g. a research note citing our data) are permitted with attribution. Internal model inputs are permitted without attribution.
The data itself, being a work of the U.S. federal and state governments, is not copyrightable; our product is the aggregation, normalization, cross-reference, and scoring layer. Contract language covers service-level commitments and the prepared data layer, not the underlying public facts.
5. Security posture
6. Contract / procurement support
We can sign a standard MSA + DPA bundle. We also accept client-paper on hedge fund annual contracts. Typical turnaround on a standard contract is 3–5 business days. We have MNPI-aware language ready that addresses the points above in the form procurement/compliance teams expect.
7. Diligence FAQ
Do you have SOC 2? Not today. We are a focused two-dataset-era team transitioning into a six-dataset product. For institutional buyers who require SOC 2, we offer a mutual NDA plus our internal security checklist, with a commitment to complete SOC 2 Type I within the contract term if the engagement warrants it.
Who sees our queries? Query logs are retained for 30 days for rate-limit and abuse prevention, then aggregated. We do not share query patterns with anyone. We do not sell "what funds are searching for." If your compliance team requires no query logging, we can offer this on the annual contract tier.
What if a company sues over our score? Our disclaimers make clear the scores are algorithmic estimates derived from public data. We are not a rating agency. We have not received any such claim. Indemnity for claims from third parties against the client arising from use of the data is addressed in the MSA.
Ready to move to diligence?
We can send the MSA, security checklist, and sample data extract together. Most funds complete intake in under a week.
Request Diligence Package → Back to Hedge Fund Page