Skip to main content

Data Provenance & Compliance

One-page summary for procurement, ops, legal, and compliance teams evaluating WARN Firehose as a data vendor. Everything below is designed to be forwarded internally without further annotation.

Short version: Every data point we sell is sourced from public U.S. federal or state government filings. We do not access, license, or redistribute material non-public information. We aggregate, normalize, and cross-reference — that is the product.

1. Where the data comes from

All six primary datasets are retrieved from government systems via their public APIs, bulk downloads, or publicly-available filing portals. No scraping behind authentication, no data brokers, no non-public feeds.

DatasetSourcePublic access
WARN Act filings State labor departments (50 states) State portals, FOIA, required disclosure
SEC 8-K Item 2.05 (restructuring) SEC EDGAR submissions API sec.gov/edgar
Chapter 11 bankruptcy dockets PACER court RSS feeds + FJC archive Federal courts public access
H-1B / LCA petitions DOL OFLC disclosure data dol.gov
JOLTS labor turnover BLS public API bls.gov/jlt
DOL unemployment claims DOL weekly claims release doleta.gov

2. MNPI analysis

Is any of this material non-public information? No. Every input to every signal we produce is published by the issuing government agency under a statutory disclosure regime. WARN notices are mandated 60+ days before mass layoffs. 8-K Item 2.05 is filed within four business days of board approval. Bankruptcy petitions are court records.

Are our derived scores MNPI? No. A score computed from public inputs using a published methodology remains non-MNPI. We publish the methodology on the Risk Signals page so clients and their compliance teams can audit it. A fund can replicate the score internally from the same sources — we save integration time, not provide exclusive access.

Do we provide any non-public information? No. We do not have relationships with company insiders, inside counsel, or anyone who would be a source of MNPI. We do not buy or license data from third-party vendors whose provenance we cannot verify.

3. Alternative-data classification

For compliance officers running alt-data intake review, we fall into the narrowest and safest subcategory:

Most funds' alt-data intake checklists are built around the panel-data and web-scraping risk categories. Our answer to every relevant question is "not applicable — public government data."

4. Licensing & usage rights

Customers receive a non-exclusive license to use the data for internal analysis, investment decisions, and research. Redistribution of raw data to third parties is not permitted. Derived works (e.g. a research note citing our data) are permitted with attribution. Internal model inputs are permitted without attribution.

The data itself, being a work of the U.S. federal and state governments, is not copyrightable; our product is the aggregation, normalization, cross-reference, and scoring layer. Contract language covers service-level commitments and the prepared data layer, not the underlying public facts.

5. Security posture

Transport
TLS 1.2+ on all API + webhook traffic
At rest
Encrypted EBS volumes, backups in S3 with SSE-S3
Auth
API keys + tier-scoped rate limits; no client secrets stored plaintext
Delivery options
REST API, webhooks, Snowflake share, S3 parquet drop
Hosting
AWS (us-east-1). No data outside continental U.S.
Data retention
10-year window on SEC + bankruptcy, all-time on labor data

6. Contract / procurement support

We can sign a standard MSA + DPA bundle. We also accept client-paper on hedge fund annual contracts. Typical turnaround on a standard contract is 3–5 business days. We have MNPI-aware language ready that addresses the points above in the form procurement/compliance teams expect.

7. Diligence FAQ

Do you have SOC 2? Not today. We are a focused two-dataset-era team transitioning into a six-dataset product. For institutional buyers who require SOC 2, we offer a mutual NDA plus our internal security checklist, with a commitment to complete SOC 2 Type I within the contract term if the engagement warrants it.

Who sees our queries? Query logs are retained for 30 days for rate-limit and abuse prevention, then aggregated. We do not share query patterns with anyone. We do not sell "what funds are searching for." If your compliance team requires no query logging, we can offer this on the annual contract tier.

What if a company sues over our score? Our disclaimers make clear the scores are algorithmic estimates derived from public data. We are not a rating agency. We have not received any such claim. Indemnity for claims from third parties against the client arising from use of the data is addressed in the MSA.

Ready to move to diligence?

We can send the MSA, security checklist, and sample data extract together. Most funds complete intake in under a week.

Request Diligence Package → Back to Hedge Fund Page
This page is provided for the convenience of institutional data buyers and their compliance teams. It summarizes our data practices as of the date below but is not a contract. Binding terms are in the MSA and Terms of Service.
Version 1.0 · Last updated 2026-04-22