Last updated: 2026-05-12. Material methodology changes are noted in the changelog at the bottom of this page.

Sources at a glance

DatasetSourceCadenceCoverageLicense
WARN Act notices50 state workforce agenciesDaily1988–present, 47/50 statesPublic records / CC0
H-1B / LCA petitionsUSCIS + DOL Office of Foreign Labor CertificationMonthlyFY2009–present (LCA back to FY2012)Public records / CC0
SEC 8-K filingsSEC EDGARDaily2014–present (items 1.03 & 2.05)Public records / CC0
Bankruptcy filingsPACER, SEC, FJC IDBDaily2015–presentPublic records / CC0
DOL unemployment claimsU.S. Department of LaborWeekly1984–presentPublic records / CC0
JOLTS labor turnoverBLS Job Openings & Labor Turnover SurveyMonthly2000–presentPublic records / CC0

WARN Act data — 50 state scrapers

The Worker Adjustment and Retraining Notification (WARN) Act requires US employers with 100+ employees to give 60 days' advance notice of mass layoffs and plant closures. Each state collects these filings and publishes them through its labor or workforce agency — with no federal central repository.

WARN Firehose operates 50 independent state-specific scrapers, each tailored to that agency's publishing format (PDF, HTML table, Excel, or CSV). Every scraper runs daily at 03:00 UTC and:

  1. Fetches the latest filings from the agency website (or PDF, where the state still publishes that way)
  2. Parses structured fields: company name, city, county, state, employees affected, notice date, effective date, layoff type
  3. Generates a deterministic record ID: {STATE}-{YEAR}-{md5[:8]} so the same filing produces the same ID on re-scrape (idempotent ingest)
  4. Validates against a schema and writes to SQLite with WAL mode
  5. Fires webhooks to active subscribers if matched on watched companies

Source URLs for every record are preserved in the database. Each public record page on this site links back to the originating state agency filing.

Cross-dataset joins

Public WARN notices alone don't tell the whole story. We cross-reference each WARN record against five other federal datasets to surface signals that individual sources miss:

These joins power the Risk Signal API and the cross-referenced data tables on every company page.

Normalization rules

Public datasets are messy. WARN Firehose applies the following normalization in the ingest pipeline:

Documented gaps

We do not generate or estimate data we don't have. The following gaps are real and disclosed:

Arkansas, Wyoming, and New Hampshire do not publish WARN Act data publicly. We carry no records from these three states. We do not extrapolate from BLS claims data to fill the gap. State agencies have been contacted; if they begin publishing, we will ingest within 7 days.
LCA petitions before FY2012 are unavailable through the DOL legacy URL paths (they 404). FY2008–FY2011 H-1B/LCA records are not on the site.
USCIS H-1B Employer Data Hub only publishes back to FY2009. FY2008 and earlier are not addressable through public APIs.
Privacy redactions: states sometimes redact small-employer filings (under 50 affected workers) for privacy. We mirror their redaction — if the source state PDF lists "Confidential" for company name, we publish it as such with the source link, rather than guessing.

Data quality status

Current quality metrics (refreshed 2026-03-08):

Accuracy guarantees

WARN Firehose surfaces public records as filed. We do not guarantee that the underlying agencies are correct. If a state agency publishes an error (wrong employer name, wrong employee count, wrong effective date), we mirror that error until they correct it. We are a data aggregator, not a primary source.

If you find a record we ingested incorrectly — for example, a parsing failure that misattributed a row — email [email protected] with the record ID and source URL and we will investigate within one business day.

Open source

The scraper pipeline, normalization code, and SEO page generators are publicly auditable at github.com/sendkamal. Issues, PRs, and reports of parsing failures are welcome.

Changelog

« About WARN Firehose  ·  Cited By »