All redistricting analytics must eventually ground themselves in the same initial dataset. This module explores the bedrock of American democracy: the Public Law (PL) 94-171 data release. By law, the Census Bureau must deliver extreme granular counts (down to the block level) of population and race data to state legislatures within a year of the Census. However, as compute power has increased, the threat of "re-identifying" individual citizens from this aggregate data caused the Bureau to introduce Differential Privacy—a mathematical "noise" deliberately injected into the 2020 Census block data. As analysts, we must now model the political world on a foundational dataset that we know is intentionally inexact.
In This Module
- Covers: The statutory requirements of PL 94-171, Census Block geography, and the Disclosure Avoidance System (Differential Privacy).
- Why it matters: If you construct a multi-million-dollar Section 2 lawsuit arguing a district is a few hundred people short of a minority-majority threshold, defense lawyers will instantly attack the Differential Privacy margin of error. You must understand the noise in the data before you present the outcome.
- After this module, the reader can: Define the PL 94-171 structure and legally contextualize the noise parameters of the 2020 Census data.
Reading List
Conceptual
-
A high-level primer on the single most important dataset in the American political system. It explains what specific variables the Census is legally required to deliver, the timeline for delivery, and how state legislatures instantly load this data into GIS software to begin drawing maps.
-
The official explanation directly from the source. The Bureau explains that due to modern algorithmic matching, traditional data anonymization no longer works. To protect privacy, they inject mathematical noise at the granular block level (e.g., reporting 5 people in a block that actually has 3). This concept is crucial for analysts defending data in court.
Methods
-
A highly technical, essential paper. The authors tested how the injected "noise" ripples upward geographically. Because states are legally required to draw congressional districts with *exact* population equality (zero variance), attempting to balance districts using artificially "noisy" census blocks creates massive, unresolvable headaches. The authors evaluate whether this noise actually impacts the partisan outcomes of the generated maps.
Technical Reference
-
The literal codebook. As an analyst, you are going to be interacting directly with the raw tabular files (e.g., Table P1: Race, Table P2: Hispanic or Latino, and not Hispanic or Latino by Race). Practitioners must hold this documentation to understand precisely how the variables are coded and how multi-race citizens are aggregated.
Key Concepts
What is Public Law 94-171 and why is it the most important dataset in American electoral politics?
Public Law 94-171 requires the U.S. Census Bureau to deliver extremely granular population and race counts, down to the census block level, to state legislatures within one year of each decennial census. This dataset is the single most consequential in American politics because it is the data that state legislatures load directly into GIS software to draw district maps that determine political representation for the next decade.
What is Differential Privacy and how does noise injection affect redistricting accuracy?
Differential Privacy is a mathematical technique where the Census Bureau deliberately injects statistical "noise" into block-level population counts to prevent re-identification of individual respondents. While this protects privacy, it creates a fundamental problem for redistricting analysts who are legally required to draw congressional districts with exact population equality. The noise propagates upward through all geographic aggregations, creating margin-of-error headaches for VRA litigation.
How does Differential Privacy noise ripple upward through geographic scales to affect redistricting?
Kenny, Kuriwaki, McCartan, and colleagues tested how noise injected at the census block level propagates to higher geographic scales. Because states must draw congressional districts with zero population deviation, attempting to balance districts using artificially noisy block counts creates cascading inaccuracies. Defense lawyers in VRA cases will immediately attack the Differential Privacy margin of error in any minority-majority threshold claim.
What specific Census tables must a redistricting analyst pull from the PL 94-171 release?
Analysts must interact directly with Table P1 (Race), Table P2 (Hispanic or Latino, and Not Hispanic or Latino by Race), and Table P3 (Race for the Population 18 Years and Over) to isolate the Voting Age Population (VAP). Practitioners must understand precisely how racial variables are coded, how multi-race citizens are aggregated, and how to cross-reference Total Population with VAP to build legally defensible demographic baselines.
Goal: Add the exact Census variables required for your project to your Methodology Portfolio.
You cannot simply request "data" from the Census Bureau. You must pull specific spatial and tabular geometries.
- Define the Geometry: Specify the geographic boundaries you will be pulling from the TIGER/Line Shapefiles. Are you pulling the geometry at the Block, Block Group, or Tract level?
- Define the Tables: Identify the specific PL 94-171 tables required for your hypothesis. If proving racial dilution, you must cross-reference Table P1 (Race) with Table P3 (Race for the Population 18 Years and Over) to isolate the Voting Age Population (VAP).
- Acknowledge DP Impact: Add a footnote to your Portfolio explicitly stating that your block-level counts have been processed through the Disclosure Avoidance System resulting in an unspecified margin of error.