Keeping Federal Data Secure

Matthew Jensen

Spring 2024

In just the past few years, the U.S. government has suffered dozens of high-profile data breaches of its civilian agencies. Leaks, hacks, and simple mistakes have exposed Americans' data, in some cases widely to the public, in others to malicious actors and enemies abroad.

Since 2020, notable losses of information have flowed from the Internal Revenue Service (IRS) and other offices of the Treasury Department, the House Select Committee to Investigate the January 6th Attack, the Department of Energy, the U.S. Patent and Trademark Office, NASA, the Federal Aviation Administration, the National Institutes of Health, the Securities and Exchange Commission (SEC), at least 27 U.S. attorneys' offices, the Administrative Office of the U.S. Courts, the U.S. Supreme Court, and others. From the Office of Management and Budget's (OMB) reporting, we learn that the government has kept to itself the details of tens of thousands of other security incidents. Thousands more have likely gone unnoticed.

America's administrative data are not safe: Information insecurity prevails. And it presents a major challenge for government activity now and for the foreseeable future.

The Biden administration would like to use this unfortunate reality as a reason to claim expansive new regulatory powers that would control information technology across all sectors of the economy, while simultaneously expanding the administrative state to collect more information than ever. Given the inability of even the most heavily protected corners of the government to keep information secure today, this approach is sure to fail.

Instead, information insecurity should be a powerful impetus for decentralization. The most secure data, after all, are those that aren't centralized. Ultimately, this will require streamlining government programs to require less information — particularly less intimate and sensitive data whose unintended disclosure, modification, or deletion could do the most damage to Americans and American interests. Where programs can't be streamlined for compelling reasons, it might require giving Americans more choice about how their data are used under an informed-consent framework.

But how can we know which government programs to streamline and downsize, where to offer informed consent, and how to balance the need for information security against other priorities? Congressional and public oversight on this front has focused on the flow of dollars and cents. Information insecurity now demands as careful a measurement of another valuable resource that the government collects and uses: data.

America needs stronger information-system reporting requirements to measure the intentional flow of information into, within, and out of the U.S. government. It also needs better transparency surrounding federal information-security failures, which will improve our understanding of accidental flows. With these in place, Congress and the public will have the motivation and the tools they need to begin streamlining government programs to protect Americans and American institutions that depend on information security.

HOW THE FEDS USE DATA

The era of Big Data is here, and it has by no means left the federal government behind. To the contrary, many of its administrative agencies, including those outside the national- and homeland-security apparatus (which are our focus here), are now the stewards of massive datasets with sensitive details on millions of American citizens and organizations.

These datasets are gathered through day-to-day interactions between the government and the governed. As Americans file taxes, claim benefits, respond to censuses and mandatory surveys, fill out regulatory forms, engage with the criminal-justice system, and otherwise interact with their government, their information is collected into datasets, connected, stored, and shared.

That government administrators seek to maximize their collection and use of administrative data is unsurprising. New technological advances in storage and computation, paired with methodological leaps in statistics, econometrics, machine learning, and, most recently, artificial intelligence, make it easier than ever to generate insights from data and to act on those insights.

The data resources available to the government are especially rich thanks to their often law-enforced accuracy and comprehensiveness. Most, if not all government agencies today use administrative data to advance their missions, encouraged to pursue "evidence-based policymaking" and scientific governance by numerous executive orders, procedures, guidelines, and laws.

Examples of massive datasets in government abound. At the IRS, the Statistics of Income Databank covers Americans across many years of their tax lives, as well as their relationships with businesses and people. The IRS and academic researchers use this information to study taxpayer behavior, report on inequality over time, and otherwise support research on taxes and society. Other datasets of taxpayer information include the Individual Master File, which includes all individual taxpayers' detailed records; the Business Master File, which includes business taxpayers' records; and the Compliance Data Warehouse (CDW). The CDW is used, among other applications, to target audits and otherwise enforce taxpayer compliance. It integrates numerous IRS datasets on taxpayers and sends data to other government actors and contractors, including the agency's Lead and Case Analytics system that is managed by Palantir Technologies and powered by its Gotham platform.

At the Office of Personnel Management (OPM), data pipelines gather detailed information on Americans who work for, or have applied to work for, the federal government, as well as those who have undergone background investigations for security clearances. OPM's datasets are drawn from information gathered through application forms and interviews, including interviews performed under the watchful eye of a lie-detector needle and intended to disclose specific information that could pose a risk of blackmail. These datasets are used in the day-to-day administration of the country's public-sector workforce and security-cleared contractors, and to assess new techniques for managing the workforce in a scientific manner.

Beyond the use of datasets by the agencies that gather them, external data sharing has also been widely encouraged by law, executive order, and the inherent civic mindedness of many public servants. Many of these agencies now share data widely to advance the missions of their peers in not only the federal government, but state and local governments, research institutions, private-sector companies, and, at times, foreign governments as well.

The Statistics of Income team at the IRS, for example, prepares "public-use" files that it sells to researchers, accounting firms, policy advocates, and others for roughly $10,000 per data year. The team harnesses various techniques to keep the data in this file anonymized. Yet the IRS also shares information from tax records without anonymization (or the degradation in data quality that accompanies it). According to the agency's disclosure reporting, taxpayer information was shared 22.3 billion times in 2022, down from 27.5 billion times in 2021 but up from 2.5 billion in 1995.

Historical trends are somewhat muddy because the IRS's disclosure accounting system counts as multiple disclosures multiple years of a taxpayer's data and information gathered from several types of tax returns, such as 1099s, W2s, and 1040s. Nonetheless, the scale of sharing is massive. Moreover, these estimates exclude the intended release of tabular and otherwise anonymized data, such as the public-use file, as well as unintended disclosures from hacks, leaks, and statistical attacks. The IRS may be the most extreme example of information collection and sharing, but it is not alone.

FEDERAL DATA INSECURITY

In their rush to harness Big Data, agencies and the policymakers that oversee them seem to have missed an important aspect of this era — one that is fundamentally at odds with efforts toward centralized decision-making and control at the federal level: the enormous risk and significant costs of data leaks.

Such risks are exacerbated each time a new data variable is collected, connected with another, duplicated, and shared. The risk profile changed fundamentally with digitization, when the cost of copying information fell dramatically and now flirts with zero.

The threats are not theoretical, as many Americans whose information has already been lost will tell you. In 2021, for example, a former IRS contractor, Charles Littlejohn, gave taxpayer information to ProPublica that the news outlet has described as "a vast trove of Internal Revenue Service data on the tax returns of thousands of the nation's wealthiest people, covering more than 15 years." ProPublica has since disclosed many details from the records, using them as the starting point for a series of intimate reports publicizing the tax practices and financial details of numerous wealthy Americans. Littlejohn, the leaker, was sentenced to five years in prison.

In 2014, OPM was breached on two separate occasions. In one case, personnel information on over 4.2 million current and former government employees — with information spanning their entire careers in government service — was lost. In the second case, information on over 21.5 million Americans — some of whom were not government employees — was lost.

In a hearing on the breach in front of the House Committee on Oversight and Government Reform, OPM's chief information officer confirmed that the stolen information included "highly sensitive information gathered in background investigations of current and former Federal employees." In the same hearing, then director of the OPM was asked whether the information — which was characterized as constituting "crown jewels material in terms of potential blackmail" — pertained to "people in the military and intelligence communities." The director suggested moving those discussions to a private off-the-record hearing, and the committee obliged. Unfortunately, the information we do have in the public record indicates that the answer is probably "yes."

Just last year, in February 2023, Russian businessman Vladislav Klyushin was convicted by a federal jury in Boston of hacking U.S. earnings reports on their way to the SEC. He and four Russian compatriots stole the reports from United States-based filing agents and used them to trade ahead of and beat more scrupulous investors in America's public markets. According to a press release from the U.S attorney's office for the district of Massachusetts, the five co-conspirators netted around $90 million from their cheating. Klyushin did not directly hack U.S. government servers, but his case offers a potent example of how government policy that requires centralizing information and controlling the timing of its release makes that information ripe for the taking.

Malicious actors, including foreign adversaries, have many means to exploit data losses from federal agencies. They can use leaked financial data for industrial espionage, gaining a competitive edge in key industries or even specific business deals. They can front-run markets, as the Klyushin case illustrates, or manipulate them by releasing information selectively to a few private parties or the general public. They can blackmail individuals with threats to release sensitive and potentially embarrassing information on child-support payments, health status, tax evasion, divorce preparations, or charitable contributions to unpopular causes. They can release such information widely to sow societal discord. They can modify official records to sow further discord. And perhaps most commonplace of all data exploitations, they can commit identity theft and fraud, often affecting the most vulnerable Americans in devastating ways.

The ease of building data pipelines to collect, connect, store, and share information exacerbates how easy it is for data losses to occur when pipelines are breached. Risk vectors and data adversaries abound. The next hacker might be a statistician, a con artist, a teenager, or a nation-state. The next internal leaker could be a security consultant, a disgruntled public employee, or even a congressional committee or the president himself. Data-sharing partners, such as peer governments (local and international), contractors, academics, and other trusted counterparties to data-sharing agreements multiply the risk vectors. Too many leaks have resulted from simple mistakes.

And federal security incidents aren't rare. The OMB tracks cases where information slipups have occurred. Many have led directly to information breaches, while others only presented the opportunity. In fiscal years 2020 and 2021, OMB tabulated over 63,000 information-security incidents. Thousands more hacks and leaks may have evaded the government's notice or reporting. Of the 63,000 incidents, the OMB presented details on just 13 — these were the ones that reached the OMB-defined threshold for a "major incident."

For private enterprise, data security has proven a generational challenge, occupying the thoughts of business leaders, regulators, and the public alike, many of whom are keenly aware that for every massive leak — like that from Equifax in 2017 — there are dozens of smaller breaches that don't make the national news. Many companies now advertise their data-security practices as important differentiators from their competitors. And while some consumers continue to use intrusive private services without worry, others pick and choose among private competitors based on data-security concerns.

As citizens rather than consumers, however, Americans do not have the luxury of picking and choosing how data are collected and used.

A MISGUIDED APPROACH

The Biden administration's proposed solutions to these problems — increasing information centralization and attempting to stop its loss through extensive new regulation — are unlikely to work.

In his first executive order on his first day in office, President Biden took steps to ensure that Americans are identified by race across federal datasets despite minorities' being generally less trusting of data linkages. In response, the Treasury Department has already imputed racial classifiers onto tax data. The administration then attempted to increase bank reporting requirements for accounts containing just $600. Firm opposition from Republicans, including Idaho senator Mike Crapo, Texas congressman Kevin Brady, and Georgia congressman Drew Ferguson, stalled that effort. But after this setback, the Democrats hit a home run for data centralization with the Inflation Reduction Act, which gives the IRS $80 billion for system modernization and clear orders to ramp up data-driven tax enforcement.

From a technological perspective, the IRS expansion is also fraught because no funds are allocated for the maintenance or the improvement of the legacy systems that are currently in use and will be for some time. Rather, funds are only provided to modernize the system. If anything, the turmoil of the transition may increase the odds of leaks in the short term.

Meanwhile, the administration's 2023 budget proposal called for numerous new policies that would dramatically increase data requirements for even basic tax enforcement to succeed. One of these calls for a minimum tax on income for Americans with measured wealth of over $100 million, where "income" is redefined to include unrealized capital gains. Executing this proposal would require the IRS to take new measurements and evaluate liabilities and assets every year in order to avoid sticking taxpayers with massive default liabilities based on assumed rates of return. The budget would also widely enhance reporting requirements for digital assets and bank accounts, both foreign and domestic.

The other prong of the administration's approach has been to enhance cybersecurity and information regulation. Already in place are new private-sector reporting requirements on hacks and ransomware payments, as well as new security-architecture requirements for the public sector and the contractors that serve it. These moves represent only a portion of an expansive agenda.

In a February 2022 article in Foreign Affairs, Chris Inglis, then Biden's appointee as the nation's first national cyber director, and Harry Krejsa, then the acting assistant national cyber director for strategy and research, wrote that "the United States needs a new social contract for the digital age — one that meaningfully alters the relationship between public and private sectors and proposes a new set of obligations for each." Jen Easterly, a tremendously accomplished cyber warrior currently heading America's leading cyber-defense agency, the Cybersecurity and Infrastructure Security Agency, praised Inglis and Krejsa explicitly in a Foreign Affairs article of her own last year and then expanded:

What the United States faces is less a cyber problem than a broader technology and culture problem. The incentives for developing and selling technology have eclipsed customer safety in importance....This is not the first time that American industry has made safety a secondary concern. For the first half of the twentieth century, conventional wisdom held that automotive accidents were the fault of bad drivers....Any car manufactured today has an array of standard safety features — seatbelts, airbags, antilock brakes, and so on. No one would think of purchasing a car that did not have seatbelts or airbags, nor would anyone pay extra to have these basic security elements installed....Consumers and businesses alike expect that cars and other products they purchase from reputable providers will not carry risk of harm. The same should be true of technology products. This expectation requires a fundamental shift of responsibility.

This vision permeates the Biden administration's National Cybersecurity Strategy. Released last year, the policy document outlines, among other things, "how the Federal Government will use all tools available to reshape incentives and achieve unity of effort [between the private and public sectors] in a collaborative, equitable, and mutually beneficial manner."

The pair of articles by Biden's appointed cyber defenders are worth reading in full; they offer a shared vision in articulate prose. But they miss an essential point: Centralization raises the stakes of government information-security procedures. Under a centralized system, any one incident, whether it be a function of malicious actions or poor management, can lead to a great deal of information being released all at once.

Can information safety really be achieved everywhere and always in federal data systems? Can government regulate its way to information security while collecting more information than ever?

One natural place to look for answers is where government pays the most attention to information security and has the most control over it: the national-security agencies. So far, we've discussed the widespread failures of civilian administrative agencies to protect information. But what of the most secure corners of government?

Edward Snowden, Julian Assange, and Chelsea Manning became household names for the holes they poked in America's national-security information defenses. But let's set those aside as part of the distant past, before our leaders had the opportunity to fully recognize the scope of the information-security challenge.

What of more recent history — say, 2023? It doesn't look good. Hundreds of the Joint Chiefs of Staff's most sensitive briefs on the war in Ukraine — sourced from the National Security Agency, the State Department's Bureau of Intelligence and Research, and the CIA — and other matters of utmost national interest were lost to a leak that allegedly came from a Massachusetts Air National Guard member. Their disclosure to American and foreign eyes wasn't noticed for over a year. Before that, the U.S. Marshals Service lost a trove of sensitive law-enforcement information that included personally identifiable data on Marshals Service employees, subjects under investigation, and other unspecified third parties. The Defense Department exposed to the public more than three terabytes of emails through a server misconfiguration. The FBI lost control of its information systems for investigating images of child sexual exploitation. Those are just a few of the highlights from a single year.

Even in the most secure corners of government, the most sensitive information is far from secure. If these agencies can't keep their information secure, how are the less sophisticated and more lightly regulated agencies to keep theirs safe?

No new security architecture can turn disaster into success overnight — and that's assuming the agencies attempt to conform to security guidelines. The Treasury's inspector general for tax administration and the Government Accountability Office have written dozens of reports over the last 10 years about IRS failures to conform to security standards. Law-review articles by such experts as the University of Washington's Michael Hatfield lay out the problems to the public. Yet the IRS continues to ignore basic principles outlined by OMB directives and even its own internal-revenue manuals. Similarly, the massive OPM hacks have been traced back to that agency's failure to follow prevailing executive-branch security guidelines.

While better information security is worth pursuing, and the government must fight the cybersecurity war with continuous improvement to its own security, the Biden administration's insistence that government collect more information than ever while regulating the entire American technology industry to help keep it safe does not make sense. When so much can be lost in a single breach, information defenders will always remain on the brink of failure.

A BETTER DIRECTION

Ultimately, the best way to protect data from government breaches is to avoid collecting it to begin with. This will require streamlining and simplifying government programs. But how can we know which programs to pare back, and which new programs to avoid implementing? And how can policymakers motivate the public to shrink government with information security in mind if the vast majority of security incidents are kept secret?

A few concrete suggestions might get us started on that path. The first two outlined below focus on security-incident transparency, and are intended to offer the public and Congress a better understanding of the current state of information insecurity. The second three focus on enhancing the transparency of information collection and sharing, which will provide us with a better understanding of which programs require information to be collected and shared, thereby placing information at risk. All five proposals build on the organizational infrastructure already established by the Privacy Act of 1974, the Congressional Budget and Impoundment Control Act of 1974, the Tax Reform Act of 1976, the E-Government Act of 2002, and the Federal Information Security Modernization Act of 2014 (FISMA). To a large extent, these reforms could be adopted through either legislation or executive action.

First, and most simply, the OMB director should broaden the definition of "major incident." Recall that of the 63,000 security incidents identified by OMB in fiscal years 2020 and 2021, details were reported to Congress and the public on just the 13 major incidents.

FISMA requires agencies to disclose and provide details on major incidents to Congress that note, in the language of the statute:

the threats and threat actors, vulnerabilities, and impacts relating to the incident; the risk assessments...of the affected information systems [conducted] before the date on which the incident occurred; the status of compliance of the affected information systems with applicable security requirements at the time of the incident; and the detection, response, and remediation actions.

This is a strong and sensible framework; it would be much more useful if it were applied to more than 0.02% of known incidents. FISMA delegated responsibility for defining major incidents to the OMB director, and thus far, OMB directors have chosen to define such incidents narrowly. Most recently, a memo from OMB director Shalanda Young offering "Fiscal Year 2023 Guidance on Federal Information Security and Privacy Management Requirements" defines a major incident as any incident or a breach involving:

personally identifiable information (PII) that, if exfiltrated, modified, deleted, or otherwise compromised, is likely to result in demonstrable harm to the national security interests, foreign relations, or the economy of the United States, or to the public confidence, civil liberties, or public health and safety of the American people.

The memo refers to documents from the Cybersecurity and Infrastructure Security Agency and the National Institute of Standards and Technology that can assist officials in making these subjective decisions. It then establishes a more objective criterion: "This memorandum requires a determination of major incident for any unauthorized modification of, unauthorized deletion of, unauthorized exfiltration of, or unauthorized access to the PII of 100,000 or more people." Since this threshold applies system by system, a single attack that compromises, say, 10 systems across the government that each lose information on 75,000 individuals (for a total loss of information on 750,000 individuals) might not meet the threshold.

Outside of federal agencies, Congress has taken a particular interest in the information security of another group of entities: the providers, clearinghouses, and health plans covered under the Health Insurance Portability and Accountability Act of 1996 (HIPAA). HIPAA requires these entities (many with far fewer resources than a federal agency) to provide public notice of much smaller breaches. If a HIPAA-covered entity suffers a breach affecting just 500 individuals, it must notify not just those affected, but relevant regulatory bodies and the public as well. The OMB standard for federal agencies like the IRS could be set at 500, or even lower; the current threshold of 100,000 is much too high.

Second, the executive branch should compile a consolidated record of the security failures associated with each federal information system. Collecting such a record could be mandated through an update to OMB guidance relating to the E-Government Act of 2002. That act established a requirement for federal agencies, under the guidance of the OMB, to produce and update privacy impact assessments (PIAs) for their information systems that address:

(I) what information is to be collected; (II) why the information is being collected; (III) the intended use of the agency of the information; (IV) with whom the information will be shared; (V) what notice or opportunities for consent would be provided to individuals regarding what information is collected and how that information is shared; (VI) how the information will be secured; and (VII) whether a system of records is being created under section 552a of title 5, United States Code, (commonly referred to as the "Privacy Act").

A 2003 memo by Joshua Bolten, then OMB director under President George W. Bush, established the OMB's guidance for these recommendations. After updating the definition of "major incident" as per the recommendation above, the president could update the Bolten memo, or Congress could amend the E-Government Act, to require agencies to add the history of major incidents associated with each system to that system's PIA. Linking the history of information-security failures with each information system will offer policymakers and the public a clear view of where the most significant security challenges lie.

Third, the federal government should enforce and enhance reporting requirements on taxpayer-information sharing. The IRS, which collects more information than any other administrative agency, faces unique security risks. The Tax Reform Act of 1976 added to the Internal Revenue Code, via section 6103(p)(3)(C), a requirement that the agency compile and release an annual report containing information that would help the public understand intentional flows of taxpayer information. But the IRS has not complied with this requirement since at least 1996.

According to the act, the IRS is supposed to report counts of requests for taxpayer-information disclosures, instances of disclosures, and taxpayers whose information is disclosed. These counts must be disaggregated by federal agency, as well as by each agency, commission, or body that receives taxpayer information for state tax administration and several other specified purposes. The IRS, however, only reports counts of disclosure instances, does not disaggregate those counts as the law specifies, and does not report at all on counts of requests for disclosure or counts of taxpayers affected.

This reporting non-compliance is particularly worrisome in light of IRS-information recipients' poor safeguarding practices. Of the 50 safeguard reviews conducted by the IRS in 2020 (the most recent year for which I've been able to find the IRS's safeguarding reports), 48 (or 96%) showed deficiencies in computer security, 46 (92%) showed deficiencies in secure storage, and 44 (88%) showed deficiencies in restricting access to federal tax information.

Congress's Joint Committee on Taxation (JCT) has legal authority under section 6103(p)(3)(B) of the Tax Reform Act to correct the IRS's reporting non-compliance; it should do so if the IRS refuses to comply independently. Congress should also consider enhancing reporting requirements by mandating that the IRS disaggregate data for a broader set of purposes. One example could be sharing information with foreign governments, which has grown in importance since the passage of the Foreign Account Tax Compliance Act in 2010 and has led to perverse outcomes, such as information sharing with Russia until April 2022 — more than a month after it invaded Ukraine. Congress should also require disclosure reporting on information sharing with contractors, external researchers, and special employees under the Intergovernmental Personnel Act. These enhancements could be adopted through updates to the Internal Revenue Code, by executive order, or through the JCT's 6103(p)(3)(B) authority.

Fourth, policymakers should add information-volume estimates to the PIAs mentioned above. Either Congress or the president could enhance the PIA framework with inspiration from the 6103(c)(3) tax-information-sharing regime. Under current law and the Bolten memo, for each information system, a PIA must report "with whom the information will be shared." Agency PIAs tend to document systems with which they directly share and receive sensitive but unclassified information (SBU) and PII: The PIA for the IRS Individual Master File, for example, lists 20 IRS systems that receive SBU/PII and four non-IRS federal agencies that receive data directly from the system (some of which may themselves go on to disseminate the data to other systems). The PIA notes that the system does not (directly, at least) disseminate SBU/PII to state and local agencies, Treasury and IRS contractors, or other sources.

By updating the E-Government Act of 2002 or the guidance on the act offered in the Bolten memo, Congress or the president could require that each PIA include upper-bound estimates of the number of individuals and organizations whose information will be shared with each recipient system, including systems within the same agency; systems of other federal agencies; contractors' systems; systems of state, local, and international governments; and any others. Policymakers could go further by requiring agencies to include the specific data variables that they share. Given that all of this meta-information is readily available from the Security Assessment & Authorization systems as well as other sources, none of these steps should pose a significant administrative challenge.

Similarly, the Privacy Act of 1974 requires agencies to publish a system-of-records notice (SORN) in the Federal Register for each information system from which information will be retrieved by personal identifier. The SORN must contain an accurate accounting of records disclosures to other agencies. An executive order or amendment to the Privacy Act could require SORN updates to include historical figures of how many disclosures were made to recipients other than the individual whose information was disclosed, the categories of information that were disclosed, and how many individuals were affected.

These enhancements to the PIA and SORN frameworks would offer a high-level network view, across the entire federal government, of how information is shared within agencies, where sensitive information is sent, and the general scope of the information shared.

Fifth, the congressional scorekeepers — namely the JCT and the Congressional Budget Office (CBO) — should be required to estimate information-collection requirements in new legislation. While the first two flow-transparency reforms above would offer Congress and the public a far better understanding of how the government collects and shares information under current law, Congress should also avoid enacting legislation that would require significant new information operations, or at least know how a new piece of legislation would affect the government's information-collection requirements.

To that end, policymakers can build on the framework established by the Congressional Budget and Impoundment Control Act of 1974, which established Congress's JCT and the CBO as congressional scorekeepers. In order to perform their budget- and revenue-estimating duties, JCT and CBO staff must develop a precise and detailed understanding of proposed laws. It would be a significant but not overly burdensome new obligation to have them estimate additional information requirements contained in proposed laws.

Under such a requirement, JCT and CBO staff could predict which new data variables the government would need to collect or share — as well as which data variables it would no longer need to collect or maintain — in order to administer a proposed law. They could do the same for how many individuals and organizations from whom the government would need to collect data and from whom data would no longer be needed. A new law increasing the standard deduction, for instance, would mean that the government would no longer need to collect variables related to the administration and enforcement of itemized deductions from some number of taxpayers. The JCT could list those variables and estimate the numbers of taxpayers affected.

Additionally, prevailing scorekeeping conventions (notably the "one-sided bets" doctrine) task the CBO and the JCT with estimating the probability of information-stewardship failures related to the federal budget occurring. By asking those two entities to estimate how proposed laws would affect the government's information-collection duties, the reforms described above would help them carry out this task.

JCT staff taking on these information responsibilities would require legislation or direction from the JCT's members; the CBO doing the same would require legislation or new conference rules. The executive branch, meanwhile, could require similar reforms from its analogous institutions: The Treasury's Office of Tax Analysis could estimate how tax proposals would affect the government's collection of information, the OMB's Resource Management Offices and Budget Review Division could do the same for spending proposals, and the OMB's Office of Information and Regulatory Affairs could do so for proposed regulation.

The broader set of recommendations in this section are not intended to be exhaustive, nor should they require establishing significant new institutions or even expanding bureaucratic resources. Rather, they are intended as simple first steps that could make a big difference for our ability to understand the administrative state through the lens of information flows, both intended and unintended.

SAFEGUARDING FEDERAL DATA

Data stored by the U.S. government cannot be assumed safe today. Any and all administrative information that has been collected might be lost at any moment, and whenever information is shared, the likelihood of such loss grows.

A Congress or White House that takes seriously this aspect of the modern data era could use transparency to motivate policymakers to act with information security in mind. Keeping federal data more secure would not require collecting and centralizing more of it, but gathering less and decentralizing what is gathered. This means policymakers must reconceive the federal government's entire approach to data security — and the sooner they do so, the better.

Matthew Jensen is the director of the Office for Fiscal and Regulatory Analysis and serves in the Center for American Prosperity at the America First Policy Institute.

number 64 • Summer 2025

Keeping Federal Data Secure

Matthew Jensen

Spring 2024

Industrial Policy, Right and Wrong

Marco Rubio

Saving Higher Education From Itself

Andrew Gillen

Insight

Archives

A weekly newsletter with free essays from past issues of National Affairs and The Public Interest that shed light on the week's pressing issues.

Sign-in to your National Affairs subscriber account.

Already a subscriber? Activate your account.

subscribe

Unlimited access to intelligent essays on the nation’s affairs.