Economic data flows and collects from sources both varied and unique. But which sources are significant and why? And how complex does this world of big data get when it comes to trying to explore the economic landscape? Cue Ms. Frizzle’s “Seatbelts, everyone!” line and let’s take a tour of economic data sources on the Magic School Bus!
When it comes to sectoring schemes, it’s practically a requirement to start with the North American Industry Classification System (NAICS). For background, this sector scheme is designed to standardize categories so federal statistical agencies can classify business establishments for the purpose of collecting, analyzing, and publishing statistical data related to the U.S. business economy. The classification structure is based on a composite code, ranging from a broad industry sector—two-digit code, containing 20 broad sectors—all the way down to a six-digit code containing 16,196,514 U.S., Canadian, or Mexican National specific business industries.
For historical context of the importance and purpose of NAICS, allow me to defer to the same U.S. Census link above:
“NAICS was developed under the auspices of the Office of Management and Budget (OMB), and adopted in 1997 to replace the Standard Industrial Classification (SIC) system. It was developed jointly by the U.S. Economic Classification Policy Committee (ECPC), Statistics Canada, and Mexico's Instituto Nacional de Estadistica y Geografia, to allow for a high level of comparability in business statistics among the North American countries.”
This effort was a huge win for economists at the time that NAICS was introduced. The system only appreciates in value as the world economy and its industries become more diverse on their trend toward globalization—helping more detailed economic policy studies take wing.
Awesome, we can now accurately classify a business with NAICS as our encyclopedia of “What industry is this business in?” But if our next question is “What data exists about this industry and what are the behaviors of the industry?” then things get very complicated.
Let’s start our tour chronologically, taking a look at the oldest established sources. First up is The Bureau of Labor and Statistics (BLS).
The Department of the Interior established the BLS with the Bureau of Labor Act (23 Stat. 60) on 27 June 1884 to collect information about employment and labor and the BLS does an impressive job of it, at that. There are a lot of key historical milestones in the 130+ years that the BLS has existed but, for our purposes, let’s highlight 3 events to showcase a few key datasets.
Employment, wage, and salary data is a crucial insight in any economic analysis since it holistically affects the economy whenever you want to model a potential change (or even understand the current known breakdown of industries). Furthermore, BLS data is very detailed when it comes to the number of sectors it describes (including 6-digit NAICS categorization in its reports). That’s crucial since every industry is unique and the more granular the data, the more accurate the base of your analysis can be. These data are published quarterly as part of the aptly named Quarterly Census of Employment and Wages (QCEW) Data. QCEW’s sister dataset, Consumer Expenditure Survey (CES), comes out every year (although lagged by one year—the data released this year describes economic activity from two years ago) and contains information that helps break out household income into categories based on income levels and lets us see the subsequent spending patterns of those household categories.
The Economic Census is a pretty broad and, as a result, poor term to use to lump together all of these various programs and surveys. However, all of these datasets are housed on census.gov so for our purposes it’ll work.
For some context, per records, the census began with the Census of Manufactures in 1810 to take economic account of a handful of producing and service-based industries. And then industrialization changed everything.
In 1902, Congress authorized the establishment of a permanent Census Bureau and directed that a census of manufactures be taken every five years. The 1905 manufacturing census was a milestone, marking the first time a census of any kind was taken separately from the decennial population census. The rest, as they say, is history.
Today, the Economic Census provides a handful of types of data which add even more depth to our understanding of the economy:
This branch of the U.S. data tree got its start shortly after the great depression in an effort to better understand the links and significance of production to local economies across the country. The Department of Commerce even tipped its hat to the BEA, describing its estimation process for GDP as “the greatest achievement of the 20th century.”
What the BEA provides is an annually-released National Income and Product Accounts (NIPAs) dataset which includes total numbers for data points like U.S. employment, GDP, capital investment, and Personal Consumption Expenditure (PCE) spending. You can think of this as an atlas for the economy—you get the big picture but some of the finer details need filling in from other sources like the BLS and U.S. Census.
The BEA also releases benchmark input-output tables every 5 years. These are the proverbial “national checkbooks” which describe what any given industry pays any other industry to provide inputs for production. The number of rows in this checkbook increase every time the BEA defines a new industry sector to describe the unique production functions which emerge as new business types enter the national economy.
Regional Economic Accounts (REA) follow NIPAs and input-output tables but are lagged by 1 year. They contain information about employee compensation and proprietor employment and income to state- and county-level detail. Of the data sources we’re covering here, this is the only one that includes information about what employees get paid (including benefits and payroll taxes).
And there’s more! Output for most service sectors, past-year deflators, state-level tax data, and net commuting rates also enrich the information available in that atlas of the U.S. economy. All you need now is to figure out the names of all those side roads and alleys are and where they lead which brings us to...
As its name suggests, this is where all the agricultural economic activity lives. The USDA is especially valuable as a data source because many other sources introduced in this article treat agriculture as a single sector or industry whereas the USDA provides far richer detail.
The additional detail finds its way into three major USDA data sets:
As you can imagine, production and sales for farmers differ widely depending on the crops that they’re growing and where those crops are grown. Also, what a farmer produces in one year may not be sold until the following year (or later). We can use all three of these USDA data sets together to reconcile what’s materially contributed to the economy in our “checkbook” snapshot of the economy for an isolated year. The USDA data helps fill in some of those gaps in the map which the other data sources don’t provide.
This one isn’t strictly an economic data provider, but their data does inform the way that you might accurately describe the trade that counties in the United States share with each other.
Specifically, these good folks provide data which can be used to extrapolate a travel-cost index which details how much it costs in terms of time and money to move goods from one county to another relative to all other counties. This is especially useful for building a gravity model of trade—but that’s a topic for a whole other article. If, for example, two counties share a border then it’s reasonable to assume that trade might flow freely between them. But if that border is scribed by an impassable mountain range, then more adjoining counties may be involved in the trade process between our first two mountainous counties. These data are also broken out by mode of transportation so if there ain’t no trains in a county, then you’re going to have to hire a company in the trucking sector to deliver.
This one sounds oddly specific, I know, but if you want to get a complete picture of the U.S. employment landscape, then you’ll have to make a stop here. BLS QCEW data only covers employees eligible for Federal unemployment insurance programs. Railroad employees don’t fall under that federal umbrella—they’re covered by their own program.
This details the employment data for colleges and universities all around the world. As you can imagine, layering this dataset into those we’ve already talked about gives you even more granularity into how higher education sectors pay and structure their workforce.
And, finally, what atlas of the U.S. economy could be complete without output for fishing sectors. NOAA’s got you covered.
Whew! That’s quite a haul of information to help solve our initial goals of identifying the sector a project of interest falls into, the relationships and characteristics the sector exhibits, and where to find standardized data pertaining to it. Bear in mind too, the big takeaway to consider as we step off of our magical school bus tour of U.S. economic data sources is that no two data sets are created equally or thoroughly—and this is just the 30,000-foot view; there are even more data sources to explore. In many cases, the economic questions you’re trying to answer require more than one source. Getting a complete, holistic snapshot of the economy takes consulting multiple data sets, knowing what’s not represented, and filling in the missing pieces wherever possible.