Leveraging Federal Data Collections for Analysis of the Causes and Consequences of Place-Based Innovation with Small Area Innovation Rate Estimation

Place-based innovation—the policy interest in developing the local endowments, institutions, and interactions required of dynamic innovation ecosystems—places new demands on data that federal collections were never designed to satisfy. To date, local measures of innovation incidence have relied on patent data that is available at the county level. However, patents are a weak innovation indicator as not all innovations are patentable; firms may prefer other means of intellectual property protection even for patentable inventions; and distinctions between product, process, and business practice innovation are usually unavailable. Innovation data collected in the Annual Business Survey (ABS) address all these weaknesses but are too sparse to provide accurate estimates of innovation incidence for all but the largest metropolitan areas. This project will investigate the feasibility of using the much larger Economic Census (EC) that contains no innovation data to substantially increase the number of firms in a small area to produce more accurate innovation rate estimates. This is done by predicting innovation behavior of firms in the EC from variables that are also included in the ABS, using a technique called small area estimation. This method “borrows strength” from a much larger general dataset (EC) to enhance the predictive power of a smaller, more detailed dataset (ABS). It is regularly used to produce local estimates of phenomena of policy interest that would be prohibitively expensive to collect, such as disease incidence or childhood poverty rates. This project is the first time these techniques have been applied to innovation data.

The goal of this project is to generate the Small Area Innovation Rate Estimation (SAIRE). Preliminary analysis using the ABS has found that commonly used control variables such as industry sector, firm size category, or state where the firm is located are predictive of innovation behavior and would be an improvement over naïve local area estimates. The project will investigate possible increases in efficiency by replacing the fixed effects used in the preliminary analysis with random effects in a generative Bayesian multilevel model. In addition to expected increases in efficiency from aspatial pooling provided by a random effects specification, estimation of innovation phenomena may be improved by modeling spatial dependence across proximate small areas. More precise innovation rate estimates may be possible by adding other firm or local characteristics into the predictive model such as cloud computing or local human capital endowments. The two major methodological challenges presented by the research are 1) incorporating complex sample design in the small area estimation as the probability of selection and innovation may be dependent on the same variables such as firm size; and 2) assessing the extent to which firm-level variables in ABS are predictive of establishment-level innovation in EC for multi-unit firms. Accurate meso-level measures of SAIRE would inform the targeting and evaluation of place-based innovation initiatives such as the Regional Innovation Engines program as well as addressing questions such as the role of innovation in reallocation growth that cannot be analyzed using current microdata.

The lead investigator for this project is Zheng Tian, assistant research professor at Penn State and NERCRD. Timothy Wojan, an ORISE Established Scientist Fellow at the NSF’s National Center for Science and Engineering Statistics, and NERCRD Director Stephan Goetz are co-investigators.

Funding Agency: U.S. National Science Foundation

Principal Investigator: Zheng Tian, NERCRD

Lead Institution: NERCRD

Accompanying Institution(s): National Center for Science and Engineering Statistics

Start Date: July, 2024   End Date: June, 2026

Visit Project Website

Tags