Thursday, March 11, 2010

American Community Survey: Using Multi-Year Estimates

Multi-year estimates for three-year and five-year periods are a product of the American Community Survey (ACS). These estimates, which provide continually updated information, must be used with care to understand what they do and do not represent.

The series of monthly samples that constitute the ACS can be combined in a variety of ways to produce survey-based estimates for geographic areas and population groups. Every year the Census Bureau produces three sets of estimates, for one-year, three-year, and five-year periods; the three sets of estimates are cumulated over 12, 36, and 60 months, respectively. The first time all three sets of period estimates were released was late 2010; the one-year estimates covered 2009, the three-year estimates 2007–2009, and the five-year estimates 2005–2009.

The reason to produce multi-year period estimates from the ACS is to cumulate the ACS sample over time in order to be able to provide estimates for small geographic areas. Only geographic areas with 65,000 or more people receive one-year (in addition to three-year and five-year) period estimates—one-year estimates for smaller areas would be much too imprecise to publish. Geographic areas with 20,000 or more people receive three-year (and five-year) period estimates, while the smallest geographic areas—small governmental units, census tracts, and block groups—receive only five-year period estimates.

Table 1: Major Geographic Areas and Type of American Community Survey Estimates Received

Construction of Multi-year Estimates

The starting point for producing multi-year period estimates is to pool the survey data over the three or five years involved into a single file, in the same way that one-year data are pooled for the 12 months of data collected within a calendar year. Then, new weights are calculated for the combined sample to make the sample cases reflect the average total population for the relevant three- or five-year period. This procedure makes it possible to use a larger sample, not only to adjust for non-response to the survey, but also to control the estimates to independently derived population and housing unit totals. A larger sample is advantageous because the adjustments can be made for finer categories, such as age, sex, race, and ethnicity categories for the population controls. However, users must keep in mind that the estimates represent an average for the entire period and not for any particular month or year.

Other adjustments made to the three-year and five-year period data files are as follows.

The latest available geographic boundaries are used for all of the months in the combined sample. For example, if a city annexed a town in 2008, then its three-year period estimates for 2005–2007 will include all of the data collected within the original city boundaries for those three years, but its three-year period estimates for 2006–2008, 2007–2009, and so on will include all of the data collected within the enlarged city boundaries for all three of the relevant years.

The latest vintage of population and housing unit estimates is used as controls, including any corrections to prior year estimates. These controls are the averages of the population and housing unit estimates for the three- or five-year period.

A model-assisted weighting step is added to the weighting process to improve the precision of subcounty estimates. This step uses a generalized regression (GREG) approach, based on an administrative file of person characteristics by age, sex, race, and Hispanic origin maintained by the Census Bureau that is matched to the Master Address File (MAF). Note that the administrative data are not used directly in ACS estimates; they are used only in the weighting process.

Income amounts are adjusted for inflation. This adjustment is made because respondents to the ACS are asked to report their income for the prior 12 months. First, to put income amounts that are reported for differing 12-month reference periods on a comparable calendar-year basis, the Census Bureau expresses them in constant dollar terms by using the national consumer price index for urban consumers-research series (CPI-U-RS) for the latest calendar year covered by an estimate. For three-year period estimates for, say, 2007–2009, the incomes for people sampled in 2007 and 2008 (which were already adjusted to calendar 2007 or 2008 on a one-year basis) were adjusted to calendar year 2009 by the ratio of the annual average CPI for 2009 divided by the annual average CPI for 2007 or 2008, as the case may be. This adjustment expresses all of the reported income amounts for a given period (one year, three years, or five years) in a comparable manner with regard to purchasing power as of the most recent calendar year in the period. Such an adjustment should not be confused with a current-year estimate, given that incomes may grow faster (or slower) than prices.

For poverty estimates, the Census Bureau’s method for determining poverty status for families and their members does not require adjusting income amounts for inflation. The Census Bureau compares the income of a family (or unrelated individual) for a 12-month reporting period, not adjusted for inflation, to an average of 12-month nominal dollar poverty thresholds by family size and type for that same period. (The official poverty measure is constructed in a similar manner except that the income and thresholds refer to a calendar year.) For a five-year period estimate, then, the poverty rate is the average rate of everyone in the sample over the five years.

Some housing costs are adjusted for inflation. While the Census Bureau makes no inflation adjustments for the one-year period estimates of housing value, rent, utilities, property taxes, and other housing costs, it does make such adjustments for the three-year and five-year period estimates. It adjusts housing costs for inflation by using the ratio of the annual average CPI value for the latest year of the three-year or five-year period to the annual average CPI value for the year for which the amounts were reported. These three-year and five-year period estimates for rent, housing value, utilities, and other housing amounts expressed in dollars for the latest year of the period are not the same as estimates for the latest year.

Some estimates are suppressed (that is, not published). The Census Bureau deletes entire tables or collapses the cells in tables containing one-year and three-year period estimates that are highly imprecise. Such suppression is not applied to five-year period estimates so that information for small areas is available for aggregation into larger areas. However, some small-area estimates are suppressed when the Census Bureau’s Disclosure Review Board (DRB) determines that their release could lead to disclosure of data for an individual. Also, fewer tables of five-year period estimates are made available for block groups than for larger areas because of very small sample sizes.

Working with Multi-year Estimates

For geographic areas where one-year and multi-year period estimates are available from the ACS, users need to determine which set to use, considering the trade-off between having the most recent data possible (which favors one-year period estimates) and having the most precise data possible (which favors multi-year period estimates). When working with the multi-year period estimates, users need to take care to understand what they do and do not represent. For example, a five-year period estimate that 10 percent of people in a county or city live in poor families could reflect any of the following: a constant 10 percent across the five years; a steady increase from, say, 7 percent to 13 percent; a corresponding steady decrease; a rise and decline in the percentage across the years; and so on. To obtain an indication of the likely pattern that underlies a five-year (or three-year) estimate, users need to apply local knowledge of the conditions in the area over the period. They can also examine the published one-year estimates for a larger area that contains the area of interest.

Users of multi-year population counts also need to recognize that the counts (for example, the number of persons in poverty) are average counts across the period. Some small areas may appreciably increase or decrease in population size over a three- or five-year period, making it difficult to interpret an average count for the period.

Some considerations for users of multi-year ACS estimates for common applications are outlined below.

Precision. Even when ACS data are combined over five years, the resulting estimates are less precise than the comparable Census 2000 long-form-sample estimates, and are often highly imprecise. In particular, ACS five-year estimates for areas with small populations, such as census tracts and block groups within cities and other larger areas and small governmental jurisdictions (such as cities, towns, school districts, and some rural counties), can be subject to very high levels of sampling error. Equally, estimates of the characteristics for small population groups in large areas, such as schoolchildren or disabled persons, can be very imprecise. In such cases, where possible, users are advised to aggregate estimates for smaller areas or population groups to form estimates for larger areas or groups.

Interarea comparisons. It is important not to mix and match ACS period estimates in making comparisons across geographic areas. For example, in comparing the percentage of poor people or employed people in each county of a state, it would be inappropriate to use one-year period estimates for large counties, three-year period estimates for medium-sized counties, and five-year estimates for small counties because economic conditions may have changed during the time period. An appropriate alternative would be to use five-year period estimates for all counties. If more recent estimates are desired, then one-year or three-year period estimates could be compared for large counties and for Public Use Microdata Areas (PUMAs, which are usually combinations of counties) for the remainder of each state.

Comparisons across time. For small geographic areas for which only multi-year period estimates are produced, the study of change over time is complicated. Users will be tempted to compare the change from one year to the next for small geographic areas by comparing overlapping three-year or five-year period estimates—for example, by comparing the percent of poor people for a small city in 2005–2007 with the percent poor in 2006–2008, or by comparing a census tract in 2005–2009 with 2006–2010. However, estimates of change based on differences between overlapping three-year or five-year period estimates are generally not useful—the reason is that the overlapping pairs of estimates contain much data in common (see Table 2). For example, estimates for 2005–2009 and 2006–2010 contain the same data for 2006, 2007, and 2008, so the only data that could generate a change are for 2005 and 2010. Yet for a small area that only has five-year period estimates available, the sampling variability of the difference between the two single years (2005 and 2010) is very large, and almost inevitably much larger than the difference itself, so that no conclusions can be drawn about the statistical significance of any difference. Furthermore, even with nonoverlapping estimates, the estimates of differences will generally be fairly imprecise. Analyses of change will be most productive for small geographic areas for which multi-year estimates must be compared only when major changes have occurred, or for large geographic areas for which one-year estimates are precise enough to be published.

Table 2: Overlapping Years in Successive ACS Five-Year Estimates, 2005–2011

A final point regarding differences between multi-year period estimates is that, just like the estimates themselves, they can reflect a variety of patterns in the underlying one-year estimates. For example, a two percent change between two nonoverlapping three-year period estimates of the percent poor people would occur if the estimate for each of the first three years was, say, 10 percent and that for each of the second three years was 12 percent. Alternatively, the two percent change could reflect an estimate of 10 percent for the first five years and an estimate of 16 percent for the sixth year. Users need to be aware of the possible underlying patterns and find ways to distinguish between them based on other sources or on ACS data at other levels of aggregation.

Despite their limitations, the multi-year period estimates are generally an improvement over the once-a-decade census long-form-sample estimates. Since they are updated every year, they provide a more recent picture of the characteristics of an area than is possible from the long-form sample.

Constance F. Citro

Graham Kalton

See also Long form; Sampling for Content

Did This Article Tell You What You Need to Know? Let Us Know...

No comments:

Post a Comment

Please let us (CQ Press Reference) know if you've found the site helpful or if you have questions that you thought would be answered but aren't.