COMPARABILITY OF DATA The BRFSS is a cross-sectional surveillance survey currently involving 52 reporting areas. It is important to note that any survey will have natural variation over sample sites; therefore some variation between states is to be expected. The complex sample design and the multiple reporting areas complicate the analysis of the BRFSS. Although CDC works with the states to minimize deviations, in 1996 there were some deviations in sampling and weighting protocols, slight differences in question wording, populations covered on some sections, sample size, response rates, and collection or processing procedures. This section identifies these issues for 1996. A. 1996 Data Anomalies and Deviations from Sampling Frame and Weighting Protocols Alaska Alaska's sample design is made up of three random digit-dialed strata and four listed strata, where phone numbers are selected from a list of putative household numbers. The sampling frame contains all of the household numbers in the random digit-dialed strata and an estimated 45% of all household numbers in the listed strata. This represents an exclusion of approximately 9% of all residential numbers in Alaska. In 3 out of 4 geographic strata, prefixes are assigned to a high (RDD) or low (listed) density stratum depending largely on the number of active household numbers (about 2000 being the threshold) in the exchange (10,000 numbers). All numbers in the fourth geographic stratum are assigned to the listed density stratum. In the RDD strata, the probability that a number is selected depends on the number of active household numbers in its exchange. In the RDD strata, a number of prefixes equal to the target sample size are selected at random in proportion to their number of active HH's. (A prefix can be selected more than once.) Once a prefix is selected into the RDD sample, 48 suffixes are randomly generated. Once the entire sample for the stratum is generated duplicate numbers are deleted without replacement. The final result is h lists of phone numbers per stratum, each containing 48 or fewer unique numbers, where h is the target number of completes per stratum. Phone numbers are called sequentially from each list until one complete per list is obtained. California California's 1996 sample design utilizes a sampling frame consisting of hundred blocks that contain three or more listed household telephone numbers. Telephone numbers are sampled in direct proportion to the number of listed household numbers in the hundred block from which each is selected. The design also varies probabilities of selection by the county to which a prefix is assigned according to one of various criteria such as total households or population. The criteria may have varied at the different times sample records were selected. During sample selection, if a known business number is initially selected, it is replaced by the next eligible number on the sampling frame, which increases the probability of selection of numbers preceded by known business numbers. Any number selected within the previous five to six months by any of the clients used by California's vendor is also replaced by the next available number on the sampling frame unless this would result in an insufficient number of sampling units being provided. No adjustment has been made for the differential probabilities associated with selection being made in proportion to hundred block listings, county, or replacement of business or recently selected numbers. Hawaii The Hawaii Department of Health generates a monthly stratified RDD sample from a list of working prefixes supplied by GTE Hawaiian Telephone Company. These prefixes are divided into six strata. In Hawaii's data, strata five and six (Molokai and Lanai Islands) are coded as a single stratum (stratum five). There is no adjustment for relative probabilities of selection between the two strata. The stratum-specific probabilities of selection were determined by dividing the number of household telephone in each recorded stratum into estimates of the telephone households in each stratum. A more appropriate adjustment would have been to divide the number of telephone numbers generated in each stratum by the size of the sampling frame. There may also be differences in the probabilities of selection of telephone numbers and clustering within strata that are not taken into account in the weighting. Nevada Nevada's 1996 sample design utilizes a sampling frame consisting of hundred blocks that contain five or more listed household telephone numbers. Telephone numbers are sampled in direct proportion to the number of listed household numbers in the hundred block from which each is selected. The design also varies probabilities of selection by the county to which a prefix is assigned according to one of various criteria such as total households or population. The criteria may have varied at the different times sample records were selected. During sample selection, if a known business number is initially selected, it is replaced by the next eligible number on the sampling frame, which increases the probability of selection of numbers preceded by known business numbers. Any number selected within the previous five to six months by any of the clients used by Nevada's vendor is also replaced by the next available number on the sampling frame unless this would result in an insufficient number of sampling units being provided. No adjustment has been made for the differential probabilities associated with selection being made in proportion to hundred block listings, county, or replacement of business or recently selected numbers. Oregon In Oregon for 1996, density weights are missing for January through June. All records for this time period received a density weight of 1. For July through December, there were 18 records with a weight of 4; all other records received a weight of 1. Texas In 1996, Texas used a sample design in which only phone numbers from hundred blocks with one or more listed household numbers were included in the sampling frame. Such hundred blocks were estimated to contain 97.1% of all household numbers in Texas. Numbers from this frame were selected with an equal probability of selection. Other areas In a few states, a portion of sample records intended for use during one month may have been completed in another month. This deviation should only affect analyses based on monthly, rather than annual, data. B. Other 1996 limitations of the data In addition to departing from the standard for sample designs, California modified the wording of mammography and Papanicolaou (PAP) smear questions. These questions may have limited comparability to those of other reporting areas. California also asked the HIV/AIDS section questions to persons 18-45 years of age rather than to those 18-64 years of age as specified. Oklahoma's sample includes a disproportionate number of persons 65 years or over compared with their state population. An age distribution that differs substantially from the population distribution in the survey may produce biased estimates of risk factor prevalences, particularly since many characteristics of individuals are affected by age. Further information on age distribution is shown in Table 5 of the Summary Quality Control Report available on the internet web site: www.cdc.gov/nccdphp/brfss. Respondent race distribution between the sample and the population differed for some reporting areas, and may produce biased estimates of risk factor prevalences. The discrepancy between the percentage Non-white in the sample and the percentage Non-white in the population is an indicator of racial bias of the sample. The percentage Non-white in the sample is affected by interstate differences in the protocol for coding the race of Hispanics. For example, Texas and Idaho coded Hispanics as Other Race unless the respondent identified themselves as belonging to one of the standard categorical races the first time they were asked the question. New Mexico, by contrast, probed for one of the standard race categories. Finally, California coded the race of Hispanics as White unless they identified themselves as belonging to one of the other standard races the first time they were asked the question. These differences in protocol affect the percentage Non-white in states with large Hispanic populations and may introduce bias in the race-specific risk factor prevalences for these areas. ( See Tables 2-4 in The Summary Quality Control Repor for additional information.) Income non-response varies substantially by reporting area. Although the median non-response rate on this item is 5.6%, it ranges from 1.2% to 31.1% across states. Thirty-one percent of Oklahoma's respondent records were missing income, as were about 20% of the records in South Carolina and Arizona. Compared with other items on the survey, income non-response is relatively high. For example, item non-response for educational attainment has a median of less than 1%, and a maximum of 5.6%. Telephone coverage averages about 95% for U.S. states as a whole, but ranges from 3.2% non-coverage in North Dakota, to 13.3% in New Mexico. It is estimated that 24% of households in Puerto Rico are without telephones. Dual questionnaires and/or partial year coverage occurred in Illinois, Kentucky, Tennessee, and New York. Illinois used dual questionnaires, and collected data on core items involving exercise, nutrition, weight control, and modules concerning oral health, hypertension, cholesterol, immunization, injury control, and alcohol for only 6 months of the interviewing period in 1996. Kentucky used dual questionnaires and collected data on the following modules for only part of the year: diabetes, smokeless tobacco, health care utilization, oral health, hypertension, cholesterol, immunization, colorectal screen, injury control, alcohol, and firearms. Tennessee collected data for the modules on sexual behavior, oral health, preventive counseling, colorectal screening, and firearms for only part of the year. New York used dual questionnaires and collected data on the following modules for part of the year: sexual behavior, oral health, preventive counseling, colorectal screen and firearms. Data users will need to alter program code so that the usual "missing/dk/refused" codes are not combined with "9's" appearing in records due to noncoverage in the states mentioned here. q:\cd96\compare