Tailoring Community-Based Wellness Initiatives With Latent Class Analysis — Massachusetts Community Transformation Grant Projects

Introduction Community-based approaches to preventing chronic diseases are attractive because of their broad reach and low costs, and as such, are integral components of health care reform efforts. Implementing community-based initiatives across Massachusetts’ municipalities presents both programmatic and evaluation challenges. For effective delivery and evaluation of the interventions, establishing a community typology that groups similar municipalities provides a balanced and cost-effective approach. Methods Through a series of key informant interviews and exploratory data analysis, we identified 55 municipal-level indicators of 6 domains for the typology analysis. The domains were health behaviors and health outcomes, housing and land use, transportation, retail environment, socioeconomics, and demographic composition. A latent class analysis was used to identify 10 groups of municipalities based on similar patterns of municipal-level indicators across the domains. Results Our model with 10 latent classes yielded excellent classification certainty (relative entropy = .995, minimum class probability for any class = .871), and differentiated distinct groups of municipalities based on health-relevant needs and resources. The classes differentiated healthy and racially and ethnically diverse urban areas from cities with similar population densities and diversity but worse health outcomes, affluent communities from lower-income rural communities, and mature suburban areas from rapidly suburbanizing communities with different healthy-living challenges. Conclusion Latent class analysis is a tool that may aid in the planning, communication, and evaluation of community-based wellness initiatives such as Community Transformation Grants projects administrated by the Centers for Disease Control and Prevention.


Introduction
Chronic diseases are among the leading causes of preventable death in the United States, accounting for roughly 75% of the nation's health care costs (1). These diseases are related to tobacco use, physical inactivity, and poor diet. Community-based approaches to preventing chronic diseases are attractive because of their broad reach and low costs, especially relative to most medical interventions.
A data-driven classification system that groups municipalities, the level at which Mass in Motion generally operates, according to relevant characteristics balances these concerns, supporting a tailored-yet-efficient approach to program implementation and evaluation. DPH, in partnership with the Metropolitan Area Planning Council and the University of Massachusetts Medical School, designed an empirically based "Prevention and Wellness Community Typology" using latent class analysis (LCA) that classifies municipalities into distinct groups based on prevention needs, community assets, and other contextual information. These groupings help DPH customize communications messages, set realistic goals for intervention outcomes, and evaluate the success of community-based wellness initiatives.
LCA is a statistical method for identifying underlying groups of similar individuals or units from a heterogeneous sample. LCA methods have been applied to study mental health problems (5,6), substance use patterns (7,8), skin cancer risk (9), back pain symptoms (10), and obesity-related health behaviors (11)(12)(13). However, we found only 1 paper that applied LCA to categorize US communities or neighborhoods into general archetypes (14). In this report, we show a novel application of LCA models to improve and better understand a suite of ongoing community-based public health interventions supported by CDC.

Methods
After key informant interviews with regional planners and public health practitioners, literature reviews, and exploratory data analyses, we identified 6 domains of municipal-level characteristics expected to affect the implementation or evaluation of community-based prevention strategies in Massachusetts. These domains are composed of 1) health behaviors or outcomes relevant to program goals, 2) housing and land use characteristics, 3) transportation patterns, 4) retail environment, 5) socioeconomics, and 6) demographic composition. These domains were used only in framing the selection input indicators a priori, not in representing distinct latent variables for which separate class solutions would be estimated.
The domain descriptions are as follows: Domain 1 (health behaviors and outcomes) captures baseline metrics that the CTG program seeks to improve (eg, fruit and vegetable intake). Domains 2 through 4 include local conditions that are involved in community-based interventions. For example, domain 2 (housing and land use) includes an indicator of subsidized housing inventory because communities are working to promote tobacco-free living in this setting. Similarly, domain 3 (transportation patterns) includes measures relevant to the walking environment. Domain 4 (retail environment) includes counts and densities of local business establishments with which many initiatives may require collaborations. Domains 5 (socioeconomics) and 6 (demographics) are expected to affect the way interventions work across communities. Domain 5 includes conditions such as median household income, which can affect the ability of residents to change some of their health behaviors. Domain 6 focuses on demographic composition because interventions may work differently among different subpopulations. For example, responses to active living and healthy eating interventions may vary by age.
We selected 55 variables to represent the 6 domains. TheAppendix provides detailed metadata on each variable. Briefly, the sources and types of data were as follows: • Massachusetts Behavioral Risk Factor Surveillance System (BRFSS) (2009): community-level prevalence estimates for selected obesity-related risk outcomes and behaviors, including diabetes, hypertension, current smoking, obesity, fruits and vegetable consumption, and physical activity among adults. Municipal-level BRFSS estimates are constructed by using a small-area estimation method (15,16) that weights data according to age, sex, race/ethnicity, and poverty rates.
• Hospital discharge data from the Massachusetts Center for Health Information and Analysis (2010): annual rates of hospitalizations for obesity-related health outcomes for 1) hypertension and hypertensive diseases, 2) transient ischemic attack, 3) major cardiovascular disease, 4) heart disease, and 5) cerebrovascular disease. We include both unadjusted and age-adjusted rates because the 2 indicators provide different information. The former indicates the magnitude of disease burden in communities while the latter highlights geographic disparities by allowing for the comparison of municipalities after removing age as a determinant of hospitalization. We compiled data for all 351 municipalities for each indicator, except where estimates were deemed unstable by DPH, in which case data were coded as missing. We ranked municipalities according to each of these indicators and assigned the municipality's decile for each indicator in our model. For example, Boston, Massachusetts' most populous municipality, was assigned a value of 10 for its 2010 population value rather than 617,594, the actual number of residents. We used this approach to prioritize relative similarity above absolute similarity so that municipalities with particularly extreme values on multiple indicators would not end up in classes by themselves while the bulk of communities were grouped into 1 large class. Roughly equal-sized groups provide functional peer groups for communities and make stratified sampling by class possible. Deciles did a better job than raw values did of operationalizing this idea of relative similarity. The resulting data set was a matrix of 351 municipalities by 55 indicators, populated with decile values for each municipality-indicator cell.
LCA, also known as a finite mixture modeling, allows researchers to detect underlying (latent) subgroups from observable variables. Subgroups are identified that produce independence among the observed variables conditional on class membership such that variables that are usually highly correlated, such as median household income and housing unit density, would no longer be correlated within each class.
Because the goal of this analysis was to help program staff tailor intervention approaches to community needs and context, we constrained our model to a 10-class solution, which was thought to be the highest number of classes tolerable from a program planning and evaluation perspective. Because observed variables (eg, population size) predicted missing data patterns, we were able to use full-information maximum likelihood estimation methods to predict the probabilities of each community belonging to each latent class. Each municipality was assigned to the latent class that the municipality had the highest probability of belonging to. The analysis was conducted using MPlus version 6.1 (Muthén and Muthén, Los Angeles, California).

Results
Health-relevant community characteristics varied widely among Massachusetts municipalities (Table 1), highlighting the need to tailor community prevention efforts. For example, municipal smoking prevalence ranged from 4.7% to 29.2%, and the percentage of residents ever diagnosed with hypertension ranged from 10.3% to 34.5%. Sociodemographic composition and environmental characteristics also varied widely. The percentage of the population that was non-Hispanic white in 2010 ranged from roughly 20% to 98% at the municipal level while the percentage of housing that is high density or multi-family ranged from 0% to 100%.
Overall high prevalence of obesity and obesity-related unhealthy behaviors indicated the need for community-based wellness interventions in Massachusetts, with average municipal-level obesity estimates exceeding 25%. Our latent class model yielded excellent classification certainty ( Table 2; relative entropy = .995, minimum class probability for any class =.871). In terms of model fit, the Akaike Information Criteria was 80525.5, the Bayesian Information Criteria (BIC) was 82896.0, and the sample size-adjusted BIC was 80948.2. The most populous class contains 51 municipalities (14.5% of the state total), whereas the smallest class contains 20 (5.7%). Model fit statistics did not support the a priori selection of a 10-class solution. The log likelihood was not replicated, and 10 classes were not preferable to 9 classes according to a Vuong-Lo-Mendell-Rubin adjusted likelihood ratio test (P = .83). However, a 10-class solution offered the maximum differentiation that program staff could accommodate and was therefore preferred for programmatic reasons.
The classification clearly separated the groups of communities from each other. Table 1 shows the mean class-level values of variables across the 10 classes, highlighting the nature of each class and demonstrating the general utility of LCA in characterizing multidimensional complexity.
Class 1 (n = 49) includes Massachusetts' least densely populated and most rural communities, which tend to have somewhat older and less racially and ethnically diverse residents than the state overall. There are very few retail outlets, physicians, or subsidized housing units in these communities, and the low-density land use patterns contribute to high per household VMT. These municipalities have among the lowest incidence of many negative health outcomes, but indicators of healthy behavior are moderate or poor (eg, childhood obesity, births to mothers who smoked anytime during pregnancy).
In terms of population size and density and geographic location within the state, class 2 is similar to class 1, though with a somewhat more diverse housing stock and a more balanced age structure. Class 2 municipalities struggle with high unemployment, low household incomes, and high poverty rates, and have one of the worst health profiles with low prevalence of fruit and vegetable consumption and physical activity and high prevalence of smoking, births to mothers who smoked during pregnancy, diabetes, and hypertension.
Class 3 is largely composed of moderate-density coastal and western Massachusetts communities ( Figure) that are popular seasonal destinations and home to many retirees, as evidenced by the high share of residents aged 65 or older (22%, higher than any other class). Population declined slightly on average in the past decade. Consistent with these land use patterns and demographics, household VMT is lower than in classes 5 and 4, which are of a similar size and density. Health behaviors and outcomes are generally in the middle of the classes, except for hypertension prevalence, which is among the highest in the state. Class 4 contains relatively small but rapidly suburbanizing communities with a large share of the population aged less than 18 years. Population in these municipalities grew by 9% on average since 2000, and half the housing stock was built after 1970, mostly in low-density single-family subdivisions. As a result, this is the most car-dependent class, where the average household drives 87 miles per day and more than 90% of workers commute by car. These communities rank near the median on socioeconomic indicators (poverty, income) and health behaviors, but have high rates of hospitalization for cardiovascular and heart disease and transient ischemic attack, or "ministroke." Class 5 contains the wealthiest communities in the commonwealth. Median household income is the highest, and poverty rate the lowest of our 10 classes. More than a quarter of the residents are under the age of 18, the highest percentage of young people of any class. These towns also have the lowest childhood obesity rates and the highest rates of physical activity and fruit and vegetable consumption. Class 5 also has among the lowest class-level diabetes prevalence, hypertension diagnoses, and adult obesity.
Class 6 includes a mix of small urban communities and mid-sized suburbs that share numerous socioeconomic and health challenges. Poverty and unemployment rates are well above the state average, household incomes are relatively low, and both health behaviors and outcomes (including smoking, exercise, eating fruits and vegetables, obesity, and hospitalizations) are poor. These communities have among the largest share of residents aged 65 or older, are predominately non-Hispanic white, and have been growing slowly over the past decade.
Class 7 consists of mid-size, moderate density, generally wealthy suburbs in Greater Boston that exhibit generally healthy behaviors and average health outcomes. Compared with other suburban communities, they have a high Asian population (5% average) and a large share of residents aged under 18 (26% average). These communities grew rapidly in the past 10 years yet still have a low share of subsidized housing. Grade 1 and grade 10 obesity rates are among the state's lowest, as are the rates of adult obesity. Class 8 has largely mature suburbs and small urban communities in Greater Boston that are characterized by racially and ethnically diverse populations, moderate socioeconomic status, and relatively poor health outcomes. Although they have a relatively diverse housing stock, a large share of the land area is devoted to commercial and industrial uses, mostly automobile-oriented development. It has among the highest class-level hospitalization rates for common chronic diseases, though it falls in the middle of the pack with respect to resident health behaviors.
Class 9 includes a range of higher-density suburbs and compact cities clustered around Boston. This class is characterized by compact development patterns, as indicated by high average population density, a low percentage of commuters traveling by motor vehicle, and a large share of high-density and multifamily housing. In particular, this class has the highest class-level proportion of Asian residents. Class 9 enjoys an excellent health profile, with the lowest adult smoking and obesity rates and highest rates of physical activity and eating fruits and vegetables.
Class 10 is composed of the state's most populous, densely populated, racially and ethnically diverse, and urban communities. These communities are much less reliant on automobile travel than most other places in the state, but experience substantial socioeconomic challenges, with low incomes, high rates of poverty and unemployment, and a large share of subsidized housing. They perform poorly on almost all health measures. Residents struggle with high rates of physical inactivity, smoking, obesity, and chronic diseases.

Discussion
We present a novel application of LCA methodology to address programmatic and evaluation challenges associated with implementing a municipal-based wellness intervention program across a large number of heterogeneous communities. The approach considered intervention inputs (eg, retail environment), effect modifiers (eg, age), and outcome measures of the prospective interventions (eg, obesity prevalence). The typology has excellent classification certainty and yielded roughly equal-sized classes. Combining expert local knowledge with an empirical approach was crucial in evaluating face validity and selecting a programmatically useful solution despite model fit statistics. To our knowledge, such an application in community health planning has not been reported in literature.
Although our analysis did not incorporate spatial relationships among municipalities, the typology map reveals that class membership is spatially clustered, reflecting the history of community development, migration patterns, and economic changes in Massachusetts. Land use, transportation, and retail environment indicators clearly distinguished rural, suburban, and urban communities. The level of urban development, however, was not the sole driver of class differentiation. For example, 2 predominantly rural classes (ie, class 1 and class 2) are quite similar with respect to the built environment, housing, and land use characteristics, transportation patterns, the retail environment, and demographics, yet exhibit distinctly different socioeconomic characteristics, health behaviors, and health outcomes. Class 2 communities were poorer and fared worse in overall health profiles, with higher rates of smoking, obesity, hypertension, and hospitalizations for chronic diseases. This distinction informs the design and delivery of public health interventions in those 2 classes.
The prevention and wellness community typology derived from our analysis serves as a basis for 1) establishing proper evaluation benchmarks, 2) establishing efficient-yet-tailored communications campaigns, 3) facilitating knowledge exchange across peer communities, and 4) using cost-effective, context-specific intervention selections and staff training. One of these applications is already in progress: field and telephone survey sample frames used the typology as a stratification variable, ensuring that data on health behaviors, the walking environment, and the food environment Despite these successes, we note limitations to our analysis. First, the model was constrained to produce 10 classes for pragmatic reasons despite model fit statistics indicating that 2 or more of these classes could be combined. Although we found meaningful distinctions among all 10 groups and made the face validity of the solution our priority, others seeking to apply LCA to public health practice should consider a data-driven approach to obtain a statistically optimal class solution. In making an empirical determination of how many classes exist, technical improvements in model specification, such as relaxing local independence assumptions for highly related variables, should be used. Failing to relax these assumptions may artificially inflate the number of classes detected by LCA, though this was not a major concern in our analysis with a predetermined 10-class solution. Second, although our input variables were selected based on 6 domains that were considered directly relevant to Mass in Motion and CTG programs, it is possible that some inputs are redundant while other relevant indicators were overlooked. Finally, rank-based classification aimed to capture relative rather than absolute similarity among communities and as such served to limit extreme values. For example, the state's largest community (Boston) has roughly 7,500 times the population of the smallest community (Gosnold). With a decile classification, that ratio is not preserved. Although this methodology limits the impact of extreme values, data from outliers could have been treated in other ways that might have affected the class structure. Absolute similarities may be considered in a refined topology.
LCA, as demonstrated here, is an effective statistical method that could help improve implementation and evaluation of community-based wellness efforts nationwide. Our work in Massachusetts is ongoing. The 5-year evaluation plan The RIS file format is a text file containing bibliographic citations. These files are best suited for import into bibliographic management applications such as EndNote , Reference Manager , and ProCite . A free trial download is available at each application's web site.