First Stage Sampling
Geographic Information System (GIS) CASPER Toolbox
Using GIS software rather than the U.S. Census website provides more flexibility in the selection of the sampling frame by allowing the user to select portions of a county, city, or other available geopolitical areas. A CASPER Toolbox is available for GIS sample selection that can automatically select clusters. It is currently available in ArcGIS Desktop for those with GIS expertise. CDC also provides sampling and mapping support to requesting jurisdictions. Email CASPER@cdc.gov for more information.
CASPER uses a two-stage cluster sampling methodology. In the first stage, clusters (traditionally 30) are selected with a probability proportional to the estimated number of households in the clusters. A cluster is a non-overlapping section in a geographic area with a known number of households. For this reason, U.S. Census blocks are most commonly used.
Selecting a CASPER Sample
- Selecting a CASPER sample requires a list of all clusters (e.g., census blocks) within your sampling frame, including the number of households within each cluster. This can be downloaded from the Census websiteexternal icon or population-based files within Geographic Information System (GIS) software.
- Number each household by assigning each cluster with a cumulative sum of the number of households.
- Finally, select 30 clusters by using a random number generator to select 30 numbers between one and the total number of households within your sampling frame and selecting the entire cluster in which that random number (i.e., household) is located. Some clusters may be chosen two or three times; this is acceptable, and teams would then conduct 14 (or 21) interviews in the selected cluster instead of the standard seven.
- All clusters are chosen without substitution – meaning that clusters originally selected are those that are assessed with no changes or modifications. Any departure from this design (30×7 cluster sampling) is not considered a CASPER. In situations where the 30×7 design may not be feasible or ideal, and a change in methodology is warranted, modified CASPERs may be acceptable but must be described as modified in report(s). See Modified CASPERs for more information.
- Develop maps via the Census website or GIS software so teams can easily navigate to the selected clusters. For more information and detail on selecting clusters, please see the CASPER Toolkit, Section 2.4.pdf icon
To aid in the CASPER process and ensure sampling is correct, CDC provides sampling and mapping support to any requesting jurisdiction or agency. Email CASPER@cdc.gov for assistance
Second Stage Sampling
Typically, a single individual will conduct the first stage of sampling (selecting 30 clusters) but it is the responsibility of the interview teams to appropriately select the households within each cluster to interview.
Systematic Random Sampling
To select the seven households to interview, conduct systematic random sampling. To do this:
- Count (or estimate) the number of households within the selected cluster.
- Divide that number by 7 (this will be your n).
- Starting at a random point, travel through the cluster in a serpentine method (i.e., walk up one side of the street and then turn and walk down the other side in such a manner that every house within the selected cluster is passed) to select every nth household for interview.
The most scientific and representative way is to select the seven households and continue to return to those households until an interview is completed. However, it is important to balance the scientifically ideal with the real-world or disaster situation. Therefore, interview teams should attempt to revisit the selected household three times, but then may replace the household if an interview wasn’t successful (e.g., household refused, nobody answered after three attempts, language barrier). Overall, keeping the sample as complete and representative as possible requires sound judgment and quality training of interview teams.
For more information, please see the CASPER Toolkit, Section 3.4pdf icon.
Things to Avoid
Convenience sampling is a form of non-probability sampling that involves selection based on availability, opportunity, or convenience. For example, going to households where there are people outside or where another interviewee told you to go since they know they would answer.
Target sampling is a form of non-probability sampling that involves intentionally sampling a certain population or group. For example, going to the household that looks the most damaged or like it will get “best” results.
Sequential sampling is going to one house after another in sequential order which will likely bias your sample to one section of your cluster. Note that there are some circumstances which may predetermine sequential sampling as necessary such as clusters with less than 10 households or when your clusters are so difficult to navigate that sequential sampling is the only way to successfully complete the CASPER in the time allowed. If this is the case, it is extremely important that a random starting point is selected prior to going into the field.