This task will outline how to calculate confidence intervals for geometric means. See "How to Perform Statistical Tests and Calculate Confidence Limits with Degrees of Freedom" in the Variance Estimation module for basic programming steps for calculating confidence limits.

When the data are highly skewed you will need to transform them. For example, you can obtain the geometric mean by applying a log transformation to the data.

In this example, you will be calculating geometric means for the fasting
serum triglyceride variable. Obtain the geometric mean and its standard error
directly from the SUDAAN *proc descript* procedure* *and then output
them to a SAS dataset where the CI can be constructed directly. The explanations
in the summary table below provide an example that you can follow.

IMPORTANT NOTE

These programs use variable formats listed in the Tutorial Formats page. You may need to format the variables in your dataset the same way to reproduce results presented in the tutorial.

Statements | Explanation |
---|---|

1 =analysis_data; 1 sdmvstra sdmvpsu; |
Use the SAS procedure, |

1 atlevel2= 2 ; | geometric atlevel1=
Use
Use the
The |

nest sdmvstra sdmvpsu; |
Use the |

1 wtsaf4yr; |
Use the morning fasting sample weight for 4 years of data ( |

subpopn ridageyr>=20 /name= "Adults 20 years and older" ; |
Use the |

1 age1 riagendr/nofreq; |
Use a class statement to list discrete variables upon which subgroups are based.
In this example, gender ( |

1 lbxtr; |
Use a |

1 riagendr*age1; |
Use the |

print nsum geomean segeomean/style=NCHS geomeanfmt=f6.0 segeomeanfmt= f6.1 ; |
Use the |

1
nsum geomean segeomean atlev1 atlev2/filename=tg9902
replace; ; |
Use an |

Statements | Explanation |
---|---|

set tg9902; df=atlev2-atlev1; |
newtg9902;>
Use the
Calculate the degrees of freedom ( |

1 PROCNUM TABLENO VARIABLE _C1 _C2 ATLEV1 ATLEV2; |
Use a |

ll=round(geomean+tinv(.025 ,df)*segeomean); ul=round(geomean+tinv(.975 ,df)*segeomean); geomean=round(geomean);segeomean=round(segeomean,.1 ); ciwidth=ul-ll; |
Use these statements to
calculate the lower limit ( |

1 = '/'
noobs ; 1 age1
age1fmt. riagendr
sexfmt. nsum
7.0 geomean
6.0 segeomean
6.1 df
2.0 ; label ll= 'Lower' / 'Limit' ul= 'Upper' / 'limit' df= 'Degrees' / 'of' / 'freedom' ciwidth='Confidence' / 'interval' / 'width' ; title1
'Geometric
mean of serum triglyceride and 95 % Confidence' ; |
Use the |

- If you used the
*proc univariate*procedure on the fasting serum triglycerides and compared the mean and median values, you would see that the difference is substantial as triglyceride is a highly skewed variable. Therefore, you should use geometric means. - The geometric mean for males increases up to age 40-49 years and then declines.
- The geometric mean for females increases up to age 60-69 years and then declines.
- The width of the confidence interval (CI) is wider for males than for females, and is the largest for males 40-49 years, indicating more variability in the mean serum triglycerides in this group.
- Confidence intervals can also be used as a first glance to see if two groups are different, for example the CI for mean serum triglycerides for total males (CI 124, 137) and total females (CI 111, 118) do not overlap, indicating that the two groups are likely to be different. However, a test for statistical difference, such as a t-test, should be performed in order to definitively determine a significant difference between the mean for two population sub-groups.