Task 4: How to Estimate Distributions of Usual Intake for a Single Episodically-consumed Dietary Constituent using the NCI Method

In this example, the distribution of the percent of usual energy intake from saturated fat in men will be estimated.

This example uses the demoadv dataset (download at Sample Code and Datasets).   The variables w0304_0 to w0304_16 are the weights (dietary weights and Balanced Repeated Replication [BRR] weights) used in the analysis of 2003-2004 dietary data that requires the use of BRR to calculate standard errors. The model is run 17 times, including 16 runs using BRR (see Module 18, task 4 for more information).  BRR uses weights w0304_1 to w0304_16.  (Note: if 4 years of NHANES data are used, 32 BRR runs are required. Additional weights are found in the demoadv dataset.)

 

Info iconIMPORTANT NOTE

Note: If 4 years of NHANES data are used, 32 BRR runs are required. Additional weights are found in the demoadv dataset.

uld not be expected to vary by more than 1%.

 

A SAS macro is a useful technique for rerunning a block of code when you want to change only a few variables; the macros rununi, runbivar, and rundist are created and called in this example. The rununi macro calls the NLMixed_Univariate macro and the BoxCox_Survey macro, and fits a univariate model.  The runbivar macro calls the NLMixed_Bivariate macro and fits a bivariate model.  Finally, the rundist macro calls the Distrib_Bivariate and Percentiles_Survey macros and estimates the distribution of the ratio of the ubiquitously-consumed dietary constituent to energy.

The NLMixed_Univariate, BoxCox_Survey, NLMixed_Bivariate, Distrib_Bivariate and Percentiles_Survey macros used in this example were downloaded from the NCI website.  Version 1.0 of the macros was used.  Check this website for macro updates before starting any analysis.  Additional details regarding the macros and additional examples may also be found on the website.

 

 

Step 1: Create a dataset so that each row corresponds to a single person day

Statements Explanation

data adultm;
format agegrp ageg. ;
 
set nh.demoadv;
  if riagendr= 1 and ridageyr>= 19 ;  
  
IF RIDAGEYR > = 19 AND RIDAGEYR  <= 30 THEN AGEGRP= 5
  
IF RIDAGEYR >= 31 AND RIDAGEYR <= 50 THEN AGEGRP= 6
  
IF RIDAGEYR >= 51 AND RIDAGEYR <= 70 THEN AGEGRP= 7
  
IF RIDAGEYR > 70    THEN AGEGRP= 8
array a (*) age_19to30 age_31to50 age_51to70 age_71plus;
do i = 1 to dim(a); 
  a(i) =
0
end ;
  a(agegrp-
4 ) = 1 ;
if w0304_0 ne . ;  
drop i;
run ;

 

The demoadv dataset is used.  Men ages 19 years and older are selected.  The ordinal variable agegrp is created.  Next, dummy variables corresponding to the age groups are created (age_19to30 age_31to50 age_51to70 age_71plus).  This type of dummy coding must be used with the MIXTRAN macro because there is no class statement in the NLMIXED procedure.

data day1;
set adultm;
DAY_WK=DR1DAY;
DRTSFAT=DR1TSFAT;
DRTKCAL=DR1TKCAL;
day= 1 ;
run ;

 

data day2;
set adultm;
DAY_WK=DR2DAY;
DRTSFAT=DR2TSFAT;
DRTKCAL=DR2TKCAL;
day= 2 ;
run ;

The variables DR1TSFAT and DR2TSFAT are NHANES variables representing total saturated fat (g) from all foods and beverages reported on the 24-hour recalls on days 1 and 2 respectively.  To create a dataset with 2 records per person, the adultm dataset is set 2 times to create 2 datasets, one where day=1 and one where day=2.  The same variable name, DRTSFAT, is used for saturated fat on both days.  It is created by setting it equal to DR1TSFAT for day 1 and DR2TSFAT for day 2.  The variables DR1TKCAL and DR2TKCAL are NHANES variables representing total energy consumed on days 1 and 2 respectively from all foods and beverages.

data sfat;
set day1 day2;
if DAY_WK in ( 1 , 6 , 7 ) then weekend= 1 ;
  else if DAY_WK in ( 2 , 3 , 4 , 5 ) then weekend= 0 ;
run ;

Finally, these data sets are appended, and dummy variables are created for weekend days. 

 

 

Step 2: Sort the dataset by respondent and day

It is important to sort the dataset by respondent and intake day (day 1 and 2) because the NLMIXED procedure uses this information to estimate the model parameters.

 

proc sort ; by seqn day; run ;

 

 

Step 3: Create a macro to fit a univariate statistical model for a ubiquitously-consumed dietary constituent

One SAS macro is created in Step 3, rununi. The rununi macro calls the NLMixed_Univariate and Boxcox_Survey macros to obtain preliminary estimates for the bivariate model.  After creating this macro and running it 1 time, it may be called several times, each time changing the macro variables.

Multiple or no covariates may be used in the modeling.  This example uses age group and adjusts the model for weekend and sequence effects.

 

Statements Explanation

%macro rununi (data, foodvar, modeltype, nloptions, outlib, run);

 

The start of the rununi macro is defined.  The variables in parentheses are the macro variables that will be used in the macro.

data nhanes (keep=id age race repeat dayofweek weekend agegrp age_19to30 age_31to50 age_51to70 age_71plus

      w0304_&run recall_food);

 set &data;
  id        = seqn;
  age       = ridageyr;
  race      = ridreth1;
  repeat    = day;
  dayofweek = DAY_WK;
  weekend   = weekend;
  recall_food = &foodvar;
  run;

The dataset (&data) is set, and the variables are renamed.  Note that day of week and weekend are used in this example.

proc sort data=nhanes;  by id repeat;  run;

The data are sorted by id and repeat.

data nhanes;
  set nhanes;
  if (repeat = 2 ) then repeat2 = 1 ;
  else repeat2 = 0 ;
run;

A dummy variable called repeat2 is created.  This variable is equal to 0 for day 1, so that the parameter only needs to be added to the model for day 2.

proc means data=nhanes noprint;
  where (recall_food >
0 );
  var recall_food;
  output out=min_a(keep=min_a) min=min_a;
  run;

The minimum amount consumed on a consumption day is calculated and saved in a dataset called min_a.

data nhanes;
  merge nhanes min_a;
  modeltype = upcase( "&modeltype" );
  if (modeltype = "ONEPART" & recall_food = 0 ) then recall_food = min_a / 2 ;
  run;

The minimum amount is added to the dataset.  For one-part models (amount models), zero values are set to ½ of the mimimum amount.

data nhanes_boxcox;
set nhanes;
  by id;
if (first.id);
if (modeltype ^= "ONEPART" & recall_food = 0 ) then delete;
run;

A dataset is created to be used to find the best Box-Cox transformation to normality.  For this dataset, zero values are deleted.

%BoxCox_Survey (data =nhanes_boxcox,    subject = id,                
   var     = recall_food,

   weight  = w0304_&run,
         
   print   = Y,
       ntitle  = 2 );

The macro BoxCox_Survey is called to find the best Box-Cox transformation.

data _null_;
set _lambda;
call symput( "lambda" ,   trim(left(put(lambda_recall_food, 4.2 )))); run;

The dataset _lambda from the BoxCox_Survey macro is set; the macro variable lambda is created.

proc sort data=nhanes; by id repeat; run;

title2 "Males 19+" ;

% NLMixed _Univariate (data    = nhanes,

       lambda      = &lambda,

       subject     = id,

       repeat      = repeat,

       response    = recall_food,

       modeltype   = &modeltype,

       covars_prob = repeat2 weekend age_19to30 age_31to50 age_51to70,

       covars_amt  = repeat2 weekend age_19to30 age_31to50 age_51to70,

       replicate_var= w0304_&run,

       nloptions  = &nloptions,

       print      = Y,

       ntitle     = 3 );

After sorting the data to make sure they are in the correct order, the macro NLMixed_Univariate is called to fit the model.

The Box-Cox transformation parameter (lambda) is set equal to the value selected by macro BoxCox_Survey through macro variable &.

data & outlib..p arms_u_&foodvar._&run;
set parms_u;
run;

The dataset outlib.parms_u_&foodvar._&run is created to save the parameters from the model.

data & outlib..p red_u_&foodvar._&run;
set pred_x_u;
run;

The dataset outlib.pred_u_&foodvar._&run is created to save the predicted values from the model.

proc datasets nolist;  delete nhanes_boxcox min_a parms_u pred_x_u nhanes;  run;  quit;

Unneeded datasets are deleted.

%mend rununi;

The end of the rununi macro.

 

Step 4: Create a macro to fit a bivariate statistical model for two ubiquitously-consumed dietary constituents

One SAS macro is created in Step 4, runbivar. The runbivar macro calls the NLMixed_Bivariate macro to fit a bivariate model of two ubiquitously-consumed dietary constituents. After creating this macro and running it 1 time, it may be called several times, each time changing the macro variables.

Multiple or no covariates may be used in the modeling; in this example, age group is used, and the model is adjusted for weekend and sequence effects.

 

Statements Explanation

%macro runbivar (data, foodvar1, foodvar2, modeltype, nloptions, outlib, run);

The start of the runbivar macro is defined. The variables in parentheses are the macro variables that will be used in the macro.

data nhanes (keep=id age race repeat dayofweek weekend agegrp age_19to30 age_31to50 age_51to70 age_71plus
w0304_&run recall_food1 recall_food2);
  set &data;
  id        = seqn;
  age       = ridageyr;
  race      = ridreth1;
  repeat    = day;
  dayofweek = DAY_WK;
  weekend   = weekend;
  recall_food1 = &foodvar1;
  recall_food2 = &foodvar2;
  run;

The dataset (&data) is set, and the variables are renamed.  Note that age, race, repeat, day of week and weekend are used in this example.

proc sort data=nhanes;  by id repeat;  run;

The data are sorted by the variables id and repeat.

data nhanes;
  set nhanes;
  if (repeat = 2 ) then repeat2 = 1 ;
  else repeat2 = 0 ;
run;

A dummy variable called repeat2 is created.  This variable is equal to 0 for day 1, so that the parameter only needs to be added to the model for day 2.

proc means data=nhanes noprint;
  where (recall_food1 > 0 );
  var recall_food1;
  output out=min_a1(keep=min_a1) min=min_a1;
  run;

proc means data=nhanes noprint;
  where (recall_food2 > 0 );
  var recall_food2;
  output out=min_a2(keep= min_a2) min=min_a2;
  run;

data min_a;
  merge min_a1 min_a2;
  run;

The minimum amount consumed on a consumption day is calculated for food 1 and saved in a dataset called min_a1.
The minimum amount consumed on a consumption day is calculated for food 2 and saved in a dataset called min_a2.

The datasets min_a1 and min_a2 are merged.

data nhanes;
merge nhanes min_a;
modeltype = upcase( "&modeltype" );
if (modeltype = "ONEPART" & recall_food1 = 0 ) then recall_food1 = min_a1 / 2 ;
if (recall_food2 = 0 ) then recall_food2 = min_a2 / 2 ;
run;

The minimum amount is added to the dataset.  For one-part models (amount models), zero values are set to ½ of the mimimum amount.

data init_parms_f1;
set & outlib..p arms_u_&foodvar1._&run;
rename a_intercept        = A1_intercept
         a_repeat2          = A1_repeat2
         a_weekend          = A1_weekend
         a_age_19to30       = A1_age_19to30
         a_age_31to50       = A1_age_31to50
         a_age_51to70       = A1_age_51to70
         a_logsde           = A1_LogSDe
         a_lambda           = A1_Lambda
         logsdu2            = LogSDu2;
run;

The dataset from the univariate model for food 1 is set, and the parameters are renamed for the bivariate model.

data init_parms_f2;

set & outlib..p arms_u_&foodvar2._&run;
rename a_intercept        = A2_intercept
         a_repeat2          = A2_repeat2
         a_weekend          = A2_weekend
         a_age_19to30       = A2_age_19to30
         a_age_31to50       = A2_age_31to50
         a_age_51to70       = A2_age_51to70
         a_logsde           = A2_LogSDe
         a_lambda           = A2_Lambda
         logsdu2            = LogSDu3;
run;

The dataset from the univariate model for food 2 is set, and the parameters are renamed for the bivariate model.

data init_parms;
merge init_parms_f1 init_parms_f2;
keep A1_intercept A1_repeat2 A1_weekend A1_age_19to30 A1_age_31to50 A1_age_51to70 A2_intercept A2_repeat2 A2_weekend A2_age_19to30 A2_age_31to50 A2_age_51to70 A1_LogSDe A2_LogSDe LogSDu2 LogSDu3;
run;

Initial estimates for the two food or nutrients are combined.

data lambdas (keep=a1_lambda a2_lambda);
  merge init_parms_f1 init_parms_f2;
  call symput( "lambda1" ,trim(left(put(a1_lambda, 4.2 ))));
  call symput( "lambda2" ,trim(left(put(a2_lambda, 4.2 ))));
run;

Macro variables lambda1 and lambda2 are created.  Note that these macro variables are the same Box-Cox parameters that were selected by macro BoxCox_Survey inside macro rununi (see step 3).

proc print data=lambdas;
  var a1_lambda a2_lambda;
  title3 "Box-Cox Transformation Parameters" ;
  run;

Print the Box-Cox parameters.

title2;

Title2 is reset.

%NLMixed_Bivariate (data           = nhanes,

       init_parms     = init_parms,

       lambda1        = &lambda1,

       lambda2        = &lambda2,

       subject        = id,

       repeat         = repeat,

       response1      = recall_food1,

       response2      = recall_food2,

       modeltype      = &modeltype,

       covars_prob1   = repeat2 weekend age_19to30 age_31to50 age_51to70,

       covars_amt1    = repeat2 weekend age_19to30 age_31to50 age_51to70,

       covars_amt2    = repeat2 weekend age_19to30 age_31to50 age_51to70,

       replicate_var  = w0304_&run,

       nloptions      = &nloptions,

       print          = Y,

        ntitle         = 3 );

The macro NLMixed_Bivariate is called to fit the model.

The Box-Cox transformation parameters (lambda1 and lambda2) are set equal to the values selected by macro BoxCox_Survey via macro variables &lambda1 and &lambda2.

data & outlib..p arms_b_&foodvar1._&foodvar2._&run;
  set parms_b;
  run;

The dataset outlib.parms_b_&foodvar._&run is created to save the parameters from the model.

data & outlib..p red_b_&foodvar1._&foodvar2._&run;
  set pred_x_b;
  run;

The dataset outlib.pred_b_&foodvar._&run is created to save the predicted values from the model.

proc datasets lib=work nolist;
delete nhanes min_a1 min_a2 init_parms_f1 init_parms_f2 init_parms lambdas;
run;
quit;

Unneeded datasets are deleted.

%mend runbivar;

The end of the runbivar macro.

 

Step 5: Create a macro to estimate the distribution of a ratio of two ubiquitously-consumed dietary constituents

One SAS macro is created in Step 5, rundist. The rundist macro calls the Distrib_Bivariate macro and the Percentiles_Survey macro to estimate the distribution of a ratio of two ubiquitously-consumed dietary constituents. After creating this macro and running it 1 time, it may be called several times, each time changing the macro variables.

 

Statements Explanation

%macro rundist(foodvar1, foodvar2, modeltype, seedv, outlib, run);

The start of the rundist macro is defined.  The variables in parentheses are the macro variables that will be used in the macro.

%global seed;

%let seed = &&seedv;    

Set seed for the random number generator.

data parms;

  set & outlib..p arms_b_&foodvar1._&foodvar2._&run;

  run;

data pred;

  set & outlib..p red_b_&foodvar1._&foodvar2._&run;

  run;

proc sort data=pred;  by id repeat;  run;

The parameter estimates and predicted values that were calculated in the runbivar macro are set to the parms and pred datasets, respectively. The dataset pred is sorted by id and repeat.

data pred;

  merge pred parms(keep=a1_repeat2 a2_repeat2 a1_weekend a2_weekend);

  run;

data pred (drop=a1_repeat2 a2_repeat2 a1_weekend a2_weekend);
  set pred;
  by id;
  if (first.id);

  if (repeat2 = 1 ) then do;
    repeat  = 1 ;
    repeat2 = 0 ;
    pred_x_a1 = pred_x_a1 - a1_repeat2;     pred_x_a2 = pred_x_a2 - a2_repeat2;     end;

  if (weekend = 1 ) then do;
    weekend = 0 ;
    pred_x_a1 = pred_x_a1 - a1_weekend;
    pred_x_a2 = pred_x_a2 - a2_weekend;
    end;

  day_wgt = 4 / 7 ;
  output;

  weekend = 1 ;
  pred_x_a1 = pred_x_a1 + a1_weekend;
  pred_x_a2 = pred_x_a2 + a2_weekend;
  day_wgt = 3 / 7 ;
  output;
  run;

Because weekend was used, two records are created per subject, one for weekday, and one for weekend.

SAS assumes repeat2 is a covariate in the model.  The predicted value is calculated for day 1, i.e., when repeat2=0.  SAS also assumes that weekend is a covariate in the model set to 0 if the recall was Monday-Thursday, and 1 if the recall was for Friday-Sunday.

For each pseudo individual, Macro Distrib_Bivariate generates usual weekday and usual weekend intakes of saturated fat (energy) and then calculates usual intake as the weighted average of the usual weekday and usual weekend intakes.   Variable day_wgt specifies the weight for weekdays (4/7) and weekends (3/7).  Note: these can be changed in this macro.

data _null_;

  set pred;

  min_a1 = min_a1 / 2 ;

  call symput( "min_a1" ,trim(left(put(min_a1, best12. ))));

  min_a2 = min_a2 / 2 ;

  call symput( "min_a2" ,trim(left(put(min_a2, best12. ))));

run;

Macro variables min_a1 and min_a2 are created for food 1 and food 2.  These variables equal ½ of the minimum amount.

title2 "Males 19+" ;

%Distrib_Bivariate (param = parms,
      predicted      = pred,
      subject        = id,       modeltype      = &modeltype,       nsim_mc        = 100 ,       day_wgt        = day_wgt,       min_a1         = &min_a1,                min_a2         = &min_a2,             print          = N,             ntitle         = 2 );

The minimum amount is added to the dataset.  For one-part models (amount models), zero values are set to ½ of the mimimum amount.

options notes;

title2;

Notes are turned back on in the log and title2 is reset.

data mcsim;
  set _mcsim;
  t_density =  100 * (( 9 * t1) / t2);
run;

title2 "Table 1: Percentiles by Age" ;

data mcsim2;
  set mcsim;
  Subpopulation = agegrp;
  format Subpopulation ageg. ;
run;

A dataset called mcsim is created from setting the dataset _mcsim from the Distrib_Bivariate macro. The usual nutrient density is defined for saturated fat (9 kcal/g).

A dataset called mcsim2 is created from mcsim, defining the subpopulation variable from the variable agegrp.

%Percentiles_Survey (data = mcsim2,
     byvar  = subpopulation,
     var    = t_density,
     weight = w0304_&run,
     cutpoints  = 10 12 15 ,
     print      = N,
     ntitle     = 2 );

The macro Percentiles_Survey is called to calculate the percentiles of the usual nutrient density.  The cutpoints correspond to 10%, 12%, and 15% of calories from saturated fat.

data pctl;
  set _percentiles;
      by subpopulation;
  run;

The dataset _percentiles is created from the macro Percentiles_Survey.

%if &run= 0 %then %do ;

title2 "Estimated Mean and Percentiles" ;

title3 "Men, ages 19+" ;

proc print data=pctl label;

  id subpopulation;

  var Mean Pctile5 Pctile10 Pctile25 Pctile50 Pctile75 Pctile90 Pctile95;

  format Mean Pctile5 Pctile10 Pctile25 Pctile50 Pctile75 Pctile90 Pctile95 7.2 ;

  run;

title2 "Estimated Cut-Point Probabilities" ;

proc print data=pctl label;
  id subpopulation;
  var Prob1-Prob3;
  format Prob1-Prob3
7.2 ;
  label Prob1 =
"Prob(X <= 10)" |         Prob2 = "Prob(X <= 12)"
        Prob3 = "Prob(X <= 15)" ;
  run;

%end ;

Summary tables are printed for the point estimate run.

data & outlib..p ctl_b_&foodvar1._&foodvar2._&run;
  set pctl;
  run=&run;
  run;

The means, percentiles, and cutpoint proportions are saved for each run.

proc datasets nolist;
delete parms pred mcsim _mcsim mcsim2 pctl _percentiles;
run;
quit;

Unneeded datasets are deleted.

%mend rundist;

The end of the rundist macro.

 

 

Step 6: Run the rununi, runbivar, and rundist macros 

Now that the rununi, runbivar, and rundist macros have been created, they need to be called.

 

Statements Explanation

%include "C:\NHANES\Macros\ratio.macros.v1\
NLMixed_univariate.macro.v1.1.sas"
;

%include "C:\NHANES\Macros\ratio.macros.v1\
boxcox_survey.macro.v1.1.sas"
;

%rununi (data=sfat, foodvar=DRTSFAT, modeltype=ONEPART, nloptions= technique=trureg  maxfunc= 10000 maxiter= 200 , outlib=work, run= 0 );

First, the univariate model of saturated fat is run to get starting values.  This requires the NLMixed_Univariate macro and the BoxCox_Survey macro, so these are included.

The foodvar is set equal to DRTSFAT to fit the univariate model for saturated fat.  A one-part model (ONEPART) is specified, as saturated fat is ubiquitously consumed.  Options for NLMIXED are set.  The macro variable run is set equal to 0 to indicate the base run.

%rununi (data=sfat, foodvar=DRTKCAL, modeltype=ONEPART, nloptions= technique=trureg maxfunc= 10000 maxiter= 200 , outlib=work, run= 0 );

The univariate model for energy is run to get starting values.

%include "C:\NHANES\Macros\ratio.macros.v1\
NLMixed_bivariate.macro.v1.1.sas"
;

%runbivar (data=sfat, foodvar1=DRTSFAT, foodvar2=DRTKCAL, modeltype=ONEPART, nloptions=technique=trureg maxfunc= 10000 maxiter= 200 ,

outlib=work,run= 0 );

The bivariate model of saturated fat and energy is run.  This requires the NLMixed_Bivariate macro so this is included.

%include "C:\NHANES\Macros\ratio.macros.v1\
distrib_bivariate.macro.v1.1.sas"
;

%include "C:\NHANES\Macros\ratio.macros.v1\
percentiles_survey.macro.v1.1.sas"
;

 

%rundist(foodvar1=DRTSFAT, foodvar2=DRTKCAL, modeltype=ONEPART, seedv= 0 , outlib=work, run= 0 );

The bivariate distribution of saturated fat and energy is estimated.  This requires the Distrib_Bivariate macro and the Percentiles_Survey macro, so these are included.

%macro BRR204(run, outlib, foodvar1, foodvar2);

%do run= 1 %to 16 ;

options nonotes;

%put BRR Run &run;

%rununi (data=sfat, foodvar=DRTSFAT, modeltype=ONEPART, nloptions= maxfunc= 10000 maxiter= 200 , outlib=work, run=&run);

%rununi (data=sfat, foodvar=DRTKCAL, modeltype=ONEPART,  nloptions= maxfunc= 10000 maxiter= 200 , outlib=work, run=&run);

%runbivar (data=sfat, foodvar1=DRTSFAT, foodvar2=DRTKCAL, modeltype=ONEPART, nloptions=technique=trureg maxfunc= 10000 maxiter= 200 ,

outlib=work,run=&run);

%rundist(foodvar1=DRTSFAT, foodvar2=DRTKCAL, modeltype=ONEPART, seedv= 0 , outlib=work, run=&run);

Next, a macro called BRR204 is defined to obtain the datasets needed for BRR.  A dataset called brr_runs is created from the output of the Percentiles_Survey macro.

proc append base=brr_runs data=& outlib..p ctl_b_&foodvar1._&foodvar2._&run;

The output from the BRR runs is saved in the brr_runs dataset.

%end ;

The end of the do loop for the BRR runs.

%mend BRR204;

The end of the BRR204 macro.

%BRR204(foodvar1=DRTSFAT, foodvar2=DRTKCAL, outlib=work)

The BRR204 macro is called.

proc print data =brr_runs; run ;

The BRR runs are printed.

 

 

Step 7: Calculate BRR SEs and print final estimates 

 

Statements Explanation

data brr_runs2;

set brr_runs;

rename mean=bmean pctile1-pctile99=bpctile1-bpctile99 prob1-prob3=bprob1-bprob3;

run ;

The dataset brr_runs is set to create a new dataset, brr_runs2, and variables are renamed to indicate they are from BRR runs.

proc sort data =pctl_b_DRTSFAT_DRTKCAL_0; by subpopulation;

proc sort data =brr_runs2; by subpopulation;

The data are sorted before merging.

data distall;
  merge pctl_b_DRTSFAT_DRTKCAL_0     brr_runs2; by subpopulation;
  array bvar (*) bmean bpctile1-bpctile99 bprob1-bprob3;
  array varo  (*) mean pctile1-pctile99 prob1-prob3;
  array dsqr (*) dbmean dbpctile1-dbpctile99 dbprob1-dbprob3;
  do i= 1 to dim(bvar);
 dsqr[i]=(bvar[i]-varo[i])** 2 ;
  end ;
run ;

The datasets pctl_b_drtsfat_drtkcal_0 (from the rundist macro for the first run) and brr_runs2 are merged, and the squared difference between the BRR estimate and the parameter from the first run is created.

proc means data =distall sum noprint by subpopulation;
  var   dbmean dbpctile1-dbpctile99 dbprob1-dbprob3;;
output out =sums sum =sum_dbmean sum_dbpctile1-sum_dbpctile99 sum_dbprob1-sum_dbprob3;
run ;

The sum of squares is computed.

data brr;
  set sums;
  array sumt (*) sum_dbmean sum_dbpctile1-sum_dbpctile99 sum_dbprob1-sum_dbprob3;
  array se  (*) mean pctile1-pctile99 prob1-prob3;
  do j= 1 to dim(sumt);

 se[j]=- 1 *sqrt((sumt[j])/( 16 * .49 ));
  end ;
  keep mean pctile1-pctile99 prob1-prob3 subpopulation;
run ;

The standard errors are computed.  Each SE is multiplied by -1 to make it print out in parentheses in the final step.

data toprint1;
    set pctl_b_DRTSFAT_DRTKCAL_0;
    line= 1 ;
    keep subpopulation mean pctile1-pctile99 prob1-prob3 line;
run ;

To create the final dataset, the point estimates are saved in a file called toprint1. The variable line will identify them as estimates.

data toprint2;
    set brr;
    line= 2 ;
    keep subpopulation mean pctile1-pctile99 prob1-prob3 line;
run ;

The standard errors are saved in a dataset called toprint2.  The variable line will identify them as standard errors.

data nh.m20task4;
set toprint1 toprint2;
run ;

The final dataset is created by appending toprint1 and toprint2.

proc sort data =nh.m20task4;
    by subpopulation line;
run ;

The final dataset is sorted.

proc print data =nh.m20task4 split = ' ' noobs ;

var subpopulation line  mean  pctile5 pctile10 pctile25 pctile50 pctile75 pctile90 pctile95 prob1-prob3;

format line line.   mean  pctile1-pctile99  negparen10.1

prob1-prob3 negparen6.2 ;

title 'Usual Intake of %Saturated Fat' ;

title2 'NHANES 2003-04' ;

run ;

The final dataset is printed.  The format negparen will make the standard errors print in parentheses.

 

 

 

Step 8: Interpret estimates

Info iconIMPORTANT NOTE

Note: Your results may vary slightly, as a random seed is used to estimate the distribution of usual intake. However, they would not be expected to vary by more than 1%.

 

 

 

 

close window icon Close Window to return to module page.