Comparison of performance rating criteria in proficiency testing programs.
Proceedings of the 1999 American Statistical Association, Section on Physical and Engineering Sciences, Minneapolis, Minnesota, June 2-4, 1999. Baltimore, MD: American Statistical Association, 2000 Jun; :179-184
Two major proficiency testing programs in the industrial hygiene field are compared with regard to their power to detect laboratories having a large bias and poor precision. Although the two programs are similar in many aspects, such as the analytes tested, frequency of test rounds, and number of samples used in each test round, their evaluation procedures and rating criteria are quite different. To expand the substances used in the proficiency testing and reduce the proficiency testing cost in each program, cooperation between the two programs will begin in 2000. Cooperation will involve exchange of samples, coordination of logistics, and harmonization of analytical methods, but will permit each program to continue its rating criterion. This study provides detailed power comparison of the two programs on rating laboratories at various bias and precision levels. In the Workplace Analysis Scheme for Proficiency (WASP) program, participating laboratories are classified into three categories: "average ", "better than average", and "worse than average". Laboratories in the Proficiency Analytical Testing (PAT) program are rated as "proficient" or "non-proficient". If a "non proficient" rating in PAT is compared to a "worse than average" classification in WASP, the study shows that WASP is more sensitive in detecting laboratories with poor performance. That is, the chance for a laboratory having a large bias or low precision to be rated "non proficient" in PAT is less than the chance of being classified as "worse than average" in WASP. Although the PAT criterion is simpler, it is not as powerful because PAT converts each quantitative laboratory result to a qualitative value. The effect of information loss can be measured by the number of samples required to maintain the same power in both programs. The required number of samples for WASP is only 60%-80% the number of samples used in PAT. To improve the PAT rating criterion a modified rating criterion based on z-scores has been previously proposed by Schlecht and Song (1997). This study demonstrates that the modified PAT criterion is equivalent to the WASP criterion.
Performance-capability; Testing-equipment; Statistical-analysis; Industrial-hygiene; Industrial-hygiene-programs; Industrial-equipment; Workplace-monitoring; Workplace-studies; Work-analysis; Sampling-equipment; Sampling
National Institute for Occupational Safety and Health, 4676 Columbia Parkway, Cincinnati, Ohio, 45226
Proceedings of the 1999 American Statistical Association, Section on Physical and Engineering Sciences, Minneapolis, Minnesota, June 2-4, 1999
American Statistical Association