NIOSHTIC-2 Publications Search
Data profiling using base SAS software: a quick approach to understanding your data.
Proceedings of the 31st Annual SAS Users Group International Conference, March 26-29, 2006, San Francisco, California. Cary, NC: SAS Institute Inc, Paper 161-31, 2006 Mar; :1-5
"Data Profiling is the use of analytical techniques about data for the purpose of developing a thorough knowledge of its content, structure and quality" (<a href="http://www.bitpipe.com"target="_blank">http://www.bitpipe.com</a>). While this terminology is most often associated with Data Warehousing and high-level business intelligence efforts, these techniques are valuable tools for the everyday data manager or data analyst. SAS Version 9 software offers various avenues for performing data profiling such as, SAS/ETL and SAS Data Quality Solution. These tools however, may not be available for some SAS users, may require additional training, and may be overkill if an understanding of the content of a file is all that is needed; that is, no data cleansing or other transformations are required. This paper discusses an application using only base SAS software which provides basic statistics, frequencies, ranges, outlier, and structural information for each variable in a table. The result of the application is a condensed report detailing the information about the content of a data file. The application was written using the Windows environment and can be run from the SAS Display Manager. For those who have SAS/IntrNet software, a front end is also available to provide a user friendly interface. Current enhancements under development include running the application from SAS Enterprise Guide as a stored process.
Computer-software; Analytical-processes; Analytical-methods; Information-systems; Information-processing; Surveillance-programs
Susan J. Nowlin, IT Specialist, National Institute for Occupational Safety and Health, 4676 Columbia Parkway, MS-R4, Cincinnati, OH 45226
Proceedings of the 31st Annual SAS Users Group International Conference, March 26-29, 2006, San Francisco, CA
Page last reviewed: September 2, 2020
Content source: National Institute for Occupational Safety and Health Education and Information Division