NIOSHTIC-2 Publications Search

Data profiling using base SAS software: a quick approach to understanding your data.

Authors
Nowlin SJ
Source
Proceedings of the 31st Annual SAS Users Group International Conference, March 26-29, 2006, San Francisco, California. Cary, NC: SAS Institute Inc, Paper 161-31, 2006 Mar; :1-5
NIOSHTIC No.
20030855
Abstract
"Data Profiling is the use of analytical techniques about data for the purpose of developing a thorough knowledge of its content, structure and quality" (<a href="http://www.bitpipe.com"target="_blank">http://www.bitpipe.com</a>). While this terminology is most often associated with Data Warehousing and high-level business intelligence efforts, these techniques are valuable tools for the everyday data manager or data analyst. SAS Version 9 software offers various avenues for performing data profiling such as, SAS/ETL and SAS Data Quality Solution. These tools however, may not be available for some SAS users, may require additional training, and may be overkill if an understanding of the content of a file is all that is needed; that is, no data cleansing or other transformations are required. This paper discusses an application using only base SAS software which provides basic statistics, frequencies, ranges, outlier, and structural information for each variable in a table. The result of the application is a condensed report detailing the information about the content of a data file. The application was written using the Windows environment and can be run from the SAS Display Manager. For those who have SAS/IntrNet software, a front end is also available to provide a user friendly interface. Current enhancements under development include running the application from SAS Enterprise Guide as a stored process.
Keywords
Computer-software; Analytical-processes; Analytical-methods; Information-systems; Information-processing; Surveillance-programs
Contact
Susan J. Nowlin, IT Specialist, National Institute for Occupational Safety and Health, 4676 Columbia Parkway, MS-R4, Cincinnati, OH 45226
Publication Date
20060326
Document Type
Conference/Symposia Proceedings
Email Address
snowlin@cdc.gov
Fiscal Year
2006
NIOSH Division
DSHEFS
Source Name
Proceedings of the 31st Annual SAS Users Group International Conference, March 26-29, 2006, San Francisco, CA
State
OH
Page last reviewed: May 11, 2023
Content source: National Institute for Occupational Safety and Health Education and Information Division