Data profiling using base SAS software: a quick approach to understanding your data.
Authors
Nowlin SJ
Source
Proceedings of the 31st Annual SAS Users Group International Conference, March 26-29, 2006, San Francisco, California. Cary, NC: SAS Institute Inc, Paper 161-31, 2006 Mar; :1-5
"Data Profiling is the use of analytical techniques about data for the purpose of developing a thorough knowledge of its content, structure and quality" (<a href="http://www.bitpipe.com"target="_blank">http://www.bitpipe.com</a>). While this terminology is most often associated with Data Warehousing and high-level business intelligence efforts, these techniques are valuable tools for the everyday data manager or data analyst. SAS Version 9 software offers various avenues for performing data profiling such as, SAS/ETL and SAS Data Quality Solution. These tools however, may not be available for some SAS users, may require additional training, and may be overkill if an understanding of the content of a file is all that is needed; that is, no data cleansing or other transformations are required. This paper discusses an application using only base SAS software which provides basic statistics, frequencies, ranges, outlier, and structural information for each variable in a table. The result of the application is a condensed report detailing the information about the content of a data file. The application was written using the Windows environment and can be run from the SAS Display Manager. For those who have SAS/IntrNet software, a front end is also available to provide a user friendly interface. Current enhancements under development include running the application from SAS Enterprise Guide as a stored process.
Links with this icon indicate that you are leaving the CDC website.
The Centers for Disease Control and Prevention (CDC) cannot attest to the accuracy of a non-federal website.
Linking to a non-federal website does not constitute an endorsement by CDC or any of its employees of the sponsors or the information and products presented on the website.
You will be subject to the destination website's privacy policy when you follow the link.
CDC is not responsible for Section 508 compliance (accessibility) on other federal or private website.
For more information on CDC's web notification policies, see Website Disclaimers.
CDC.gov Privacy Settings
We take your privacy seriously. You can review and change the way we collect information below.
These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us to know which pages are the most and least popular and see how visitors move around the site. All information these cookies collect is aggregated and therefore anonymous. If you do not allow these cookies we will not know when you have visited our site, and will not be able to monitor its performance.
Cookies used to make website functionality more relevant to you. These cookies perform functions like remembering presentation options or choices and, in some cases, delivery of web content that based on self-identified area of interests.
Cookies used to track the effectiveness of CDC public health campaigns through clickthrough data.
Cookies used to enable you to share pages and content that you find interesting on CDC.gov through third party social networking and other websites. These cookies may also be used for advertising purposes by these third parties.
Thank you for taking the time to confirm your preferences. If you need to go back and make any changes, you can always do so by going to our Privacy Policy page.