SESUG 2021 Conference Proceedings

Track
Education/Institutional Research
Healthcare/Pharmaceuticals
Know Your SAS: Advanced
Know Your SAS: Foundations
Leadership / team building/ career development
Planning and Administration
Reporting and Graphics
Statistics and Data Analysis
e-Posters



Education/Institutional Research

Paper Authors Title Key Takeaways
Paper 41 Peter Zsiga Two-way Survey Analysis for Units and Organization with Means and Percentages Favorable and Unfavorable SAS is extremely helpful in reporting out survey results by question, location, and response.
      SAS proc sgpanel allows at least 35 small graphs to appear on one page.
      Dropping no-response responses from percentage calculations produces accurate results for individual survey questions
Paper 65 Kelly Smith Moving beyond Frequency and Percentage to Chi Square, t-Tests, and Correlation Analysis. Successful inferential analysis requires checking data suitability against test assumptions by such steps as addressing missing values and checking variable distributions. Once data suitability has been established, inferential analysis provides actionable information and results.
Paper 81 Maham Khan, James Farley, Zhong Zheng and Glendalis Gonzales Understanding the Effects of Campus Safety on College Student Retention and Completion: A Panel Data Analysis * This paper explores the relationship between campus safety and student outcomes pertaining to retention and completion rates. One hundred and thirty public and private Research Tier 1 (R1) universities were used in this analysis. However, the analysis does not show a strong association between campus security and student retention or completion rates, however socio-economic and admissions related factors do present themselves as affecting student success.



Healthcare/Pharmaceuticals

Paper Authors Title Key Takeaways
Paper 68 Tamar Roomian Using PROC SQL to restructure data for common healthcare applications PROC SQL can be used to manipulate electronic medical record data for common healthcare analysis needs.
      The GROUP BY statement with aggregate functions within PROC SQL can be used to transpose data from “long” to “wide”
      PROC SQL in conjunction with the INTNX function can be used to join data with unequal date conditions.
Paper 71 Hong Zhang and Huei-Ling Chen Page Margin Checking Macro for RTF files The audience should be able to learn the technique utilized in our macro and the key RTF syntax that related page orientation and page margins.



Know Your SAS: Advanced

Paper Authors Title Key Takeaways
Paper 33 Kirk Paul Lafler Under the Hood: The Mechanics of SQL Query Optimization Techniques The SELECT statement’s purpose is to retrieve (or read) data from one, or more, underlying tables (or views). The execution order of the clauses are: Execution Order 1. FROM 2. INTO 3. ON 4. WHERE 5. GROUP BY 6. HAVING 7. SELECT 8. ORDER BY
      SAS users can control how much information the SAS System writes to the SAS log by specifying the MSGLEVEL= SAS System option in an Options statement.
      The PROC SQL _METHOD option can be specified to analyze a query process or for debugging purposes. The complete list of codes available with the _METHOD option are: Code Description SQXCRTA Create table as Select. SQXSLCT Select statement or clause. SQXJSL Step loop join (Cartesian). SQXJM Merge join operation. SQXJNDX Index join operation. SQXJHSH Hash join operation. SQXSORT Sort operation. SQXSRC Source rows from table. SQXFIL Rows filtration. SQXSUMG Summary stats (aggregates) with GROUP BY clause. SQXSUMN Summary stats with no GROUP BY clause.
Paper 42 Thomas Billings SAS® Macro to Identify Potentially Obsolete Variables in a File Obsolete variables can be identified by checking for as-of dates when a variable is populated, and identifying the latest populated date and comparing it to a reference date.
Paper 56 Kiran venna Finding variable names and count of variables with missing values at various positions within an observation using SAS arrays. How SAS arrays help to find count and variable names for missing values in an Observation.
Paper 69 William Smith Utilizing SAS Macros to Deduplicate Your Data Data Deduplication in SAS is easy and applicable to all users of many skill levels.
Paper 7 Bruce Gilsen Copying Data Between SAS ® and JSON Files JavaScript Object Notation (JSON) is an open standard file format and data interchange format used for some of the same purposes as XML.
      Starting in SAS ® 9.4, you can copy SAS data sets to JSON files with PROC JSON. Starting in SAS 9.4TS1M4, you can copy JSON files to SAS data sets with the JSON engine.
      Copying data from SAS to JSON with PROC JSON is relatively straightforward. Copying data from JSON to SAS can be simple or complicated, depending on the data.
Paper 80 Zach Acuff Coarsening Continuous Variables: An Automated Method to Create Categorical Versions of Continuous Variables While Mitigating Loss of Precision This SAS macro is a great and easy-to-use tool for creating a reasonably faithful categorical version of a continuous variable.
Paper 87 Jayanth Iyengar FROM %LET TO %LOCAL; METHODS, USE, AND SCOPE OF MACRO VARIABLES IN SAS PROGRAMMING Scope of macrovariables determines where they can be used.
      Scope of a macrovariable depends on method used to define it, and location of definition.
Paper 8 Ronald Fehd Using LaTeX document class sugconf to write your paper SAS programs are text files which contain statements that read an external data source; the output can be a .pdf
      LaTeX documents are text files which contain both the LaTeX statements and data --- text --- to be typeset into a pdf.
      This paper shows basic and advanced LaTeX statements --- markup commands --- for SAS authors.
Paper 93 Troy Hughes Yo Mama is Broke Cause Yo Daddy is Missing: Autonomously and Responsibly Responding to Missing or Invalid SAS® Data Sets Through Exception Handling Routines You will learn how to utilize &SYSCC to programmatically detect missing, locked, or otherwise invalid data sets.
      You will learn how to initialize and utilize user-defined global macro variables as return codes that demonstrate program success or decry program failure.
      You will learn how to write modular SAS macro functions that improve the quality of your software.
Paper 99 Josh Horstman Using the Output Delivery System to Create and Customize Excel Workbooks The ODS EXCEL destination allows you to create native Microsoft Excel files directly from SAS.
      The ODS EXCEL statement has an OPTIONS option that supports numerous Excel-specific sub-options which allow you to customize your Excel workbook by changing the sheet names, column widths, and row heights, hiding or freezing rows or columns, and much more.
      You can customize nearly any aspect of the visual appearance of a Microsoft Excel workbook generated using the ODS EXCEL destination by using built-in ODS style templates, creating your own style with PROC TEMPLATE, or applying in-styling within various SAS procedures.



Know Your SAS: Foundations

Paper Authors Title Key Takeaways
Paper 11 Melvin Alexander Using JMP® and R Integration to Analyze Virtual Chat Messages during a Coronavirus Pandemic This presentation is an example of a growing trend among coding specialists and developers in the digital 4.0 age such as the Integrated Development Environment (IDE).
      IDEs provide a single place for SAS tools to interface with other software repositories (e.g., R and Python), organize data analysis, reporting, and visualization work.
      SAS/JMP and R Integration gives SAS/JMP users the same capabilities of more advanced text analytics solutions (e.g., sentiment analysis) within the base SAS/JMP Environment.
Paper 12 Stephen Sloan Twenty Ways to Run your SAS programs faster and use less space There are a variety of techniques that can be used to help run a SAS program efficiently
      While each technique might be small, when combined they can lead to a significant savings in time and space
Paper 26 Aaron Brown A SAS® Toolbox for File and Folder Manipulation: Copy, Rename, Delete, or Zip via Functions or X Commands SAS can create folders via DLCREATEDIR.
      Macros are useful for looping through lists of files or folders, to create/rename/delete them.
      Both SAS functions and X commands can be used to delete files and folders.
Paper 37 Richann Watson and Louise Hadden What Kind of WHICH Do You CHOOSE to be? WHICH and CHOOSE functions are efficient and can replace a series of if-then-else statements.
      WHICH functions return the index number of the first match in an item list.
      CHOOSE functions return the value from the selection list that matches the index specified.
Paper 57 Michael Raithel Using the SAS® HPBIN Procedure to Create Format Value Ranges for Numeric Variables This macro program uses PROC HPBIN to create format ranges based on either quartile or bucket binning.
      This macro can used to compute numerical boundaries that can used to define the Start/End values in PROC FORMAT statements.
      The macro provides a scientific, defensible, repeatable method of setting format Start/End valuse
Paper 72 David Bosak The reporter package: A Powerful and Easy-to-use Reporting Package for R Creating reports with existing R packages is hard
      Creating reports with the reporter package is easy
      More information can be found at reporter.r-sassy.org
Paper 73 Deb Cassidy Where Where is a Problem There are many ways to write the same logic but not everything works as you might expect.
Paper 75 Thomas Billings Tips for Input/Handling of Dates in Command Files and Production Standard date handling can be made more flexible by checking to determine if an override-date macro variable is populated, and using when relevant.
      Instead of INTNX and HOLIDAY function checking to determine last workday, an easier approach is an SQL pull of latest date from a table that is populated only for workdays.
      SYSPARM - available via OS SAS® command, macro variable, and function - is a method to set start, stop dates (and other parameters).
Paper 82 Julia Skinner May i? Lessons learned from using nested DO loops in a family card game Any project, no matter how small, provides an opportunity for learning.
      Thorough and accurate requirements gathering is critical to success.
      When using nested DO loops, parameters need to be chosen thoughtfully.
Paper 84 Jonathan Duggins and Jim Blum PROC REPORT: Tips and Customizations for Quickly Creating Customized Reports PROC REPORT is much more flexible than PROC PRINT
      Usages in PROC REPORT allow for variety of report layouts and contents
      Styles allow for a customized aesthetics.
Paper 85 Ronald Fehd A Configuration File Companion: testing and using environment variables and options; templates for startup-only options initstmt and termstmt one third of options are startup-only: can only be assigned in configuration files
      testing configuration files is a non-trivial task and requires other startup-only options on command line
      examples of initstmt and termstmt show tricks for %put statement
Paper 96 Kent Phelps and Ronda Phelps Base SAS® & SAS® Enterprise Guide®: Automate Your SAS® World with Dynamic Code The Power To Know through a dynamic FILENAME statement.
      The Power To Transform static code into dynamic code using the SET INDSNAME option and a Macro variable.
      The Power To Execute dynamic and static code automatically using the CALL EXECUTE command.
Paper 98 Josh Horstman Getting Started with Data Step Hash Objects Hash objects are simply data tables stored in memory along with built-in methods for fast and efficient access to that data.
      Use the FIND method of a hash object to retrieve information based on the values of a key variable or combination of variables.
      A hash iterator object can be used in conjunction with a hash object to offer sequential access to the data in the hash object.



Leadership / team building/ career development

Paper Authors Title Key Takeaways
Paper 100 Josh Horstman So You Want To Be An Independent Consultant: 2021 Edition Working as an independent consultant can be a very fulfilling and rewarding lifestyle, but it's not for everyone. It requires hard work, careful planning, and an ability to motivate oneself.
      Networking is an extremely powerful tool for the independent consultant. It's worth spending time and effort to build and maintain a robust professional network.
      When starting out as an independent consultant, do your homework to decide what type of legal entity will work best for you. Consider consulting an attorney with experience in business formation.
Paper 30 Kirk Paul Lafler Differentiate Yourself Here's my top ways for SAS users to develop and/or improve their skills: 1) Download, use and learn valuable techniques using the “free” SAS OnDemand for Academics software 2) Access and learn valuable techniques from the SAS Technical Support website at support.sas.com 3) Access valuable and useful tips and techniques on SAS Support Communities 4) Download and use the “free” Online SAS documentation 5) Access and learn application-oriented techniques using the published “White” papers on LexJansen.com 6) Learn from the experts by reading user-written books from SAS Press 7) Attend, Volunteer and Present at in-house, local, regional, special-interest and international SAS user groups 8) Collaborate / Network with other SAS users 9) Become certified with SAS Certification exams
      Here's my list of ways to differentiate yourself from the competition: 1) Earn an Advanced Degree and/or Certification 2) Attend, Volunteer and Present at in-house, local, regional, special-interest and international SAS user groups 3) Share Your Knowledge (or Mentor) Colleagues 4) Collaborate with others 5) Read, Study and Learn from the published “white” papers on LexJansen.com 6) Access and Learn One New Technique Each Day from the Content on support.sas.com 7) Become SAS Certified (Base, Advanced, Clinical Trials, Predictive Modeler, Statistical Business Analyst, BI Content Developer, Visual Business Analyst, Data Integration Developer, Data Quality Steward, Platform Administrator) 8) Build Social Networking Connections on LinkedIn, Blogs, etc.
      Here's my list of MOOC courses and content to help SAS users improve their analytical, programming, statistical, and reporting skills: 1) SAS OnDemand for Academics software 2) SAS Technical Support website at support.sas.com 3) Online SAS Documentation in PDF and HTML format on support.sas.com 4) Published “White” Papers on LexJansen.com 5) SAS Press and SAS Talks Webinars 6) SAS YouTube Channel Videos 7) In-house and Local SAS Users Group meetings
Paper 35 Barbara Okerson Asking the Right Questions: Designing Surveys to Produce Valid and Reliable Results Careful attention needs to be placed on question wording for surveys to produce valid results.
Paper 50 Kirk Paul Lafler Exploring the Skills Needed by the Data Scientist The Bureau of Labor Statistics (BLS) (accessed on April 9th, 2021) projects that Employment of computer and information research scientists is projected to grow 15 percent from 2019 to 2029, much faster than the average for all occupations.
      The majority of Data Scientists hold an undergraduate degree in a quantitative field such as Mathematics, Statistics, Decision Sciences, Computer Science, Management Information Systems, and Economics to name a few. Many also hold graduate degrees and/or certificates in the field of data science.
      Data Science / Analytics Skills Technical Skills Non-Technical Skills Programming including Python, R and SAS Critical Thinking SQL Intellectual Curiosity Statistics Business Acumen Excel Verbal / Written Communication Hadoop Ability to Work in a Team Machine Learning (ML) Artificial Intelligence (AI) Knowledge with Structured and Unstructured Data Data Wrangling Knowledge of Analytical Functions Data Visualization
Paper 63 Brian Varney What Level am I? A Look at Categorizing a Programmer as a Beginner, Intermediate, or Advanced categorizing a SAS programmer's skill set
Paper 64 Kelly Smith Successful Communication with Data Phobic Audiences Keep your audience in mind when preparing and delivering data presentations.
      Headline titles and callouts help your audience understand and retain key information from your graphics.
      Follow accessibility guidelines to make your presentations beneficial to diverse audiences.
Paper 79 Kelly Smith Developing Ethical Data Use and Users Data ethics touches every part of a project, from selecting data to reporting results.
      Data misuse doesn't have to be intentional to cause harm. Incidental misuse of data can still harm the innocent.
      Proactive ethics approaches are good for business and for society.
Paper 94 Troy Hughes Badge in Batch with Honeybadger: Generating Conference Badges with Quick Response (QR) Codes Containing Virtual Contact Cards (vCards) for Automatic Smart Phone Contact List Upload You will learn how to generate customized QR codes that contain contact information and which can be scanned by cell phones.
      You will learn how to generate customized conference badges that include attendee name (and other optional personal information), personalized QR code, and conference logo.
      You will learn the benefits of data-driven software design, in that all of this customization is accomplished only through modification of an attendee data set and a CSS file--with no need to modify the underlying code.



Planning and Administration

Paper Authors Title Key Takeaways
Paper 27 Denise Kruse This Week’s Forecast: Read From The Cloud How to collect data read from RDBMS Introduction of SASTRACE for bringing more detail into the SAS log
      Introduction of %MDUEXTR macro for pulling metadata from SAS Management Console user entries
      General guidance on how to approach the challenge of collecting metrics for records read from RDBMS
Paper 34 Kirk Paul Lafler SAS® Performance Tuning Techniques Efficiency and performance strategies can be classified into five areas: CPU Time, Data Storage, Elapsed Time, I/O, and Memory.
      A sampling of CPU techniques are: 1) Use KEEP= or DROP= data set options to retain desired variables. 2) Use WHERE statements, WHERE= data set option, or WHERE clauses to subset SAS datasets. 3) Create and access SAS datasets rather than ASCII or EBCDIC raw data files. 4) Use IF-THEN / ELSE or SELECT-WHEN / OTHERWISE in the DATA step, or a Case expression in PROC SQL to conditionally process data. 5) Use the DATASETS procedure COPY statement to copy datasets opposed to DATA-SET constructs. 6) Use DATA step hash techniques to perform lookups and merges (or joins). 7) Turn off the Macro facility when not needed by specifying the NOMACRO system option. 8) Avoid unnecessary sorting - understand when a sort is needed. 9) Use procedures that support the CLASS statement to take advantage of group processing without sorting. 10) Use the Stored Program Facility for complex DATA steps. 11) CPU time and elapsed time can be reduced with the SASFILE statement.
      A sampling of input/output (I/O) techniques are: 1) Read only data that is needed from external data files. 2) Minimize the number of times a large data set is read by subsetting in a single DATA step. 3) Use KEEP= or DROP= data set options to retain only desired variables. 4) Use a WHERE statement, WHERE data set option or PROC SQL WHERE-clause to subset data. 5) Use data compression for large data sets. 6) Use the SQL procedure to consolidate steps. 7) Store data in SAS data sets, not external files to avoid excessive read processing. 8) Perform data subsets as early as possible to reduce the number of reads. 9) Use indexed data sets to improve access to data subsets. 10) The BUFNO= option can be specified to adjust the number of open page buffers when processing SAS data sets.
Paper 60 Louise Hadden Management of Metadata and Documentation When Your Data Base Structure is Fluid: What to do if Your Data Dictionary has a Varying Number of Variables A data dictionary designed to guide data collection can be used to programmatically drive the creation of essential SAS tools such as variable labels, value labels, format assignment statements and codebooks.
      If the number of instances of a variable (for example, the number of medical tests) in a data dictionary is represented by a wildcard or wildcards, SAS tools can be used to iterate variables for metadata purposes.
      Conditional macro logic and functions such as INDEX, INDEXC and SUBSTR can be used to create valuable metadata products.
Paper 95 Troy Hughes GIS Challenges of Cataloging Catastrophes: Serving up Geowaffles with a Side of Hash Tables to Conquer Big Data Point-in-Polygon Determination and Supplant SAS® PROC GINSIDE PROC GINSIDE is the SAS out-of-the-box solution for point-in-polygon analysis...but it is both slow AND inefficient.
      The GeoWaffle methods, with full code demonstrated in this white paper, run up to 25 times faster (while producing identical results) than PROC GINSIDE.
      GeoWaffles are ideal when three criteria are met: 1) you have BIGGGGGG DATA, 2) you have static shapefiles/maps that change little over time, and 3) you have the need to recurrently process point-in-polygon analysis as part of GIS workflows.



Reporting and Graphics

Paper Authors Title Key Takeaways
Paper 38 Richann Watson and Louise Hadden "Bored"-Room Buster Bingo - Create Bingo Cards Using SAS® ODS Graphics ODS Graphics which like Statistical Graphic (SG) procedures and Graph Template Language (GTL) can be used for more than just producing series plots, bar charts or scatter plots.
      Random selection of items can be selected using RAND function which replaces the older random selection functions, such as RANUNI and UNIFORM.
      Using SAS does not always have to be about work. Sometimes you just want to do something fun, like play bingo.
Paper 59 Louise Hadden Dressing Up your SAS/GRAPH and SG Procedural Output with Templates, Attributes and Annotation SAS/GRAPH and SG Procedures (ODS Graphics) have parallel capabilities. However, SAS/GRAPH is available via special license only, while SG Procedures are available in BASE SAS.
      Both SAS/GRAPH and SG Procedures incorporate the ability to annotate graphics created or imported within SAS, some overlapping (for example, the %CENTROID SAS provided macro.)
      SG Procedures and SAS/GRAPH both allow users to customize attributes in SAS graphics; ODS graphics have standardized the customization process with ATTR statements and ATTRMAPS.
Paper 66 Dennis Beal Tips for Customizing Graphs Using Real Coronavirus Testing Data SAS/GRAPH is a powerful tool for customizing graphs without ODS
Paper 77 Jim Blum and Jonathan Duggins Getting Started with Attribute Maps: Methods for Creating and Storing Custom Style Definitions for Graphs, Charts, and Maps Attribute maps are helpful for setting consistent styles.
      Attribute maps require some planning and effort



Statistics and Data Analysis

Paper Authors Title Key Takeaways
Paper 13 Stephen Sloan and Kevin Gillette Assigning agents to districts under multiple constraints using PROC CLP PROC CLP, a SAS OR product, provides a useful technique for assigning agents to districts when subject to multiple constraints
      PROC CLP provides a feasible solution
Paper 16 Ross Bettinger Missing Value Imputation The reader will understand the theoretical background underlying the analysis and imputation of missing values.
      The reader will learn about fuzzy logic and how fuzzy concepts can be used in clustering to extend "crisp" k-means clustering into "fuzzy" c-means clustering.
      The reader will observe how fuzzy c-means clustering can be used to perform imputation on missing values by analysis of a case study.
Paper 28 Deanna Schreiber-Gregory Back to Basics: Running an Analysis from Data to Refinement in SAS Analytics is a complex study with multiple steps. Skipping a step can be detrimental to your research!
Paper 3 KANNAN DEIVASIGAMANI and Douglas Lunsford Statistical Test Selector for Researchers Plan the study and identify test required in a research study
      Learn and refer to analysis of the test from other research studies
      Define expanded scope for further research to develop further adding value to future researchers to pursue.
Paper 51 Austin Brown A Macro to Utilize a Nonparametric Multiple Stream Process Quality Control Chart in SAS How a new, novel nonparametric control charting scheme can be implemented in SAS
Paper 58 Chun Du Use SAS Enterprise Miner Workstation 15.1 to Do Predictive Analysis for Mobile Strategy Games Industry Machine Learning
      SAS Enterprise Miner
      TEXT CLUSTER & TOPIC ANALYTICS
Paper 67 Tamar Roomian A SAS macro program to calculate the Fragility Index The Fragility Index is a measure to determine the statistical stability of 2x2 contingency tables when the outcome of interest is rare.
      This macro program calculates the Fragility Index and Fragility Quotient for a list of studies.
Paper 6 Jenhao Cheng Converting Remeasurement Data into Percentile Ranks Based on Baseline Data Using PROC SQL: Patient Experience Measures for CMS Value-Based Purchasing Program Learn how to convert data values in one dataset into percentile ranks with another dataset as the reference.
      Learn how to join tables with more advanced SQL queries: full cross join, unequal join, nearest join, self join, union operator, and some built-in functions.
      Getting a basic understanding of the patient experience data used for CMS Value-Based Purchasing program
Paper 89 Bruce Lund Screening, Binning, Transforming Predictors for a Generalized Logit Model This paper discusses methods to screen, bin, or transform predictors prior to the model fitting stage for the generalized logit
      SAS(R) macros are discussed and used in examples in the screening, binning, and transforming of predictors for a generalized logit model.
      Comparisons and extensions are made to screening, binning, or transforming of predictors for the cumulative logit model.
Paper 90 Jason Brinkley Using PROC SURVEYSELECT to create data files with all pairwise combinations of data Proc Surveyselect for sampling
      Matching of datasets
      Summarizing within data associations



e-Posters

Paper Authors Title Key Takeaways
Paper 21 hengwei liu Some Linux Shell Scripts for SAS® Programmers Set up short shell scripts to handle some routine tasks.
Paper 32 Kirk Paul Lafler Ten Rules for Better Charts, Figures and Visuals In Rougier, Droettboom and Bourne’s published paper, “Ten Simple Rules for Better Figures,” a set of rules to improve figure design and to explain common pitfalls associated with the production of better figures are presented. Rule #1: Know Your Audience Rule 2: Identify Your Message Rule 3: Adapt the Figure to the Support Medium Rule 4: Captions Are Not Optional Rule 5: Do Not Trust the Defaults Rule 6: Use Color Effectively Rule 7: Do Not Mislead the Reader Rule 8: Avoid ‘‘Chartjunk’’ Rule 9: Message Trumps Beauty Rule 10: Get the Right Tool
      Good graphical design begins with displaying data clearly and accurately. Data (and information) should be conveyed effectively and without ambiguity. Unnecessary information often distracts from the message, therefore it should be excluded.
      Whether the data you work with is large or small, or some size in between, the SGPLOT procedure is a powerful tool for handling many of your graphical needs. You’ll be able to let your visuals do the talking by helping your audience see hidden, or hard to see, things in your data, while avoiding the obvious by surprising and engaging your audience. PROC SGPLOT creates single-cell bar charts, box plots, bubble plots, dot plots, histograms, line plots, scatter plots, and an assortment of other plot types quickly and easily.
Paper 5 Abbas Tavakoli and Navid Tavakoli Using QUANTREG to Examine time and Drug on Histamine HA level in Mice Using Quantile regression
      Application in animal study
      Using SAS Macro
Paper 61 Louise Hadden Looking for the Missing(ness) Piece The ubiquitous PROC FREQ has a number of helpful options, including an option to produce a summary table detailing the number of non-missing values and missing values in a data base.
      PROC FREQ NLEVELS accommodates information from special missing numeric values in SAS.
      The PROC FREQ NLEVELS output data set can be manipulated to traffic light values which may indicate data quality issues, for example a completely missing variable.
Paper 74 Stephen Sloan A Quick Look at Fuzzy Matching Programming Techniques Using SAS® Software Fuzzy Matching is a useful technique when joining two or more files on a character variable that might have spelling differences
      There are multiple techniques for fuzzy matching that are best for different circumstances