SESUG 2010 Abstract

Beyond the Basics


%whatChanged: A Tool for the Well-Behaved Macro
Frank DiIorio
Paper BB-01

The power and usefulness of macros is undeniable. Also above dispute is the ability of a poorly constructed macro to introduce chaos in programs that use it. Many of these undesirable side effects can be prevented by following some simple design and coding practices.

One of these practices, and the focus of this paper, is that the macro should produce only what is described in its documentation. If, for example, it is supposed to create a dataset and a macro variable containing the number of observations in the dataset, then these items should be the only changes to the SAS environment once the macro terminates. There are no built-in SAS tools that compare “snapshots” of resources and settings. Not surprisingly, %whatChanged, the macro described in this paper, performs this task.

This paper very briefly discusses some principles of good macro design. It then describes the design, coding, and use of %whatChanged. While discussion of the macro is useful in and of itself, the other goal of the paper is to give the reader some insight and motivation to develop similar tools.

The paper is appropriate for anyone charged with developing, debugging, or enhancing macros. A basic knowledge of the macro language is assumed.



While You Were Sleeping, SAS Was Hard At Work
Andrea Wainwright-Zimmerman
Paper BB-02

Automating and scheduling SAS code to run over night has many advantages, but there are also many pitfalls to be aware of. This paper will discuss how to make sure you save the log, get e-mail messages at critical points, as well as other recommendations and considerations on how to make it all work.



Which SASAUTOS Macros Are Available To My SAS Session
Harry Droogendyk
Paper BB-03

The SAS installation process makes a number of SAS supplied macros available to your SAS session via the SASAUTOS option. In addition, useful macros created by users in your organization may be made available to SAS users via SASAUTOS or by way of the Compiled Stored Macro facility. But… how do I know which macros are available? And, if a macro has been defined in more than one location, which one will be utilized? This paper will explain the different methods of defining and storing macros, discuss how SAS searches for macros, demonstrate how to specify the appropriate options to allow access to compiled stored macros and define the SASAUTOS search order.

The meaty stuff will highlight parts of %list_sasautos, a useful utility macro written by the author, that will identify all of the macros available to your SAS session.



QC Your SAS and RDBMS Data Using Dictionary Tables
Harry Droogendyk
Paper BB-04

In the context of our daily occupations we are always examining data. Whether we're testing ETL processes that populate data marts, verifying data pulled for testing, or just becoming acquainted with unfamiliar data there are some rudimentary things we do typically do. Simple analysis of continuous variables such as min, max, mean etc... and frequency distributions of categorical variables are often used to provide quick insight.

This paper presents a macro that does the QC work for you, driving the process from dictionary tables, whether the data is from SAS datasets or sourced from any of the DB systems accessible by SAS.



SAS/Data Integration Studio – Creating and Using a Generated Transformation
Jeff Dyson
Paper BB-05

SAS/Data Integration Studio (DI Studio) transformations are packaged tasks that give data architects the tools needed to construct ETL job flows. While DI Studio has built in transforms that satisfy the majority of an architect’s needs, there are occasions when the user must create a custom routine to meet a non-standard requirement. The DI Studio generated transformation is a perfect solution to fill this void when these non-standard requirements are repeatable.



ExcelXP on Steroids: Adding Custom Options To The ExcelXP Tagset
Mike Molter
Paper BB-06

The multitude of options available with ODS’s ExcelXP tagset has allowed users access to dozens of Excel features when creating spreadsheets from SAS, but not all of them. ExcelXP is a SAS-made tool, but because it is a tagset, users have the ability to modify it. In this paper we’ll discuss strategies for adding simple functionality to ExcelXP. Users of all levels will not only see the brief, intuitive tagset code used to produce the required XML for these specific examples, but will also realize the power over their output that this and other tagsets give them. Those with more experience with XML and tagset coding will learn a little more about the inner workings of ExcelXP as well as general strategies for adding any functionality to any tagset.



Mobile Macros - Get Up to Speed Somewhere New Fast
Patricia Hettinger
Paper BB-07

Have you ever been faced with this scenario? It’s your first day on the job and you know absolutely nothing about the data you’re supposed to be working with. The documentation is scanty and people are too busy to give a detailed orientation – if they know enough to do so in the first place. This paper gives some macros the reader can use anywhere and explains some SAS concepts behind them. It also details how you can use SAS Enterprise Guide to create input screens for more flexibility.


SAS and Relational Databases: What You Must Know
Patricia Hettinger
Paper BB-08

It is a rare shop that has data in just SAS format. DB2, Teradata, Oracle, SAP - all of these relational databases have their own idiocyrancies that may not become apparent until your DBA calls. Just because you can assign a libname to a database and use proc sql doesn't mean it will run the same way as it would against a SAS source. This paper will address what you must know before using SAS/Access for more efficient queries and consistent results.



A Serious Look Macro Quoting
Ian Whitlock
Paper BB-10

You can make decisions macro with %IF and do looping with %DO-loops. But there are times when you don't understand why the beast does what it does. Now what? It is time to come to this presentation.

It is time to take a serious look at macro quoting. I have often said that anyone who thinks macro quoting is simple, probably doesn't understand the problem; so I have been there. Now I want to explain how simple it can be.

Everything relevant to this paper is in BASE SAS®. Although the examples have been executed on a PC under Windows, the examples are independent of any particular operating system.

You might think this paper is a compendium of all the macro quoting functions. It is not. It is more about the subject of macro quoting and understanding than about the macro quoting functions. On the other hand, a minimum set of quoting functions is discussed.



The Path, The Whole Path, And Nothing But the Path, So Help Me Windows
Art Carpenter
Paper BB-11

Have you ever needed to determine a directory path dynamically? Have you needed to know the name of the currently executing program and where it resides? There are a number of tools available to the SAS programmer that can be used to determine path and location information. These tools range from common to obscure and from simple to complex and they utilize SAS Language functions, System Options, SASHELP views, DICTIONARY tables, and OS Environmental Variables.

This presentation will take a survey of a number of these techniques for finding and determining path and location information. This information can be important for all levels of programmers, and you should be aware of these techniques even if you think that you will never need them.



ODS Layout for RTF - A custom tagset
Richard A. DeVenezia
Paper BB-12

ODS LAYOUT is a preproduction feature in SAS® 9.0 through SAS 9.2. Preproduction means it is not officially supported nor fully developed. Regardless of its status, LAYOUT is a powerful tool for the PRINTER and PDF destination. It will let you “easily mix graphics, images, text, and tables, and arrange them on a page,” producing eye popping print or print copy. Hold it, wait, backup... PRINTER and PDF destination only? What about that RTF report I wanted to create?

This paper will demonstrate that the production ODS system has the ability to extend the standard RTF tagset to do what ODS LAYOUT does. Details of the Tagset and RTF languages will be discussed.



The DoW-Loop Unrolled
Paul M. Dorfman
Paper BB-13

The DoW-loop is a nested repetitive DATA step programming structure. It is intentionally organized to isolate DO-loop instructions related to a certain break-event from actions performed before and after the loop in a logically natural manner, i.e. in full rhythm with the DATA step implied-loop automatics. Readily recognizable in its most well-known form by the DO UNTIL (LAST.ID) construct, which naturally lends itself to control-break processing of key-grouped data, the DOW-loop is much more generic and morphologically diverse. In this paper, we examine the internal logic of the DOW-loop, use the power of example and the DATA step debugger to reveal its aesthetic beauty and pragmatic utility. Flagging each observation in a group based on conditions within the group in a single DATA step, natively aggregating cumulative totals, dynamically splitting a file in a number of files by ID via a hash - the DOW-loop lends itself to all these and other tasks as an ideal logical vehicle by radically simplifying the alignment of stream-of-consciousness and SAS® DATA step code.



PROC_CODEBOOK, Automating the Review and Documentation of SAS Files
James Terry
Paper BB-14

If a SAS file has variable names and formats for categorical variables, highly recommended practices, the PROC_CODEBOOK macro can be called to produce a comprehensive, well formatted and easy to read codebook. In the heading the codebook will provide codebook title(s), file name, file label, and date created, number of observations and number of variables and file organization. The body of the codebook will provide the variable name, label, type, format and length, the mean, the range of values, frequency category, number and percent. Optional footnotes provide additional documentation. In addition, a codebook warning report is produced for categories that are not used, for variables with all missing data and variables that are constant.

To produce a codebook, the user provides one or two titles for the codebook (title1 and title2), file organization (%let organization = user defined) and calls the macro, it is that easy.



Coders' Corner


A View Toward Performance
Ed Heaton
Paper CC-01

Often we need to preprocess our data through a DATA step before we submit it to some SAS Procedure. When our data are very large -- billions of rows -- this processing of reading the entire dataset, making the modifications to each row, writing the entire dataset back out to disk, and then reading then entire dataset back into the PROC can take a very long time.

The author proposes the use of a DATA view to process the data one row at a time and then to send that row directly to the PROC for processing. With very large datasets the time savings can be on the order of 60%.


Using SAS to Produce Report-Ready Summaries of Likert-Type Survey Data: PROC TABULATE, Output Delivery System, PROC TEMPLATE
Imelda Go
Paper CC-02

Beginners in SAS can use simple methods, such as the DATA Step, PROC FREQ, and PROC MEANS, to compute the summary statistics for their survey data. However, the form in which default SAS output appears is typically not compatible with the appearance of formal reports. In the worst-case scenario, one can get the required statistics from the SAS output and type them manually into the report. Using Likert data as an example, this paper shows how, with data set manipulation, a programmer can use PROC TABULATE, the Output Delivery System, and PROC TEMPLATE to produce customized report-ready output in rich text format (RTF). It is possible to produce RTF output that can be included in formal reports with little or no revision.


Recoding “ALL THAT APPLY” Variables from Handhelds and Portable Computers
Wafa Handley, Barbara Bibb, Lilia Filippenko, Jay Levinsohn, Donna Medeiros
Paper CC-03

How do you turn “ALL THAT APPLY” (ATA) questions into useful data structures? With the advent of Computer Assisted Interviewing (CAI) as a primary venue for survey data collection, most commercially available CAI software provides developers with tools to interface with the software. The tools enable developers to access the CAI software database to create data structure documents with variable names and attributes to facilitate data processing. With the introduction of handhelds as a mode of data collection, the range of methods and tools to collect data is quite broad. As yet there is no dominant mobile tool such as the Blaise questionnaire development software, so converting handheld database structures or handheld storage conventions for certain types of data for SAS® typically requires new programming.

Prior to the use of handhelds for data collection, application programmers at RTI International adopted three approaches to facilitate data analysis for the ATA survey items. The approaches utilize Blaise as the questionnaire development application used in the CAI software, and the SAS® system software to produce the final data files.

ATA responses from the handhelds were stored as one variable per question compared to individual response storage options available with larger capacity portable devices. The single variables were then “deconstructed” using SAS® Decimal to Binary conversion functions to create analysis variables representing chosen responses.

Our presentation will: (1) demonstrate the need for recoding “All That Apply Variables”, (2) describe the different approaches taken in recoding, and (3) review the different approaches for raison d’être.


Making Sense of Census Data
Robert S. Matthews
Paper CC-04

The United States Census Bureau publishes a vast amount of data on many different facets of the U.S. population. One of the most utilized resources is the data available from the decennial census. This data is summarized and stratified in many different ways and can be used for a myriad of purposes. The task of actually extracting particular pieces of information from the census data can be daunting since there are literally hundreds of files containing thousands of individual variables. Not only is the data itself voluminous, but the documentation is also very extensive. An example of extracting three variables from the 2000 census data is presented in this paper to illustrate the tasks involved in reading, understanding, and using the data.


By Your Command: Executing Windows DLLs from SAS Enterprise Guide
Darryl Putnam
Paper CC-06
Have you ever wanted to use system shell commands (copying and moving files, creating directories, etc) from SAS Enterprise Guide 4.1? SAS Enterprise Guide default installation turns off this functionality. You can re-enable the X command using -ALLOWXCMD, but do you really want just anyone to be able to issue commands like format C on your SAS server? I think not. Luckily you can run external DLLs from SAS under Windows by using the SASCBTBL Attribute table and MODULE family of call routines and functions.

This paper will demonstrate how to setup the SASCBTBL table to call a select number of Windows DLLs. A brief description of MSDN library and how to turn the Windows function into SAS callable routines will be presented.


How to run an error check to stop SAS
Dan Blanchette
Paper CC-07

I wrote a SAS macro named RUNQUIT that allows interactive SAS users to stop SAS from processing the rest of their code in an interactive session without ending the SAS session nor loosing any data or code. It also works in batch mode. You can just type "%runquit;" instead of "run;" or "quit;" in your code so that SAS will stop whenever an error occurs.

RUNQUIT basically presses the interrupt button (the icon with the exclamation point "!" in a circle in the SAS toolbar) for you when an error occurs in your code. Choosing to to cancel submitted statements in the pop-up windows that the interrupt button asks you makes SAS stop processing any more of your code but does not end your SAS session.


%RESTRUCT - SAS® macro with Proc Univariate
Milorad Stojanovic
Paper CC-08

Proc Univariate has a long history as a part of the SAS Base software package. It is very handy and useful. It allows users to quickly analyze raw as well as final data at a glance. It does, however, come with a price – users often have to browse many pages of Proc Univariate report results. Macro %RESTRUCT relieves you of that burden by summarizing the data you need to examine most. The summary results are presented in a very condensed form. Users should specify their source of data, statistics they would like to have and the output file location.


IF and %IF You Don't Understand
Ian Whitlock
Paper CC-09

Many beginning SAS(r) macro programmers are confused about the difference between the DATA step IF and a %IF in a DATA step generated by a macro, and when to use which. Simple examples are used and discussed so that you may never be confused about this issue again.

This presentation is appropriate to all operating systems and SAS products where code is written.


Automation of Data Updates: A Case Study
Carry Croghan
Paper CC-10

In December of 2009, the United States Environmental Protection Agency initiated a year long study to measure a suite of traffic-related air pollutants adjacent to I-15 in Las Vegas, Nevada. Measurements were collected simultaneously at four sites and stored at five minute intervals. The data were transmitted daily to a central computer using WinCollect Data Evaluation and Reporting Software (Version: 3.3). Once a week, the data were exported into an Excel (2003) spreadsheet which was then translated to a SAS® dataset. To efficiently update the SAS database on a weekly basis several different SAS tools were utilized including PROC IMPORT, PROC UPDATE, %INCLUDE, and batch calls with parameter pass throughs. Each of these tools will be described with details including why they were utilized. By automating the process of updating the SAS database, the time to complete the weekly data extraction task and the probability of human error were greatly reduced.


Using DICTIONARY Views to Eliminate Tedious Visual Review
Christine Davies
Paper CC-11

The SASHELP Dictionary views are a powerful tool that may be overlooked at times. As an example of their utility, they were used to overcome the challenge of inconsistent naming of datasets across years for a project. This is illustrated in the example code where the dictionary views in the SASHELP library were used to populate macro variables with dataset names, which were then used in conjunction with the FILEEXIST function to perform conditional processing on the dataset based on date last modified.


Application Dispatcher: Some Tweaks and Tricks
Carol Martell
Paper CC-12

In the process of converting a ColdFusion/MySQL dynamic website to one that uses SAS/IntrNet® Application Dispatcher to query data, we developed a few workarounds to cope with miscellaneous obstacles. The paper presents techniques enriching the dictionary tables and tricks for passing and parsing multiple values from a single form field. This paper is for the intermediate to advanced SAS® user having familiarity with Application Dispatcher.


SAS® TIPS FOR INSTITUTIONAL RESEARCHERS TO TRACK STUDENT OUTCOMES EFFICIENTLY
Vijayalakshmi Sampath
Paper CC-13

Analysts and programmers in the Institutional Research (IR) field at Higher Education Institutes often work on research studies where a cohort of students must to be tracked across semesters (Longitudinal studies) in order to summarize their outcomes such as retention, course success rate, and graduation. Some studies also involve tracking outcomes and summarizing results for different types of student cohorts for a given semester (Cross Sectional studies). Since student information in College databases is constantly changing, researchers use static versions of the data for all official reporting. The static IR files in many colleges are generated after the Census date for every semester, resulting in multiple data files. As a result, tracking students across multiple files or tracking multiple cohorts becomes challenging and tedious for multi-year studies. This paper presents innovative SAS programming logic to track and output results of various student outcomes under the multiple semester and the multiple cohort scenarios. It utilizes Macro programming and Arrays within and outside of the DATA step in a robust manner to achieve this.


Bars and Lines: A Quick Introduction to PROC GBARLINE
Garland D. Maddox
Paper CC-15

Ever had a need to create a vertical bar chart and then overlay it with one or more plots? PROC GBARLINE provides a convenient and elegant method of doing just that. This paper provides a brief introduction to PROC GBARLINE with an emphasis on using some of the enhancements introduced in SAS® Version 9.2. This is a basic to intermediate presentation


Which Job Sent *THAT* Error Message - How to Generate a Lookup List From Your Metadata
Robert Janka
Paper CC-16

Imagine you are a newly hired SAS® Platform Administrator and Data Integrator. It is trial by fire … jobs are failing! How would you quickly figure out which nightly jobs send which error messages? How would you match up those error texts to the appropriate jobs and create a lookup list that will help not only you, but your co-workers as well? The author found himself in just such a situation. He will present an example using DATA STEP metadata functions to automatically generate this lookup list.


Your Friendly Neighborhood Webcrawler: A Guide to Crawling the Web with SAS®
James Cox
Paper CC-17

The World Wide Web has a plethora of information; from stock quotes to movie reviews, market prices to trending topics, almost anything can be found at the click of a button. Many SAS users are interested in analyzing data found on the Web, but how do you get this data into the SAS environment? Various methods are available, such as designing your own Web crawler in SAS DATA step code or utilizing the %TMFILTER macro in SAS® Text Miner. In this paper, we will review the general architecture of a Web crawler. We will discuss the methods of getting Web information into SAS, as well as examine experimental code from an experimental internal project called SAS Search Pipeline. We will also offer advice on how to easily customize a Web crawler to suit individual needs and how to import specific data into SAS® Enterprise Miner™.


SAS® Abbreviations Are Your Friends; Use a Template Method to Code!
Elizabeth Ceranowski
Paper CC-18

Often, coders find themselves using the same procedures or sequence of procedures over and over again. It would be very useful to create a "shell" of a procedure or program that can be used to cut down on repetitive typing. (For example, to submit the generic code "proc print; run;"). This quick demonstration will show how to create/edit/use SAS® abbreviations to quickly interject code into the Enhanced Editor window in SAS® for Windows or in the code node in SAS® Enterprise Guide®.

This technique can be easily used by any level of SAS user from beginning to expert!



Foundations and Fundamentals


Take Control: Understanding and Controlling Your Do-Loops
Sarah A. Woodruff, Toby Dunn
Paper FF-01

The term “loop” describes any control structure that causes a set of programming logic to be executed iteratively. Not knowing when and how to use these structures properly cause many to produce SAS® code that is at best an illegible jumble and at worst a virtually useless quagmire. This paper performs an in-depth examination of the underlying concepts involved in DO-loop construct theory, including basic constructs such as DO While and DO Until loops as well as advanced constructs such as the DO-loop of Whitlock (DoW). In addition, the paper describes sound yet simple methods to help determine which loop is needed and rules for how to easily create them.

“Using loops is one of the most complex aspects of programming; knowing how and when to use each kind of a loop is a decisive factor in constructing high-quality software” - Steve McConnell (Code Complete, 2nd Edition).


Building the Better Macro: Best Practices for the Design of Reliable, Effective Tools
Frank DiIorio
Paper FF-02

The SAS® macro language has power and flexibility. When badly implemented, however, it demonstrates a chaos-inducing capacity unrivalled by other components of the SAS System. It can generate or supplement code for practically any type of SAS application, and is an essential part of the serious programmer's tool box. Collections of macro applications and utilities can prove invaluable to an organization wanting to routinize work flow and quickly react to new programming challenges. But the language's flexibility is also one of its implementation hazards. The syntax, while sometimes rather baroque, is reasonably straightforward and imposes relatively few spacing, documentation, and similar requirements on the programmer. In the absence of many rules imposed by the language, the result is often awkward and ineffective coding. Some amount of self- imposed structure must be used during the program design process, particularly when writing systems of interconnected applications. This paper presents a collection of macro design guidelines and coding best practices. It is written primarily for programmers who create systems of macro-based applications and utilities, but will also be useful to programmers just starting to become familiar with the language.


The Data Step; Your Key To Successful Data Processing In SAS
Don Kros
Paper FF-03

The Data step is one of the basic yet more powerful of the building blocks in SAS programming. Its power allows users to create the data sets that are used in a SAS program's analysis and reporting procedures. Understanding the basic structure, functioning, and components of the Data step is fundamental to learning how to create your own SAS data sets. In this paper, we will discuss what a SAS Data step is, why it is needed, how the Data step works, and what information you can supply in your SAS Data step to assist with getting the most out of a single data pass.


SAS Formats: Effective and Efficient
Harry Droogendyk
Paper FF-04

SAS formats, whether they be the vanilla variety supplied with the SAS system, or fancy ones you create yourself, will increase your coding and program efficiency. (In)Formats can be used effectively for data conversion, data presentation and data summarization, resulting in efficient, data-driven code that's less work to maintain. Creation and use of user-defined formats, including picture formats, are also included in this paper.


Fun with Functions
Yogini Thakkar
Paper FF-05

How often do we sit back and take the time to improve our SAS programming skills by learning to use something better than what we have routinely used in the past? I invite you to join me as we come out of our comfort zone to examine what is new in SAS 9 and find out how we can use these new tools to improve our code. In this paper we will revisit some commonly used functions and see how they have been improved in SAS9, and find out how this will help us simplify our code while reducing the chance of errors.

A few things that will be covered in this paper include examining how the FIND function can work better than the INDEX function and how the new COMPRESS function helps us minimize code. We will also look at the new ANY and NOT functions.


Merging into Hash: Some Practical Examples of Converting MERGE Statements into Hash Objects
Ying Liu
Paper FF-06

Merging data from two or more datasets is a common process as we manipulate our data for reporting and analysis. Prior to SAS® version 9, the MERGE statement was the most common approach to accomplish this task for a data step programmer. Although MERGE is effective and robust, there are some potential downsides; the most common is the cost of sorting. To use a MERGE the datasets normally must go through multiple sorts. Each data set has to be sorted by the common variables before the merge step occurs; and the results dataset often has to be sorted in to its most common access order after the merge. The CPU time can significantly increased because of the sorting procedures; in addition, if the datasets are large there is a need for extended disk space to accommodate the sort. With the introduction of a memory search method – the SAS HASH object, merging with a HASH look-up table method substantially improves the data management process, not only increasing efficiency but also improving code transparency. This paper will illustrate some techniques used to convert programmes using a MERGE statement to programmes using a HASH Object in the match-merge table relationship. Both one to one and one to many relationships will be covered.


The MEANS/SUMMARY Procedure: Getting Started
Art Carpenter
Paper FF-07

The MEANS/SUMMARY procedure is a workhorse for most data analysts. It is used to create tables of summary statistics as well as complex summary data sets. The user has a great many options which can be used to customize what the procedure is to produce. Unfortunately most analysts rely on only a few of the simpler basic ways of setting up the PROC step, never realizing that a number of less commonly used options and statements exist that can greatly simplify the procedure code, the analysis steps, and the resulting output.

This tutorial begins with the basic statements of the MEANS/SUMMARY procedure and follows up with introductions to a number of important and useful options and statements that can provide the analyst with much needed tools. With this practical knowledge, you can greatly enhance the usability of the procedure and then you too will be doing more with MEANS/SUMMARY.


Leave Your Bad Code Behind: 50 Ways to Make Your SAS® Code Execute More Efficiently.
William E. Benjamin Jr
Paper FF-08

This laundry list of tips, gathered from over 25 years of SAS programming experience, will show 50 ways to help SAS programmers make their code run faster and more efficiently. General topics will include doing more than one thing in each DATA step, combining steps to make simple tasks take less code, using macro variables to simplify maintenance, using built in features rather than writing your own code, ways to save disk space, using sorts for more than just sorting data, and ways to make the program code just read better (code that is easier to read is easier to maintain). The list is broken into categories to allow readers to find something they can use faster. This list can work as a primer for making code changes that will improve the systems and processes to which it is applied.


Know What Your Business Client Wants: An Introduction to how analytics is used to understand loyalty program in the hospitality industry
Tracy Li-moshenko
Paper FF-09

One of the data and analytical challenges that industries such as hospitality face is the identification of its unique guests. Without proper identification, it is not quite possible to get a holistic understanding of the guest transaction history which may render data mining exercises less insightful.

Besides the obvious benefit of building customer loyalty through monetary rewards or more personalized service, having a loyalty program also has the added benefit of allowing the business to gather additional personal information than would normally be collected either at a hotel reception or through its reservation systems. In addition, by issuing members a ‘loyalty program card’ and assigning a unique membership number makes it possible for the business to identify and track these members’ transactions over time.

This paper introduces readers to the loyalty program in a hospitality industry and how its analytics team utilizes the database and assists the loyalty team to better understand their members.

In particular, given the design of the data warehouse, there exist different ways of identifying transactions for the loyalty program, either on guest level or member level. Knowing what the business client wants or what kind of business question to be addressed is therefore especially important in that it may warrant the analyst takes a different approach to the relational database. The paper shows how the combination of the contextual understanding of business questions with fundamental SASdata management techniques is essential for conducting proper analysis.


It’s Five O’Clock Somewhere!!! Handling Dates And Times In SAS
Toby Dunn, Sarah A. Woodruff
Paper FF-10

Dates and Times - we use these concepts every day and they are strewn throughout our SAS® code. So what about dates and times cause us problems? Why do they cause so many programmers, both new and seasoned, headaches? More importantly, do they have to be so hard to deal with? We are here to say ”No, they don’t.” In this paper we will take you through the ins and outs of working with SAS date, datetime, and time values. Once armed with a range of concepts and functions you will come to see just how easy it is to work with date and time values in SAS, whether used in open or macro code. Once you have mastered these skills you too will be singing “It’s Five O’Clock Somewhere” and looking to kick back with your favorite beverage.


Evolve from a Carpenter’s Apprentice to a Master Woodworker: Creating a Plan for Your Reports and Avoiding Common Pitfalls in REPORT Procedure Coding
Allison Booth
Paper FF-11

Do you sometimes get into a routine of using the same techniques to create PROC REPORT code only to end up with reports that no longer suit your needs? You're sure that there's a better way to code in PROC REPORT, but you don't know where to start. If you're frustrated with the time that you spend creating reports that are less than ideal, consider taking the time to create a coding plan before you begin to code.

Although PROC REPORT coding is flexible and can be used with various SAS® applications, there are pitfalls. Just like an inexperienced carpenter’s apprentice, you might code in familiar ways that you know will work, rather than in ways that work optimally with PROC REPORT. Making a coding plan and being aware of the pitfalls will enable you to code PROC REPORT easily, thereby taking you from an apprentice-level carpenter to a master-level woodworker.


SAS® Programmer's Paradise: New Goodies in SAS® Enterprise Guide® 4.3
Stephen Slocum
Paper FF-12

Are you looking for a SAS® development environment that actually helps you write and manage SAS programs? It's here! This paper describes the many new productivity enhancements for programmers in SAS® Enterprise Guide® 4.3. There is a new program editor with syntax completion for hundreds of SAS procedures and statements, support for parentheses matching, built-in function help and disambiguation, program "tidy" functions, and more. There is also a SAS code analyzer that can read your program and turn it into a process flow, or modify it to optimize the program steps to run in parallel on a grid computing environment. The SAS Enterprise Guide project file now supports relative paths, making it easier to manage your projects, programs, and data in standard source management systems. And, it has never been easier to turn your SAS program into a store process so that you can share your expertise with the world.


Point-and-Click Programming Using SAS® Enterprise Guide®
Mira Shapiro, Kirk P. Lafler
Paper FF-13

SAS® Enterprise Guide® empowers organizations exploiting the power of SAS by offering programmers, business analysts, statisticians and end-users with powerful built-in wizards to perform a multitude of reporting and analytical tasks, access multi-platform enterprise data sources, the delivery of data and results to a variety of mediums and outlets, perform important data manipulations without the need to learn complex coding constructs, and support data management and documentation requirements quickly and easily. Attendees learn how to use the graphical user interface (GUI) to access tab-delimited and Excel input files; subset, group, and summarize data; join two or more tables together; flexibly export results to HTML, PDF and Excel; and visually manage projects using flowcharts and diagrams.



Hands-On Workshops


Statistical Analysis - The First Steps
Jennifer Waller
Paper HOW-01

For both statisticians and non-statisticians, knowing what data look like before more rigorous analyses is key to understanding what analyses can and should be performed. After all data has been cleaned up and statistical analysis begins, it is a good idea to perform some descriptive and basic inferential statistical tests. How to run and look at descriptive statistics using PROC FREQ, PROC MEANS, and PROC UNIVARIATE, and how to plot your data using some of the statistical graphics options in SAS® 9.2 will be presented. Additionally, how to perform and interpret one- and two sample t-tests using PROC TTEST and chi-square tests using PROC FREQ will be presented.


Macro Quoting - How and Why
Ian Whitlock
Paper HOW-02

Macro quoting requires precision understanding. Sometimes it is correct to quote and sometimes it is wrong. To quote correctly you must know what needs to be hidden (or masked) from whom during what part of the process and how code going to the SAS compiler gets unquoted.

Examples will be used to enable you to develop the intuition needed to know what to quote, how to do it, and how to remove the quoting when it is needed. A minimum number of quoting functions will be introduced along the way to accomplish these tasks.

The student should have some experience writing SAS(r) code and at least a minimum knowledge of how to generate it using the macro facility. In particular the instructions %MACRO, %IF, %DO, %LET, and %PUT should be recognized along with some simple functions such as %UPCASE and %SUBSTR or %SCAN. Although the student should know the purpose of these tools, s/he need not have a great deal of experience using these tools.

The course is appropriate to any operating system and any SAS product which involves the need to understand and/or write code.


PROC TABULATE: Doing More
Art Carpenter
Paper HOW-03

Although PROC TABULATE has been a part of Base SAS® since early version 6, this powerful analytical and reporting procedure is very under utilized. TABULATE is different; it’s step statement structure is unlike any other procedure. Because the programmer who wishes to learn the procedure must essentially learn a new programming language, one with radically different statement structure than elsewhere within SAS, many do not make the effort.

Once the foundation is laid a number of intermediate level techniques, options, and statements that provide the TABULATE programmer with a wide range of power and flexibility are presented. These techniques include those that are often underutilized even by the more experienced TABULATE programmer.


SAS/GRAPH® Elements You Should Know – Even If You Don't Use SAS/GRAPH
Art Carpenter
Paper HOW-04

We no longer live or work in a line printer - green bar paper environment. Indeed many of today’s programmers do not even know what a line printer is or what green bar paper looks like. Our work environment expects reports which utilize various fonts, with control over point size and color, and the inclusion of graphic elements. In general we are expected to produce output that not only conveys the necessary information, but also looks attractive. In the line printer days little could be done with the style and appearance of reports, and consequently little was expected. Now a great deal of control is possible, and we are expected to take advantage of the tools available to us. We can no longer sit back and present a plain vanilla report.

The Output Delivery System gives us a great deal of the kind of control that we must have in order to produce the kinds of reports and tables that are expected of us. Although we will often include graphical elements in our tables, it turns out that a number of options, statements, and techniques that are associated with SAS/GRAPH can be utilized to our benefit even when we are NOT creating graphs. Learn how to take advantage of these graphical elements, even when you are not using SAS/GRAPH.


Two Guys on Hash
Paul M. Dorfman, Peter Eberhardt
Paper HOW-05

The SAS® hash object is no longer new, yet its use is still not widespread. Where it is used, often the strengths and capabilities of the Hash object are underutilized. In this workshop we will quickly step through some introductory steps to ensure everyone is 'on the same page'; however, it does assume workshop attendee have some knowledge of the hash object - if not from practical experience, at least from attendance at an introductory workshop.

Once we lay some introductory groundwork we will work through some more interesting and challenging examples of the Hash object in action.

Take a deep breath (but don't inhale) as we start our journey into the world of Hash.


How To Use Proc SQL select into for List Processing
Ronald Fehd
Paper HOW-06

The SAS(R) macro language is simple, yet powerful. List Processing with Proc SQL is also simple, yet powerful. This Hands On Workshop paper provides programmers with knowledge to use the Proc SQL SELECT INTO clause with the various SQL dictionaries to replace macro arrays and %do loops.

Expected audience is intermediate to advanced users, and macro programmers.


Traffic Lighting Your Multi-Sheet Microsoft Excel Workbooks the Easy Way with SAS®
Vince DelGobbo
Paper HOW-07

"Traffic lighting" is the process of applying visual formatting to data. This paper explains how to use Base SAS®9 software to create multi-sheet Microsoft Excel workbooks (for Excel versions 2002 and later), and then traffic light the values that lie outside a range. You will learn step-by-step techniques for quickly and easily creating attractive, multi-sheet Excel workbooks that contain your SAS output using the ExcelXP ODS tagset. The techniques that are presented in this paper can be used regardless of the platform on which SAS software is installed. You can even use them on a mainframe! Creating and delivering your workbooks on-demand and in real time using SAS server technology is also discussed. Although the title is similar to previous papers by this author, this paper contains new and revised material not previously presented.



Planning and Support


If you Have Programming Standards, Please Raise Your Hand: An Everyman’s Guide
Dianne Louise Rhodes
Paper PS-01

When a new project starts, or a new manager takes over an old project, the programming team is faced with a new culture. Often, this means that they are asked to adhere to yet another set of programming standards, use style sheets, and are subject to peer reviews. This process is usually short lived, because of the lack of a practical approach to applying standards and enforcement and failure to budget time and resources in the project schedule. This paper goes through a step by step process of developing programming standards, classifying them and entering them into a database. This database can then be used to develop style sheets and check lists for peer review and testing. Through peer reviews and in preparation for them, programmers learn good programming practices. We describe in detail the most common standards; why and how they should be applied.


Supporting SAS Whether You Are a User or Not
Robert Jackson, Stephanie R. Thompson
Paper PS-02

With the proliferation of malware and numerous security threats, many organizations have implemented stringent security policies. Can there be a balance between solid security and the ability for SAS users to have some form of autonomy? Many IT support personnel are non-SAS users and many SAS users are not broadly versed in IT security issues. Is this a recipe for disaster or is there a support model that can work for both sides of the table? This panel will discuss some models currently in use, the pros and cons of different models, and maybe even come up with some new ideas.


If You Can’t Learn It From A Book, Why Are You Reading This?
Steve Noga
Paper PS-03

People management, an inexact science if ever there was one, is often thrust upon individuals as they move up the corporate ladder. With promotion comes added responsibility whether a person is ready for it or not. How is a person supposed to learn to manage other people? Seminars, and online courses, and books, Oh My! The choices are numerous and chances are that knowledge will be obtained. However, how did you learn to deal with missing data where none was expected? You experienced the frustration after your program produced erroneous results and you learned how to prepare for this situation the next time. Management is no different. Welcome to my observations from twenty-one years of managing SAS® programmers.


At Your Service: Your Roadmap to Support from SAS®
Kathy Council
Paper PS-04

At YOUR service. How to make the most of the products and services from SAS Publishing, SAS Education, Technical Support, and support.sas.com.

You have world class support from SAS at your fingertips. But where do you start? How do you navigate the sea of information available to you from SAS? How do you find the resources you need to do your job? This paper will provide you with practical tips, tricks, and techniques to find exactly what you need to use SAS. You’ll save time, learn about upcoming content and, best of all, become a more proficient and expert user of SAS.

This paper will benefit the new user and the seasoned SAS user alike. The intended audience is anyone interested in learning more about how to effectively use the services available from SAS.


SAS-L and Beyond
Joe Kelley
Paper PS-05

In the beginning was SAS-L (http://listserv.uga.edu/archives/sas-l.html ). And it was good, but did not reach far enough. So a Usenet newsgroup was added: comp.soft-sys.sas (http://groups.google.com/group/comp.soft-sys.sas/topics?hl=en&lnk ). It, too, was good. But more was needed. Now there is a Wiki: sasCommunity.org(http://www.sascommunity.org/wiki/Main_Page ), and the discussion forums at support.sas.com (http://support.sas.com/forums/index.jspa). There is also Lex Jansen’s site (http://lexjansen.com/) and many more. These will give you a good place to start and offer support, code examples, best practices, information, gossip, jokes and arguments. This presentation will go over these and perhaps provide some information on future developments.



Posters


Using SAS to Examine Missing Data in Psychometric Research
JoAnne Herman, Elizabeth Register, Abbas Tavakoli
Paper PO-01

This study examined the effect of missing data on development and testing of the Register-Connectedness Scale for Older Adults. A convenience sample of 428 community dwelling older adults participated in this study. Three factor analyses were run to develop Register -Connectedness Scales (72-likert item) for older adults. These run included no imputation, single imputation, and multiple imputation for missing data. Our study indicated that there were differences in the result of factor analysis when imputation was used to replace missing data as compared to using no imputation. Researchers should consider using imputation methods to help improve problems caused by missing data in the study.


SAS® Maps as Tools to Display and Clarify Healthcare Outcomes
Barbara B. Okerson
Paper PO-02

Changes in healthcare and other industries often have a spatial component. Maps can be used to convey this type of information to the user more quickly than tabular reports and other non-graphical formats. SAS®, SAS/GRAPH® and ODS graphics provide SAS programmers with the tools to not only create professional and colorful maps, but also the ability to display spatial data in a meaningful manner that aids in the understanding of changes that have transpired. This paper illustrates the creation of a number of different maps for displaying change over time with examples from the healthcare arena. Examples include choropleth, bubble, and distance maps and introduce the new GEOCODE procedures.

Results included in this paper were created with version 9.1.3 of SAS on a Windows XP platform and use Base SAS, SAS/STAT® and SAS/GRAPH. SAS Version 9.1 or later is required for ODS graphics extensions. SAS Version 9.2 is required for the DATAIMPORT and GEOCODE procedures. The techniques represented in this paper are not platform-specific and can be adapted by both beginning and advanced SAS users.


LAG Function Combined with Conditional Functions – Useful in Identifying Differences in Like Data
Andrew Hummel
Paper PO-03

The LAG function is useful in identifying subtle differences in rows with similar data. This is especially valuable when the data set contains a large number of rows. Additionally, when conditional functions are used in conjunction with the LAG function specific limits can be used to flag only particular differences between rows.


Solving Kenken Puzzles -- By Not Playing
John R. Gerlach
Paper PO-04

Solving Kenken® puzzles requires more than making sure that numbers are used only once in a row and column of a matrix. Unlike Sudoku puzzles that can use any symbol and have sub-matrices, Kenken puzzles require actual integers and have contiguous cells, called cages. And, unlike a sub-matrix that contains a unique collection of numbers or symbols, Kenken puzzles have cages that must contain natural numbers representing a total as a function of its assigned arithmetic operation. For example, consider a 4x4 Kenken puzzle having a cage containing 3 cells whose total is 11 as a function of simple addition. One possible set of 3 numbers would be: 4+3+4=11. The objective is to complete the grid using numbers ranging from 1 to N that satisfies both cage arithmetic and row / column uniqueness.

Depending on the size of the NxN grid, the number (and size) of the cages, as well as the arithmetic operations used, a Kenken puzzle offers a formidable challenge for logic puzzle fans. However, rather than play the game of considering numerous possible sets ranging from two integers, for subtraction and division, to N-digits, for addition and multiplication, this paper proposes a SAS® solution that obtains the viable sets for each cage straight-away and solves the puzzle by identifying the only appropriate collection of cage-specific sets.


ES_ANOVA: A SAS® Macro for Computing Point and Interval Estimates of Effect Sizes Associated with Analysis of Variance Models
Jeffrey D. Kromrey, Bethany A. Bell
Paper PO-05

Measures of effect size are recommended to communicate information on the strength of relationships. Such information supplements the reject/fail-to-reject decision obtained in statistical hypothesis testing. Because sample effect sizes are subject to sampling error, as is any sample statistic, computing confidence intervals for these statistics is a useful strategy to represent the magnitude of uncertainty about the corresponding population effect sizes. This paper provides a SAS macro for computing common effect sizes associated with analysis of variance models. By utilizing data from PROC GLM ODS tables, the macro produces point and interval estimates of eta-squared, partial eta-squared, omega-squared, and partial omega-squared. This paper provides the macro programming language, as well as results from an executed example of the macro.


How to Monitor “Don’t Know” and “Refusal” Non-responses in a Large National Survey -- Using Simple SAS Macros, a Few PROCs, and Data Steps.
Mariah Cheng, Timothy Monbureau
Paper PO-07

Monitoring data quality is a critical and sometimes daunting task during any data collection effort. One simple way to assess the quality of typical interview data involves tracking the number of valid and missing responses to survey items. Such tracking may lead to the early detection of problematic questions, enabling researchers to redesign instruments before proceeding further with the potentially costly collection of flawed data. By recoding data into the 4 outcome categories of “Don’t Know,” Refusal,” “Legitimate Skip,” and “Valid Response,” we use SAS Macros and the common SAS procedures of CONTENTS, TRANSPOSE, FREQ, APPEND, and SUMMARY in some uncommon ways to determine response rates for specific questions, larger questionnaire sections, and the survey as a whole. Our technique also facilitates the informative calculation of both relative response rates, which are based on the number of legitimate responses, and absolute response rates, which are based on the total number of respondents. We use the well-known Longitudinal Study of Adolescent Health (Add Health) Wave IV interview as an illustration here.


Proc Report Data = Subject.Event_Chronology;
Christina Carty, Elizabeth Spence
Paper PO-09

Data management staff at the VA Cooperative Studies Program Coordinating Center located in Perry Point, Maryland was challenged with the task of generating a report to display in chronological order selected data points collected on 10 case report forms. The information was used to determine if MI or other cardiac events occurred for a given patient. Because the classification results play an essential role in the final analysis reporting for the study, it was very important that the data be conveyed accurately in a user-friendly fashion; not an easy task for a large number of variables and two thousand patients!

Since the chronology lists were produced during the ongoing recruitment phase of the study, a mechanism was employed to screen each subject’s data to assure all required records were received, and no queries were left unresolved. Required values were extracted from each form dataset and manipulated where necessary, then compiled into a master database containing multiple observations for each subject. Finally, a complete chronology report was created and sent to an outside lab for MI classification as auditing was performed simultaneously to track IDs throughout the process.

This paper will demonstrate how Base SAS® was used to filter and manipulate the data. We will then show how Proc Report and the Output Delivery System (ODS) were utilized to format and display the data in an easy-to-read MS Word document.


TIPS AND TRICKS OF EFFICIENT SAS PROGRAMMING FOR SDTM DATA
Eric Qi, Fikret Karahoda
Paper PO-10

Dataset sizes increased dramatically after the pharmaceutical industry adopted the SDTM data model. For example, SUPPQUAL, CF, and LB domains can easily reach several gigabytes (GB) in size. The need to process this data efficiently becomes more important even when using today's high-speed computer resources. In this paper, we will discuss a case to show how efficient programming plays an important role in handling large SAS® datasets.


Guide to ODS Graphics Editor in SAS® 9.2
Mirjana Stojanovic
Paper PO-12

This paper describes the new ODS graphics editor in SAS® v. 9.2. In this presentation we talk about a review of ODS graphics. It will be provided and explained in more detail about new ODS graphics editor and the ODS graphics designer and how it differs from the ODS graphics editor. ODS Graphics Editor is a point-and-click editor to edit graphs that are produced by procedures that use ODS Statistical Graphics. What are pre-requisites for editing a graph? How to create an editable graph? How to customize graph properties to satisfy your needs? Paper explores some of the new features of the ODS graphics in SAS 9.2.


Customizing Saved Proc Import Code
Carolyn D. Williams
Paper PO-13

Proc import is a great SAS tool that I use often when converting raw data to SAS. Usually SAS zips through the import procedure and the output is exactly what I expect. However occasionally there are glitches and gotchas with numeric and character data types which can really make for a very bad day. The GUESSROWS= option value is valid for 1 to 32767 rows. Beyond that, SAS is no longer looking at the data to guess. What happens at observation number 82767 when SAS has defined a numeric storage location but the raw data has the value "302~40.33" or some other non numeric value? This poster takes a serious and sometimes humorous walk down the yellow brick road to find the key to the import wizard's heart so that we might learn how to customize import statements and tame the wicked data no matter what direction it come from.


Fitting multivariate random-effects models using SAS® PROC GLIMMIX
Lei Li
Paper PO-14

Multinomial ordinal data arise when measures of an outcome are scaled into ordered categories. The ordinal data can be analyzed using multinomial logit models (Agresti 2002). But the analysis is often complicated by the clustering nature of the data. One approach to handling intra-cluster correlation is logistic regression using a random intercept. Another approach is using random-intercept multivariate logit model, in which multiple logits of the outcomes are assumed to share a common random intercept.

The multivariate modeling approach can be further extended by incorporating a vector of Multivariate Normal random intercepts to allow more flexibility in the correlation structure among multivariate outcomes. In this paper we describe a continuation-ratio logit model with Multivariate Normal random intercepts and use SAS® PROC GLIMMIX to fit the model to survival data among premature infants admitted into hospital neonatal intensive care units.


A SAS/AF® Application for Organizing the Data Management Activities of the CHIMES Follow Up Study
Emily A. Mixon, Valisa R. Brown, Karen B. Fowler
Paper PO-16

The CMV & Hearing Multicenter Screening (CHIMES) Study is a multi-center study to define the contribution of congenital cytomegalovirus (CMV) infection in childhood hearing loss. The CHIMES study consists of two parts, the screening study and the follow up study. The follow up study consists of children who test positive for congenital CMV infection. The participants are followed from infancy to 4 years of age to monitor any changes in their hearing status. All data for the CHIMES study are entered by scanning TELEform® designed data forms and using the TELEform® software to read, process, and verify the data. Once the data are verified they are saved in a comma separated value (CSV) format. SAS® programs are run on the data while in the CSV files for identifying logic and possible data entry errors prior to making the data permanent in SAS® datasets. Once the data are converted to SAS® data files any changes made to the data are tracked using SAS/FSEDIT® screens. Using the data in the SAS® data files, a listing of CMV confirmation results received for the enrollment visit of new study participants may be generated. This SAS/AF® application has been developed to help organize the data management tasks of the follow up study. Separate frames with icons are used to list the twenty-four pre-SAS programs and sixteen FSEDIT screens the user may choose to run with a third frame available for generating the list of CMV confirmation results by way of an ODS PDF report form.


Technique of Using PROC SQL
Hui-Ping Chen
Paper PO-17

SAS PROC SQL implements Structured Query Language (SQL), which is used to communicate with a database. That is, PROC SQL can be used to retrieve or update data in related tables or databases. You can use PROC SQL to generate data or obtain results that are the same as what DATA step or other SAS procedures yield; however, it works somewhat differently. For example, you do not need to sort the data before merging. You also can sum the results or count rows vertically without transposing data into a horizontal data structure. In SQL, the row is the same as an observation while the column is same as a variable in the DATA step. This paper will introduce techniques using PROC SQL efficiently.


Analysis and Visual Review of Error Matrices in SAS® Stat Studio
Robert Seffrin
Paper PO-18

The United States Department of Agriculture’s National Agricultural Statistics Service uses ground reference data and satellite imagery to create an annual Cropland Data Layer (CDL) land cover classification product. The CDL and survey data are used to estimate crop acreage. A review of the quality of the CDL begins with a cross tabulation of the CDL against a validation data set to produce an error matrix, also referred to as a confusion matrix or contingency table. The power of SAS® Stat Studio is employed to calculate the statistics using SAS/IML® and IMLPlus and to generate insightful linked graphics and maps to review the quality of the land cover classification at the state and county level.


Sample Size Calculation to Evaluate Mediation Analysis
Rajendra Kadel
Paper PO-19

Mediation analysis is used to review the comparative change in the amount of strength of association of the primary predictor with the outcome after adjustment for the mediator. Mediation models are very widely used in social sciences and biomedical sciences. Before conducting mediation studies, researchers want to know the sample size (i.e. number of subjects) required for achieving the adequate power when testing for mediation. To the author’s knowledge, there is no any SAS® procedure that produces sample size calculation for mediation analysis. The author presents a macro that implements the methodology for sample size calculation for mediation analysis written by Vittinghoff et al. (2009). It implements the methods to calculate sample sizes for the linear regression models. Very basic understanding of SAS® is enough to use this macro. MACRO programming skill is not expected. When the macro is invoked, a series of windows will pop-up asking user to input required information. Output will be presented in SAS output window as well as in Microsoft word document. SAS® products involved for this sample size calculation are Base SAS® and SAS® MACRO. This macro is tested on Windows SAS®9.1 and above.

Keyword: SAS® MACRO, Mediation Analysis, linear regression models, sample size


Developing a Telco Revenue Forecasting and Device Optimization Analytics Tool
Lan Guan
Paper PO-21

To meet its uprising data revenue target and compete with other players in the wireless industry, a large telco in the US has engaged us to build a flexible and predictive analytics tool to maximize Apps/Services Revenue by providing ad-hoc scenario capability and by optimizing its wireless device portfolio mix. Essentially, the telco is looking for a sophisticated analytics solution to answer the question if it can achieve annual revenue targets for Apps/Services based on the projected device mix. More importantly, how can the projected device mix (device portfolio mix or total number of devices) be altered to meet the revenue targets? Based on multiple years of historical wireless device sales and ARPU data, an end-to-end SAS solution was designed and developed to achieve these analytics objectives. The solution encompasses a complex set of SAS analytics modules including SAS/ETS for time series forecasting and SAS/OR for device optimization. Key outputs of the analytics tool help the telco successfully project its monthly device sales required to meet the data revenue forecast by addressing number of device SKUs required. It also provides its product managers with intuitive desktop-based Scenario Planning & What-If analyses to identify alternate solutions and options.


The SAS User Group Community Activity
Don Kros
Paper PO-22

Internal SAS User groups are an effective, inexpensive way to receive practical information and technical content by meeting with peers who share the goal of getting more out of the SAS platforms, products, technologies, and resources available. Our User Group has grown from 5 to 400 active users in 3 years. Typically a user group is run by volunteers, independent of enterprise management, who meet on a regular basis to discuss and share information on a variety of technical topics and user topics. How we did it and what can you take away from our experience?


Using Dictionary Tables to Explore SAS Datasets
Phillip Julian
Paper PO-23

This paper presents a SAS dataset profiling tool that uses only the Base SAS language. The tool was inspired by my need to know much more about my data, without having data analysis tools like SAS Data Integration, Dataflux, or SAS JMP.

The tool uses information from the SAS dictionary tables to answer questions about data structure, contents, classes, and attributes. The SAS program will find and analyze all SAS datasets at any level in a directory tree.

Once SAS datasets are located, they are analyzed by a generic profiler program that summarizes all character values and computes a set of statistics for all the numberic variables, producing a "fingerprint" of the data.



Reporting and Information Visualization


Business intelligence 2.0: Are we there yet?
Greg Nelson
Paper RIV-01

Business intelligence (BI) has been with us for years now, and, we would argue, not much has changed. Advances in the consumer communications and the entertainment world have dwarfed commercial software applications. Consumers have unprecedented access to information and tools with which to consume information. Social media, mobile access, augmented reality and 3D views of pictures and video have blurred the lines between our private and work personas and have fundamentally changed the way the consumer utilizes information. Conversely, BI has seemed to lack in any real innovation. In this paper, we will outline what the next chapter of innovation should contain for business intelligence, analytics and data integration. Some hints of these features exist today, while others have lagged behind as the industry has endured a number of mergers and acquisitions. The vendors that have survived the economic and industry turmoil can now turn from treading water to making waves.


Using Linux Shell Commands, vi Editor, and Base SAS to Parse through Log Files and gather Log-information
Fuad J. Foty
Paper RIV-02

Reading log files can be fun but not when one has to examine several dozens or sometimes several thousands of those files and calculate real time and CPU time to see if the process is running according to plan. There are many ways to do that, the manual way or some smart automated way. By using the Linux shell commands such as tail, grep and cut, the vi editor, and base SAS, one can parse through log files and gather log-information such as cpu time, real time, and other important information that one needs to make decisions.


Distance mapping in health and health care: SAS® as a tool for health geomatics
Barbara B. Okerson
Paper RIV-03

Geomatics is the science concerned with using mathematical methods with spatial data to learn about the earth's surface. Health geomatics is used to improve the understanding of relationships between people, location, time, and health. Geomaticists assist in discovering and eliminating disease, aid in public health disease prevention and health promotion initiatives, and support healthcare service planning and delivery through defining spatial relationships. Distance mapping is one of many tools in the geomatics toolbox that furthers this understanding.

SAS® provides a comprehensive set of graphics tools and procedures as part of the SAS/GRAPH® product that can be used for distance mapping, including the new GEOCODE procedure and the GEODIST function, available with SAS 9.2. This paper combines the graphics available in SAS/GRAPH with other available SAS tools to explore spatial relationships in the health care arena. Examples include maps, grid squares, area charts, and X-Y plots.

The analytic graphics in this paper were developed with version 9.1.3 or 9.2 of SAS executing on a Windows XP platform. SAS 9.2 is required for the GEOCODE procedure. The graphics represented in this paper are not platform-specific and can be adapted by both beginning and advanced SAS users.


SAS Proc Report and ODS ExcelXP Tagsets to Produce Customized Excel Output Without DDE
Mira Shapiro
Paper RIV-04

After completing what appeared to be a routine request for a data-driven report with Proc Tabulate, one of my tools of choice, the additional requirement of an asterisk in selected cells in the requested Excel spreadsheet became an opportunity to explore some other techniques and add some new skills to my SAS repertoire. While it would have been possible to take the Proc Tabulate output and use DDE to create the desired spreadsheet, I decided to use Proc Report and ODS ExcelXP Tagsets. The solution to producing this report required the use of many features of Proc Report including across variables, nested columns, call define, and style. The most interesting challenge was that the requester wanted the results delivered via an Excel spreadsheet with an asterisk in some of the columns based on two variables that were not being displayed in the report. The solution evolved into a short SAS program with a Proc Report step doing all of the data summarization and the majority of the formatting. This paper will briefly outline the DDE approach that might have been taken and will detail the solution that makes use of Proc Report and ODS ExcelXP Tagsets.

This presentation is for SAS Users who want to use some of the advanced features of Proc Report and gain an understanding of how to exploit Proc Report’s underlying column structure. In addition, in a client server SAS implementation where DDE is not feasible, this approach gives an alternative for producing customized Excel output.


ODS RTF TEXT My New Best Friend!
Brian Spruell
Paper RIV-05

Recently, I completed a project which required generating twenty two reports for a client. Twenty one of those reports went to individual parties who contract with the client. The twenty-second report was an overall summary report which went directly to the client. I had finished my code which generated the needed tables and figures. The difficulty began when we needed to add text to the report. Originally we were going to paste in the standard text into each report (a report was anywhere from eighteen-twenty pages). I was fearful this approach would be too time consuming and leave open the possibility for a large number of human errors. Fortunately I learned about ODS RTF text which enabled me to add the standard text in one place within my SAS code and reduced the human manipulation of the output to zero.

This paper is intended for a beginner SAS programmer. It intends to show him/her a tool which can be quite useful when generating standard output reports. This tool is something the author wishes he had learned years ago


SBSBOXPLOT: A SAS® Macro for Generating Side-by-Side Boxplots
Jason A. Schoeneberger, Grant B. Morgan, Bethany A. Bell
Paper RIV-06

Good research practice includes a basic exploration of data prior to engaging in more sophisticated, inferential analyses. Generating location and dispersion summary statistics such as the mean or median and standard deviation or interquartile range help describe variable distributions. Whereas these descriptive statistics can be reported in tabular or narrative format, they can also be reported visually. One efficient graphical display that clearly communicates a host of summary statistics, including the mean, median, 10th, 25th, 75th, and 90th percentiles, and the minimum and maximum values is the box-and-whisker plot (Tukey, 1977), often simply referred to as a boxplot. Thus, when examining a boxplot, a reader can quickly assess measures of central tendency and dispersion, as well as the shape of the distribution for any given continuous variable. Moreover, side-by-side boxplots allow easy communication of patterns evident in data by viewing graphical displays in clusters or blocks to accentuate relationships among categorical or discrete variables or factors (e.g., examining standardized test scores by subject and student race/ethnicity or adolescent BMI by student gender and age). However, the SAS programming necessary to generate side-by-side boxplots is not intuitive and cumbersome at best. To ease this burden, using various data management procedures in conjunction with PROC BOXPLOT, SBSBOXPLOT was developed to facilitate the generation of side-by-side boxplots for displaying distributional information disaggregated by factors important for a basic understanding of one’s data. This paper provides the macro programming language, as well as results from an executed example of the macro.


Yes! SAS® ExcelXP WILL NOT Create a Microsoft Excel® Graph; But SAS Users Can Command Microsoft Excel® to Automatically Create Graphs From SAS® ExcelXP Output.
William E. Benjamin Jr
Paper RIV-08

With everything becoming more “Security Aware”, hearing that Microsoft Excel 2007 can ignore DDE is no surprise. The SAS® ODS Tagset ExcelXP creates *.xml output, and *.xml output cannot contain graphs. Nowadays, no one wants Excel spreadsheets with any kind of macros in the workbook. Most default security settings will not open “Macro Infected” workbooks (without an authorized Digital Signature – just try to get one of those). So how can SAS® Programmers, without “Admin” rights on your computer, get graphs into your Excel workbooks? One way is to build them in Excel yourself, one at a time… But, did you really start writing SAS® code to build graphs one at a time, “In Excel”? This paper shows you how to create data using SAS®, then command Microsoft Excel® to read the data, create a graph or fully reformat a worksheet, build a *.xls/*.xlsx file (NOTE – no embedded macros can be saved in *.xlsx files); without putting an Excel macro into the Excel Workbook? And the programs will do it all while you watch, for multiple sheets in a workbook? Over, and Over, and Over again. This paper defines the pieces and parts of SAS® code, and Excel® code, that you can write, which will allow the creation of a fully integrated system to create and format macro free Excel 2007 workbooks, without downloading any external data or code from the “Forbidden Unsecure Internet”; using SAS Version 9 (SAS® Version 8 if internet downloads are available) and Excel 2007.


Geocoding Crashes in Limbo
Carol Martell, Daniel Levitt
Paper RIV-09

In North Carolina, crash locations are documented only with the road names for the nearest intersection. Some projects require more accurate location for importing into ArcGis. We use a combination of SAS® and GoogleEarth to more accurately pinpoint crash locations. This paper discusses how we streamline a manual process to make it feasible.


Producing Maps Using SAS Enterprise Guide
Harmon L. Jolley
Paper RIV-10

Plotting data onto color-coded maps provides a geographic view, such as may be done in market analysis. SAS/GRAPH can be used to produce maps, both interactively as well as in automated batch scheduling. However, the data analyst may be unfamiliar with SAS/GRAPH programming, or may prefer a point-and-click method of generating the SAS/GRAPH code.


Model Visualization Using JMP®, SAS®, and Excel
Jon Weisz
Paper RIV-11

Engineers, financial analysts and statisticians need ways to communicate models and to perform what-if analyses. This paper will highlight the use of the Profiler feature in JMP to visualize and perform what-if analyses for models defined using SAS/STAT®, Microsoft Excel and engineering tools. We will discuss cases that highlight the difference between model creation and model consumption.


The Systems Development Life Cycle (SDLC) as a Standard: Beyond the Documentation
Dianne Louise Rhodes
Paper RIV-12

Has your company adopted the Systems Development Life Cycle (SDLC) as a standard for benchmarking progress on a project? Have they developed Word and other templates for documents created during SDLC? In three of my most recent positions, the stress was put on completing the documents according to schedule, rather than emphasizing the work. The work involved cataloguing requirements, analyzing them and developing a good design document, and thoroughly testing the resulting code. When I first started programming in SAS, I was lucky to get any users requirements at all; it was always “I’ll know it when I see it.” But with the emphasis on the documentation, and not on the analytical work behind them, the project still falls behind schedule because of missed requirements. If the requirements are not thoroughly complete when coding begins, it is likely to fail in the testing phase, especially if the independent test team gets a better, more complete version of the requirements than the development team. We discuss the work that is involved in detail for producing sound requirements, design, and testing protocols. Consider the retirement of legacy software as part of the SDLC. Some useful templates (in Excel, perhaps) to help non-programmers specify the reports they want.


Introduction to Graphics Using SAS/GRAPH® Software
Mike Kalt
Paper RIV-13

This session introduces two of the most commonly used SAS/GRAPH procedures – GCHART and GPLOT – and illustrates how they are used to produce bar charts, pie charts, scatter plots, and line plots. The session will also demonstrate how to export graphs produced by SAS/GRAPH to Web pages, Microsoft Office products, and PDF files.



Statistics and Data Analysis


A Taste of ADaM
Beilei Xu, Changhong Shi
Paper SDA-03

The Analysis Data Model (ADaM) and the ADaM Implementation Guide (ADaMIG) published by the Clinical Data Interchange Standards Consortium (CDISC) provide the fundamental principles of analysis datasets and specifications for two standard data structures: the subject-level analysis dataset ADSL and the Basic Data Structure (BDS) which is a general structure that provides "one proc away" readiness for many common analyses. This paper presents the detailed steps used in drug project work to create ADaM BDS datasets, illustrated by ADLP, a dataset to support analysis of lipids endpoints. Particular emphasis is given to the following implementation considerations: 1. number of ADaM datasets needed; 2. derivation of analysis endpoints, analysis windows, analysis values, and imputation of missing values; and 3. setup of analysis flags and population flags.

Keywords: ADaM, CDISC, analysis dataset


Detecting Medicaid Data Anomalies Using Data Mining Techniques
Aran J. Canes, Qiling Shi, Shenjun Zhu
Paper SDA-04

The purpose of this study is to use statistical and data mining techniques in Base SAS(R) and SAS(R) Enterprise MinerTM to proactively reduce the number of false positives caused by data anomalies in Medicaid pharmacy claim data when employing a rule-based approach to identify overpayments. Typically rule-based techniques are based on specific state Medicaid laws and policies using certain formulas to detect and identify over charged payments. False positives are defined as an identified overpayment that is erroneously positive when a claim was paid correctly due to data anomalies or unknown factors. False positives substantially increase the amount of time and resources spent by the auditors. The specific objective of the study is to detect and reduce data anomalies by examining the relationships among key variables such as Medicaid amount paid (MAP), average wholesale price (AWP) and quantity of service in Medicaid pharmacy claim data.

Pharmacy claim data were simulated and the overpayment was calculated by a rule-based approach developed by AdvanceMed Corporation. Different data mining techniques such as the studentized residual, leverage, Cook’s distance, DFFITS and clustering were utilized to capture the abnormal claims and reduce the number of false positives. The results of this analysis indicated that the clustering statistical method is the best approach to detect these kinds of data anomalies, followed by the DFFITS method.


SAS Macros for Estimating the Attributable Benefit of an Optimal Treatment Regime
Jason S. Brinkley
Paper SDA-05

For many diseases there are several treatment options and sometimes there is no consensus on the best treatment to give to individual patients. What works well for some patients may be a very bad option for others. One may be interested in developing an algorithm that assigns treatments in such a way that each individual patient receives the one that is best for them. We define the optimal treatment regime as the strategy that most greatly reduces the number of poor outcomes. Using large observational databases, many can already form statistical models that determine the effectiveness of such treatment strategies. But it would be useful to have a single measure that indicates the effectiveness of imposing such a strategy. One such measure is the attributable benefit of a treatment regime, which uses notions from causal inference to estimate the proportion of poor outcomes that could have been prevented had a particular strategy been implemented. This talk will present two easy to use SAS macros for estimating the attributable benefit of the optimal treatment regime and it’s variance from a logistic regression model-based treatment regime. The first macro uses a simple disease/treatment model that is supplied by the user. The second macro provides a doubly-robust estimate of attributable benefit that incorporates the simple disease/treatment model with a propensity model toward treatment. While the theory behind estimating these measures is complex, users with an understanding of Proc Logistic in SAS can implement and interpret the output of these macros.


Potential Change in Reliability Measures Based on Decreased Sample Size for the Census Coverage Measurement Survey
Vincent T. Mule
Paper SDA-06

As part of the evaluation of the 2010 Censuis, the U.S. Census Bureau conducts the Census Coverage Measurement (CCM) Survey. This survey produces the net coverage results of undercounts or overcounts of the Census. In addition to net coverage, our program has been asked to estimate the components of census coverage that include erroneous enumerations and omissions. A proposal was made to reduce our sample from the planned 300,000 housing units to help offset the costs of the operational enhancements to try to help reduce the nonsampling error in our estimates. This paper shows the results of a simulation study to assess the potential change in reliability measures for our proposed estimates. This paper will show results of the study and how SAS Procedures like Proc SURVEYSELECT and Proc SGPANEL were used.


A SAS Macro to Compute Added Predictive Ability of New Markers Predicting a Dichotomous Outcome
Kevin Kennedy, Michael Pencina
Paper SDA-07

Risk prediction is an important field in applied statistics. For example, in clinical research, predicting the development of adverse medical conditions is the object of many studies. However, this concept is multi-disciplinary. Many published models exist to predict dichotomous outcomes, ranging widely from a model to predict winners in the NCAA basketball tournament to a recently published risk model to predict the likelihood of a bleeding complication after percutaneous coronary intervention. It is important to note, however, these models are dynamic, as changes in population and technology lead to discovery of new and better methods to predict outcomes. An important consideration is when to add new markers to existing models. While a significant p-value is an important condition, it does not necessarily imply an improvement in model performance. Traditionally, receiver operating characteristic (ROC) curves and its corresponding area under the curve (AUC) are used to compare models, with a statistical test due to DeLong et al. for two correlated AUCs. However, the clinical relevance of this metric has been questioned by researchers. To address this issue, Pencina and D’Agostino have proposed two statistics to evaluate the significance of novel markers. The Integrated Discrimination Improvement (IDI) measures the new model’s improvement in average sensitivity without sacrificing average specificity. The Net Reclassification Improvement (NRI) measures the correctness of reclassification of subjects based on their predicted probabilities of events using the new model with the option of imposing meaningful risk categories.


Using SAS Text Miner 4.1 to create a term list for patients with PTSD within the VA
Matthew R. Richardson, Stephen L. Luther, Donald Berndt
Paper SDA-08

SAS Text Miner 4.1 was utilized to perform statistical text mining to supplement efforts to develop a clinical vocabulary for post-traumatic stress disorder (PTSD) in the VA. 405 veterans with PTSD were identified through administrative data within the Tampa VA and then combined with a comparison group of 392 veterans with no known PTSD symptoms or diagnosis. The patient notes from this cohort were captured to produce the dataset analyzed. Using all possible combinations of Frequency Weight and Term Weight 24 different models were run. 21 models produced usable results. In each model terms were ranked and scored based on global weights and then ranked and scored again based on the absolute value of the coefficient in a regression analysis utilizing stepwise regression. Scores for each model and each analysis were combined to conceive a master score with the highest scored term receiving the highest rank. 449 terms were identified within the regression analysis and 343 were identified in the global weight analysis for a total of 585 distinct terms. Future studies will focus on incorporating processes that identify terms that carry greater clinical relevance as well as incorporation of the improved techniques behind SAS Text Miner 4.2.


A Macro for Calculating Summary Statistics on Left Censored Environmental Data using the Kaplan-Meier Method
Dennis J. Beal
Paper SDA-09

Calculating summary statistics such as the mean, standard deviation, and an upper confidence limit on the mean is straightforward when the data values are known. However, environmental data often are reported back from the analytical laboratory as left censored, meaning the actual concentration for a given contaminant was not detected above the method detection limit. Therefore, the true concentration is known only to be between 0 and the reporting limit. The nonparametric Kaplan-Meier product limit estimator has been widely used in survival analysis on right censored data, but only recently has this method been applied to left censored data. Kaplan-Meier can be used on left censored data with multiple detection limits with minimal assumptions. This paper presents a SAS macro that calculates the mean, standard deviation, and standard error of the mean of a left censored environmental data set using the nonparametric Kaplan-Meier method. Kaplan-Meier has been shown to provide more robust estimates of the mean and standard deviation of left censored data than other methods such as simple substitution and maximum likelihood estimates. This paper is for intermediate SAS users of SAS/BASE.


Using SAS PROC CLUSTER to Determine University Benchmarking Peers
Elayne Reiss, Sandra Archer, Robert L. Armacost, Ying Sun, Yun Fu
Paper SDA-10

This paper will explore the steps taken by a large public research institution to develop a list of peer institutions for consistent use in future benchmarking studies. At the heart of this process is the use of two tools available in SAS/STAT® – PROC CLUSTER and PROC FASTCLUS. Both of these procedures address clustering in differing ways, but this paper will demonstrate how we utilized the strengths of both procedures to strengthen our analysis. In describing the analytical process, we will focus on data selection and preparation concerns, the specifics of using these particular SAS procedures for our benchmarking application, and the ways this application of clustering fits into benchmarking activities as a whole.


Developing a Model for Person Estimation in Puerto Rico for the 2010 Census Coverage Measurement Program
Colt S. Viehdorfer
Paper SDA-11

One of the goals of the 2010 Census Coverage Measurement (CCM) program is to estimate net coverage error for persons in housing units for the 2010 Census, with Puerto Rico results being calculated independently from the rest of the United States. General logistic regression is being used for the estimation of net error for the 2010 Census as opposed to using a post-stratification method, as was done in previous census coverage measurement surveys. Unlike post-stratification, logistic regression allows the use of continuous variables. This paper will outline the steps that I took to develop a logistic regression model for net coverage error estimation in Puerto Rico, using data from the 2000 Census and its coverage measurement survey. I will explain how I determine the main effects to be included in the model using various exploratory and statistical techniques and will also examine different model selection procedures for deciding on a final model, which will include main effects and interactions. The main effects that I have chosen to use in the model in this paper will be proposed to be used as the main effects for the model for the 2010 CCM. However, specific interaction terms to include in the model will be determined using the procedures outlined in this paper once the actual 2010 CCM data becomes available.


Stationarity Testing in High Frequency Seasonal Time Series
David A. Dickey
Paper SDA-12

Seasonal differencing is often applied when reporting, for example, monthly sales. New car sales are often reported as up or down from the same period last year. We modify the seasonal tests of Dickey, Hasza, and Fuller (1984) to investigate results for less typical, long period cases such as 1,440 minutes per day, 52 weeks per year, 365 days per year and so forth, getting some nice properties including a surprising effect of deterministic terms in the models. An example of weekly natural gas storage numbers will be used to illustrate the results.


The Graph Template Language and the Statistical Graphics Procedures -- An Example-Driven Introduction
Warren Kuhfeld
Paper SDA-13

This tutorial provides a gentle, parallel, and example-driven introduction to the graph template language (GTL) and to the statistical graphics (SG) procedures. With the GTL and the SG procedures, you can easily create professional looking statistical graphics and modify the graphs that SAS automatically produces. Examples are provided of the basic graphs that are produced by the SGPLOT, SGSCATTER, and SGPANEL procedures. Most graphs are produced in at least two ways. One graph is created with the GTL, PROC TEMPLATE, and PROC SGRENDER. The other graph is created more directly with PROC SGPLOT or one of the other SG procedures. Each example provides you with prototype programs for getting started with the GTL and with the SG procedures. While you do not need to know the GTL to make many useful graphs, understanding the GTL enables you to create custom graphs that cannot be produced by the SG procedures. This tutorial also presents examples that help you safely customize elements of the default templates that SAS provides such as titles, axis labels, colors, lines, markers, ticks, grids, axes, reference lines, and legends.


Take a Whirlwind Tour Around SAS 9.2
Diane Hatcher
Paper SDA-14

A new bevy of SAS 9.2 capabilities are available in the platform for Business Analytics. Come and join us for a whirlwind tour of what’s new across the suite of client applications for Business Intelligence and Data Integration. We’ll start the trip with SAS Management Console which highlights updates to the metadata architecture that will surface throughout the rest of the tour. Other featured stops include Information Map Studio, Enterprise Guide, the Add-in for Microsoft Office, Web Report Studio, and BI Dashboard in the Information Delivery Portal. We’ll even take a jaunt over to Data Integration Studio to see how user productivity can be taken even higher. Finally, we’ll wrap up the tour with some integration with Microsoft Sharepoint. Experience the excitement of the new capabilities and features that integrate intelligence across your organization.


The Next Generation: SAS/STAT 9.22
Phil Gibbs
Paper SDA-15

Caught up with SAS/STAT 9.2? Want to add even more statistical tools to your arsenal? Then get ready for the 9.22 release of SAS/STAT software, which works with Base SAS 9.2 to deliver methodological advances in addition to customer-requested features.

New functionality for postprocessing equips the modern linear modeling procedures with comparable capabilities, including the latest techniques for testing complex research hypotheses. The new SURVEYPHREG procedure fits Cox regression models to survey data. The EFFECTPLOT statement uses ODS Graphics to create plots of effects from models produced by the GENMOD, LOGISTIC, and ORTHOREG procedures. Exact Poisson regression is one of several new exact methods for categorical data analysis. Other new features include updated spatial analysis capabilities, classification variable support in the ROBUSTREG procedure, and model averaging in the GLMSELECT procedure. This talk provides an overview of these exciting new enhancements to the statistical software.



Weekend Workshops


Reusable Programs and List Processing: Concepts and Development, Techniques and Tools
Ronald Fehd
Paper WW-01

This course covers the principles of writing, developing and consolidating reusable programs. The goal is to make you a more valued and productive programmer or analyst.

At first, you will be shown how to move beyond simple "cut and paste" code into reusable code that handles repetitive tasks efficiently.

Next, you will learn how to look at the process used to write code. This will allow you to develop the requirements documents usually needed but rarely provided by supervisors and clients. It will also give you insight into how large-scale applications can be broken down into smaller, repeatable tasks.


Beginning PROC REPORT
Art Carpenter
Paper WW-02

Although PROC REPORT has been available since Version 6.07, the procedure is generally underutilized. One reason is that the syntax of the procedure is unique with the SAS System. Learning the basic structure in an organized way allows the programmer to easily transition from simple to increasingly more complex tables.

This Seminar will show how PROC REPORT works and thinks through a series of increasingly more complex examples. Examples will include:

•An introduction to the basic syntax of the PROC step
•Introduction to the COLMN, DEFINE, COMPUTE, BREAK, and RBREAK statements
•The demonstration of addition of text to headers and value descriptions
•The use of the DEFINE statement to form groups and columns
•The generation of breaks before and after groups
•The generation of breaks before and after the report
•The use of ODS with PROC REPORT, including STYLES and traffic lighting


Using Dictionary Tables in Pharmaceutical Applications
Frank DiIorio
Paper WW-03

SDTM, ADaM and a host of other emerging standards have added complexity to the already challenging life of pharmaceutical industry programmers. These standards are typically represented as metadata that describe the attributes of deliverables such as datasets and displays.

SAS dictionary tables are another, complementary metadata source. These tables contain a wealth of information about a SAS session, describing contents of datasets and views, identifying macro variables, titles and footnotes, ODS destinations, and characteristics of external files. The tables are useful in and of themselves (think “utility macros”). And they become even more valuable to programmers who must ensure deliverables’ compliance with standards.

This workshop takes attendees on a tour of the more commonly used dictionary tables. It:

•Presents an overview of how the tables are created and maintained
•Illustrates the relationships among the tables
•Demonstrates different ways to view the tables’ contents
•Identifies usage quirks and “features”
•Gives examples of how they can be used for both generalized and pharma-specific applications


Linear, Generalized, and Nonlinear Mixed Models
David A. Dickey
Paper WW-04

In this course we first review the ideas of fixed and random effects as well as REML estimation. We review the historic approaches to such models in simple analysis of variance situations, touching on the estimation of variance components by the method of moments and their relationship to heritability calculations in genetics. We show how REML estimation allows the extension of these computations to unbalanced mixed model cases, to repeated measures, and to random coefficients models. We then discuss models in which the response is not normally distributed, conditional on the predictors. Binomial and Poisson data are leading examples of these models. In these models the mean and variance are often functions of the same parameter, for example, in the binomial case with n trials the mean number of successes is np and the variance of this number is np(1-p) so the mean and variance are linked to each other. In practice, the estimated mean and variance may depart sufficiently from that relationship to require adjustments. Such adjustments will be addressed as will the alternate strategy of changing to a more appropriate distribution when such departure from expected behavior is encountered. Some such distributions, for example the zero inflated Poisson or ZIP distribution, require nonlinear models that, in cases of interest to us, also include random effects. A ZIP model using PROC NLMIXED will be demonstrated.


SAS Enterprise Guide 4.2 – Getting to Know You
Andy Ravenna
WW-05

The latest release of SAS Enterprise Guide provides perhaps the quickest and slickest point-and-click version yet to access and analyze your data. However, both longtime users of SAS as well as new users to SAS often don’t know where to begin. Longtime users are accustomed to typing all of their code into the Program Editor window and hitting the Submit key. New users are puzzled about where to get started and how to ‘make something happen’ such as creating a report. This beginning tutorial introduces SAS Enterprise Guide 4.2 to both old and new users of SAS. It focuses on the key points of a typical session: creating a project, accessing your data, building a query, and producing a report. It also answers several common questions for first-time users, such as ‘Why can’t I sort my data?’ and ‘How can I copy a task from one project to another project?’ Attendees receive enough information about this release that they can return to the office with the confidence to get started with SAS Enterprise Guide.


Moving to Release 9.2 of SAS/GRAPH Software
Mike Kalt
Paper WW-06

Release 9.2 of SAS/GRAPH software contains a number of enhancements and changes from previous releases. If you have been using a previous release of SAS/GRAPH, you may notice a different (and improved) look to your graphs. You will learn how to take advantage of changes and new features in the software, use new SAS/GRAPH procedures, and ensure that SAS/GRAPH applications developed in previous releases produce the same output in Release 9.2.