SlideShare a Scribd company logo
Sas classes in mumbai
How to start using SAS
The topics 
 An overview of the SAS system 
 Reading raw data/ create SAS data set 
 Combining SAS data sets & Match merging 
SAS Data Sets 
 Formatting data 
 Introduce some simple regression procedure 
 Summary report procedures
Basic Screen Navigation 
 Main: 
 Editor 
contains the SAS program to be submitted. 
 Log 
contains information about the processing of the SAS 
program, including any warning and error messages 
 Output 
contains reports generated by SAS procedures and 
DATA steps 
 Side: 
 Explore 
navigate to other objects like libraries 
 Results 
navigate your Output window
SAS programs 
A SAS program is a sequence of steps that the user 
submits for execution. 
Data steps are typically used to create SAS data sets 
PROC steps are typically used to process SAS data 
sets (that is, generate reports and graphs, edit 
data, sort data and analyze data
SAS Data Libraries 
 A SAS data library is a collection of SAS files that are 
recognized as a unit by SAS 
 A SAS data set is one type of SAS file stored in a data 
library 
 Work library is temporary library, when SAS is closed, all 
the datasets in the Work library are deleted; create a 
permanent SAS dataset via your own library.
SAS Data Libraries 
 Identify SAS data libraries by assigning each a library reference 
name (libref) with LIBNAME statement 
LIBNAME libref “file-folder-location”; 
Eg: LIBNAME readData 'C:tempsas classreadData‘; 
 Rules for naming a libref: 
 The name must be 8 characters or less 
 The name must begin with a letter or underscore 
 The remaining characters must be letters, numbers or 
underscores.
Reading raw data set into SAS 
system 
 In order to create a SAS data set from a raw 
data file, you must 
 Start a DATA step and name the SAS data set 
being created (DATA statement) 
 Identify the location of the raw data file to read 
(INFILE statement) 
 Describe how to read the data fields from the raw 
data file (INPUT statement)
Reading external raw data file into 
SAS system 
LIBNAME readData 'C:tempsas classreadData‘; 
DATA readData.wa80; 
INFILE “k:censusstf2_wa80.txt”; 
INPUT @10 SUMRYLVL $2. @40 COUNTY $3. 
@253 TABA1 9.0 @271 TABA1 9.0; 
RUN; 
 The LIBNAME statement assigns a libref ‘readData ’ to a data library. 
 The DATA statement creates a permanent SAS data set named ‘wa80’. 
 The INFILE statement points to a raw data file. 
 The INPUT statement 
- name the SAS variables 
- identify the variables as character or numeric ($ indicates character data) 
- specify the locations of the fields in the raw data 
- can be specified as column, formatted, list, or named input 
 The RUN statement detects the end of a step
Example 1 
 Reading raw data separated by spaces 
/* Create a SAS permanent data set named HighLow1; 
Read the data file temperature1.dat using listing input */ 
DATA readData.HighLow1; 
INFILE ‘C:sas classreadDatatemperature1.dat’; 
INPUT City $ State $ NormalHigh NormalLow 
RecordHigh RecordLow; 
RUN; 
/* The PROC PRINT step creates a isting report of the 
readData.HighLow1 data set */ 
PROC PRINT DATA = readData.highlow1; 
TITLE ‘High and Low Temperatures for July’; 
RUN; 
temperature1.dat: 
Nome AK 55 44 88 29 
Miami FL 90 75 97 65 
Raleign NC 88 68 105 50
Example 2 
 Reading multiple lines of raw data per observation 
/* Read the data file using line pointer, slash(/) and pount-n (#n). 
The slash(/) indicates next line, the #n means to go to the n line 
for that observation. Slash(/) can be replaced by #2 here */ 
DATA readData.highlow2; 
INFILE ‘C:sas classreadDatatemperature2.dat’; 
INPUT City $ State $ 
/ NormalHigh NormalLow 
#3 RecordHigh RecordLow; 
PROC PRINT DATA = readData.highlow2; 
TITLE ‘High and Low Temperatures for July’; 
RUN; 
temperature2.dat: 
Nome AK 
55 44 
88 29 
Miami FL 
90 75 
97 65 
Raleign NC 
88 68 
105 50
Example 3 
 Reading multiple observations per line of raw data 
temperature3.dat: 
Nome AK 55 44 88 29 Miami FL 90 75 97 65 Raleign NC 88 
68 105 50 
/* To read multiple observations per line of raw data,use double railing at 
signs (@@) at the end of INPUT statement */ 
DATA readData.highlow3; 
INFILE ‘C:sas classreadDatatemperature3.dat’; 
INPUT City $ State $ NormalHigh NormalLow RecordHigh 
RecordLow @@; 
PROC PRINT DATA = readData.highlow3; 
TITLE ‘High and Low Temperatures for July’; 
RUN;
Reading external raw data file into 
SAS system 
 Reading raw data arranged in columns 
INPUT FILEID $ 1-5 RECTYP $ 6-9 SUMRYLVL $ 10-11 
URBARURL $ 12-13 SMSACOM $ 14-15; 
 Reading raw data mixed in columns 
INPUT FILEID $ 1-5 @10 SUMRYLVL $ 2. @253 TABA1 9.0 
@271 TABA1 9.0; 
/* The @n is the column pointer, where n is the number of the column 
SAS should move to. The $w. reads standard character data, and 
w.d reads standard numeric data, where w is the total width and d 
is the number of decimal places. */
Reading Delimited or PC Database 
Files with the IMPORT Procedure 
 If your data file has the proper extension, use the simplest form of 
the IMPORT procedure: 
PROC IMPORT DATA FILE = ‘filename’ OUT = data-set 
Type of File Extension DBMS Identifier 
Comma-delimited .csv CSV 
Tab-delimited .txt TAB 
Excel .xls EXCEL 
Lotus Files .wk1, .wk3, .wk4 WK1,WK3,WK4 
Delimiters other than commas or tabs DLM 
 Examples: 
1. PROC IMPORT DATAFILE=‘c:tempsale.csv’ OUT=readData.money; RUN; 
2. PROC IMPORT DATAFILE=‘c:tempbands.xls’ OUT=readData.music; RUN;
Reading Files with the IMPORT 
Procedure 
 If your file does not have the proper extension, or your 
file is of type with delimiters other than commas or tabs, 
then you must use the DBMS= and DELIMITER= option 
PROC IMPORT DATAFILE = ‘filename’ OUT = data-set 
DBMS = identifier; 
DELIMITER = ‘delimiter-character’; 
RUN; 
 Example: 
PROC IMPORT DATAFILE = ‘C:sas classreadDataimport2.txt’ 
OUT =readData.sasfile DBMS =DLM; 
DELIMITER = ‘&’; 
RUN;
Format in SAS data set 
 Standard Formats (selected): 
 Character: $w. 
 Date, Time and Datetime: 
DATEw., MMDDYYw., TIMEw.d, …… 
 Numeric: COMMAw.d, DOLLARw.d, …… 
 Use FORMAT statement 
PROC PRINT DATA=sales; 
VAR Name DateReturned CandyType Profit; 
FORMAT DateReturned DATE9. Profit DOLLAR 6.2; 
RUN;
Format in SAS data set 
 Create your own custom formats with two steps: 
 Create the format using PROC FORMAT and VALUE statement. 
 Assign the format to the variable using FORMAT statement. 
 General form of a simple PROC FORMAT steps: 
PROC FORMAT; 
VALUE name range-1=‘formatted-text-1’ 
range-2=‘formatted-text-2’ ……; 
RUN; 
 The name in VALUE statement is the name of the format you are 
creating, which can’t be longer than eight characters, must not start or 
end with a number. If the format is for character data, it must start with 
a $.
Format in SAS data set 
Exmaple: 
/* Step1: Create the format for certain variables */ 
PROC FORMAT; 
VALUE genFmt 1 = 'Male' 
2 = 'Female'; 
VALUE money 
low-<25000='Less than 25,000' 
25000-50000='25,000 to 50,000' 
50000<-high='More than 50,000'; 
VALUE $codeFmt 
'FLTA1'-'FLTA3'='Flight Attendant' 
'PILOT1'-'PILOT3'='Pilot'; 
RUN; 
/* Step2: Assign the variables */ 
DATA fmtData.crew1; 
SET fmtData.crew; 
FORMAT Gender genFmt. Salary money. JobCode $codeFmt.; 
RUN;
Format in SAS data set 
 Permanently store formats in a SAS catalog by 
 Creating a format catalog file with LIB in PROC 
FORMAT statement 
 Setting the format search options 
 Example: 
LIBNAME class ‘C:sas classFormat’; 
OPTIONS FMTSEARCH=(fmtData.fmtvalue); RUN; 
PROC FORMAT LIB=fmtData.fmtvalue; 
VALUE genFmt 1 = ‘Male’ 2=‘Female’; 
RUN;
Combining SAS Data Sets: 
Concatenating and Interleaving 
 Use the SET statement in a DATA step to 
concatenate SAS data sets. 
 Use the SET and BY statements in a DATA 
step to interleave SAS data sets.
Combining SAS Data Sets: 
Concatenating and Interleaving 
 General form of a DATA step concatenation: 
 DATA SAS-data-set; 
SET SAS-data-set1 SAS-data-set2 …; 
RUN; 
 Example: 
DATA stack.allEmp; 
SET stack.emp1 stack.emp2 stack.emp3; 
RUN;
Combining SAS Data Sets: 
Concatenating and Interleaving 
 General form of a DATA step interleave: 
 DATA SAS-data-set; 
SET SAS-data-set1 SAS-data-set2 …; 
BY BY-variable; 
RUN; 
 Sort all SAS data set first by using PROC SORT 
 Example: 
PROC SORT data=stack.emp2 OUT=stack.emp2_sorted; BY Salary; 
RUN; 
DATA stack.allEmp; 
SET stack.emp1 stack.emp2 stack.emp3; 
BY salary; 
RUN;
Match-Merging SAS Data Sets 
 One-to-one match merge 
One-to-many match merge 
Many-to-many match merge 
 The SAS statements for all three types of match 
merge are identical in the following form: 
DATA new-data-set; 
MERGE data-set-1 data-set-2 data-set-3 …; 
BY by-variable(s); /* indicates the variable(s) that control 
which observations to match */ 
RUN;
Merging SAS Data Sets: A More 
Complex Example 
 Example: Merge two data sets acquire the names of the group 
team that is scheduled to fly next week. 
combData.employee combData.groupsched 
EmpID LastName 
E00632 Strauss 
E01483 Lee 
E01996 Nick 
E04064 Waschk 
/* To match-merge the data sets by common variables - EmpID, the data sets 
must be ordered by EmpID */ 
PROC SORT data=combData.Groupsched; 
BY EmpID; 
RUN; 
EmpID FlightNum 
E04064 5105 
E0632 5250 
E01996 5501
Merging SAS Data Sets: A More 
Complex Example 
/* simply merge two data sets */ 
DATA combData.nextweek; 
MERGE combData.employee combData.groupsched; 
BY EmpID; 
RUN; 
EmpID LastJName FlightNum 
E00632 Strauss 5250 
E01483 Lee 
E01996 Nick 5501 
E04064 Waschk 5105
Merging SAS Data Sets: A More 
Complex Example 
 Eliminating Nonmatches 
Use the IN= data set option to determine which 
dataset(s) contributed to the current observation. 
 General form of the IN=data set option: 
SAS-data-set (IN=variable) 
 Variable is a temporary numeric variable that has two 
possible values: 
 0 indicates that the data set did not contribute to the 
current observation. 
 1 indicates that the data set did contribute to the 
current observation.
Merging SAS Data Sets: A More 
Complex Example 
/*Exclude from the data set employee who are scheduled to fly next 
week. */ 
LIBNAME combData “K:sas classmerge”; 
DATA combData.nextweek; 
MERGE combData.employee 
combData.groupsched (in=InSched); 
BY EmpID; 
IF InSched=1; True 
RUN; 
EmpID LastJName FlightNum 
E00632 Strauss 5250 
E01996 Nick 5501 
E04064 Waschk 5105
Merging SAS Data Sets: A More 
Complex Example 
/* Find employees who are not in the flight scheduled group. */ 
LIBNAME combData “K:sas classmerge”; 
DATA combData .nextweek; 
MERGE combData .employee (in=InEmp) 
combData.groupsched (in=InSched); 
BY EmpID; 
IF InEmp=1; True 
IF InSched=0; False 
RUN; 
EmpID LastJName FlightNum 
E01483 Lee
Different Types of Merges in SAS 
 One-to-Many Merging 
DATA work.three; 
MERGE work.one work.two; 
BY X; 
RUN; 
X Y 
1 A 
2 B 
3 C 
Work.two 
X E 
1 A1 
1 A2 
2 B1 
3 C1 
3 C2 
Work.three 
X Y Z 
1 A A1 
1 A A2 
2 B B1 
3 C C1 
3 C C2 
Work.one
Different Types of Merges in SAS 
 Many-to-Many Merging 
DATA work.three; 
MERGE work.one work.two; 
BY X; 
RUN; 
X Y 
1 A1 
1 A2 
2 B1 
2 B2 
Work.two 
X Z 
1 AA1 
1 AA2 
1 AA3 
2 BB1 
2 BB2 
Work.three 
X Y Z 
1 A1 AA1 
1 A2 AA2 
1 A2 AA3 
2 B1 BB1 
2 B2 BB2 
Work.one
Some simple regression analysis 
procedure 
 The REG Procedure 
 The LOGISTIC Procedure
The REG procedure 
 The REG procedure is one of many regression 
procedures in the SAS System. 
 The REG procedure allows several MODEL 
statements and gives additional regression 
diagnostics, especially for detection of collinearity. It 
also creates plots of model summary statistics and 
regression diagnostics. 
 PROC REG <options>; 
MODEL dependents=independents </options>; 
PLOT <yvariable*xvariable>; 
RUN;
An example 
 PROC REG DATA=water; 
MODEL Water = Temperature Days Persons / VIF; 
MODEL Water = Temperature Production Days / VIF; 
RUN; 
 PROC REG DATA=water; 
MODEL Water = Temperature Production Days; 
PLOT STUDENT.* PREDICTED.; 
PLOT STUDENT.* NPP.; 
PLOT NPP.*r.; 
PLOT r.*NQQ.; 
RUN;
The LOGISTIC procedure 
 The binary or ordinal responses with continuous 
independent variables 
PROC LOGISTIC < options > ; 
MODEL dependents=independents < / options > ; 
RUN; 
 The binary or ordinal responses with categorical 
independent variables 
PROC LOGISTIC < options > ; 
CLASS categorical variables < / option > ; 
MODEL dependents=independents < / options > ; 
RUN;
Example 
PROC LOGISTIC data=Neuralgia; 
CLASS Treatment Sex; 
MODEL Pain= Treatment Sex Treatment*Sex Age Duration; 
RUN;
Overview Summary Report 
Procedures 
 PROC FREQ: produce frequency counts 
 PROC TABULATE: produce one- and two-dimensional tabular 
reports 
 PROC REPORT: produce flexible detail and summary reports
The FREQ Procedure 
 The FREQ procedure display frequency counts 
of the data values in a SAS data set. 
 General form of a simple PROC FREQ steps: 
PROC FREQ DATA = SAS-data-set; 
TABLE SAS-variables </options>; 
RUN;
The FREQ Procedure 
 Example: 
PROC FREQ DATA = class.crew ; 
FORMAT JobCode $codefmt. Salary money.; 
TABLE JobCode*Salary /NOCOL NOROW OUT =freqTable; 
RUN;
The TABULATE Procedure 
 PROC TABULATE displays descriptive 
statistics in tabular format. 
 General form of a simple PROC TABULATE 
steps: 
PROC TABULATE DATA=SAS-data-set; 
CLASS class-variables; 
VAR analysis-variables; 
TABLE row-expression, 
column-expression</options>; 
RUN;
The TABULATE Procedure 
 Example: 
TITLE 'Average Salary for Cary and Frankfurt'; 
PROC TABULATE DATA= class.crew FORMAT=dollar12.; 
WHERE Location IN ('Cary','Frankfurt'); 
CLASS Location JobCode; 
VAR Salary; 
TABLE JobCode, Location*Salary*mean; 
RUN;
The REPORT procedure 
 REPORT procedure combines features of the 
PRINT, MEANS, and TABULATE procedures. 
 It enables you to 
 create listing reports 
 create summary reports 
 enhance reports 
 request separate subtotals and grand totals
The REPORT procedure 
 Example 
PROC REPORT DATA =class.crew nowd HEADLINE HEADSKIP; 
COLUMN JobCode Location Salary; 
DEFINE JobCode / GROUP WIDTH= 8 'Job Code'; 
DEFINE Location / GROUP 'Home Base'; 
DEFINE Salary / FORMAT=dollar10. 'Average Salary‘ MEAN ; 
RBREAK AFTER / SUMMARIZE DOL; 
RUN;

More Related Content

PDF
JSON Data Parsing in Snowflake (By Faysal Shaarani)
PDF
SAS cheat sheet
PPT
Understanding SAS Data Step Processing
PDF
Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)
PPTX
SAS Mainframe -Program-Tips
DOCX
Oracle sql loader utility
DOCX
Sql loader good example
PDF
Introduction to SAS Data Set Options
JSON Data Parsing in Snowflake (By Faysal Shaarani)
SAS cheat sheet
Understanding SAS Data Step Processing
Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)
SAS Mainframe -Program-Tips
Oracle sql loader utility
Sql loader good example
Introduction to SAS Data Set Options

What's hot (19)

PPT
Hechsp 001 Chapter 3
PPT
Less17 Util
PPT
INTRODUCTION TO SAS
PPT
SAS ODS HTML
PPT
Les 11 Fb Queries
DOC
Reading the LISTCAT entries for VSAM
PPT
Sas-training-in-mumbai
PDF
Foxpro (1)
PDF
Import and Export Excel files using XLConnect in R Studio
PDF
Import and Export Excel Data using openxlsx in R Studio
PPT
SAS Access / SAS Connect
DOC
005 foxpro
DOC
NOTES ON "FOXPRO"
PPTX
Comparing SAS Files
PPT
As08 Revised
PDF
Compare And Merge Scripts
PDF
Introduction to-sas-1211594349119006-8
DOC
Log4j
Hechsp 001 Chapter 3
Less17 Util
INTRODUCTION TO SAS
SAS ODS HTML
Les 11 Fb Queries
Reading the LISTCAT entries for VSAM
Sas-training-in-mumbai
Foxpro (1)
Import and Export Excel files using XLConnect in R Studio
Import and Export Excel Data using openxlsx in R Studio
SAS Access / SAS Connect
005 foxpro
NOTES ON "FOXPRO"
Comparing SAS Files
As08 Revised
Compare And Merge Scripts
Introduction to-sas-1211594349119006-8
Log4j
Ad

Similar to Sas classes in mumbai (20)

PPTX
Introducción al Software Analítico SAS
PDF
SAS Internal Training
PPT
Sas short course_presentation_11-4-09
PPT
Sas short course_presentation_11-4-09
PDF
Sas cheat
PDF
Introduction to sas
PPT
BASE SAS Training presentation of coding
PPT
Prog1 chap1 and chap 2
PDF
The Little Sas Book
PDF
Set, merge, and update
DOC
Introduction to SAS
PDF
Sas summary guide
PDF
I need help with Applied Statistics and the SAS Programming Language.pdf
PPT
SAS - overview of SAS
PPT
Utility Procedures in SAS
PPT
Basics Of SAS Programming Language
PPTX
Sas clinical training
DOCX
Sample Questions The following sample questions are not in.docx
PDF
SAS Online Training
Introducción al Software Analítico SAS
SAS Internal Training
Sas short course_presentation_11-4-09
Sas short course_presentation_11-4-09
Sas cheat
Introduction to sas
BASE SAS Training presentation of coding
Prog1 chap1 and chap 2
The Little Sas Book
Set, merge, and update
Introduction to SAS
Sas summary guide
I need help with Applied Statistics and the SAS Programming Language.pdf
SAS - overview of SAS
Utility Procedures in SAS
Basics Of SAS Programming Language
Sas clinical training
Sample Questions The following sample questions are not in.docx
SAS Online Training
Ad

More from Vibrant Technologies & Computers (20)

PPT
Buisness analyst business analysis overview ppt 5
PPT
SQL Introduction to displaying data from multiple tables
PPT
SQL- Introduction to MySQL
PPT
SQL- Introduction to SQL database
PPT
ITIL - introduction to ITIL
PPT
Salesforce - Introduction to Security & Access
PPT
Data ware housing- Introduction to olap .
PPT
Data ware housing - Introduction to data ware housing process.
PPT
Data ware housing- Introduction to data ware housing
PPT
Salesforce - classification of cloud computing
PPT
Salesforce - cloud computing fundamental
PPT
SQL- Introduction to PL/SQL
PPT
SQL- Introduction to advanced sql concepts
PPT
SQL Inteoduction to SQL manipulating of data
PPT
SQL- Introduction to SQL Set Operations
PPT
Sas - Introduction to designing the data mart
PPT
Sas - Introduction to working under change management
PPT
Teradata - Architecture of Teradata
PPT
Teradata - Restoring Data
PPT
Datastage database design and data modeling ppt 4
Buisness analyst business analysis overview ppt 5
SQL Introduction to displaying data from multiple tables
SQL- Introduction to MySQL
SQL- Introduction to SQL database
ITIL - introduction to ITIL
Salesforce - Introduction to Security & Access
Data ware housing- Introduction to olap .
Data ware housing - Introduction to data ware housing process.
Data ware housing- Introduction to data ware housing
Salesforce - classification of cloud computing
Salesforce - cloud computing fundamental
SQL- Introduction to PL/SQL
SQL- Introduction to advanced sql concepts
SQL Inteoduction to SQL manipulating of data
SQL- Introduction to SQL Set Operations
Sas - Introduction to designing the data mart
Sas - Introduction to working under change management
Teradata - Architecture of Teradata
Teradata - Restoring Data
Datastage database design and data modeling ppt 4

Recently uploaded (20)

PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Pre independence Education in Inndia.pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
RMMM.pdf make it easy to upload and study
PDF
Insiders guide to clinical Medicine.pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Sports Quiz easy sports quiz sports quiz
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
01-Introduction-to-Information-Management.pdf
PPTX
Pharma ospi slides which help in ospi learning
PPTX
GDM (1) (1).pptx small presentation for students
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
O5-L3 Freight Transport Ops (International) V1.pdf
Pre independence Education in Inndia.pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
O7-L3 Supply Chain Operations - ICLT Program
RMMM.pdf make it easy to upload and study
Insiders guide to clinical Medicine.pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Sports Quiz easy sports quiz sports quiz
TR - Agricultural Crops Production NC III.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Abdominal Access Techniques with Prof. Dr. R K Mishra
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Anesthesia in Laparoscopic Surgery in India
01-Introduction-to-Information-Management.pdf
Pharma ospi slides which help in ospi learning
GDM (1) (1).pptx small presentation for students
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx

Sas classes in mumbai

  • 2. How to start using SAS
  • 3. The topics  An overview of the SAS system  Reading raw data/ create SAS data set  Combining SAS data sets & Match merging SAS Data Sets  Formatting data  Introduce some simple regression procedure  Summary report procedures
  • 4. Basic Screen Navigation  Main:  Editor contains the SAS program to be submitted.  Log contains information about the processing of the SAS program, including any warning and error messages  Output contains reports generated by SAS procedures and DATA steps  Side:  Explore navigate to other objects like libraries  Results navigate your Output window
  • 5. SAS programs A SAS program is a sequence of steps that the user submits for execution. Data steps are typically used to create SAS data sets PROC steps are typically used to process SAS data sets (that is, generate reports and graphs, edit data, sort data and analyze data
  • 6. SAS Data Libraries  A SAS data library is a collection of SAS files that are recognized as a unit by SAS  A SAS data set is one type of SAS file stored in a data library  Work library is temporary library, when SAS is closed, all the datasets in the Work library are deleted; create a permanent SAS dataset via your own library.
  • 7. SAS Data Libraries  Identify SAS data libraries by assigning each a library reference name (libref) with LIBNAME statement LIBNAME libref “file-folder-location”; Eg: LIBNAME readData 'C:tempsas classreadData‘;  Rules for naming a libref:  The name must be 8 characters or less  The name must begin with a letter or underscore  The remaining characters must be letters, numbers or underscores.
  • 8. Reading raw data set into SAS system  In order to create a SAS data set from a raw data file, you must  Start a DATA step and name the SAS data set being created (DATA statement)  Identify the location of the raw data file to read (INFILE statement)  Describe how to read the data fields from the raw data file (INPUT statement)
  • 9. Reading external raw data file into SAS system LIBNAME readData 'C:tempsas classreadData‘; DATA readData.wa80; INFILE “k:censusstf2_wa80.txt”; INPUT @10 SUMRYLVL $2. @40 COUNTY $3. @253 TABA1 9.0 @271 TABA1 9.0; RUN;  The LIBNAME statement assigns a libref ‘readData ’ to a data library.  The DATA statement creates a permanent SAS data set named ‘wa80’.  The INFILE statement points to a raw data file.  The INPUT statement - name the SAS variables - identify the variables as character or numeric ($ indicates character data) - specify the locations of the fields in the raw data - can be specified as column, formatted, list, or named input  The RUN statement detects the end of a step
  • 10. Example 1  Reading raw data separated by spaces /* Create a SAS permanent data set named HighLow1; Read the data file temperature1.dat using listing input */ DATA readData.HighLow1; INFILE ‘C:sas classreadDatatemperature1.dat’; INPUT City $ State $ NormalHigh NormalLow RecordHigh RecordLow; RUN; /* The PROC PRINT step creates a isting report of the readData.HighLow1 data set */ PROC PRINT DATA = readData.highlow1; TITLE ‘High and Low Temperatures for July’; RUN; temperature1.dat: Nome AK 55 44 88 29 Miami FL 90 75 97 65 Raleign NC 88 68 105 50
  • 11. Example 2  Reading multiple lines of raw data per observation /* Read the data file using line pointer, slash(/) and pount-n (#n). The slash(/) indicates next line, the #n means to go to the n line for that observation. Slash(/) can be replaced by #2 here */ DATA readData.highlow2; INFILE ‘C:sas classreadDatatemperature2.dat’; INPUT City $ State $ / NormalHigh NormalLow #3 RecordHigh RecordLow; PROC PRINT DATA = readData.highlow2; TITLE ‘High and Low Temperatures for July’; RUN; temperature2.dat: Nome AK 55 44 88 29 Miami FL 90 75 97 65 Raleign NC 88 68 105 50
  • 12. Example 3  Reading multiple observations per line of raw data temperature3.dat: Nome AK 55 44 88 29 Miami FL 90 75 97 65 Raleign NC 88 68 105 50 /* To read multiple observations per line of raw data,use double railing at signs (@@) at the end of INPUT statement */ DATA readData.highlow3; INFILE ‘C:sas classreadDatatemperature3.dat’; INPUT City $ State $ NormalHigh NormalLow RecordHigh RecordLow @@; PROC PRINT DATA = readData.highlow3; TITLE ‘High and Low Temperatures for July’; RUN;
  • 13. Reading external raw data file into SAS system  Reading raw data arranged in columns INPUT FILEID $ 1-5 RECTYP $ 6-9 SUMRYLVL $ 10-11 URBARURL $ 12-13 SMSACOM $ 14-15;  Reading raw data mixed in columns INPUT FILEID $ 1-5 @10 SUMRYLVL $ 2. @253 TABA1 9.0 @271 TABA1 9.0; /* The @n is the column pointer, where n is the number of the column SAS should move to. The $w. reads standard character data, and w.d reads standard numeric data, where w is the total width and d is the number of decimal places. */
  • 14. Reading Delimited or PC Database Files with the IMPORT Procedure  If your data file has the proper extension, use the simplest form of the IMPORT procedure: PROC IMPORT DATA FILE = ‘filename’ OUT = data-set Type of File Extension DBMS Identifier Comma-delimited .csv CSV Tab-delimited .txt TAB Excel .xls EXCEL Lotus Files .wk1, .wk3, .wk4 WK1,WK3,WK4 Delimiters other than commas or tabs DLM  Examples: 1. PROC IMPORT DATAFILE=‘c:tempsale.csv’ OUT=readData.money; RUN; 2. PROC IMPORT DATAFILE=‘c:tempbands.xls’ OUT=readData.music; RUN;
  • 15. Reading Files with the IMPORT Procedure  If your file does not have the proper extension, or your file is of type with delimiters other than commas or tabs, then you must use the DBMS= and DELIMITER= option PROC IMPORT DATAFILE = ‘filename’ OUT = data-set DBMS = identifier; DELIMITER = ‘delimiter-character’; RUN;  Example: PROC IMPORT DATAFILE = ‘C:sas classreadDataimport2.txt’ OUT =readData.sasfile DBMS =DLM; DELIMITER = ‘&’; RUN;
  • 16. Format in SAS data set  Standard Formats (selected):  Character: $w.  Date, Time and Datetime: DATEw., MMDDYYw., TIMEw.d, ……  Numeric: COMMAw.d, DOLLARw.d, ……  Use FORMAT statement PROC PRINT DATA=sales; VAR Name DateReturned CandyType Profit; FORMAT DateReturned DATE9. Profit DOLLAR 6.2; RUN;
  • 17. Format in SAS data set  Create your own custom formats with two steps:  Create the format using PROC FORMAT and VALUE statement.  Assign the format to the variable using FORMAT statement.  General form of a simple PROC FORMAT steps: PROC FORMAT; VALUE name range-1=‘formatted-text-1’ range-2=‘formatted-text-2’ ……; RUN;  The name in VALUE statement is the name of the format you are creating, which can’t be longer than eight characters, must not start or end with a number. If the format is for character data, it must start with a $.
  • 18. Format in SAS data set Exmaple: /* Step1: Create the format for certain variables */ PROC FORMAT; VALUE genFmt 1 = 'Male' 2 = 'Female'; VALUE money low-<25000='Less than 25,000' 25000-50000='25,000 to 50,000' 50000<-high='More than 50,000'; VALUE $codeFmt 'FLTA1'-'FLTA3'='Flight Attendant' 'PILOT1'-'PILOT3'='Pilot'; RUN; /* Step2: Assign the variables */ DATA fmtData.crew1; SET fmtData.crew; FORMAT Gender genFmt. Salary money. JobCode $codeFmt.; RUN;
  • 19. Format in SAS data set  Permanently store formats in a SAS catalog by  Creating a format catalog file with LIB in PROC FORMAT statement  Setting the format search options  Example: LIBNAME class ‘C:sas classFormat’; OPTIONS FMTSEARCH=(fmtData.fmtvalue); RUN; PROC FORMAT LIB=fmtData.fmtvalue; VALUE genFmt 1 = ‘Male’ 2=‘Female’; RUN;
  • 20. Combining SAS Data Sets: Concatenating and Interleaving  Use the SET statement in a DATA step to concatenate SAS data sets.  Use the SET and BY statements in a DATA step to interleave SAS data sets.
  • 21. Combining SAS Data Sets: Concatenating and Interleaving  General form of a DATA step concatenation:  DATA SAS-data-set; SET SAS-data-set1 SAS-data-set2 …; RUN;  Example: DATA stack.allEmp; SET stack.emp1 stack.emp2 stack.emp3; RUN;
  • 22. Combining SAS Data Sets: Concatenating and Interleaving  General form of a DATA step interleave:  DATA SAS-data-set; SET SAS-data-set1 SAS-data-set2 …; BY BY-variable; RUN;  Sort all SAS data set first by using PROC SORT  Example: PROC SORT data=stack.emp2 OUT=stack.emp2_sorted; BY Salary; RUN; DATA stack.allEmp; SET stack.emp1 stack.emp2 stack.emp3; BY salary; RUN;
  • 23. Match-Merging SAS Data Sets  One-to-one match merge One-to-many match merge Many-to-many match merge  The SAS statements for all three types of match merge are identical in the following form: DATA new-data-set; MERGE data-set-1 data-set-2 data-set-3 …; BY by-variable(s); /* indicates the variable(s) that control which observations to match */ RUN;
  • 24. Merging SAS Data Sets: A More Complex Example  Example: Merge two data sets acquire the names of the group team that is scheduled to fly next week. combData.employee combData.groupsched EmpID LastName E00632 Strauss E01483 Lee E01996 Nick E04064 Waschk /* To match-merge the data sets by common variables - EmpID, the data sets must be ordered by EmpID */ PROC SORT data=combData.Groupsched; BY EmpID; RUN; EmpID FlightNum E04064 5105 E0632 5250 E01996 5501
  • 25. Merging SAS Data Sets: A More Complex Example /* simply merge two data sets */ DATA combData.nextweek; MERGE combData.employee combData.groupsched; BY EmpID; RUN; EmpID LastJName FlightNum E00632 Strauss 5250 E01483 Lee E01996 Nick 5501 E04064 Waschk 5105
  • 26. Merging SAS Data Sets: A More Complex Example  Eliminating Nonmatches Use the IN= data set option to determine which dataset(s) contributed to the current observation.  General form of the IN=data set option: SAS-data-set (IN=variable)  Variable is a temporary numeric variable that has two possible values:  0 indicates that the data set did not contribute to the current observation.  1 indicates that the data set did contribute to the current observation.
  • 27. Merging SAS Data Sets: A More Complex Example /*Exclude from the data set employee who are scheduled to fly next week. */ LIBNAME combData “K:sas classmerge”; DATA combData.nextweek; MERGE combData.employee combData.groupsched (in=InSched); BY EmpID; IF InSched=1; True RUN; EmpID LastJName FlightNum E00632 Strauss 5250 E01996 Nick 5501 E04064 Waschk 5105
  • 28. Merging SAS Data Sets: A More Complex Example /* Find employees who are not in the flight scheduled group. */ LIBNAME combData “K:sas classmerge”; DATA combData .nextweek; MERGE combData .employee (in=InEmp) combData.groupsched (in=InSched); BY EmpID; IF InEmp=1; True IF InSched=0; False RUN; EmpID LastJName FlightNum E01483 Lee
  • 29. Different Types of Merges in SAS  One-to-Many Merging DATA work.three; MERGE work.one work.two; BY X; RUN; X Y 1 A 2 B 3 C Work.two X E 1 A1 1 A2 2 B1 3 C1 3 C2 Work.three X Y Z 1 A A1 1 A A2 2 B B1 3 C C1 3 C C2 Work.one
  • 30. Different Types of Merges in SAS  Many-to-Many Merging DATA work.three; MERGE work.one work.two; BY X; RUN; X Y 1 A1 1 A2 2 B1 2 B2 Work.two X Z 1 AA1 1 AA2 1 AA3 2 BB1 2 BB2 Work.three X Y Z 1 A1 AA1 1 A2 AA2 1 A2 AA3 2 B1 BB1 2 B2 BB2 Work.one
  • 31. Some simple regression analysis procedure  The REG Procedure  The LOGISTIC Procedure
  • 32. The REG procedure  The REG procedure is one of many regression procedures in the SAS System.  The REG procedure allows several MODEL statements and gives additional regression diagnostics, especially for detection of collinearity. It also creates plots of model summary statistics and regression diagnostics.  PROC REG <options>; MODEL dependents=independents </options>; PLOT <yvariable*xvariable>; RUN;
  • 33. An example  PROC REG DATA=water; MODEL Water = Temperature Days Persons / VIF; MODEL Water = Temperature Production Days / VIF; RUN;  PROC REG DATA=water; MODEL Water = Temperature Production Days; PLOT STUDENT.* PREDICTED.; PLOT STUDENT.* NPP.; PLOT NPP.*r.; PLOT r.*NQQ.; RUN;
  • 34. The LOGISTIC procedure  The binary or ordinal responses with continuous independent variables PROC LOGISTIC < options > ; MODEL dependents=independents < / options > ; RUN;  The binary or ordinal responses with categorical independent variables PROC LOGISTIC < options > ; CLASS categorical variables < / option > ; MODEL dependents=independents < / options > ; RUN;
  • 35. Example PROC LOGISTIC data=Neuralgia; CLASS Treatment Sex; MODEL Pain= Treatment Sex Treatment*Sex Age Duration; RUN;
  • 36. Overview Summary Report Procedures  PROC FREQ: produce frequency counts  PROC TABULATE: produce one- and two-dimensional tabular reports  PROC REPORT: produce flexible detail and summary reports
  • 37. The FREQ Procedure  The FREQ procedure display frequency counts of the data values in a SAS data set.  General form of a simple PROC FREQ steps: PROC FREQ DATA = SAS-data-set; TABLE SAS-variables </options>; RUN;
  • 38. The FREQ Procedure  Example: PROC FREQ DATA = class.crew ; FORMAT JobCode $codefmt. Salary money.; TABLE JobCode*Salary /NOCOL NOROW OUT =freqTable; RUN;
  • 39. The TABULATE Procedure  PROC TABULATE displays descriptive statistics in tabular format.  General form of a simple PROC TABULATE steps: PROC TABULATE DATA=SAS-data-set; CLASS class-variables; VAR analysis-variables; TABLE row-expression, column-expression</options>; RUN;
  • 40. The TABULATE Procedure  Example: TITLE 'Average Salary for Cary and Frankfurt'; PROC TABULATE DATA= class.crew FORMAT=dollar12.; WHERE Location IN ('Cary','Frankfurt'); CLASS Location JobCode; VAR Salary; TABLE JobCode, Location*Salary*mean; RUN;
  • 41. The REPORT procedure  REPORT procedure combines features of the PRINT, MEANS, and TABULATE procedures.  It enables you to  create listing reports  create summary reports  enhance reports  request separate subtotals and grand totals
  • 42. The REPORT procedure  Example PROC REPORT DATA =class.crew nowd HEADLINE HEADSKIP; COLUMN JobCode Location Salary; DEFINE JobCode / GROUP WIDTH= 8 'Job Code'; DEFINE Location / GROUP 'Home Base'; DEFINE Salary / FORMAT=dollar10. 'Average Salary‘ MEAN ; RBREAK AFTER / SUMMARIZE DOL; RUN;

Editor's Notes

  • #7: - SAS use data libraries to store data sets. - You can think of a SAS data library as a drawer in a filling cabinet and a SAS data set as one of the file folders in the drawer. - The Work library is temporary. When SAS is closed, all the datasets in the Work library are deleted. if you want to save a dataset to continue to work with it later, create a permanent SAS dataset via a library.
  • #10: You identify SAS data libraries by assigning each a library reference name (libref). The name must be 8 characters or less, must begin with a letter or underscore and the remaining characters must be letters, numbers, or underscores.
  • #13: When you have multiple observations per line of raw data, you can use double railing at signs (@@) at the end of your INPUT statement.
  • #18: Create our own custom formats when you use a lot of coded data. Formats can remind you of the meaning behind the category. Note that formats do not change the actual value of the variable, just how it’s displayed.
  • #19: If the format is for character data, it must start with a $
  • #34: the keyword NPP. or NQQ., which can be used with any of the preceding variables to construct normal P-P or Q-Q plots,
  • #35: Binary responses (for example, success and failure), and ordinal responses (for example, normal, mild, and severe