SAS Tutorial — Session 6

Contents
Introduction
Session 1
Session 2
Session 3
Session 4
Session 5
Session 6
Odds -n- Ends
 
 

Objectives

  1. Use the SET statement to read observations from an existing data set

  2. Use the IF statement to subset the data set

  3. Use the PROC SORT statement to reorder observations

SET statement

    Use: To read observations from a data set
    Syntax: SET dataset_name;
    Result: The data is read for further processing

The SET statement has several uses, but the most common use is to read observations from one or more SAS data sets for further processing in the DATA step. For example:

    DATA NEW;
    SET OLD;

reads all of the observations from data set OLD and copies them into data set NEW. This copy can then be manipulated or subset without changing the original data set.

ToC

IF statement

    Use: To subset, or take a portion of the data set
    Syntax: IF var_name = somevalue;
    Result: The data set is subset according to the specified criteria.

The subsetting IF statement enables the user to run a procedure on a selected portion of their data. The general format of the IF statement is:

    IF var_name = value;

Example:

    DATA MALE;
    IF SEX= 1;
    IF ETHNICITY='B';

The above example would select all observations where SEX equals 1 and where ETHNICITY equals 'B';.

Notice that values for character variables must be enclosed in quotes and they must match exactly. Case matters.

ToC

PROC SORT statement

    Use: To reorder cases according to the values of one or more variables
    Syntax: PROC SORT; BY var_name;
    Result: The data is sorted in order of the values of var_name.

PROC SORT reorders the cases in a SAS data set based on the values of one or more variables. You can sort in ascending or descending order. By default PROC SORT sorts in ascending order. BY variables can be numeric or character. For example,

    PROC SORT; BY GENDER DESCENDING AGE;

Sorts cases first in ascending order of GENDER and within those categories, 1 (Male) and 2 (Female), the cases are further sorted in descending order of AGE.

PROC SORT is used most often for sorting a data set so that other SAS procedures can process that data set in subsets using BY statements. Data sets must also be sorted before they can be merged or updated.

Example:

    PROC SORT DATA=SURVEY; BY GENDER;
    PROC FREQ; BY GENDER;
    VAR Q1;

This example first sorts the SAS data set called SURVEY in ascending order of the values of the variable GENDER, 1 for males and 2 for females. Then a frequency distribution of the variable Q1 is performed for males and females separately.

Note: By default, procedures work on the most recently created dataset, unless a DATA= option specifies another dataset.

ToC

Session 6 Exercises

Exercise 6.1

  1. Edit (w/Pico) your SAS program, survey.sas.

  2. First we will create a new SAS data set containing only the males from the original dataset. So delete the PROC FREQ from the bottom of your program below the data and add a DATA statement in its place by typing:

      DATA MALES;

  3. Copy the original data set, SURVEY, into your new data set, MALES typing:

      SET SURVEY;

  4. Add a subsetting IF statement after the DATA statement to select only the males in the data set (SEX=1) by typing:

      IF SEX= 1;

  5. Add a PRINT procedure to check your new all male data set.

      PROC PRINT DATA=MALES;

  6. Be sure to save your work. At this point your program should look like this:

  7. Now, submit your SAS program file at the Linux ($) prompt. When it has finished ( and the $ prompt returns) check the .log and .lst files for errors and warnings for the output.

Exercise 6.2

This time we will do a separate PROC PRINT for males and females without first creating separate data sets with a subsetting IF statement as we did in Exercise A.

  1. Edit (w/Pico) your SAS program, survey.sas.

  2. Delete the entire DATA Step at the end of your program, where you created the MALES data set. Leave the PROC PRINT, but remove the DATA= option.

  3. SORT your SURVEY data set by the variable SEX by typing:

      PROC SORT; BY SEX;

  4. Add a BY statement to your PROC PRINT telling it to do the procedure once for each value of the BY variable by typing:

      PROC PRINT; BY SEX;

  5. Be sure to save your work. At this point your program should look like this:

  6. Now, submit your SAS program file at the Linux ($) prompt. When it has finished (and the $ prompt returns) check the .log and .lst files for errors and warnings for the output.

ToC

© University of New Mexico -- last updated -- comments to: docs@unm.edu