Saturday, July 13, 2013

Different Uses of SUM in SAS

There are at least four different types of SUM I have run across in SAS:

    • The SUM statement, which causes PROC PRINT to display totals for the specified variable
  • In the DATA step
    • The sum statement, which adds a specified expression to an accumulator variable.  The value of the accumulator variable is retained through each iteration of the data step.  The sum statement treats missing values as 0.
    • The + operator.  This adds two numeric values, but if either value is missing the expression will evaluate to missing.
    • The SUM function - this taks the form SUM(val1val2, ...), returning the sum of the values, treating missing values as zero.
In PROC PRINT, SUM variable-name will give you the total value, for example:

  var item;
  sum price;

Obs Item Price
1 Milk 3
2 Tofu 4
3 Bread 5

On the other hand, in a DATA step, you have the sum statement variable+expression.  This initializes the value of variable to 0 and increment by expression in each iteration of the DATA step.

The SUM function and + operator have the same results, except in the case of a missing value.
MyVar = SUM(Var1,Var2); 
MyVar = Var1 + Var2,
 Will produce the same result, unless Var1 or Var2 is a missing value.  If, for example, Var2 is missing, then the SUM function will return the value of VAR1, while the + operator will return the value missing.  This is particularly confusing because the sum statement also uses the + symbol, but the sum statement treats missing values as zeros.