There are at least four different types of SUM I have run across in SAS:
- In PROC PRINT
- The SUM statement, which causes PROC PRINT to display totals for the specified variable
- In the DATA step
- The sum statement, which adds a specified expression to an accumulator variable. The value of the accumulator variable is retained through each iteration of the data step. The sum statement treats missing values as 0.
- The + operator. This adds two numeric values, but if either value is missing the expression will evaluate to missing.
- The SUM function - this taks the form SUM(val1, val2, ...), returning the sum of the values, treating missing values as zero.
In PROC PRINT, SUM
variable-name will give you the total value, for example:
PROC PRINT DATA=work.grocery;
var item;
sum price;
run;
Obs |
Item |
Price |
1 |
Milk |
3 |
2 |
Tofu |
4 |
3 |
Bread |
5 |
|
|
12 |
On the other hand, in a DATA step, you have the sum statement
variable+expression. This initializes the value of
variable to 0 and increment by
expression in each iteration of the DATA step.
The SUM function and + operator have the same results, except in the case of a missing value.
MyVar = SUM(Var1,Var2);
MyVar = Var1 + Var2,
Will produce the same result, unless Var1 or Var2 is a missing value. If, for example, Var2 is missing, then the SUM function will return the value of VAR1, while the + operator will return the value missing. This is particularly confusing because the sum statement also uses the + symbol, but the sum statement treats missing values as zeros.