9 SASmarkdown

This section shows how to write SAS code inside an R Markdown workflow using the SASmarkdown package. The goal is to keep one reproducible document (HTML/PDF) while allowing SAS to do what SAS is good at: data steps, PROC routines, reporting, and macros.

Two ideas drive the design:

  1. knitr engines: a code chunk can be executed by an external program (SAS) instead of R.
  2. chunk isolation: in most SASmarkdown workflows, each SAS chunk runs as its own SAS “session script,” so objects created in one chunk may not automatically exist in another chunk (unless you explicitly write/read datasets).

This is why SASmarkdown feels different from pure SAS Studio and also different from pure R Markdown.


9.1 How to install SASmarkdown

The SASmarkdown package and documentation are here:
sasmarkdown

The typical setup steps are:

  • Install and load SASmarkdown in R.
  • Tell knitr where SAS is installed (the sas.exe path).
  • Set SAS options (log, line size, batch mode).
  • Set the chunk engine to a SAS engine such as sas, saslog, sashtml, etc.

9.1.1 Use an engine

The first chunk loads the package. You can also set global knitr options here, but the key is that SASmarkdown must be available.

# knitr::opts_chunk$set(echo = TRUE)
require(SASmarkdown)
## Loading required package: SASmarkdown
## Warning: package 'SASmarkdown' was built under R version 4.4.3
## SAS found at C:/Program Files/SASHome/SASFoundation/9.4/sas.exe
## SAS engines are now ready to use.

Next, we define the SAS executable path and options, then tell knitr to use sashtml as the default engine.

What this means operationally:

  • knitr writes your SAS code to a temporary .sas file,
  • calls sas.exe in batch mode,
  • captures the HTML output produced by SAS,
  • embeds that output into your knitted document.
saspath <- "C:/Program Files/SASHome/SASFoundation/9.4/sas.exe"
sasopts <- "-nosplash -ls 75"
knitr::opts_chunk$set(engine='sashtml', engine.path=saspath,
                      engine.opts=sasopts, comment="")

Common troubleshooting notes (Windows): - If the path contains spaces, it must still be a valid Windows path string. - If you have multiple SAS versions, confirm the correct sas.exe. - Some corporate environments block batch execution; SASmarkdown may fail unless permissions allow it.


9.2 Common statements

Two practical rules matter most when writing SAS inside R Markdown:

  1. Do not use R syntax inside SAS chunks
    SAS chunks are sent to SAS verbatim. R objects like mydata, paste(), or %>% have no meaning there.

  2. Data typically cannot be used across chunks unless you persist it
    Many SASmarkdown setups run each SAS chunk independently. If you create data test; ... run; in one chunk, the next chunk may not “see” work.test unless the session is preserved or you write it out to a permanent library.

A safe habit is: - treat each SAS chunk as self-contained, or - write datasets to a permanent location/library and read them back later.


9.2.1 Read in data using informats (date data)

9.2.1.1 Example 1: a minimal DATA step with inline data

This example uses cards; ... ; to create a dataset directly.
It is the fastest way to create toy data for teaching, testing, or demonstrations.

/*import and export raw data using infile*/
data test;

/*infile "c:/document/data.text";*/
input name $ height weight;
length height 4;
cards 
;
Daniel 173 150
;
run;
proc print;run;
                     Obs     name     height    weight

                      1     Daniel      173       150 

Key points: - input name $ height weight; declares name as character ($) and the others numeric. - length height 4; sets storage length (bytes) for numeric variables (mainly useful for character variables; numeric is typically 8 bytes by default in SAS). - proc print; confirms what was read.


9.2.1.2 Example 2: reading mixed date strings using inputn()

Real-world data often stores dates as text in inconsistent formats.
Here we read two strings: one containing the date text and one containing the informat name. Then we convert dynamically using inputn().

data have;
input char_date : $20. fmt : $20.;
/* set a new variable format*/
num_char=inputn(char_date,fmt);
format num_char date9. ;
cards;
12-08-2015  DDMMYY12.
12/8/2016   MMDDYY12.
05/25/2015  MMDDYY12.
;
run;
proc print;run;
                Obs    char_date        fmt        num_char

                 1     12-08-2015    DDMMYY12.    12AUG2015
                 2     12/8/2016     MMDDYY12.    08DEC2016
                 3     05/25/2015    MMDDYY12.    25MAY2015

What’s happening: - char_date is a character date string. - fmt is a character variable holding an informat name (like DDMMYY12.). - inputn(char_date, fmt) converts using the informat stored in fmt. - format num_char date9.; controls display (e.g., 25MAY2015).

This pattern is extremely useful in messy ingestion pipelines.


9.2.1.3 Example 3: date informat directly in INPUT

If the date format is consistent, you can apply the informat in the input statement directly:

data have;
input char_date MMDDYY10.  ;
/* output format */
format char_date date9. ;
cards;
12-08-2015   
12/8/2016   
05/25/2015   
;
run;
proc print;run;
                             Obs    char_date

                              1     08DEC2015
                              2     08DEC2016
                              3     25MAY2015

Here: - SAS reads a date value using MMDDYY10. regardless of separator style. - The stored value is numeric (SAS date = days since 01JAN1960), but displayed via date9..


9.2.1.4 Example 4: setting both input and output format consistently

Sometimes you want to keep the same “look” as the input:

data have;
/* set input and output format*/
input char_date MMDDYY10.  ;
format char_date MMDDYY10. ;
cards;
12-08-2015   
12/8/2016   
05/25/2015   
;
run;
proc print;run;
                             Obs     char_date

                              1     12/08/2015
                              2     12/08/2016
                              3     05/25/2015

This helps when your deliverable expects a specific format (for example, when exporting to CSV to be read by another system).


9.2.2 Compute mean and frequency

A typical exploratory workflow in SAS uses proc means, proc freq, and proc contents.

proc means data=sashelp.class maxdec=5 ;
run;

proc freq data=sashelp.class  ;
table sex;
run;

proc contents data=sashelp.class varnum  ;
run;
                            The MEANS Procedure

 Variable    N           Mean        Std Dev        Minimum        Maximum
 -------------------------------------------------------------------------
 Age        19       13.31579        1.49267       11.00000       16.00000
 Height     19       62.33684        5.12708       51.30000       72.00000
 Weight     19      100.02632       22.77393       50.50000      150.00000
 -------------------------------------------------------------------------
 
                                                                           
 
                            The FREQ Procedure

                                         Cumulative    Cumulative
         Sex    Frequency     Percent     Frequency      Percent
         --------------------------------------------------------
         F             9       47.37             9        47.37  
         M            10       52.63            19       100.00  
 
                                                                           
 
                          The CONTENTS Procedure

  Data Set Name        SASHELP.CLASS             Observations          19
  Member Type          DATA                      Variables             5 
  Engine               V9                        Indexes               0 
  Created              09/06/2017 21:55:32       Observation Length    40
  Last Modified        09/06/2017 21:55:32       Deleted Observations  0 
  Protection                                     Compressed            NO
  Data Set Type                                  Sorted                NO
  Label                Student Data                                      
  Data Representation  WINDOWS_64                                        
  Encoding             us-ascii  ASCII (ANSI)                            

                     Engine/Host Dependent Information

Data Set Page Size          65536                                          
Number of Data Set Pages    1                                              
First Data Page             1                                              
Max Obs per Page            1632                                           
Obs in First Data Page      19                                             
Number of Data Set Repairs  0                                              
ExtendObsCounter            YES                                            
Filename                    C:\Program                                     
                            Files\SASHome\SASFoundation\9.                 
                            4\core\sashelp\class.sas7bdat                  
Release Created             9.0401M5                                       
Host Created                X64_SR12R2                                     
Owner Name                  BUILTIN\Administrators                         
File Size                   128KB                                          
File Size (bytes)           131072                                         

                        Variables in Creation Order
 
                       #    Variable    Type    Len

                       1    Name        Char      8
                       2    Sex         Char      1
                       3    Age         Num       8
                       4    Height      Num       8
                       5    Weight      Num       8

Interpretation: - proc means summarizes numeric variables (mean, std, min, max, etc.). - proc freq tabulates categorical variables. - proc contents varnum shows dataset metadata ordered by variable position.

These are often the first three procs you run when you open a new dataset.


9.2.3 Sort a dataset

Sorting is required for many BY-group operations (including merges and BY processing).

proc sort data=sashelp.class Out= name ;
by name;
Run;

proc print data= name (obs=10) ;
run;
             Obs    Name       Sex    Age    Height    Weight

               1    Alfred      M      14     69.0      112.5
               2    Alice       F      13     56.5       84.0
               3    Barbara     F      13     65.3       98.0
               4    Carol       F      14     62.8      102.5
               5    Henry       M      14     63.5      102.5
               6    James       M      12     57.3       83.0
               7    Jane        F      12     59.8       84.5
               8    Janet       F      15     62.5      112.5
               9    Jeffrey     M      13     62.5       84.0
              10    John        M      12     59.0       99.5

Note: - The output dataset name name is legal but can be confusing because name is also a variable in sashelp.class. - A safer convention is out=class_sorted, but we keep your code unchanged.


9.2.4 Transpose or reshape

Reshaping appears frequently in reporting and longitudinal data workflows.

9.2.4.1 Wide to long

proc transpose converts columns into rows.

proc transpose data= sashelp.class out= Field;
by name;
Run;

proc print data= field (obs=10) ;
run;
                     Obs     Name      _NAME_     COL1

                       1    Alfred     Age        14.0
                       2    Alfred     Height     69.0
                       3    Alfred     Weight    112.5
                       4    Alice      Age        13.0
                       5    Alice      Height     56.5
                       6    Alice      Weight     84.0
                       7    Barbara    Age        13.0
                       8    Barbara    Height     65.3
                       9    Barbara    Weight     98.0
                      10    Carol      Age        14.0

A critical detail: - BY processing requires sorted data by name (or SAS will warn/error depending on settings).


9.2.4.2 Long to wide

A second transpose can rebuild a wide layout, typically using: - id to name new columns, - var to specify the values to spread.

proc transpose data= sashelp.class out= Field;
by name;
Run;

proc transpose data=Field out=Field_wide  ;
    by name ;
    id _name_;
    var col1;
run;

proc print data= field_wide (obs=10) ;
run;
            Obs    Name       _NAME_    Age    Height    Weight

              1    Alfred      COL1      14     69.0      112.5
              2    Alice       COL1      13     56.5       84.0
              3    Barbara     COL1      13     65.3       98.0
              4    Carol       COL1      14     62.8      102.5
              5    Henry       COL1      14     63.5      102.5
              6    James       COL1      12     57.3       83.0
              7    Jane        COL1      12     59.8       84.5
              8    Janet       COL1      15     62.5      112.5
              9    Jeffrey     COL1      13     62.5       84.0
             10    John        COL1      12     59.0       99.5

Conceptually: - The first transpose creates a “longer” dataset with _name_ and col1. - The second transpose pivots _name_ levels into columns.


9.2.5 Conditional statements

Conditional logic is fundamental in data steps: creating flags, categories, derived variables, and messages.

9.2.5.1 Simple IF/ELSE

DATA ab;
set sashelp.class;
IF sex="F" then message='A is greater';
Else message='B is greater';
Run; 

proc print data=ab (obs=10);
run;
     Obs    Name       Sex    Age    Height    Weight      message

       1    Alfred      M      14     69.0      112.5    B is greater
       2    Alice       F      13     56.5       84.0    A is greater
       3    Barbara     F      13     65.3       98.0    A is greater
       4    Carol       F      14     62.8      102.5    A is greater
       5    Henry       M      14     63.5      102.5    B is greater
       6    James       M      12     57.3       83.0    B is greater
       7    Jane        F      12     59.8       84.5    A is greater
       8    Janet       F      15     62.5      112.5    A is greater
       9    Jeffrey     M      13     62.5       84.0    B is greater
      10    John        M      12     59.0       99.5    B is greater

Even though the message text is arbitrary, the example demonstrates: - IF/ELSE structure, - character literals in quotes, - adding derived variables in a data step.


9.2.5.2 Multiple conditions (ELSE IF)

Data class;
set sashelp.class;
if weight >=100 then weight_cat =3;
else if weight >=90 and weight <100 then weight_cat =2 ;
else weight_cat =1;
attrib weight_cat label = "weight category,  1=Inad, 2=Adeq, 3=Exces"; 
run;

proc print data=class (obs=3);
run;
                                                            weight_
        Obs     Name      Sex    Age    Height    Weight      cat

          1    Alfred      M      14     69.0      112.5       3   
          2    Alice       F      13     56.5       84.0       1   
          3    Barbara     F      13     65.3       98.0       2   

This is a standard pattern for category creation: - ordered thresholds, - mutually exclusive categories, - labeling for readable tables.


9.2.5.3 index() for pattern detection

data test;
set sashelp.class;
length gender $10 ;
if index(sex,"M")>0 THEN gender="male";
if index(sex,"F")>0 THEN gender="female";
RUN;

proc print data=test (obs=5);
run;
        Obs     Name      Sex    Age    Height    Weight    gender

          1    Alfred      M      14     69.0      112.5    male  
          2    Alice       F      13     56.5       84.0    female
          3    Barbara     F      13     65.3       98.0    female
          4    Carol       F      14     62.8      102.5    female
          5    Henry       M      14     63.5      102.5    male  

Note: - index() returns the position of a substring (0 if not found). - Because you used two separate IF statements (not IF/ELSE), the logic is still fine here because sex can’t be both “M” and “F”.


9.2.6 LIKE operation to select rows containing a pattern

SAS where supports SQL-like pattern matching using like with % wildcards.

proc print data =sashelp.class;
where upcase(name) like '%A%';
run;

proc print data =sashelp.class;
where  (name) like '%A%';
run;
             Obs    Name       Sex    Age    Height    Weight

               1    Alfred      M      14     69.0      112.5
               2    Alice       F      13     56.5       84.0
               3    Barbara     F      13     65.3       98.0
               4    Carol       F      14     62.8      102.5
               6    James       M      12     57.3       83.0
               7    Jane        F      12     59.8       84.5
               8    Janet       F      15     62.5      112.5
              14    Mary        F      15     66.5      112.0
              17    Ronald      M      15     67.0      133.0
              18    Thomas      M      11     57.5       85.0
              19    William     M      15     66.5      112.0
 
                                                                           
 
              Obs     Name     Sex    Age    Height    Weight

                1    Alfred     M      14     69.0      112.5
                2    Alice      F      13     56.5       84.0

The first query is case-insensitive by forcing uppercase.
The second is case-sensitive, depending on your SAS session settings.


9.2.7 Change format of a variable

User-defined formats map raw values into labeled categories (often for tables).

proc format;
   value AGEnew
        11 = '1: NEW' 
        12 = '2: NEW'
        13 = '3: NEW' 
        14 = '4: NEW'
        15 = '5: NEW'
        16 = '6: NEW'        
        ;
        run;
        
DATA ab;
set sashelp.class;
Format AGE AGEnew.;
Run;
        
proc freq data=ab;
        table AGE;
        run;
        
                            The FREQ Procedure

                                           Cumulative    Cumulative
           Age    Frequency     Percent     Frequency      Percent
        -----------------------------------------------------------
        1: NEW           2       10.53             2        10.53  
        2: NEW           5       26.32             7        36.84  
        3: NEW           3       15.79            10        52.63  
        4: NEW           4       21.05            14        73.68  
        5: NEW           4       21.05            18        94.74  
        6: NEW           1        5.26            19       100.00  

This workflow is typical: 1) define format in proc format,
2) apply format in a data step,
3) display with proc freq.


9.2.8 Basic operations

SAS provides many numeric functions directly in the data step.

Data Mathdata;
A= 10.12345;
B=20;
C= mean (a,b);
D= Min(a,b);
E= Max(a,b); 
F = log(a);
G= round(a,0.02);
H= floor(a );

Run;

proc print data=Mathdata;
run;

Data mathdata;
Set  sashelp.Iris;
Sum = sum (of SepalLength, SepalWIDTH);
Diff = SepalLength- SepalWIDTH;
Mult = SepalLength* SepalWIDTH;
Div= SepalLength/ SepalWIDTH;
Run;

proc print data=mathdata (obs=10);
run;
Obs       A        B       C          D        E       F         G       H

 1     10.1235    20    15.0617    10.1235    20    2.31485    10.12    10
 
                                                                           
 
                  Sepal  Sepal   Petal  Petal
   Obs  Species  Length  Width  Length  Width  Sum  Diff  Mult    Div

     1  Setosa     50      33     14      2     83   17   1650  1.51515
     2  Setosa     46      34     14      3     80   12   1564  1.35294
     3  Setosa     46      36     10      2     82   10   1656  1.27778
     4  Setosa     51      33     17      5     84   18   1683  1.54545
     5  Setosa     55      35     13      2     90   20   1925  1.57143
     6  Setosa     48      31     16      2     79   17   1488  1.54839
     7  Setosa     52      34     14      2     86   18   1768  1.52941
     8  Setosa     49      36     14      1     85   13   1764  1.36111
     9  Setosa     44      32     13      2     76   12   1408  1.37500
    10  Setosa     50      35     16      6     85   15   1750  1.42857

Two common teaching points: - SAS functions like mean() handle missing values differently than plain arithmetic. - sum(of ...) is a very SAS-specific idiom and scales well when many variables are involved.


9.2.9 Rename variables

9.2.9.1 Rename one variable

 Data AB;
set sashelp.class; Rename AGE= AGENEW;
Run;

proc print data=AB;
run;
            Obs    Name       Sex    AGENEW    Height    Weight

              1    Alfred      M       14       69.0      112.5
              2    Alice       F       13       56.5       84.0
              3    Barbara     F       13       65.3       98.0
              4    Carol       F       14       62.8      102.5
              5    Henry       M       14       63.5      102.5
              6    James       M       12       57.3       83.0
              7    Jane        F       12       59.8       84.5
              8    Janet       F       15       62.5      112.5
              9    Jeffrey     M       13       62.5       84.0
             10    John        M       12       59.0       99.5
             11    Joyce       F       11       51.3       50.5
             12    Judy        F       14       64.3       90.0
             13    Louise      F       12       56.3       77.0
             14    Mary        F       15       66.5      112.0
             15    Philip      M       16       72.0      150.0
             16    Robert      M       12       64.8      128.0
             17    Ronald      M       15       67.0      133.0
             18    Thomas      M       11       57.5       85.0
             19    William     M       15       66.5      112.0

9.2.9.2 Rename multiple variables programmatically

This advanced example shows how to build a rename list from dictionary.columns and apply it.

2          /*rename*/
3          data AAA;
4           set sashelp.class;
5          run;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.AAA has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.00 seconds
      

6          
7          /*create a list*/
8          proc sql noprint;
9             select cats(name,'=',name,'_new')
10                   into : lis
11                   separated by ' '
12                   from dictionary.columns
13                   where libname = 'WORK' and memname = 'AAA';
14         quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds
      

15         %put &lis;
Name=Name_new Sex=Sex_new Age=Age_new Height=Height_new Weight=Weight_new
16         
17         proc datasets library = work nolist;
18            modify AAA;
19            rename &lis;
NOTE: Renaming variable Name to Name_new.
NOTE: Renaming variable Sex to Sex_new.
NOTE: Renaming variable Age to Age_new.
NOTE: Renaming variable Height to Height_new.
NOTE: Renaming variable Weight to Weight_new.
20         quit;

NOTE: MODIFY was successful for WORK.AAA.DATA.
NOTE: PROCEDURE DATASETS used (Total process time):
      real time           0.03 seconds
      cpu time            0.00 seconds
      

21         
22         proc print data=aaa (obs=2);
23         run;

NOTE: There were 2 observations read from the data set WORK.AAA.
NOTE: The PROCEDURE PRINT printed page 1.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      

24         
25         /*function to deal with text*/
26         %scan(&lis,2);
NOTE: Line generated by the macro function "SCAN".
26                          Sex=Sex_new
                            ___
                            180

ERROR 180-322: Statement is not valid or it is used out of proper order.

27         /*#scan variable*/
28         %substr(NAME, %length(NAME),1);

NOTE: Line generated by the macro function "SUBSTR".
28              E
                _
                180
ERROR 180-322: Statement is not valid or it is used out of proper order.
29         /*#extract character*/
30         %substr(NAME, 2,1);

NOTE: Line generated by the macro function "SUBSTR".
30             A
               _
               180
ERROR 180-322: Statement is not valid or it is used out of proper order.
31         %index(&lis,"Age=Age_new") ;

NOTE: Line generated by the macro function "INDEX".
31              0
                _
                180
ERROR 180-322: Statement is not valid or it is used out of proper order.
32         /*#identify this exists  */

ERROR: Errors printed on page 1.
                                                 Height_    Weight_
        Obs    Name_new    Sex_new    Age_new      new        new

          1     Alfred        M          14        69.0      112.5 
          2     Alice         F          13        56.5       84.0 

Why this matters: - In real clinical trial pipelines, you often receive datasets with inconsistent naming conventions. - Generating rename rules programmatically reduces manual error and improves reproducibility.


9.2.10 Text manipulation

9.2.10.1 Extract text from a character value

Data Mathdata;
Text = "Hello World";
Text1= substr(Text, 6, 2);
Run;

proc print data=mathdata  ;
run;
                        Obs       Text        Text1

                         1     Hello World      W  

substr() is one of the most-used string functions in clinical reporting.


9.2.10.2 Convert character to numeric and reverse

Character → numeric:

Data ABC;
set sashelp.class;
agenew= input (age, best.); 
Run;

proc print data=abc  ;
run;
        Obs    Name       Sex    Age    Height    Weight    agenew

          1    Alfred      M      14     69.0      112.5      14  
          2    Alice       F      13     56.5       84.0      13  
          3    Barbara     F      13     65.3       98.0      13  
          4    Carol       F      14     62.8      102.5      14  
          5    Henry       M      14     63.5      102.5      14  
          6    James       M      12     57.3       83.0      12  
          7    Jane        F      12     59.8       84.5      12  
          8    Janet       F      15     62.5      112.5      15  
          9    Jeffrey     M      13     62.5       84.0      13  
         10    John        M      12     59.0       99.5      12  
         11    Joyce       F      11     51.3       50.5      11  
         12    Judy        F      14     64.3       90.0      14  
         13    Louise      F      12     56.3       77.0      12  
         14    Mary        F      15     66.5      112.0      15  
         15    Philip      M      16     72.0      150.0      16  
         16    Robert      M      12     64.8      128.0      12  
         17    Ronald      M      15     67.0      133.0      15  
         18    Thomas      M      11     57.5       85.0      11  
         19    William     M      15     66.5      112.0      15  

Numeric → character:

Data ABC;
set sashelp.class;
agenew= put (age, best.); 
Run;

proc contents data=abc  ;
run;
                          The CONTENTS Procedure

Data Set Name        WORK.ABC                      Observations          19
Member Type          DATA                          Variables             6 
Engine               V9                            Indexes               0 
Created              02/20/2026 16:44:41           Observation Length    48
Last Modified        02/20/2026 16:44:41           Deleted Observations  0 
Protection                                         Compressed            NO
Data Set Type                                      Sorted                NO
Label                                                                      
Data Representation  WINDOWS_64                                            
Encoding             wlatin1  Western (Windows)                            

                     Engine/Host Dependent Information

Data Set Page Size          65536                                          
Number of Data Set Pages    1                                              
First Data Page             1                                              
Max Obs per Page            1361                                           
Obs in First Data Page      19                                             
Number of Data Set Repairs  0                                              
ExtendObsCounter            YES                                            
Filename                    C:\Users\hed2\AppData\Local\Temp\SAS Temporary 
                            Files\_TD19308_HDW02234493_\abc.sas7bdat       
Release Created             9.0401M5                                       
Host Created                X64_10PRO                                      
Owner Name                  NIH\hed2                                       
File Size                   128KB                                          
File Size (bytes)           131072                                         

                Alphabetic List of Variables and Attributes
 
                       #    Variable    Type    Len

                       3    Age         Num       8
                       4    Height      Num       8
                       1    Name        Char      8
                       2    Sex         Char      1
                       5    Weight      Num       8
                       6    agenew      Char     12

In practice: - input() reads a character string into numeric. - put() formats numeric into character.


9.2.10.3 Change the length of a variable

Data ABC;
set sashelp.class;
Length agenew $10.;
Label agenew=“New age”;
Run;

proc contents data=abc  ;
run;
 
                          The CONTENTS Procedure

Data Set Name        WORK.ABC                      Observations          19
Member Type          DATA                          Variables             6 
Engine               V9                            Indexes               0 
Created              02/20/2026 16:44:42           Observation Length    48
Last Modified        02/20/2026 16:44:42           Deleted Observations  0 
Protection                                         Compressed            NO
Data Set Type                                      Sorted                NO
Label                                                                      
Data Representation  WINDOWS_64                                            
Encoding             wlatin1  Western (Windows)                            

                     Engine/Host Dependent Information

Data Set Page Size          65536                                          
Number of Data Set Pages    1                                              
First Data Page             1                                              
Max Obs per Page            1361                                           
Obs in First Data Page      19                                             
Number of Data Set Repairs  0                                              
ExtendObsCounter            YES                                            
Filename                    C:\Users\hed2\AppData\Local\Temp\SAS Temporary 
                            Files\_TD24572_HDW02234493_\abc.sas7bdat       
Release Created             9.0401M5                                       
Host Created                X64_10PRO                                      
Owner Name                  NIH\hed2                                       
File Size                   128KB                                          
File Size (bytes)           131072                                         

                Alphabetic List of Variables and Attributes
 
               #    Variable    Type    Len    Label

               3    Age         Num       8                 
               4    Height      Num       8                 
               1    Name        Char      8                 
               2    Sex         Char      1                 
               5    Weight      Num       8                 
               6    agenew      Char     10    “New age”

A key habit: define length before assignment when creating new character variables to avoid truncation.


9.2.11 Create a report

proc report is a flexible reporting tool and can replace many proc print / proc tabulate workflows.

proc report data=sashelp.class;
Column age;
Define age / display;
Run;
                                       Age
                                        14
                                        13
                                        13
                                        14
                                        14
                                        12
                                        12
                                        15
                                        13
                                        12
                                        11
                                        14
                                        12
                                        15
                                        16
                                        12
                                        15
                                        11
                                        15

Even this minimal example shows the structure: - column defines the layout, - define controls how variables appear.


9.2.12 Random variables

SAS provides pseudo-random generators via rand().

data ab;
set sashelp.class;
num=rand("normal");
run;

proc print data=ab (obs=10); 
run;
       Obs    Name       Sex    Age    Height    Weight       num

         1    Alfred      M      14     69.0      112.5    -0.44832
         2    Alice       F      13     56.5       84.0     1.55149
         3    Barbara     F      13     65.3       98.0    -0.95175
         4    Carol       F      14     62.8      102.5    -0.66781
         5    Henry       M      14     63.5      102.5     0.25197
         6    James       M      12     57.3       83.0    -1.64294
         7    Jane        F      12     59.8       84.5     0.96083
         8    Janet       F      15     62.5      112.5    -0.45714
         9    Jeffrey     M      13     62.5       84.0    -0.63351
        10    John        M      12     59.0       99.5    -1.00366

This is useful for simulation, multiple imputation diagnostics, and resampling.


9.2.13 Combine two texts, compress spaces, locate a substring, change case

These are common in cleaning messy character fields.

Combine:

Data Mathdata;
Text = "Hello";
Text1= "World";
Text2= text || " " ||text1;
Run;

proc print data=Mathdata  ; 
run;
                   Obs    Text     Text1       Text2

                    1     Hello    World    Hello World

Compress:

Data Mathdata;
Text = "Hello  World  ";
Text1= trim(text);
Text2= compress(text);
Run;

proc print data=Mathdata  ; 
run;
             Obs        Text           Text1          Text2

              1     Hello  World    Hello  World    HelloWorld

Index:

Data Mathdata;
Text = "Hello World";
indextext=  index(text, "or");
Run;

proc print data=Mathdata  ; 
run;
                      Obs       Text        indextext

                       1     Hello World        8    

Case functions:

Data Mathdata;
Text = "Hello World";
upcase=  upcase(text );
lowcase=  lowcase(text );
propcase=  propcase(text );
Run;

proc print data=Mathdata  ; 
run;
      Obs       Text          upcase         lowcase       propcase

       1     Hello World    HELLO WORLD    hello world    Hello World

9.2.14 Deduplication

Deduplication is a frequent preprocessing step, especially when identifying unique subjects, visits, or records.

9.2.14.1 Method 1: nodupkeys

proc sort data = sashelp.class out =  dedup
nodupkeys;
by height; 
run;

proc print data= dedup;
run;
             Obs    Name       Sex    Age    Height    Weight

               1    Joyce       F      11     51.3       50.5
               2    Louise      F      12     56.3       77.0
               3    Alice       F      13     56.5       84.0
               4    James       M      12     57.3       83.0
               5    Thomas      M      11     57.5       85.0
               6    John        M      12     59.0       99.5
               7    Jane        F      12     59.8       84.5
               8    Janet       F      15     62.5      112.5
               9    Carol       F      14     62.8      102.5
              10    Henry       M      14     63.5      102.5
              11    Judy        F      14     64.3       90.0
              12    Robert      M      12     64.8      128.0
              13    Barbara     F      13     65.3       98.0
              14    Mary        F      15     66.5      112.0
              15    Ronald      M      15     67.0      133.0
              16    Alfred      M      14     69.0      112.5
              17    Philip      M      16     72.0      150.0

This keeps the first record for each unique height.

9.2.14.2 Method 2: BY-group with first. / last.

proc sort data=sashelp.class out=data_rank;  by height  ; run;

data outdata dropdata;
 set data_rank;
 by height ;
 if last.height   then output  outdata;
else output dropdata; 
run;
proc print data=dropdata;run; 
              Obs    Name     Sex    Age    Height    Weight

               1     Janet     F      15     62.5      112.5
               2     Mary      F      15     66.5      112.0

This version keeps the last record per height.
Notice that the output depends on the current sort order.


9.2.15 Select a subset of rows

data where;
set sashelp.class;
where sex ne "F";
/*if */
run;

proc print data= where;
run;
             Obs    Name       Sex    Age    Height    Weight

               1    Alfred      M      14     69.0      112.5
               2    Henry       M      14     63.5      102.5
               3    James       M      12     57.3       83.0
               4    Jeffrey     M      13     62.5       84.0
               5    John        M      12     59.0       99.5
               6    Philip      M      16     72.0      150.0
               7    Robert      M      12     64.8      128.0
               8    Ronald      M      15     67.0      133.0
               9    Thomas      M      11     57.5       85.0
              10    William     M      15     66.5      112.0

The where statement filters rows during reading, which is typically more efficient than filtering after reading.


9.2.16 Create macros with DO loops

Macros provide automation, especially when the same PROC is repeated across multiple variables.

9.2.16.1 A macro to calculate descriptive stats across multiple variables

This example uses: - proc means to compute summaries by sex, - ODS output to capture results, - a macro loop to run across variables.

2          %macro means(var_avg) ;
3          
4          /*calculate means*/
5          proc means data=sashelp.class StackODSOutput n mean std  min p5
5        ! p95 max nmiss;
6          var  &var_avg;
7          class sex;
8          ods output summary=result2;
9          run;
10         
11         /*append then output*/
12         data masterresult2;                  * combine results;
13         set masterresult2 result2;
14         run;
15         
16         %mend means;
17         
18         /*use macro to merge all descriptive stats */
19         data masterresult2 ;
20         set _null_;
21         run;

NOTE: The data set WORK.MASTERRESULT2 has 0 observations and 0 variables.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.00 seconds
      

22         
23         %let vars=
24         age
25         height
26         weight
27         ;
28         
29         %macro model ;
30         %do i=1 %to %sysfunc(countw(&vars));
31         
32         %let x=%scan(&vars,&i);
33          %means( &x )
34         
35         %end;
36         %mend model;
37         
38         %model;
NOTE: The data set WORK.RESULT2 has 2 observations and 12 variables.
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The PROCEDURE MEANS printed page 1.
NOTE: PROCEDURE MEANS used (Total process time):
      real time           0.07 seconds
      cpu time            0.00 seconds
      

NOTE: There were 0 observations read from the data set WORK.MASTERRESULT2.
NOTE: There were 2 observations read from the data set WORK.RESULT2.
NOTE: The data set WORK.MASTERRESULT2 has 2 observations and 12 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds
      

NOTE: The data set WORK.RESULT2 has 2 observations and 12 variables.
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The PROCEDURE MEANS printed page 2.
NOTE: PROCEDURE MEANS used (Total process time):
      real time           0.03 seconds
      cpu time            0.00 seconds
      

WARNING: Multiple lengths were specified for the variable Variable by 
         input data set(s). This can cause truncation of data.
NOTE: There were 2 observations read from the data set WORK.MASTERRESULT2.
NOTE: There were 2 observations read from the data set WORK.RESULT2.
NOTE: The data set WORK.MASTERRESULT2 has 4 observations and 12 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds
      

NOTE: The data set WORK.RESULT2 has 2 observations and 12 variables.
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The PROCEDURE MEANS printed page 3.
NOTE: PROCEDURE MEANS used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds
      

WARNING: Multiple lengths were specified for the variable Variable by 
         input data set(s). This can cause truncation of data.
NOTE: There were 4 observations read from the data set WORK.MASTERRESULT2.
NOTE: There were 2 observations read from the data set WORK.RESULT2.
NOTE: The data set WORK.MASTERRESULT2 has 6 observations and 12 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds
      

39         
40         proc print data= masterresult2;
41         run;

NOTE: There were 6 observations read from the data set WORK.MASTERRESULT2.
NOTE: The PROCEDURE PRINT printed page 4.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      

42         
                            The MEANS Procedure

                         Analysis Variable : Age 
 
        N
Sex   Obs    N           Mean        Std Dev        Minimum       5th Pctl
--------------------------------------------------------------------------
F       9    9      13.222222       1.394433      11.000000      11.000000

M      10   10      13.400000       1.646545      11.000000      11.000000
--------------------------------------------------------------------------

                         Analysis Variable : Age 
 
                     N                                      N
             Sex   Obs      95th Pctl         Maximum    Miss
             ------------------------------------------------
             F       9      15.000000       15.000000       0

             M      10      16.000000       16.000000       0
             ------------------------------------------------
 
                                                                           
 
                            The MEANS Procedure

                       Analysis Variable : Height 
 
        N
Sex   Obs    N           Mean        Std Dev        Minimum       5th Pctl
--------------------------------------------------------------------------
F       9    9      60.588889       5.018328      51.300000      51.300000

M      10   10      63.910000       4.937937      57.300000      57.300000
--------------------------------------------------------------------------

                       Analysis Variable : Height 
 
                     N                                      N
             Sex   Obs      95th Pctl         Maximum    Miss
             ------------------------------------------------
             F       9      66.500000       66.500000       0

             M      10      72.000000       72.000000       0
             ------------------------------------------------
 
                                                                           
 
                            The MEANS Procedure

                       Analysis Variable : Weight 
 
        N
Sex   Obs    N           Mean        Std Dev        Minimum       5th Pctl
--------------------------------------------------------------------------
F       9    9      90.111111      19.383914      50.500000      50.500000

M      10   10     108.950000      22.727186      83.000000      83.000000
--------------------------------------------------------------------------

                       Analysis Variable : Weight 
 
                     N                                      N
             Sex   Obs      95th Pctl         Maximum    Miss
             ------------------------------------------------
             F       9     112.500000      112.500000       0

             M      10     150.000000      150.000000       0
             ------------------------------------------------
 
                                                                           
 
Obs   Sex   NObs   _control_   Variable    N           Mean         StdDev

 1     F      9                  Age       9      13.222222       1.394433
 2     M     10        1         Age      10      13.400000       1.646545
 3     F      9                  Hei       9      60.588889       5.018328
 4     M     10        1         Hei      10      63.910000       4.937937
 5     F      9                  Wei       9      90.111111      19.383914
 6     M     10        1         Wei      10     108.950000      22.727186

Obs            Min             P5            P95            Max   NMiss

 1       11.000000      11.000000      15.000000      15.000000     0  
 2       11.000000      11.000000      16.000000      16.000000     0  
 3       51.300000      51.300000      66.500000      66.500000     0  
 4       57.300000      57.300000      72.000000      72.000000     0  
 5       50.500000      50.500000     112.500000     112.500000     0  
 6       83.000000      83.000000     150.000000     150.000000     0  

Practical interpretation: - &vars defines the batch of variables. - %do iterates through them. - Each iteration runs proc means and appends results into a single dataset.

This is exactly the type of pattern used in clinical reporting pipelines when building shells.


9.2.17 Output intermediate tables with ODS

ODS is the bridge from SAS procedures to reusable datasets.

9.2.17.1 Discover table names: ods trace

ods trace on;
proc freq data=sashelp.class  ;
table  sex;
run;
ods trace off;
                            The FREQ Procedure

                                         Cumulative    Cumulative
         Sex    Frequency     Percent     Frequency      Percent
         --------------------------------------------------------
         F             9       47.37             9        47.37  
         M            10       52.63            19       100.00  

ods trace tells you which internal table objects SAS produces.

9.2.17.2 Capture a specific output table

ods   listing close;
ods trace on;
ods output ParameterEstimates= ParameterEstimates  ;
proc glm data=sashelp.class;
model height=age;
run;
ods trace off;
ods listing;

proc print data=ParameterEstimates;
run;
  Obs  Dependent  Parameter      Estimate        StdErr   tValue   Probt

   1    Height    Intercept   25.22388451    6.52168912     3.87  0.0012
   2    Height    Age          2.78713911    0.48688163     5.72  <.0001

This pattern is essential when you want to: - compute results in SAS, - feed them into later steps, - export them, or - compare them against R outputs.


9.2.18 Create sequence numbers

Sequence variables are common in SDTM/ADaM creation and tracking.

9.2.18.1 Grouped sequence numbers

/* grouped*/
proc sort data=sashelp.class out=class ; by name; run; 
data temp; set class; 
    by name; 
    count + 1;
    if first.name then count = 1; 
run; 
proc print data=temp (obs=3);run; 


proc sort data=class out=class2 ; by height; run; 
data class3;
set class2;
by height;
retain count;
if first.height then count=1;
else count=count+1;
run;
proc print data=class3 (obs=3);run; 

 /*nogrouped*/
data new;
 set class;
 seqno = _n_;
run;

proc print data=new (obs=3);run; 
         Obs     Name      Sex    Age    Height    Weight    count

           1    Alfred      M      14     69.0      112.5      1  
           2    Alice       F      13     56.5       84.0      1  
           3    Barbara     F      13     65.3       98.0      1  
 
                                                                           
 
         Obs     Name     Sex    Age    Height    Weight    count

           1    Joyce      F      11     51.3      50.5       1  
           2    Louise     F      12     56.3      77.0       1  
           3    Alice      F      13     56.5      84.0       1  
 
                                                                           
 
         Obs     Name      Sex    Age    Height    Weight    seqno

           1    Alfred      M      14     69.0      112.5      1  
           2    Alice       F      13     56.5       84.0      2  
           3    Barbara     F      13     65.3       98.0      3  

Key idea: - _n_ is the observation index within the data step. - first. and last. require sorted data and allow per-group logic.


9.2.19 Merge datasets

SAS has two main approaches: - SQL joins (proc sql) — flexible and readable, - DATA step merge — fast and common in production, but requires sorting and careful logic.

9.2.19.1 Left and right join

2          data class2;
3          input name $   score;
4          datalines;

NOTE: The data set WORK.CLASS2 has 3 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.04 seconds
      cpu time            0.00 seconds
      

8          ;
9          run;
10         
11         proc sql;
12         create table class_c as
13           select *
14             from sashelp.class as a
15            left join class2 as b
16              on a.name = b.name;
WARNING: Variable name already exists on file WORK.CLASS_C.
NOTE: Table WORK.CLASS_C created, with 19 rows and 6 columns.

17         quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.02 seconds
      cpu time            0.00 seconds
      

18         
19         proc print data=class_c;
20         run;

NOTE: There were 19 observations read from the data set WORK.CLASS_C.
NOTE: The PROCEDURE PRINT printed page 1.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.04 seconds
      cpu time            0.00 seconds
      

21          
22         proc sql;
23         create table class_d as
24           select *
25             from class2 as a
26            right join sashelp.class as b
27              on a.name = b.name;
WARNING: Variable Name already exists on file WORK.CLASS_D.
NOTE: Table WORK.CLASS_D created, with 19 rows and 6 columns.

28         quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds
      

29         
30         proc print data=class_d;
31         run;

NOTE: There were 19 observations read from the data set WORK.CLASS_D.
NOTE: The PROCEDURE PRINT printed page 2.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      

32         
         Obs    Name       Sex    Age    Height    Weight    score

           1    Alfred      M      14     69.0      112.5      85 
           2    Alice       F      13     56.5       84.0      89 
           3    Barbara     F      13     65.3       98.0       . 
           4    Carol       F      14     62.8      102.5       . 
           5    Henry       M      14     63.5      102.5       . 
           6    James       M      12     57.3       83.0       . 
           7    Jane        F      12     59.8       84.5       . 
           8    Janet       F      15     62.5      112.5       . 
           9    Jeffrey     M      13     62.5       84.0       . 
          10    John        M      12     59.0       99.5       . 
          11    Joyce       F      11     51.3       50.5       . 
          12    Judy        F      14     64.3       90.0       . 
          13    Louise      F      12     56.3       77.0       . 
          14    Mary        F      15     66.5      112.0       . 
          15    Philip      M      16     72.0      150.0       . 
          16    Robert      M      12     64.8      128.0       . 
          17    Ronald      M      15     67.0      133.0       . 
          18    Thomas      M      11     57.5       85.0       . 
          19    William     M      15     66.5      112.0       . 
 
                                                                           
 
         Obs     name     score    Sex    Age    Height    Weight

           1    Alfred      85      M      14     69.0      112.5
           2    Alice       89      F      13     56.5       84.0
           3                 .      F      13     65.3       98.0
           4                 .      F      14     62.8      102.5
           5                 .      M      14     63.5      102.5
           6                 .      M      12     57.3       83.0
           7                 .      F      12     59.8       84.5
           8                 .      F      15     62.5      112.5
           9                 .      M      13     62.5       84.0
          10                 .      M      12     59.0       99.5
          11                 .      F      11     51.3       50.5
          12                 .      F      14     64.3       90.0
          13                 .      F      12     56.3       77.0
          14                 .      F      15     66.5      112.0
          15                 .      M      16     72.0      150.0
          16                 .      M      12     64.8      128.0
          17                 .      M      15     67.0      133.0
          18                 .      M      11     57.5       85.0
          19                 .      M      15     66.5      112.0

9.2.19.2 Full join

2          data class2;
3          input name $   score;
4          datalines;

NOTE: The data set WORK.CLASS2 has 3 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.00 seconds
      

8          ;
9          run;
10         
11         proc sql;
12         create table class_e as
13           select *
14             from sashelp.class as a
15            full   join class2 as b
16              on a.name = b.name;
WARNING: Variable name already exists on file WORK.CLASS_E.
NOTE: Table WORK.CLASS_E created, with 20 rows and 6 columns.

17         quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.04 seconds
      cpu time            0.01 seconds
      

18         
19         proc print data=class_e;
20         run;

NOTE: There were 20 observations read from the data set WORK.CLASS_E.
NOTE: The PROCEDURE PRINT printed page 1.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.06 seconds
      cpu time            0.06 seconds
      

21         
         Obs    Name       Sex    Age    Height    Weight    score

           1    Alfred      M      14     69.0      112.5      85 
           2    Alice       F      13     56.5       84.0      89 
           3    Barbara     F      13     65.3       98.0       . 
           4    Carol       F      14     62.8      102.5       . 
           5                        .       .          .       99 
           6    Henry       M      14     63.5      102.5       . 
           7    James       M      12     57.3       83.0       . 
           8    Jane        F      12     59.8       84.5       . 
           9    Janet       F      15     62.5      112.5       . 
          10    Jeffrey     M      13     62.5       84.0       . 
          11    John        M      12     59.0       99.5       . 
          12    Joyce       F      11     51.3       50.5       . 
          13    Judy        F      14     64.3       90.0       . 
          14    Louise      F      12     56.3       77.0       . 
          15    Mary        F      15     66.5      112.0       . 
          16    Philip      M      16     72.0      150.0       . 
          17    Robert      M      12     64.8      128.0       . 
          18    Ronald      M      15     67.0      133.0       . 
          19    Thomas      M      11     57.5       85.0       . 
          20    William     M      15     66.5      112.0       . 

9.2.19.3 DATA step merge (append-style merge by key)

data class2;
input name $   score;
datalines;
Alfred  85
Alice 89
Daniel 99
;
run;
 
data class_f;
merge  sashelp.class class2;
by name;
/*if A and B;  */
run;

proc print data=class_f;
run;
         Obs    Name       Sex    Age    Height    Weight    score

           1    Alfred      M      14     69.0      112.5      85 
           2    Alice       F      13     56.5       84.0      89 
           3    Barbara     F      13     65.3       98.0       . 
           4    Carol       F      14     62.8      102.5       . 
           5    Daniel              .       .          .       99 
           6    Henry       M      14     63.5      102.5       . 
           7    James       M      12     57.3       83.0       . 
           8    Jane        F      12     59.8       84.5       . 
           9    Janet       F      15     62.5      112.5       . 
          10    Jeffrey     M      13     62.5       84.0       . 
          11    John        M      12     59.0       99.5       . 
          12    Joyce       F      11     51.3       50.5       . 
          13    Judy        F      14     64.3       90.0       . 
          14    Louise      F      12     56.3       77.0       . 
          15    Mary        F      15     66.5      112.0       . 
          16    Philip      M      16     72.0      150.0       . 
          17    Robert      M      12     64.8      128.0       . 
          18    Ronald      M      15     67.0      133.0       . 
          19    Thomas      M      11     57.5       85.0       . 
          20    William     M      15     66.5      112.0       . 

9.2.19.4 Inner join

2          data class2;
3          input name $   score;
4          datalines;

NOTE: The data set WORK.CLASS2 has 3 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.06 seconds
      

8          ;
9          run;
10         
11         proc sql;
12         create table class_e as
13           select *
14             from sashelp.class as a
15            inner join class2 as b
16              on a.name = b.name;
WARNING: Variable name already exists on file WORK.CLASS_E.
NOTE: Table WORK.CLASS_E created, with 2 rows and 6 columns.

17         quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.06 seconds
      cpu time            0.01 seconds
      

18         
19         proc print data=class_e;
20         run;

NOTE: There were 2 observations read from the data set WORK.CLASS_E.
NOTE: The PROCEDURE PRINT printed page 1.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.05 seconds
      cpu time            0.04 seconds
      

21         
         Obs     Name     Sex    Age    Height    Weight    score

          1     Alfred     M      14     69.0      112.5      85 
          2     Alice      F      13     56.5       84.0      89 

9.2.19.5 Minus join (anti-join)

2          data class2;
3          input name $   score;
4          datalines;

NOTE: The data set WORK.CLASS2 has 3 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.00 seconds
      

8          ;
9          run;
10         
11         proc sql;
12         create table class_e as
13           select *
14             from sashelp.class as a
15            left join class2 as b
16              on a.name = b.name
17              where b.name is NULL;
WARNING: Variable name already exists on file WORK.CLASS_E.
NOTE: Table WORK.CLASS_E created, with 17 rows and 6 columns.

18         quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.03 seconds
      cpu time            0.03 seconds
      

19         
20         proc print data=class_e;
21         run;

NOTE: There were 17 observations read from the data set WORK.CLASS_E.
NOTE: The PROCEDURE PRINT printed page 1.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.03 seconds
      cpu time            0.00 seconds
      
         Obs    Name       Sex    Age    Height    Weight    score

           1    Barbara     F      13     65.3       98.0      .  
           2    Carol       F      14     62.8      102.5      .  
           3    Henry       M      14     63.5      102.5      .  
           4    James       M      12     57.3       83.0      .  
           5    Jane        F      12     59.8       84.5      .  
           6    Janet       F      15     62.5      112.5      .  
           7    Jeffrey     M      13     62.5       84.0      .  
           8    John        M      12     59.0       99.5      .  
           9    Joyce       F      11     51.3       50.5      .  
          10    Judy        F      14     64.3       90.0      .  
          11    Louise      F      12     56.3       77.0      .  
          12    Mary        F      15     66.5      112.0      .  
          13    Philip      M      16     72.0      150.0      .  
          14    Robert      M      12     64.8      128.0      .  
          15    Ronald      M      15     67.0      133.0      .  
          16    Thomas      M      11     57.5       85.0      .  
          17    William     M      15     66.5      112.0      .  

This is the SAS SQL equivalent of “keep rows in A that do not match B”.


9.2.20 Create Table 1

In clinical reporting, “Table 1” typically summarizes baseline demographics and characteristics by treatment group.

Your code calls external macro files via %include.
This is a realistic production pattern: statistical programming teams maintain shared macro libraries, and analyses import them.

2            *** Load utility macros;
3          %include
3        ! "C:\Users\hed2\Downloads\mybook2\mybook2\sasmacro\create_table1.
3        ! sas";
WARNING: Physical file does not exist, 
         C:\Users\hed2\Downloads\mybook2\mybook2\sasmacro\create_table1.sas
         .
ERROR: Cannot open %INCLUDE file 
       C:\Users\hed2\Downloads\mybook2\mybook2\sasmacro\create_table1.sas.
4          
5            *** Specify input and output data sets, and the column
5        ! variable.;
6          %let INPUT_DATA  = sashelp.class;
7          %let OUTPUT_DATA = Table1 ;
8          %let COLVAR      = sex;
9          
10           /*chort*/
11          *** %AddText(text=Height);
12          *** %CategoricalRowVar2(rowVar=);
13         %ContinuousRowVar2(rowVar=height );
           _
           180
WARNING: Apparent invocation of macro CONTINUOUSROWVAR2 not resolved.

ERROR 180-322: Statement is not valid or it is used out of proper order.

WARNING: Apparent invocation of macro CONTINUOUSROWVAR2 not resolved.
14          *** %AddText(text=);
15         
16         %ContinuousRowVar2(rowVar=weight );
           _
           180

ERROR 180-322: Statement is not valid or it is used out of proper order.

WARNING: Apparent invocation of macro CONTINUOUSROWVAR2 not resolved.
17         %ContinuousRowVar2(rowVar=Age );
           _
           180

ERROR 180-322: Statement is not valid or it is used out of proper order.

18         
19         proc print data= table1;
ERROR: File WORK.TABLE1.DATA does not exist.
20         run;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: SAS set option OBS=0 and will continue to check statements. 
      This might cause NOTE: No observations in data set.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.04 seconds
      cpu time            0.00 seconds
      
21         
22         /* Export Table 1 as a CSV file*/


NOTE: PROCEDURE EXPORT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
NOTE: The SAS System stopped processing this step because of errors.
23         proc export data=Table1 replace label
24             outfile="C:\Users\hed2\Downloads\mybook2\mybook2\sasmacro\ta
24       ! ble1.csv"
25             dbms=csv;
26         run;
27         


ERROR: Errors printed on page 1.

Interpretation: - Macro calls like %ContinuousRowVar2(...) likely append rows into the Table 1 dataset. - proc export creates a portable deliverable (CSV) for review, QA, or publication workflows.


9.3 Using macros

Macros are the “automation layer” of SAS. They help you: - avoid copy/paste, - standardize reports, - scale analyses to many endpoints/variables.

9.3.0.1 Create a macro variable from a statistic

2          proc sql noprint;
3          select  std(age) format=best32.
4             into :age_mean
5             from sashelp.class;
6          quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.02 seconds
      cpu time            0.00 seconds
      

7          
8          %put Mean of age: &age_mean;
Mean of age:                 1.49267215939689
9          %putlog Mean of age: &age_mean;
           _
           180
WARNING: Apparent invocation of macro PUTLOG not resolved.

ERROR 180-322: Statement is not valid or it is used out of proper order.

ERROR: Errors printed on page 1.

This is a standard trick: - compute something with PROC SQL, - store into a macro variable, - reuse it later in titles, footnotes, conditions, or other macros.


9.3.0.2 Select variable names from metadata

SAS metadata tables (dictionary.columns) are extremely powerful for programmatic workflows.

proc sql;
     select name
           into :vars separated by ' '
            from dictionary.columns
              where libname="SASHELP" and
                memname="CLASS" and varnum=4;
   ;
quit;
%put &vars.;
                     Column Name
                     --------------------------------
                     Height                          

This pattern is the foundation of “dynamic programming” in SAS: - detect variables automatically, - loop over them, - generate consistent outputs.


9.3.0.3 Iterative DO loops

DATA step loops are for row-by-row computations and simulation-like constructions.

data do_to;
   x=10;
   y="yes";
   do i=1 to 10;
      x=x+1;
      output;
   end;
run;
proc print data=do_to;
run;
                          Obs     x     y      i

                            1    11    yes     1
                            2    12    yes     2
                            3    13    yes     3
                            4    14    yes     4
                            5    15    yes     5
                            6    16    yes     6
                            7    17    yes     7
                            8    18    yes     8
                            9    19    yes     9
                           10    20    yes    10

do while loops are useful for process-like simulations:

data loan;
   balance=1000;
   payment=0;
   do while (balance>0);
      balance=balance-100;
      payment=payment+1;
      output;
   end;
run;
proc print data=loan;
run;
                         Obs    balance    payment

                           1      900          1  
                           2      800          2  
                           3      700          3  
                           4      600          4  
                           5      500          5  
                           6      400          6  
                           7      300          7  
                           8      200          8  
                           9      100          9  
                          10        0         10  

9.3.0.4 DO loop inside a macro

This example illustrates how macros generate SAS code and then run it.

%macro run_calculation(amt, t, r);
    data customer_value;
        i=&r./10.;
        do n=0 to &t.;
            S=&amt.*((1+i)*n - 1)/i;  /*Power operator */
            output; /*output s*/
        end;
   file print;
   putlog  s @@;
   put  s @@;
    run;
proc print data=customer_value;
run;
%mend;

%run_calculation(amt=100, t=10, r=7);
2285.7142857
 
                                                                           
 
                       Obs     i      n        S

                         1    0.7     0     -142.86
                         2    0.7     1      100.00
                         3    0.7     2      342.86
                         4    0.7     3      585.71
                         5    0.7     4      828.57
                         6    0.7     5     1071.43
                         7    0.7     6     1314.29
                         8    0.7     7     1557.14
                         9    0.7     8     1800.00
                        10    0.7     9     2042.86
                        11    0.7    10     2285.71

Even when the formula is toy-like, the structure mirrors production macro design: - macro parameters define the scenario, - data step creates a dataset, - output is printed or reported.


9.3.1 PROC REPORT for flexible summaries

Here we generate a summary dataset with proc means, then format it with proc report.

proc means data=sashelp.class;
var _NUMERIC_;  /*_CHARACTER_*/
output out=want mean= sum= max= /autoname;
run;

proc report data= want;
Column _FREQ_;
Column Age_Mean;
Column Age_Sum;
Column Age_max;
Define _FREQ_  /"The total number" display;
Define Age_mean  /"Mean of age" display;
Define Age_sum  /"Sum of age" display;
Define Age_max  /"Max of age" display;
Run;
                            The MEANS Procedure

 Variable    N           Mean        Std Dev        Minimum        Maximum
 -------------------------------------------------------------------------
 Age        19     13.3157895      1.4926722     11.0000000     16.0000000
 Height     19     62.3368421      5.1270752     51.3000000     72.0000000
 Weight     19    100.0263158     22.7739335     50.5000000    150.0000000
 -------------------------------------------------------------------------
 
                                                                           
 
                      The                                 
                    total    Mean of     Sum of     Max of
                   number        age        age        age
                       19  13.315789        253         16

Teaching point: - proc means creates the statistics, - proc report controls presentation.

This split is clean and scalable.


9.4 Use SAS formats

Formats provide consistent labeling, grouping, and display standards—especially important in regulated reporting.

9.4.0.1 Creating labels for different values

PROC FORMAT;
    VALUE LIKERT7_A
      1,2,3   = "Disagree"
      4       = "Neither Agree nor Disagree"
      5,6,7   = "Agree"
RUN;

PROC FORMAT;
    VALUE INCOME
      LOW   -< 20000 = "Low"
      20000 -< 60000 = "Middle"
      60000 - HIGH   = "High";
RUN;

PROC FORMAT;
    VALUE RACE
       1     = "White"
       2     = "Black"
       OTHER = "Other";
RUN;

PROC FORMAT;
    VALUE $GENDERLABEL
      "M"   = "Male"
      "F"   = "Female";
RUN;

DATA sample;
    SET sashelp.class;
    FORMAT sex GENDERLABEL. ;
RUN;

proc freq data=sample;
table sex;
run;
                            The FREQ Procedure

                                           Cumulative    Cumulative
        Sex       Frequency     Percent     Frequency      Percent
        -----------------------------------------------------------
        Female           9       47.37             9        47.37  
        Male            10       52.63            19       100.00  

Formats are not only cosmetic—they also define category behavior in many procedures.


9.4.1 Storing formats to a library

When you create formats in work, they disappear after the session ends.
A permanent format library allows reuse across projects and reports.

9.4.1.1 Way 1: store in a dedicated format library and use FMTSEARCH

2          LIBNAME format
2        ! "C:\Users\hed2\Downloads\code-storage\code\format";
NOTE: Library FORMAT does not exist.
3          
4          PROC FORMAT LIBRARY=format ;
ERROR: Library FORMAT does not exist.
5              VALUE $GENDER
6             "M"   = "Male_new"
7             "F"   = "Female_new";
NOTE: The previous statement has been deleted.
8          RUN;

NOTE: PROCEDURE FORMAT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
NOTE: The SAS System stopped processing this step because of errors.
9          
10         OPTIONS FMTSEARCH=(format);


11         DATA sample;
12             SET sashelp.class;
13             FORMAT sex GENDER. ;
                          _______
                          48
ERROR 48-59: The format $GENDER was not found or could not be loaded.

14         RUN;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: SAS set option OBS=0 and will continue to check statements. 
      This might cause NOTE: No observations in data set.
WARNING: The data set WORK.SAMPLE may be incomplete.  When this step was 
         stopped there were 0 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.04 seconds
      cpu time            0.04 seconds
      

15         
16         proc freq data=sample;
17         table sex;
18         run;

NOTE: PROCEDURE FREQ used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      

ERROR: Errors printed on page 1.

Interpretation: - LIBNAME format points to a permanent folder. - PROC FORMAT LIBRARY=format stores formats there. - FMTSEARCH tells SAS where to look for formats.


9.4.1.2 Way 2: include a SAS file that defines formats

2          %INCLUDE
2        ! 'C:\Users\hed2\Downloads\code-storage\code\format\format_test.sa
2        ! s';
WARNING: Physical file does not exist, 
         C:\Users\hed2\Downloads\code-storage\code\format\format_test.sas.
ERROR: Cannot open %INCLUDE file 
       C:\Users\hed2\Downloads\code-storage\code\format\format_test.sas.
3          DATA sample;
4              SET sashelp.class;
5              FORMAT sex GENDERsex. ;
                          __________
                          48
ERROR 48-59: The format $GENDERSEX was not found or could not be loaded.

6          RUN;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: SAS set option OBS=0 and will continue to check statements. 
      This might cause NOTE: No observations in data set.
WARNING: The data set WORK.SAMPLE may be incomplete.  When this step was 
         stopped there were 0 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.00 seconds
      

7          
8          proc freq data=sample;
9          table sex;
10         run;

NOTE: PROCEDURE FREQ used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      

ERROR: Errors printed on page 1.

This is common in shared programming environments: - formats are maintained in version-controlled .sas files, - projects include them as needed.


9.4.1.3 Way 3: add formats to an existing format library

2           LIBNAME format
2        ! "C:\Users\hed2\Downloads\code-storage\code\format";
NOTE: Library FORMAT does not exist.
3          
4          PROC FORMAT LIBRARY=format ;
ERROR: Library FORMAT does not exist.
5              VALUE $GENDER
6             "M"   = "Male_new"
7             "F"   = "Female_new";
NOTE: The previous statement has been deleted.
8          RUN;

NOTE: PROCEDURE FORMAT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
NOTE: The SAS System stopped processing this step because of errors.
9          


10         PROC FORMAT LIBRARY=format ;
ERROR: Library FORMAT does not exist.
11           VALUE $gendersextwo
12            "M"   = "Male_new2_two"
13            "F"   = "Female_new2_two";
NOTE: The previous statement has been deleted.
14         RUN;

NOTE: PROCEDURE FORMAT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
NOTE: The SAS System stopped processing this step because of errors.
15          
16         OPTIONS FMTSEARCH=(format);


17         DATA sample;
18             SET sashelp.class;
19             FORMAT sex GENDERsextwo. ;
                          _____________
                          48
ERROR 48-59: The format $GENDERSEXTWO was not found or could not be loaded.

20          RUN;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: SAS set option OBS=0 and will continue to check statements. 
      This might cause NOTE: No observations in data set.
WARNING: The data set WORK.SAMPLE may be incomplete.  When this step was 
         stopped there were 0 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.01 seconds
      

21         
22         proc freq data=sample;
23         table sex;
24         run;

NOTE: PROCEDURE FREQ used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      

ERROR: Errors printed on page 1.

This demonstrates that format libraries are extendable: you can keep adding definitions over time.


9.4.1.4 Way 4: apply formats in batch via %include

2          LIBNAME format
2        ! "C:\Users\hed2\Downloads\code-storage\code\format";
NOTE: Library FORMAT does not exist.
3          
4          PROC FORMAT LIBRARY=format ;
ERROR: Library FORMAT does not exist.
5              VALUE $GENDER
6             "M"   = "Male_new"
7             "F"   = "Female_new";
NOTE: The previous statement has been deleted.
8          RUN;

NOTE: PROCEDURE FORMAT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
NOTE: The SAS System stopped processing this step because of errors.
9          
10         OPTIONS FMTSEARCH=(format);


11         DATA sample;
12             SET sashelp.class;
13         %INCLUDE
13       ! 'C:\Users\hed2\Downloads\code-storage\code\format\use_format.sas
13       ! ';
WARNING: Physical file does not exist, 
         C:\Users\hed2\Downloads\code-storage\code\format\use_format.sas.
ERROR: Cannot open %INCLUDE file 
       C:\Users\hed2\Downloads\code-storage\code\format\use_format.sas.
14         RUN;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.SAMPLE has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.01 seconds
      

15         
16         proc freq data=sample;
17         table sex;
18         run;

NOTE: There were 19 observations read from the data set WORK.SAMPLE.
NOTE: The PROCEDURE FREQ printed page 1.
NOTE: PROCEDURE FREQ used (Total process time):
      real time           0.06 seconds
      cpu time            0.00 seconds
      

ERROR: Errors printed on page 1.
                            The FREQ Procedure

                                         Cumulative    Cumulative
         Sex    Frequency     Percent     Frequency      Percent
         --------------------------------------------------------
         F             9       47.37             9        47.37  
         M            10       52.63            19       100.00  

This pattern is helpful when: - many variables require formatting, - format assignment rules are long, - you want to keep the main analysis code clean.


9.4.2 Modify formats

User-defined formats cannot be edited “in place” like a database table; you typically redefine them.
Your example also shows using other= to handle all remaining values.

/*User-defined formats cannot be edited. to create a macro by myself ; or proc freq then use excel*/; 
PROC FORMAT ;
    VALUE $GENDER
      "M"   = "Male_new"
      "F"   = "Female_new";
RUN;
proc format;
      value $sex_f  "F"="Female_f" other=[5.1]; 
run;

DATA sample;
    SET sashelp.class;
    FORMAT sex $sex_f. ;
 RUN;

proc freq data=sample;
table sex;
run;
                            The FREQ Procedure

                                            Cumulative    Cumulative
       Sex         Frequency     Percent     Frequency      Percent
       -------------------------------------------------------------
       Female_f           9       47.37             9        47.37  
       M                 10       52.63            19       100.00  

9.4.3 Transform missing to character then to numeric again

This technique appears in recoding tasks where you need numeric categories but start from character labels.

PROC FORMAT ;
    VALUE $num
      "M"   = 1
      "F"   = 2;
RUN;

DATA sample_2; 
  SET sashelp.class;
  sex_2 = input(put(sex ,num.),best.);  /*Converting variable types and keep*/
RUN;
proc freq data=sample_2;
table sex_2;
run;
                            The FREQ Procedure

                                          Cumulative    Cumulative
        sex_2    Frequency     Percent     Frequency      Percent
        ----------------------------------------------------------
            1          10       52.63            10        52.63  
            2           9       47.37            19       100.00  

Mechanism: - put(sex, num.) maps “M”/“F” to “1”/“2” as character, - input(..., best.) converts that character into numeric.


9.4.4 Output format definition details

When you need to inspect what a format contains, you can output it.

PROC FORMAT ;
    VALUE $GENDER
      "M"   = "Male_new"
      "F"   = "Female_new";
RUN;

proc format library = work.formats FMTLIB cntlout = cntlout;
select $GENDER;
run;
---------------------------------------------------------------------------
-
|       FORMAT NAME: $GENDER  LENGTH:   10   NUMBER OF VALUES:    2
|
|   MIN LENGTH:   1  MAX LENGTH:  40  DEFAULT LENGTH:  10  FUZZ:        0
|
|--------------------------------------------------------------------------
|
|START           |END             |LABEL  (VER. V7|V8
20FEB2026:16:45:16)|
|----------------+----------------+----------------------------------------
|
|F               |F               |Female_new
|
|M               |M               |Male_new
|
---------------------------------------------------------------------------
-

This creates cntlout, which is a dataset representation of the format—useful for QA and documentation.


9.4.5 A macro to copy an existing PROC FORMAT

This longer macro demonstrates a common real need: reconstruct a format definition programmatically (for reuse, reporting, or migration).

%macro formatd(data, var);
data temp;
set &data;
keep &var;
run;

/*original freq*/
ods output  OneWayFreqs=sample_b  ;
proc freq data=temp  ;
table  &var /missing ;
run;
ods output close;
 
/*delete format freq*/
data sample2;
set  temp;
Format _all_;
run;
proc freq data=sample2   ;
table  &var /missing out=sample_c ;
run;

/*select original variable code*/
proc sql noprint;
          select name into :firstvar_b from dictionary.columns where libname='WORK' and memname='SAMPLE_B'
          and varnum=2;
quit;
data sample_b2;
set sample_b;
Keep &firstvar_b ;
run;

/*select original variable label*/
proc sql noprint;
          select name into :firstvar_c from dictionary.columns where libname='WORK' and memname='SAMPLE_C'
          and varnum=1;
quit;
data sample_c2;
set sample_c;
Keep &firstvar_c  ;
run;

/*merge variable code and label*/
data sample_bc;
set sample_b2 (RENAME=(&firstvar_b=new_b));
set sample_c2 (RENAME=(&firstvar_c=new_c));
run;

/*create format format and output*/
data sample_bc;
set sample_bc;
Original_format = CATS(new_c,"=","'",new_b,"'");
run;

proc print data=sample_bc noobs;
   var Original_format;
run;

%mend formatd;


PROC FORMAT ;
    VALUE $GENDER
      "M"   = "Male_new"
      "F"   = "Female_new";
RUN;
DATA sample;
    SET sashelp.class;
    FORMAT sex $GENDER. ;
 RUN;

%formatd(sample, sex   );
                            The FREQ Procedure

                                             Cumulative    Cumulative
      Sex           Frequency     Percent     Frequency      Percent
      ---------------------------------------------------------------
      Female_new           9       47.37             9        47.37  
      Male_new            10       52.63            19       100.00  
 
                                                                           
 
                            The FREQ Procedure

                                         Cumulative    Cumulative
         Sex    Frequency     Percent     Frequency      Percent
         --------------------------------------------------------
         F             9       47.37             9        47.37  
         M            10       52.63            19       100.00  
 
                                                                           
 
                                Original_
                                  format

                              F='Female_new'
                              M='Male_new'  

Reading the intent: - sample_b captures formatted labels (via ODS output from PROC FREQ). - sample_c captures raw unformatted codes. - The macro merges them to reconstruct mapping rules like "Male_new"='M'.

This is a clever QA/automation trick when you inherit legacy formats but need documentation.


9.4.6 A macro to view the list of variables

Finally, a compact utility macro to list variable names in a dataset.

%macro varnames (dat);
proc contents
     data = &dat
          noprint
          out = data_info
               (keep = name );
run;
proc print
     data = data_info
          noobs;
run;
%mend;
%varnames (sashelp.class) 
                                  NAME

                                  Age   
                                  Height
                                  Name  
                                  Sex   
                                  Weight

This is particularly helpful inside SASmarkdown documents because it provides quick visibility when you cannot “click around” like in SAS Studio.


9.4.7 Practical workflow recommendations for SASmarkdown

When you write SAS inside R Markdown, the biggest productivity gains come from being explicit:

  1. Make each SAS chunk self-contained, or write intermediate datasets to a permanent library.
  2. Use ODS OUTPUT whenever you need a procedure result as a dataset for downstream steps.
  3. Prefer PROC SQL joins for clarity unless you specifically need DATA step merge behavior.
  4. Store formats in a permanent library if the report will be knitted repeatedly.

With these habits, SASmarkdown becomes a reliable way to produce reproducible, publication-ready analysis documents that combine narrative, code, and output in one place.