The PROC SORT statement supports the SORTSIZE= option, which limits the amount
of memory available for PROC SORT to use.
If you do not use the SORTSIZE option in the PROC SORT statement, PROC SORT
uses the value of the SORTSIZE system option. If the SORTSIZE system option is not
set, PROC SORT uses the amount of memory specified by the REALMEMSIZE system
option. If PROC SORT needs more memory than you specify, it creates a temporary
utility file in your SAS Work directory to complete the sort.
The default value of this option is 1G.
The TAGSORT option is useful in single-threaded situations where there might not be
enough disk space to sort a large SAS data set. The TAGSORT option is not supported
for multi-threaded sorts.
When you specify the TAGSORT option, only sort keys (that is, the variables specified
in the BY statement) and the observation number for each observation are stored in the
temporary files. The sort keys, together with the observation number, are referred to as
tags. At the completion of the sorting process, the tags are used to retrieve the records
from the input data set in sorted order. Thus, in cases where the total number of bytes of
the sort keys is small compared with the length of the record, temporary disk use is
reduced considerably. However, you should have enough disk space to hold another
copy of the data (the output data set) or two copies of the tags, whichever is greater. Note
that although using the TAGSORT option can reduce temporary disk use, the processing
time might be much higher.
Choosing a Location for the Sorted File
When you sort a SAS data set, SAS creates a temporary utility file. If the sort uses
multiple threads, you can specify the location of the utility file by using the UTILLOC
system option. The default location for utility files is the Work data library for both
single-threaded and multi-threaded sorts. If two or more locations are specified for the
UTILLOC system option, the next available location is used as the location for the utility
file. For sorts that use a single thread, the temporary utility file is opened in the Work
data library if there is not enough memory to hold the data set during the sort. The utility
file has a .sas7butl file extension. Before you sort, ensure that your Work data library has
room for this temporary utility file.
The sorted data set replaces the input data set anytime the OUT= option specifies the
same name as the IN/DATA= data set. The OVERWRITE option allows for the input
data set to be deleted before the output data set has been created , decreasing the peak
disk space requirements.
The output data set is populated from the contents of the utility file. The original data set
is deleted after the sort is complete if the output data set is replacing the input data set.
Before you sort a data set, make sure that you have space for the .sas7butl file.
Use the following rules to determine where the .sas7butl file and the resulting sorted data
set are created:
• If you omit the OUT= option in the PROC SORT statement, the data set is sorted on
the drive and in the directory or subdirectory where it is located. For example, if you
submit the following statements (note the two-level data set name), the .sas7butl file
is created in the subdirectory that was created for the SAS session within the
specified WORK directory.
libname mylib 'c:\sas\mydata';
proc sort data=mylib.report;
Advanced Performance Tuning Methods