COBOL SORT Statement

The COBOL SORT statement is used in the PROCEDURE DIVISION to place records from an input file in a temporary work file where their order will be rearranged. This work file is referred to as the sort work file. Once rearranged, the records can be placed back in the original input file, placed in a separate output file, or processed by the program directly from the sort work file.

ENVIRONMENT DIVISION entries

The input file, the sort file, and, if used, the output file must each be defined by a SELECT statement in the INPUT-OUTPUT SECTION, FILE-CONTROL paragraph. As with any other file, the SELECT statement for the sort file specifies both the internal and external filenames for the program and for the Operating System.

 

            SELECT SORT-FILE ASSIGN TO DISK “SORTWORK.TMP”.

 

In this example, the internal filename is SORT-FILE and the external filename is SORTWORK.TMP. These are programmer-supplied words, as with any file definition, and are chosen because they are descriptive. The DOS file extension TMP is chosen to remind you that this is a temporary file.

 

DATA DIVISION entries

In the FILE SECTION, the input and output files would be described as usual, using FD and record description entries. For the SORT-FILE, in place of an FD entry, a sort work file has a similar SD entry, which stands for Sort Description. The record definition following the SD entry is called the sort record and must be broken down enough to show the position and size of the sort key field(s). Multiple keys can be defined. All other fields can be defined as filler entries if not used in the Procedure Division. The total number of characters (sum of the PIC sizes) within this record must match the number of characters in the records to be sorted.

 

            SD            SORT-FILE.

01            SORT-RECORD.

02                                  PIC X(20).

02            SORT-KEY            PIC X(5).

02                                    PIC X(30).

                        02            SR-UNITS            PIC 999.

                        02            SR-COST PIC 9(5)V99.

02                                    PIC X(15).

 

In this example, SD is used instead of FD to describe the sort file in this division. The file name must be the same as the internal file name in the Environment Division. The record is described to define the name, position, data type, and size of the key field for use with the SORT verb in the Procedure Division. Other fields that may be referenced by the program’s procedures are also defined.

PROCEDURE DIVISION statement

The sort work file and the sort key field(s) are among the entries named in a SORT statement. The name of the work file must follow the verb SORT. This name must be the same as the name used in the SELECT and SD entries. Records can be arranged in either ASCENDING or DESCENDING sequence by key field(s) specified. When multiple key fields are used, they must be listed from major to minor. USING and GIVING clauses are used to specify the name of the input data file and the name of the output data file respectively.

 

            SORT SORT-FILE

                        ASCENDING SORT-KEY,

                        USING INPUT-FILE,

                        GIVING OUTPUT-FILE.

 

In this example, the SORT verb would perform these procedures:

1.      The file named INPUT-FILE would automatically be opened for input, records read, rearranged, and stored in the sort’s work file in ascending sequence according to the values in the key field of each record, and the file closed.

2.      The file named OUTPUT-FILE would automatically be opened for output, the sorted records written to the file, and the file closed.

 

To sort the records, the programmer does not code any OPEN, CLOSE, READ, or WRITE statements. The SORT causes all these operations to occur. After the SORT verb is complete, the programmer may open the resulting sorted file for input and process the records normally (by using the verbs OPEN, READ, CLOSE). This may be accomplished in the same program or in a separate one.

 

Collating Sequence

Sorting takes place on the relative binary values of the characters in the key field(s) of the sort record. There are two major code sets in use: the ASCII (American Standards Code for Information Interchange) and EBCDIC (Extended Binary Coded Decimal Interchange Code). ASCII is used in personal computers. EBCDIC is used in most full-scale computers, such as mainframes and minicomputers. Since the codes representing data in these two sets are different, sorting on PCs will occur differently than sorting the same records on a mainframe.

 

Character Set

Collating Sequence

ASCII

Space, 0-9, A-Z, a-z

EBCDIC

Space, a-z, A-Z, 0-9

 

Although the COBOL language is standard, collating sequences are not. This would be a factor to consider only if a particular installation does some processing on a PC and other processing on a mainframe.