Michael Coughlan

Page 41

by Beginning COBOL for Programmers-Apress (2014) (pdf)

SORT WorkFile

ON ASCENDING BookId-WF

AuthorName-WF

USING BookSalesFileUS, BookSalesFileEU

GIVING SortedBookSales

SORT WorkFile

ON DESCENDING NCAP-Result-WF

ASCENDING ManfName-WF, VehicleName-WF

USING NCAP-TestResultsFile

GIVING Sorted-NCAP-TestResultsFile

Simple Sorting Notes

Consider the following:

• SDWorkFileName identifies a temporary work file that the sort process uses as a kind of scratch

pad for sorting. The file is defined in the FILE SECTION using a sort description (SD) rather

than a file description (FD) entry. Even though the work file is a temporary file, it must still

have associated SELECT and ASSIGN clauses in the ENVIRONMENT DIVISION. You can give this

file any name you like; I usually call it WorkFile as I did in Example 14-1.

• SDWorkFileName file is a sequential file with an organization of RECORD SEQUENTIAL. Because

this is the default organization, it is usually omitted (see Listing 14-1).

• Each WorkSortKey#$i identifies a field in the record of the work file. The sorted file will be

ordered on this key field(s).

• When more than one WorkSortKey#$i is specified, the keys decrease in significance from left

to right (the leftmost key is the most significant, and the rightmost is the least significant).

• InFileName and OutFileName are the names of the input and output files, respectively.

• If more than one InFileName is specified, the files are combined (OutFileSize = InFile1Size

+ InFile2Size) and then sorted.

• If more than one OutFileName is specified, then each file receives a copy of the sorted records.

328

Chapter 14 ■ Sorting and Merging

• If the DUPLICATES clause is used, then when the file has been sorted, the final order of records

with duplicate keys (keys with the same value) is the same as that in the unsorted file. If no

DUPLICATES clause is used, the order of records with duplicate keys is undefined.

• AlphabetName is an alphabet name defined in the SPECIAL-NAMES paragraph of the

ENVIRONMENT DIVISION. This clause is used to select the character set the SORT verb uses for

collating the records in the file. The character set may be STANDARD-1 (ASCII), STANDARD-2

(ISO 646), NATIVE (may be defined by the system to be ASCII or EBCDIC; see your

implementer manual), or user defined.

• SORT can be used anywhere in the PROCEDURE DIVISION except in an INPUT PROCEDURE (SORT)

or OUTPUT PROCEDURE (SORT or MERGE) or in the DECLARATIVES SECTION. The purpose of the

INPUT PROCEDURE and OUTPUT PROCEDURE is explained later in this chapter, but an explanation

of the DECLARATIVES SECTION has to wait until Chapter 18.

• The records described for the input file (USING) must be able to fit into the records described

for SDWorkFileName.

• The records described for SDWorkFileName must be able to fit into the records described for

the output file (GIVING).

• The description of WorkSortKey#$i cannot contain an OCCURS clause (it cannot be a table), nor

can it be subordinate to an entry that contains one.

• The InFileName and OutFileName files are automatically opened by the SORT. When the SORT

executes, they must not already be open.

How the Simple SORT Works

Figure 14-2 shows how the simple version of SORT works. In this case, the diagram uses the example in Listing 14-1

to illustrate the point. The sort process takes records from the unsorted BillableServicesFile, sorts them using WorkFile (the temporary work area), and, when the records have been sorted, sends them to SortedBillablesFile.

After sorting, the records in the SortedBillablesFile will be ordered on ascending SubscriberId.

Figure 14-2. Diagram showing how the simple SORT works

329

Chapter 14 ■ Sorting and Merging

Simple Sorting Program

Universal Telecoms has subscribers all over the United States. Each month, the billable activities of these subscribers are gathered into a file. BillableServicesFile is an unordered sequential file. Each record has the following description: Field

Type

Length

Value

SubscriberId

9

10

–

ServiceType

9

1

1(text)/2(voice)

ServiceCost

9

6

0.10–9999.99

A program is required to produce a report that shows the value of the billable services for each subscriber (see Listing 14-1). In the report, BillableValue is the sum of the ServiceCost fields for each subscriber. The report must be printed on ascending SubscriberId and have the following format:

Universal Telecoms Monthly Report

SubscriberId BillableValue

XXXXXXXXXX XXXXXXXXXXX

XXXXXXXXXX XXXXXXXXXXX

XXXXXXXXXX XXXXXXXXXXX

Listing 14-1. A simple SORT applied to the BillableServicesFile

IDENTIFICATION DIVISION.

PROGRAM-ID. Listing14-1.

AUTHOR. Michael Coughlan.

ENVIRONMENT DIVISION.

INPUT-OUTPUT SECTION.

FILE-CONTROL.

SELECT WorkFile ASSIGN TO "WORK.TMP".

SELECT BillableServicesFile ASSIGN TO "Listing14-1.dat"

ORGANIZATION LINE SEQUENTIAL.

SELECT SortedBillablesFile ASSIGN TO "Listing14-1.Srt"

ORGANIZATION LINE SEQUENTIAL.

DATA DIVISION.

FILE SECTION.

FD BillableServicesFile.

01 SubscriberRec-BSF PIC X(17).

SD WorkFile.

01 WorkRec.

02 SubscriberId-WF PIC 9(10).

02 FILLER PIC X(7).

FD SortedBillablesFile.

01 SubscriberRec.

88 EndOfBillablesFile VALUE HIGH-VALUES.

02 SubscriberId PIC 9(10).

330

Chapter 14 ■ Sorting and Merging

02 ServiceType PIC 9.

02 ServiceCost PIC 9(4)V99.

WORKING-STORAGE SECTION.

01 SubscriberTotal PIC 9(5)V99.

01 ReportHeader PIC X(33) VALUE "Universal Telecoms Monthly Report".

01 SubjectHeader PIC X(31) VALUE "SubscriberId BillableValue".

01 SubscriberLine.

02 PrnSubscriberId PIC 9(10).

02 FILLER PIC X(8) VALUE SPACES.

02 PrnSubscriberTotal PIC $$$,$$9.99.

01 PrevSubscriberId PIC 9(10).

PROCEDURE DIVISION.

Begin.

SORT WorkFile ON ASCENDING KEY SubscriberId-WF

USING BillableServicesFile

GIVING SortedBillablesFile

DISPLAY ReportHeader

DISPLAY SubjectHeader

OPEN INPUT SortedBillablesFile

READ SortedBillablesFile

AT END SET EndOfBillablesFile TO TRUE

END-READ

PERFORM UNTIL EndOfBillablesFile

MOVE SubscriberId TO PrevSubscriberId, PrnSubscriberId

MOVE ZEROS TO SubscriberTotal

PERFORM UNTIL SubscriberId NOT EQUAL TO PrevSubscriberId

ADD ServiceCost TO SubscriberTotal

READ SortedBillablesFile

AT END SET EndOfBillablesFile TO TRUE

END-READ

END-PERFORM

MOVE SubscriberTotal TO PrnSubscriberTotal

DISPLAY SubscriberLine

END-PERFORM

CLOSE SortedBillablesFile

STOP RUN.

Program
Notes

I have kept this program simple for reasons of clarity and space, and because you will meet a more fully worked version of the program when I explore advanced versions of the SORT. Because the SORT uses a disk-based WorkFile, it is slower than purely RAM-bound operations. You should be aware of this whenever you are considering using SORT.

You should probably use SORT only when no practical RAM-based solution is available; and even then, you should ensure that only the data items required in the sorted file are sorted. This may involve leaving out some of the records or changing the record size.

331

Chapter 14 ■ Sorting and Merging

In this instance, sorting the file does seem to be the only viable option. There are millions of telephone subscribers, and, in the course of a month, they make many calls and send hundreds of texts. So BillableServicesFile contains tens of millions, or hundreds of millions, of records. In COBOL, the only possible RAM-based solution (you can't create dynamic structures like trees or linked lists pre–ISO 2002) would be to use a table (one element per subscriber) to sum the subscribers’ ServiceCost fields. That solution has many problems. The array would have to contain millions of elements, you would have to ensure that the elements were in SubscriberId order, and, because new subscribers are constantly joining, the table would have to be redimensioned every time the program ran.

You may wonder why the example uses different record descriptions for the three files when the records are identical. The reason is that although the records are identical, they are used in different ways in the program, and the granular data descriptions reflect way the records are used.

The input file is used only by the SORT, so while you have to define how much storage a record will occupy you never need to refer to the individual fields. You could fully define the record as follows:

01 UnsortedSubcriberRec.

02 SubscriberId PIC 9(10).

02 ServiceType PIC 9.

02 ServiceCost PIC 9(4)V99

But then you would either have to use slightly different field names for the sorted file or qualify them using references such as SubscriberId OF SubscriberRec.

In WorkFile, only the data items on which the file is to be sorted (mentioned in the KEY phrase) need to be explicitly defined. In this case, the only item that must be explicitly identified is SubscriberId-WF.

The sorted file is normally the file that the program uses to do whatever work is required. This generally means that all, or nearly all, of the data items are mentioned by name in the program; and, hence, they have to be declared.

Normally, the record description for this file fully defines the record.

Using Multiple Keys

If you examine the SORT metalanguage in Figure 14-1, you will realize not only that can a file be sorted on a number of keys but also that one key can be ascending while another is descending. This is illustrated in Table 14-1 and Example 14-2.

The table contains student results that have been sorted into descending StudentId order within ascending GPA order. Notice that GPA is the major key and that StudentId is only in descending sequence within GPA. This is because the first key named in a SORT statement is the major key, and keys become less significant with each successive declaration.

Example 14-2. SORT with One Key Descending and Another Ascending

SORT WorkFile ON DESCENDING GPA

ASCENDING StudentId

USING StudentResultsFile

GIVING SortedStudentsResultsFile

332

Chapter 14 ■ Sorting and Merging

Table 14-1. Ascending StudentId within Descending GPA

-

SORT with Procedures

The simple version of SORT takes the records from InFileName, sorts them, and then outputs them to OutFileName.

Sometimes, however, not all the records in the unsorted file are required in the sorted file, or not all the data items in the unsorted file record are required in the record of the sorted file. For instance, suppose the specification for the Universal Telecoms Monthly Report changes so that you are only required to show the value of the voice calls made by subscribers. In that situation, the text records (ServiceType = 1) are not required in the sorted file. Similarly, if the specification changes so that the number of texts and phone calls is required rather than their value, you do not need the ServiceCost data item in sorted file records. In both cases, processing must be applied, to eliminate unwanted records or alter their format, before the records are submitted to the sort process. This processing is achieved by specifying INPUT PROCEDURE with SORT.

Sometimes, to reduce the number of files that have to be declared, you may find it useful to process the records directly from the sort process instead of creating a sorted file and then processing that. For instance, you could create the Universal Telecoms Monthly Report directly instead of creating a sorted file and then processing the sorted file to create the report. Such processing is accomplished by using OUTPUT PROCEDURE with SORT.

An INPUT PROCEDURE is a block of code that consists of one or more sections or paragraphs that execute, having been passed control by SORT. When the block of code has finished, control reverts to SORT. An OUTPUT PROCEDURE

works in a similar way.

333

Chapter 14 ■ Sorting and Merging

Figure 14-3 gives the metalanguage for the full SORT including the INPUT PROCEDURE and the OUTPUT PROCEDURE.

Figure 14-3. Metalanguage for the full version of the SORT verb

INPUT PROCEDURE Notes

You should consider the following when using an INPUT PROCEDURE:

• The block of code specified by the INPUT PROCEDURE allows you to select which records,

and what format of records, are submitted to the sort process. Because an INPUT PROCEDURE

executes before the SORT sorts the records, only the data that is actually required in the sorted

file is sorted.

• When you use an INPUT PROCEDURE, it replaces the USING phrase. The ProcedureName in

the INPUT PROCEDURE phrase identifies a block of code that uses the RELEASE verb to supply

records to the sort process. The INPUT PROCEDURE must contain at least one RELEASE statement

to transfer the records to the work file (identified by SDWorkFileName).

• The INPUT PROCEDURE finishes before the sort process sorts the records supplied to it by the

procedure. That's why the records are RELEASEd to the work file. They are stored there until the

INPUT PROCEDURE finishes, and then they are sorted.

• Neither an INPUT PROCEDURE nor an OUTPUT PROCEDURE can contain a SORT or MERGE

statement.

• The pre–ANS 85 COBOL rules for the SORT verb stated that the INPUT PROCEDURE and OUTPUT

PROCEDURE had to be self-contained sections of code and could not be entered from elsewhere

in the program.

• In the ANS 85 version of COBOL, the INPUT PROCEDURE and OUTPUT PROCEDURE can be

any contiguous group of paragraphs or sections. The only restriction is that the range of

paragraphs or sections used must not overlap.

334

Chapter 14 ■ Sorting and Merging

OUTPUT PROCEDURE Notes

You should consider the following when using an OUTPUT PROCEDURE:

• An OUTPUT PROCEDURE retrieves sorted records from the work file using the RETURN verb. An

OUTPUT PROCEDURE must contain at least one RETURN statement to get the records from the

work file.

• An OUTPUT PROCEDURE only executes after the file has been sorted.

• If you use an OUTPUT PROCEDURE, the SORT..GIVING phrase cannot be used.

How an INPUT PROCEDURE Works

A simple SORT works by taking records from the USING file, sorting them, and then writing them to the GIVING file.

When an INPUT PROCEDURE is used, there is no USING file,
so the sort process has to get its records from the INPUT

PROCEDURE. The INPUT PROCEDURE uses the RELEASE verb to supply the records to the work file of the SORT, one at a time.

Although an INPUT PROCEDURE usually gets the records it supplies to the sort process from an input file, the records can originate from anywhere. For instance, if you wanted to sort the elements of a table, you could use INPUT

PROCEDURE to send the elements, one at a time, to the sort process (see Listing 14-7, in the section “Sorting Tables Program”). Or, if you wanted to sort the records as they were entered by the user, you could use INPUT PROCEDURE to get the records from the user and supply them to the sort process (see Listing 14-3, later in this section). When an INPUT PROCEDURE gets its records from an input file, it can select which records to send to the sort process and can even alter the structure of the records before they are sent.

Creating an INPUT PROCEDURE

When you use an INPUT PROCEDURE, a RELEASE verb must be used to send records to the work file associated with SORT. The work file is declared in an SD entry in the FILE SECTION. RELEASE is a special verb used only in INPUT

PROCEDUREs to send records to the work file. It is the equivalent of a WRITE command and works in a similar way. The metalanguage for the RELEASE verb is given in Figure 14-4.

Figure 14-4. Metalanguage for the RELEASE verb

A template for an INPUT PROCEDURE that gets records from an input file and releases them to the SORT work file is given in Example 14-3. Notice that the work file is not opened in the OUTPUT PROCEDURE. The work file is automatically opened by the SORT.

Example 14-3. INPUT PROCEDURE File-Processing Template

OPEN INPUT InFileName

READ InFileName RECORD

PERFORM UNTIL TerminatingCondition

RELEASE SDWorkRec

READ InFileName RECORD

END-PERFORM

CLOSE InFileName

335

Chapter 14 ■ Sorting and Merging

Using an INPUT PROCEDURE to Select Records

Suppose that the specification for the Universal Telecoms Monthly Report is changed so that only the value of the voice calls made by subscribers is required. Figure 14-5 shows how you can use an INPUT PROCEDURE between the input file and the sort process to filter out the unwanted text (ServiceType = 1) records. Listing 14-2 implements the specification change and also produces a more fully worked version. In this program, the report is written to a print file rather than just displayed on the computer screen.

‹ Prev Next ›