Michael Coughlan
Page 43
Chapter 14 ■ Sorting and Merging
PROCEDURE DIVISION.
Begin.
SORT WorkFile ON ASCENDING KEY SubscriberId-WF
USING BillableServicesFile
OUTPUT PROCEDURE IS CreateSummaryFile
STOP RUN.
CreateSummaryFile.
OPEN OUTPUT SortedSummaryFile
RETURN WorkFile
AT END SET EndOfWorkFile TO TRUE
END-RETURN
PERFORM UNTIL EndOfWorkFile
MOVE ZEROS TO CostOfTexts, CostOfCalls
MOVE SubscriberId-WF TO SubscriberId
PERFORM UNTIL SubscriberId-WF NOT EQUAL TO SubscriberId
IF VoiceCall
ADD ServiceCost-WF TO CostOfCalls
ELSE
ADD ServiceCost-WF TO CostOfTexts
END-IF
RETURN WorkFile
AT END SET EndOfWorkFile TO TRUE
END-RETURN
END-PERFORM
WRITE SummaryRec
END-PERFORM
CLOSE SortedSummaryFile.
Figure 14-8 illustrates the process of producing the summary file. The SORT takes records from BillableServicesFile and sorts them, and then the OUTPUT PROCEDURE summarizes them and writes the summary
records to SortedSummaryFile.
The data items in BillableServicesFile are not referred to in the program and so are not explicitly defined, although the storage they require is reserved (PIC X(17)). For reasons of brevity, and because it would obscure the core logic, the program does not check the data for validity.
Some Interesting Programs
You have seen how you can use an INPUT PROCEDURE to process records before they are sent to a SORT and how you can use an OUTPUT PROCEDURE to process the sorted records. But each was used in isolation. You can achieve some interesting results by using them in concert.
346
Chapter 14 ■ Sorting and Merging
Sorting Student Records into Date-of-Entry Order
Suppose there exists an unordered sequential file of student records, and each record in the file has the following description:
Field
Type Length
Value
StudentId
9
7
YYxxxxx
CourseCode
9
5
LMxxx
StudentId is a number that consists of two digits representing the year of entry followed by six other digits. Write a program to sort StudentFile on the “real” ascending StudentId.
This specification presents an interesting issue. It says that the file should be ordered on the “real” ascending StudentId. This means the IDs of students who entered the university after the year 2000 should appear after those of students who entered the university before 2000. This is a problem because you can't just sort the records in ascending StudentId order, as is demonstrated in Figure 14-10.
Figure 14-10. Showing the real StudentId sort order
How can this be done? Listing 14-6 solves the problem by using an INPUT PROCEDURE to alter StudentId to add the millennium to the date-of-entry part. Then the altered records are sorted, and the OUTPUT PROCEDURE strips off the millennium digits.
Listing 14-6. Using INPUT PROCEDURE and OUTPUT PROCEDURE in Concert
IDENTIFICATION DIVISION.
PROGRAM-ID. Listing14-6.
AUTHOR. Michael Coughlan.
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT UnsortedStudentsFile ASSIGN TO "Listing14-6.DAT"
ORGANIZATION IS LINE SEQUENTIAL.
SELECT WorkFile ASSIGN TO "Workfile.tmp".
SELECT SortedStudentsFile ASSIGN TO "Listing14-6.srt"
ORGANIZATION IS LINE SEQUENTIAL.
DATA DIVISION.
FILE SECTION.
FD UnsortedStudentsFile.
347
Chapter 14 ■ Sorting and Merging
01 StudentRecUF.
88 EndOfUnsortedFile VALUE HIGH-VALUES.
02 StudentIdUF.
03 MillenniumUF PIC 99.
03 FILLER PIC 9(5).
02 RecBodyUF PIC X(14).
SD WorkFile.
01 StudentRecWF.
88 EndOfWorkFile VALUE HIGH-VALUES.
02 FullStudentIdWF.
03 MillenniumWF PIC 99.
03 StudentIdWF PIC 9(7).
02 RecBodyWF PIC X(14).
FD SortedStudentsFile.
01 StudentRecSF.
02 StudentIdSF PIC 9(7).
02 RecBodySF PIC X(14).
PROCEDURE DIVISION.
Begin.
SORT WorkFile ON ASCENDING KEY FullStudentIdWF
INPUT PROCEDURE IS AddInMillennium
OUTPUT PROCEDURE IS RemoveMillennium
STOP RUN.
AddInMillennium.
OPEN INPUT UnsortedStudentsFile
READ UnsortedStudentsFile
AT END SET EndOfUnsortedFile TO TRUE
END-READ
PERFORM UNTIL EndOfUnsortedFile
MOVE RecBodyUF TO RecBodyWF
MOVE StudentIDUF TO StudentIdWF
IF MillenniumUF < 70
MOVE 20 TO MillenniumWF
ELSE
MOVE 19 TO MillenniumWF
END-IF
RELEASE StudentRecWF
READ UnsortedStudentsFile
AT END SET EndOfUnsortedFile TO TRUE
END-READ
END-PERFORM
CLOSE UnsortedStudentsFile.
RemoveMillennium.
OPEN OUTPUT SortedStudentsFile
RETURN WorkFile
AT END SET EndOfWorkFile TO TRUE
348
Chapter 14 ■ Sorting and Merging
END-RETURN
PERFORM UNTIL EndOfWorkFile
MOVE RecBodyWF TO RecBodySF
MOVE StudentIdWF TO StudentIdSF
WRITE StudentRecSF
RETURN WorkFile
AT END SET EndOfWorkFile TO TRUE
END-RETURN
END-PERFORM
CLOSE SortedStudentsFile.
Sorting Tables
Versions of COBOL before ISO 2002 did not allow you to apply a SORT to a table. But it was possible to work around this restriction by using an INPUT PROCEDURE to release table elements to the work file and an OUTPUT PROCEDURE
to get the sorted element-records from the work file and put them back into the table. The process is illustrated in Figure 14-11; see Listing 14-7.
Figure 14-11. Using INPUT PROCEDURE and OUTPUT PROCEDURE to sort a table
Listing 14-7. Sorting a Table Using INPUT PROCEDURE and OUTPUT PROCEDURE
IDENTIFICATION DIVISION.
PROGRAM-ID. Listing14-7.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 CountyTable.
02 TableValues.
03 FILLER PIC X(16) VALUE "kilkenny 0080421".
03 FILLER PIC X(16) VALUE "laois 0058732".
03 FILLER PIC X(16) VALUE "leitrim 0025815".
03 FILLER PIC X(16) VALUE "tipperary0140281".
03 FILLER PIC X(16) VALUE "waterford0101518".
349
Chapter 14 ■ Sorting and Merging
03 FILLER PIC X(16) VALUE "westmeath0072027".
03 FILLER PIC X(16) VALUE "carlow 0045845".
03 FILLER PIC X(16) VALUE "wicklow 0114719".
03 FILLER PIC X(16) VALUE "cavan 0056416".
03 FILLER PIC X(16) VALUE "clare 0103333".
03 FILLER PIC X(16) VALUE "meath 0133936".
03 FILLER PIC X(16) VALUE "monaghan 0052772".
03 FILLER PIC X(16) VALUE "offaly 0063702".
03 FILLER PIC X(16) VALUE "roscommon0053803".
03 FILLER PIC X(16) VALUE "sligo 0058178".
03 FILLER PIC X(16) VALUE "cork 0448181".
03 FILLER PIC X(16) VALUE "donegal 0137383".
03 FILLER PIC X(16) VALUE "dublin 1122600".
03 FILLER PIC X(16) VALUE "galway 0208826".
03 FILLER PIC X(16) VALUE "wexford 0116543".
03 FILLER PIC X(16) VALUE "kerry 0132424".
03 FILLER PIC X(16) VALUE "kildare 0163995".
03 FILLER PIC X(16) VALUE "limerick 0175529".
03 FILLER PIC X(16) VALUE "longford 0031127".
03 FILLER PIC X(16) VALUE "louth 0101802".
03 FILLER PIC X(16) VALUE "mayo 0117428".
02 FILLER REDEFINES TableValues.
03 CountyDetails OCCURS 26 TIMES
INDEXED BY Cidx.
04 CountyName PIC X(9).
04 CountyPop PIC 9(7).
01 PrnCountyPop PIC Z,ZZZ,ZZ9.
PROCEDURE DIVISION.
Begin.
DISPLAY "County name order"
SORT CountyDetails ON ASCENDING KEY CountyName
PERFORM DisplayCountyTotals
VARYING Cidx FROM 1 BY 1 UNTIL Cidx GREATER
THAN 26.
DISPLAY SPACES
DISPLAY "County population order"
SORT CountyDetails ON DESCENDING KEY CountyPop
PERFORM DisplayCountyTotals
VARYING Cidx FROM 1 BY 1 UNTIL Cidx GREATER
THAN 26.
STOP RUN.
DisplayCountyTotals.
MOVE CountyPop(Cidx) TO PrnCountyPop
DISPLAY CountyName(Cidx) " is " PrnCountyPop
350
Chapter 14 ■ Sorting and Merging
Sorting Tables: ISO 2002 Changes
Listing 14-7 shows how to sort a table using an INPUT PROCEDURE and an OUTPUT PROCEDURE. The problem with this solution is the work file. The sort operation, being bound to a file on backing storage, is comparatively slow. Sorting the table would be faster if it could be done wholly in memory.
Sorting a table directly in memory is exactly what the ISO 2002 version of COBOL now allows. The metalanguage for this SORT format is given in Figure 14-12, and Listing 14-8 shows how you can use this format to sort the County table from Listing 14-7.
Figure 14-12. Metalanguage for the ISO 2002 version of SORT
Listing 14-8. Applying SORT Directly to a Table
IDENTIFICATION DIVISION.
PROGRAM-ID. Listing14-8.
*> ISO 2002 Applying the SORT to a table
DATA DIVISION.
WORKING-STORAGE SECTION.
01 CountyTable.
02 TableValues.
03 FILLER PIC X(16) VALUE "kilkenny 0080421".
03 FILLER PIC X(16) VALUE "laois 0058732".
03 FILLER PIC X(16) VALUE "leitrim 0025815".
03 FILLER PIC X(16) VALUE "tipperary0140281".
03 FILLER PIC X(16) VALUE "waterford0101518".
03 FILLER PIC X(16) VALUE "westmeath0072027".
03 FILLER PIC X(16) VALUE "carlow 0045845".
03 FILLER PIC X(16) VALUE "wicklow 0114719".
03 FILLER PIC X(16) VALUE "cavan 0056416".
03 FILLER PIC X(16) VALUE "clare 0103333".
03 FILLER PIC X(16) VALUE "meath 0133936".
03 FILLER PIC X(16) VALUE "monaghan 0052772".
03 FILLER PIC X(16) VALUE "offaly 0063702".
03 FILLER PIC X(16) VALUE "roscommon0053803".
03 FILLER PIC X(16) VALUE "sligo 0058178".
03 FILLER PIC X(16) VALUE "cork 0448181".
03 FILLER PIC X(16) VALUE "donegal 0137383".
03 FILLER PIC X(16) VALUE "dublin 1122600".
03 FILLER PIC X(16) VALUE "galway 0208826".
03 FILLER PIC X(16) VALUE "wexford 0116543".
03 FILLER PIC X(16) VALUE "kerry 0132424".
03 FILLER PIC X(16) VALUE "kildare 0163995".
03 FILLER PIC X(16) VALUE "limerick 0175529".
351
Chapter 14 ■ Sorting and Merging
03 FILLER PIC X(16) VALUE "longford 0031127".
03 FILLER PIC X(16) VALUE "louth 0101802".
03 FILLER PIC X(16) VALUE "mayo 0117428".
02 FILLER REDEFINES TableValues.
03 CountyDetails OCCURS 26 TIMES
INDEXED BY Cidx.
04 CountyName PIC X(9).
04 CountyPop PIC 9(7).
01 PrnCountyPop PIC Z,ZZZ,ZZ9.
PROCEDURE DIVISION.
Begin.
DISPLAY "County name order"
SORT CountyDetails ON ASCENDING KEY CountyName
PERFORM DisplayCountyTotals
VARYING Cidx FROM 1 BY 1 UNTIL Cidx GREATER THAN 26.
DISPLAY SPACES
DISPLAY "County population order"
SORT CountyDetails ON ASCENDING KEY CountyPop
PERFORM DisplayCountyTotals
VARYING Cidx FROM 1 BY 1 UNTIL Cidx GREATER THAN 26.
STOP RUN.
DisplayCountyTotals.
MOVE CountyPop(Cidx) TO PrnCountyPop
DISPLAY CountyName(Cidx) " is " PrnCountyPop.
■ Note For full details, read your implementer manual.
Merging Files
It is often useful to combine two or more files into a single large file. If the files are unordered, this is easy to accomplish because you can simply append the records in one file to the end of the other. But if the files are ordered, the task is somewhat more complicated—especially if there are more than two files—because you must preserve the ordering in the combined file.
In COBOL, instead of having to write special code every time you want to merge files, you can use the MERGE verb.
MERGE takes a number of files, all ordered on the same key values, and combines them based on those key values. The combined file is then sent to an output file or an OUTPUT PROCEDURE.
352
Chapter 14 ■ Sorting and Merging
MERGE Verb
The metalanguage for the MERGE verb is given in Figure 14-13. It should be obvious from the metalanguage that MERGE
shares many of same declarations required for SORT. Just like SORT, MERGE uses a temporary work file that must be defined using an SD entry in the FILE SECTION. Also just as with SORT, the KEY field (on which the files are merged) must be a data item declared in the work file. And just as with SORT, you can use an OUTPUT PROCEDURE to get records from the work file before sending them to their ultimate destination. Unlike with SORT, however, no INPUT PROCEDURE is permitted.
Figure 14-13. Metalanguage for the MERGE verb
MERGE Notes
You should consider the following when using MERGE:
• The results of the MERGE verb are predictable only when the records in the USING files are
ordered as described in the KEY clause associated with the MERGE. For instance, if the MERGE
statement has an ON DESCENDING KEY StudentId clause, then all the USING files must be
ordered on descending StudentId.
• As with SORT, SDWorkFileName is the name of a temporary file, with an SD entry in the FILE
SECTION, SELECT and ASSIGN entries in the INPUT-OUTPUT SECTION, and an organization of
RECORD SEQUENTIAL.
• Each MergeKeyIdentifier identifies a field in the record of the work file. The merged files are
ordered on this key field(s).
• When more than one MergeKeyIdentifier is specified, the keys decrease in significance from
left to right (the leftmost key is most significant, and the rightmost is least significant).
• InFileName and MergedFileName are the names of the input file to be merged and the
resulting combined file produced by the MERGE, respectively. These files are automatically
opened by the MERGE. When the MERGE executes, they must not be already open.
• AlphabetName is an alphabet name defined in the SPECIAL-NAMES paragraph of the
ENVIRONMENT DIVISION. This clause is used to select the character set the SORT verb uses for
collating the records in the file. The character set may be STANDARD-1 (ASCII), STANDARD-2
(ISO 646), NATIVE (may be defined by the system to be A
SCII or EBCDIC; see your
implementer manual), or user defined.
• MERGE can use an OUTPUT PROCEDURE and the RETURN verb to get merged records from
SDWorkFileName.
• The OUTPUT PROCEDURE executes only after the files have been merged and must contain at
least one RETURN statement to get the records from SortFile.
353
Chapter 14 ■ Sorting and Merging
Merging Province Sales Files
Listing 14-9 is an example program that uses MERGE to combine four sequential files, each ordered on ascending ProductCode. The program is based on the following specification.
Listing 14-9. Merging ProvinceSales Files and Producing a Sales Summary File
IDENTIFICATION DIVISION.
PROGRAM-ID. Listing14-9.
AUTHOR. Michael Coughlan.
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT UlsterSales ASSIGN TO "Listing14-9ulster.dat"
ORGANIZATION IS LINE SEQUENTIAL.
SELECT ConnachtSales ASSIGN TO "Listing14-9connacht.dat"
ORGANIZATION IS LINE SEQUENTIAL.
SELECT MunsterSales ASSIGN TO "Listing14-9munster.dat"
ORGANIZATION IS LINE SEQUENTIAL.
SELECT LeinsterSales ASSIGN TO "Listing14-9leinster.dat"
ORGANIZATION IS LINE SEQUENTIAL.
SELECT SummaryFile ASSIGN TO "Listing14-9.sum"
ORGANIZATION IS LINE SEQUENTIAL.
SELECT WorkFile ASSIGN TO "WORK.TMP".
DATA DIVISION.
FILE SECTION.
FD UlsterSales.
01 FILLER PIC X(12).
FD ConnachtSales.
01 FILLER PIC X(12).
FD MunsterSales.
01 FILLER PIC X(12).
FD LeinsterSales.
01 FILLER PIC X(12).
FD SummaryFile.
01 SummaryRec.
02 ProductCode-SF PIC X(6).
02 TotalSalesValue PIC 9(6)V99.
354
Chapter 14 ■ Sorting and Merging
SD WorkFile.
01 WorkRec.
88 EndOfWorkfile VALUE HIGH-VALUES.
02 ProductCode-WF PIC X(6).
02 ValueOfSale-WF PIC 9999V99.