0000
*****
202
Chapter 9 ■ edited piCtures
2. show the formatted result that will be produced when the data value is moved to the
edited picture.
Sending
Data
Result
Edited Picture
9(6)
000321
***321
PIC ZZZ,999
9(6)
004321
**4,321
PIC ZZZ,999
9(6)
000004
****004
PIC ZZZ,999
9(6)
654321
654,321.00
PIC ZZZ,ZZZ.00
9999V99
654321
**6,543.21
PIC ZZZ,ZZZ.ZZ
9999V99
004321
***$43.21
PIC $$,$$9.99
9999V99
000078
****$0.78
PIC $$,$$9.99
9999V99
000078
$****0.78
PIC $Z,ZZ9.99
S9999V99
000078
$****0.78
PIC $Z,ZZ9.99CR
S9999V99
-045678
$**456.78CR
PIC $Z,ZZ9.99CR
S9(6)
-123456
-123,456
PIC -999,999
S9(6)
123456
*123,456
PIC -999,999
S9(6)
123456
+123,456
PIC +999,999
S9(6)
-123456
-123,456
PIC +999,999
S9(6)
001234
**+1,234
PIC ++++,++9
9(6)
123456
12*34*56
PIC 99B99B99
9(6)
001234
**1234.00
PIC Z(6).00
9(6)
000092
****9200
PIC ZZZZZZ00
X(5)
123GO
1*2*3**GO
PIC XBXBXBBXX
9999V99
000123
$******1.23
PIC $***,**9.99
99999V99
24123.45
$4,123.45
PIC $$,$$9.99
prOGraMMING eXerCISe 1: aNSWer
the answer to this exercise is found in the next chapter, where it appears an an example.
203
Chapter 10
Processing Sequential Files
Previous chapters introduced the mechanics of creating and reading sequential files. This chapter introduces the two most important sequential-file processing problems: control breaks and the file update problem.
Both control breaks and the file update problem involve manipulating ordered sequential files so the chapter begins with a discussion of how sequential files are organized and the difference between ordered and unordered sequential files.
The next section discusses control-break problems. These normally occur when a hierarchically structured
printed report has to be produced. But control breaks are not limited to printed reports. Any problem that processes a stream of ordered data and requires action to be taken when one of the items on which the stream is ordered changes, is a control-break problem.
The final section introduces the file-update problem. This involves the thorny difficulty of how to apply a sequential file of ordered transaction records to an ordered sequential master file. This section starts gently by showing how transaction files containing updates of only a single type may be applied to a master file. I then discuss the record buffer implications of transaction files that contain different types of records and introduce a simplified version of the file-update problem. Finally, I discuss and demonstrate an algorithm, based on academic research, which addresses the full complexity of the file-update problem.
File Organization vs. Method of Access
Two important characteristics of files are data organization and method of access. Data organization refers to the way the file’s records are organized on the backing storage device. COBOL recognizes three main types of file organization:
• Sequential: Records are organized serially.
• Relative: A direct-access file is used and is organized on relative record number.
• Indexed: A direct-access file is used and has an index-based organization.
Method of access refers to the way in which records are accessed. Some approaches to organization are more versatile than others. A file with indexed or relative organization may still have its records accessed sequentially; but the records in a sequential file can only be accessed sequentially.
To understand the difference between file organization and method of access, consider them in the context of a library with a large book collection. Most of the books in the library are organized by Dewey Decimal number; but some, awaiting shelving, are organized in the order in which they were purchased. A reader looking for a book in the main part of the library might find it by looking up its Dewey Decimal number in the library index or might just go the particular section and browse through the books on the shelves. Because the books are organized by Dewey Decimal number, the reader has a choice regarding the method of access. But if the desired book is in the newly acquired section, the reader has no choice. They have to browse through all the titles to find the one they want. This is the difference between direct-access files and sequential files. Direct-access files offer a choice of access methods.
Sequential files can only be processed sequentially.
205
Chapter 10 ■ proCessing sequential Files
Sequential Organization
Sequential organization is the simplest type of file organization. In a sequential file, the records are arranged serially, one after another, like cards in a dealing shoe. The only way to access a particular record is to start at the first record and read all the succeeding records until the required record is found or until the end of the file is reached.
Ordered and Unordered Files
Sequential files may be ordered or unordered (they should really be called serial files). In an ordered file, the records are sequenced (see Table 10-1) on a particular field in the record, such as CustomerId or CustomerName. In an unordered file, the records are not in any particular order.
Table 10-1. Ordered and Unordered Files
Ordered File
Unordered File
Record-KeyA
Record-KeyM
Record-KeyB
Record-KeyH
Record-KeyD
Record-Key0
Record-KeyG
Record-KeyB
Record-KeyH
Record-KeyN
Record-KeyK
Record-KeyA
Record-KeyM
Record-KeyT
Record-Key0
Record-KeyK
Record-KeyT
Record-KeyG
The ordering of the records in a file has a significant impact on the way in which it is processed and the processing that can be applied to it.
Control-Break Processing
Control-break processing is a technique generally applied to an ordered sequential file in order to create a printed report. But it can also be used for other purposes such as creating a summary file. For control-break processing to work, the input file must be sorted in the same order as the output to be produced.
A control-break program works by monitoring one or more control items (fields in the record) and taking action when the value in one of the control items changes (the control break). In a control-break program with multiple control-break items, the control breaks are usually hierarchical, such that a break in a major control item automatically causes a break in the minor controls even if the actual value of the minor item doe
s not change. For instance, Figure 10-1 partially models a file that holds details of magazine sales. When the major control item changes from England to Ireland, this also causes the minor control item to break even though its value is unchanged. You can see the logic behind this: it is unlikely that the same individual (Maxwell) lives in both countries.
206
Chapter 10 ■ proCessing sequential Files
Figure 10-1. Partial model of a file containing details of magazine sales. A major control break also causes a break of the minor control item
Specifications that Require Control Breaks
To get a feel for the kinds of problems that require a control-break solution, consider the following specifications.
Specification Requiring a Single Control Break
Write a program to process the UnemploymentPayments file to produce a report showing the annual Social Welfare unemployment payments made in each county in Ireland. The report must be printed and sequenced on ascending CountyName. The UnemploymentPayments file is a sequential file ordered on ascending CountyName.
In this specification, the control-break item is the CountyName. The processing required is to sum the
payments for a particular county and then, when the county names changes, to print the county name and the total unemployment payments for that county.
Specification Requiring Two Control Breaks
A program is required to process the MagazineSales file to produce a report showing the total spent by customers in each country on magazines. The report must be printed on ascending CustomerName within ascending CountryName.
The MagazineSales file is a sequential file ordered on ascending CustomerName within ascending CountryName.
207
Chapter 10 ■ proCessing sequential Files
Figure 10-1 models the MagazineSales file and shows what is meant by “ordered on ascending CustomerName within ascending CountryName.” Notice how the records are in order of ascending country name, but all the records for a particular country are in order of ascending customer name.
In this specification, the control-break items are the CountryName (major) and the CustomerName (minor).
Specification Requiring Three Control Breaks
Electronics2Go has branches in a number of American states. A program is required to produce a report showing the total sales made by each salesperson, the total sales for each branch, the total sales for each state, and a final total of sales for the entire United States. The report must be printed on ascending SalespersonId within ascending BranchId within ascending StateName.
The report is based on the CompanySales file. This file holds details of sales made in all the branches of the company. It is a sequential file, ordered on ascending SalespersonId, within ascending BranchId, within ascending StateName.
In this specification, the control-break items are the StateName (major), the BranchId (minor), and the
SalespersonId (most minor).
Detecting the Control Break
A major consideration in a control-break program is how to detect the control break. If you examine the data in Figure 10-1, you can see the control breaks quite clearly. When the country name changes from England to Ireland, a major control break has occurred. When the customer surname changes from Molloy to Power, a minor control break has occurred. It is easy for you to see the control breaks in the data file, but how can you detect these control breaks programmatically?
The way you do this is to compare the value of the control field in the record against the previous value of the control field. How do you know the previous value of the control field? You must store it in a data item specifically set up for the purpose. For instance, if you were writing a control-break program for the data in Figure 10-1, you might create the data items PrevCountryName and PrevCustomerName to store the control-break values. Detecting the control break then simply becomes a matter of comparing the values in these fields with the values in the fields of the current record.
Writing a Control-Break Program
The first instinct programmers seem to have when writing a control-break program is to code the solution as a single loop and to use IF statements (often nested IF statements) to handle the control breaks. This approach results in a cumbersome solution. A better technique is to recognize the structure of the data in the data file and in the report and to create a program that echoes that structure. This echoed structure uses a hierarchy of loops to process the control breaks. This idea is not original; it is essentially that advocated by Michael Jackson in Jackson Structured Programming (JSP).1
When you use this approach, the code for processing each control item becomes
Initialize control items (Totals and PrevControlItems)
Loop Until control break
Finalize control items (Print Totals)
1Michael Jackson. Principles of Program Design. Academic Press, 1975.
208
Chapter 10 ■ proCessing sequential Files
Control-Break Program Template
Example 10-1 gives a template for writing a control-break program. The program structure echoes the structure of the input and output data. The control breaks are processed by a hierarchy of loops, where the inner loop processes the most minor control break.
Example 10-1. Template for Control-Break Programs
OPEN File
Read next record from file
PERFORM UNTIL EndOfFile
MOVE ZEROS TO totals of ControlItem1
MOVE ControlItem1 TO PrevControlItem1
PERFORM UNTIL ControlItem1 NOT EQUAL TO PrevControlItem1
OR EndOfFile
MOVE ZEROS TO totals of ControlItem2
MOVE ControlItem2 TO PrevControlItem2
PERFORM UNTIL ControlItem2 NOT EQUAL TO PrevControlItem2
OR ControlItem1 NOT EQUAL TO PrevControlItem1
OR EndOfFile
Process record
Read next record from file
END-PERFORM
Process totals of ControlItem2
END-PERFORM
Process totals of ControlItem1
END-PERFORM
Process final totals
CLOSE file
Three-Level Control Break
Let’s see how all this works in an actual example. As the basis for the example, let’s use a modified version of the three-control-break specification given earlier.
Electronics2Go has branches in a number of American states. A program is required to produce a summary
report showing the total sales made by each salesperson, the total sales for each branch, the total sales for each state, and a final total of sales for the entire United States. The report must be printed on ascending SalespersonId in ascending BranchId in ascending StateName.
The report is based on the Electronics2Go sales file. This file holds details of sales made in all the branches of the company. It is a sequential file, ordered on ascending SalespersonId, within ascending BranchId, within ascending StateName. Each record in the sales file has the following description:
Field
Type
Length
Value
StateName
X
14
-
BranchId
X
5
-
SalespersonId
X
6
99999X (M/F)
ValueOfSale
9
6
0000.00–9999.99
209
Chapter 10 ■ proCessing sequential Files
The report format should follow the template in Figure 10-2. In the report template, the SalesTotal field is the sum of the sales made by this salesperson. The Branch Total is the sum of the sales made by each branch. The State Total is the sum of the sales made by all the branches in the state. The Final Total is the sum of the sales made in the United States.
Figure 10-2. Template for the Electronics2Go sales
report
In all sales value fields, leading zeros should be suppressed and the dollar symbol should float against the value.
The State Name and the Branch should be suppressed after their first occurrence. For simplicity, the headings are only printed once, so no page count or line numbers need be tracked.
■ Note the full state name is used in every record of the sales file. this is a waste of space. normally a code representing the state would be used, and the program would convert this code into a state name by means of a lookup table. Because you have not yet encountered lookup tables, i have decided to use the full state name in the file.
Three-Level Control-Break Program
Listing 10-1 shows a program that implements the Electronics2Go Sales Report specification.
210
Chapter 10 ■ proCessing sequential Files
Listing 10-1. Three-Control-Break Electronics2Go Sales Report
IDENTIFICATION DIVISION.
PROGRAM-ID. Listing10-1.
AUTHOR. Michael Coughlan.
* A three level Control Break program to process the Electronics2Go
* Sales file and produce a report that shows the value of sales for
* each Salesperson, each Branch, each State, and for the Country.
* The SalesFile is sorted on ascending SalespersonId within BranchId
* within Statename.
* The report must be printed in the same order
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT SalesFile ASSIGN TO "Listing10-1TestData.Dat"
ORGANIZATION IS LINE SEQUENTIAL.
SELECT SalesReport ASSIGN TO "Listing10-1.RPT"
ORGANIZATION IS LINE SEQUENTIAL.
DATA DIVISION.
FILE SECTION.
FD SalesFile.
01 SalesRecord.
88 EndOfSalesFile VALUE HIGH-VALUES.
02 StateName PIC X(14).
02 BranchId PIC X(5).
02 SalesPersonId PIC X(6).
02 ValueOfSale PIC 9(4)V99.
FD SalesReport.
01 PrintLine PIC X(55).
WORKING-STORAGE SECTION.
01 ReportHeading.
02 FILLER PIC X(35)
VALUE " Electronics2Go Sales Report".
01 SubjectHeading.
02 FILLER PIC X(43)
VALUE "State Name Branch SalesId SalesTotal".
Michael Coughlan Page 27