Michael Coughlan
Page 5
some language elements remain that, if used, make it difficult and in some cases impossible to write good programs.
ALTER verb, I’m thinking of you.
COBOL Syntax Metalanguage
COBOL syntax is defined using a notation sometimes called the COBOL metalanguage. In this notation
• Words in uppercase are reserved words. When underlined, they are mandatory. When not
underlined, they are noise words, used for readability only, and are optional.
• Words in mixed case represent names that must be devised by the programmer (such as the
names of data items).
• When material is enclosed in curly braces { }, a choice must be made from the options within
the braces. If there is only one option, then that item is mandatory.
• When material is enclosed in square brackets [ ], the material is optional and may be
included or omitted as required.
• When the ellipsis symbol ... (three dots) is used, it indicates that the preceding syntactic
element may be repeated at your discretion.
• To assist readability, the comma, semicolon, and space characters may be used as separators
in a COBOL statement, but they have no semantic effect. For instance, the following
statements are semantically identical:
ADD Num1 Num2 Num3 TO Result
ADD Num1, Num2, Num3 TO Result
ADD Num1; Num2; Num3 TO Result
In addition to the metalanguage diagrams, syntax rules govern the interpretation of metalanguage. For instance, the metalanguage for PERFORM..VARYING (see Figure 2-3) implies that you can have as many AFTER phrases as desired.
In fact, as you will discover when I discuss this construct in Chapter 6, only two are allowed.
Figure 2-3. PERFORM..VARYING metalanguage
19
Chapter 2 ■ COBOL FOundatiOn
Some Notes on Syntax Diagrams
As mentioned in the previous section, the interpretation of the COBOL metalanguage is modified by syntax rules.
Because it can be tedious to wade through all the rules for each COBOL construct, this book uses a modified form of the syntax diagram. In this modified diagram, special operand suffixes indicate the type of the operand; these are shown in Table 2-1.
Table 2-1. Special Metalanguage Operand Suffixes
Suffix
Meaning
$i
Uses an alphanumeric data item
$il
Uses an alphanumeric data item or a string literal
#i
Uses a numeric data item
#il
Uses a numeric data item or numeric literal
$#i
Uses a numeric or an alphanumeric data item
Example Metalanguage
As an example of how the metalanguage for a COBOL verb is interpreted, the syntax for the COMPUTE verb is shown in Figure 2-4. I’m presenting COMPUTE here because, as the COBOL arithmetic verb (the others are ADD, SUBTRACT, MULTIPLY, DIVIDE) that’s closest to the way things are done in many other languages, it will be a point of familiarity.
The operation of COMPUTE is discussed in more detail in Chapter 4.
Figure 2-4. COMPUTE metalanguage syntax diagram
The COMPUTE verb assigns the result of an arithmetic expression to a variable or variables. The interpretation of the COMPUTE metalanguage is as follows:
• A COMPUTE statement must start with the keyword COMPUTE.
• The keyword must be followed by the name of a numeric data item that receives the result of
the calculation (the suffix #i indicates that the operand must be the name of a numeric data
item [variable]).
• The equals sign (=) must be used.
• An arithmetic expression must follow the equals sign.
• The square braces [ ] around the word ROUNDED indicate that rounding is optional. Because
the word ROUNDED is underlined, the word must be used if rounding is required.
• The ellipsis symbol (...) indicates that there can more than one Result#i data item.
• The ellipsis occurs outside the curly braces {}, which means each result field can have its own
ROUNDED phrase.
20
Chapter 2 ■ COBOL FOundatiOn
In other words, you could have a COMPUTE statement like
COMPUTE Result1 ROUNDED, Result2 = ((9 * 9) + 8) / 5
where Result1 would be assigned a value of 18 (rounded 17.8) and Result2 would be
assigned a value of 17 (truncated 17.8), assuming both Result1 and Result2 were defined
as PIC 99.
Structure of COBOL Programs
COBOL is much more rigidly structured than most other programming languages. COBOL programs are hierarchical in structure. Each element of the hierarchy consists of one or more subordinate elements. The program hierarchy consists of divisions, sections, paragraphs, sentences, and statements (see Figure 2-5).
Figure 2-5. Hierarchical COBOL program structure
A COBOL program is divided into distinct parts called divisions. A division may contain one or more sections.
A section may contain one or more paragraphs. A paragraph may contain one or more sentences, and a sentence one or more statements.
Note
■
programmers unused to this sort of rigidity may find it irksome or onerous, but this layout offers some practical advantages. Many of the programmatic items that might need to be modified as a result of an environmental change are defined in the ENVIRONMENT DIVISION. external references, such as to devices, files, collating sequences, the currency symbol, and the decimal point symbol are all defined in the ENVIRONMENT DIVISION.
Divisions
The division is the major structural element in COBOL. Later in this chapter, I discuss the purpose of each division.
For now, you can note that there are four divisions: the IDENTIFICATION DIVISION, the ENVIRONMENT DIVISION, the DATA DIVISION, and the PROCEDURE DIVISION.
Sections
A section is made up of one or more paragraphs. A section begins with the section name and ends where the next section name is encountered or where the program text ends.
A section name consists of a name devised by the programmer or defined by the language, followed by the word Section, followed by a period (full stop). Some examples of section names are given in Example 2-1.
21
Chapter 2 ■ COBOL FOundatiOn
In the first three divisions, sections are an organizational structure defined by the language. But in the PROCEDURE
DIVISON, where you write the program’s executable statements, sections and paragraphs are used to identify blocks of code that can be executed using the PERFORM or the GO TO.
Example 2-1. Example Section Names
SelectTexasRecords SECTION.
FILE SECTION.
CONFIGURATION SECTION.
INPUT-OUTPUT SECTION.
Paragraphs
A paragraph consists of one or more sentences. A paragraph begins with a paragraph name and ends where the next section name or paragraph name is encountered or where the program text ends.
In the first three divisions, paragraphs are an organizational structure defined by the language (see Example 2-2).
But in the PROCEDURE DIVISON, paragraphs are used to identify blocks of code that can be executed using PERFORM or GO TO (see Example 2-3).
Example 2-2. ENVIRONMENT DIVISION Entries Required for a File Declaration
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT ExampleFile ASSIGN TO "Example.Dat"
ORGANIZATION IS SEQUENTIAL.
Example 2-3. PROCEDURE DIVISION with Two Paragraphs (Begin and DisplayGreeting)
PROCEDURE DIVISION.
Begin.
PERFORM DisplayGreeting 10 TIMES.
/>
STOP RUN.
DisplayGreeting.
DISPLAY "Greetings from COBOL".
Sentences
A sentence consists of one or more statements and is terminated by a period. There must be at least one sentence, and hence one period, in a paragraph. Example 2-4 shows two sentences. The first sentence also happens to be a statement; the second consists of three statements.
Example 2-4. Two Sentences
SUBTRACT Tax FROM GrossPay GIVING NetPay.
MOVE .21 TO VatRate
COMPUTE VatAmount = ProductCost * VatRate
DISPLAY "The VAT amount is - " VatAmount.
22
Chapter 2 ■ COBOL FOundatiOn
Statements
In COBOL, language statements are referred to as verbs. A statement starts with the name of the verb and is followed by the operand or operands on which the verb acts. Example 2-5 shows three statements.
Example 2-5. Three Statements
DISPLAY "Enter name " WITH NO ADVANCING
ACCEPT StudentName
DISPLAY "Name entered was " StudentName
In Table 2-2, the major COBOL verbs are categorized by type. The arithmetic verbs are used in computations, the file-handling verbs are used to manipulate files, the flow-of-control verbs are used to alter the normal sequential execution of program statements, the table-handling verbs are used to manipulate tables (arrays), and the string-handling verbs allow such operations as character counting, string splitting, and string concatenation.
Table 2-2. Major COBOL Verbs, Categorized by Type
Arithmetic
File Handling
Flow of Control Assignment & I-O
Table Handling String Handling
COMPUTE
OPEN
IF
MOVE
SEARCH
INSPECT
ADD
CLOSE
EVALUATE
SET
SEARCH ALL
STRING
SUBTRACT
READ
PERFORM
INITIALIZEACCEPT
SET
UNSTRING
MULTIPLY
WRITE
GO TO
DISPLAY
DIVIDE
DELETE
CALL
REWRITE
STOP RUN
START
EXIT PROGRAM
SORT
RETURN
RELEASE
The Four Divisions
At the top of the COBOL hierarchy are the four divisions. These divide the program into distinct structural elements.
Although some of the divisions may be omitted, the sequence in which they are specified is fixed and must be as follows. Just like section names and paragraph names, division names must be followed by a period:
IDENTIFICATION DIVISION . Contains information about the program
ENVIRONMENT DIVISION . Contains environment information
DATA DIVISION . Contains data descriptions
PROCEDURE DIVISION . Contains the program algorithms
IDENTIFICATION DIVISION
The purpose of the IDENTIFICATION DIVISION is to provide information about the program to you, the compiler, and the linker. The PROGRAM-ID paragraph is the only entry required. In fact, this entry is required in every program.
Nowadays all the other entries have the status of comments (which are not processed when the program runs), but you may still find it useful to included paragraphs such as AUTHOR and DATE-WRITTEN.
23
Chapter 2 ■ COBOL FOundatiOn
The PROGRAM-ID is followed by a user-devised name that is used to identify the program internally. This name may be different from the file name given to the program when it was saved to backing storage. The metalanguage for the PROGRAM-ID is
PROGRAM–ID. UserAssignedProgramName.
[IS [COMMON] [INITIAL] PROGRAM].
The metalanguage items in square braces apply only to subprograms, so I will reserve discussion of these items until later in the book.
When a number of independently compiled programs are combined by the linker into a single executable run-
unit, each program is identified by the name given in its PROGRAM-ID. When control is passed to a particular program by means of a CALL verb, the target of the CALL invocation is the name given in the subprogram’s PROGRAM-ID for instance:
CALL "PrintSummaryReport".
Example 2-6 shows an example IDENTIFICATION DIVISION. Pay particular attention to the periods — they are
required.
Example 2-6. Sample IDENTIFICATION DIVISION
IDENTIFICATION DIVISION.
PROGRAM-ID. PrintSummaryReport.
AUTHOR. Michael Coughlan.
DATE-WRITTEN. 20th June 2013.
ENVIRONMENT DIVISION
The ENVIRONMENT DIVISION is used to describe the environment in which the program works. It isolates in one place all aspects of the program that are dependent on items in the environment in which the program runs. The idea is to make it easy to change the program when it has to run on a different computer or one with different peripheral devices or when the program is being used in a different country.
The ENVIRONMENT DIVISION consists of two sections: the CONFIGURATION SECTION and the INPUT-OUTPUT
SECTION. In the CONFIGURATION SECTION, the SPECIAL-NAMES paragraph allows you to specify such environmental details as what alphabet to use, what currency symbol to use, and what decimal point symbol to use. In the INPUT-OUTPUT SECTION, the FILE-CONTROL paragraph lets you connect internal file names with external devices and files.
Example 2-7 shows some example CONFIGURATION SECTION entries. A few notes about the listing:
• In some countries the meaning of the decimal point and the comma are reversed. For
instance, the number 1,234.56 is sometimes written 1.234,56. The DECIMAL-POINT IS COMMA
clause specifies that the program conforms to this scheme.
• The SYMBOLIC CHARACTERS clause lets you assign a name to one of the unprintable characters.
In this example, names for the escape, carriage return, and line-feed characters have been
defined by specifying their ordinal position (not value) in the character set.
• The SELECT and ASSIGN clauses let you connect the name you use for a file in the program with
its actual name and location on disk.
24
Chapter 2 ■ COBOL FOundatiOn
Example 2-7. CONFIGURATION SECTION Examples
IDENTIFICATION DIVISION.
PROGRAM-ID. ConfigurationSectionExamples.
AUTHOR. Michael Coughlan.
ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
SPECIAL-NAMES.
DECIMAL-POINT IS COMMA.
SYMBOLIC CHARACTERS ESC CR LF
ARE 28 14 11.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT StockFile ASSIGN TO "D:DataFilesStock.dat"
ORGANIZATION IS SEQUENTIAL.
DATA DIVISION
The DATA DIVISION is used to describe most of the data that a program processes. The obvious exception to this is literal data, which is defined in situ as a string or numeric literal such as “Freddy Ryan” or -345.74.
The DATA DIVISION is divided into four sections:
• The FILE SECTION
• The WORKING-STORAGE SECTION
• The LINKAGE SECTION
• The REPORT SECTION
The first two are the main sections. The LINKAGE SECTION is used only in subprograms, and the REPORT SECTION
is used only when generating reports. The LINKAGE and REPORT sections are discussed more fully when you encounter the elements that require them later in the book. For now, only the first two sections need concern you.
File Section
The FILE SECTION describes the data that is sent to, or comes from, the computer’s d
ata storage peripherals. These include such devices as card readers, magnetic tape drives, hard disks, CDs, and DVDs.
Working-Storage Section
The WORKING-STORAGE SECTION describes the general variables used in the program. The COBOL metalanguage
showing the general structure and syntax of the DATA DIVISION is given in Figure 2-6 and is followed by a fragment of an example COBOL program in Example 2-8.
25
Chapter 2 ■ COBOL FOundatiOn
Figure 2-6. DATA DIVISION metalanguage
Example 2-8. Simple Data Declarations
IDENTIFICATION DIVISION.
PROGRAM-ID. SimpleDataDeclarations.
AUTHOR. Michael Coughlan.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 CardinalNumber PIC 99 VALUE ZEROS.
01 IntegerNumer PIC S99 VALUE -14.
01 DecimalNumber PIC 999V99 VALUE 543.21.
01 ShopName PIC X(30) VALUE SPACES.
01 ReportHeading PIC X(25) VALUE "=== Employment Report ===".
Data Hierarchy
All the data items in Example 2-8 are independent, elementary, items. Although data hierarchy is too complicated a topic to deal with at this point, a preview of hierarchical data declaration is given in BirthDate (see Example 2-9).
Example 2-9. Example of a Hierarchical Data Declaration
01 BirthDate.
02 YearOfBirth.
03 CenturyOB PIC 99.
03 YearOB PIC 99.
02 MonthOfBirth PIC 99.
02 DayOfBirth PIC 99.
26
Chapter 2 ■ COBOL FOundatiOn
In this declaration, the data hierarchy indicated by the level numbers tells you that the data item BirthDate consists of (is made up of) a number of subordinate data items. The immediate subordinate items (indicated by the 02 level numbers) are YearOfBirth, MonthOfBirth, and DayOfBirth. MonthOfBirth and DayOfBirth are elementary, atomic, items that are not further subdivided. However, YearOfBirth is a data item that is further subdivided (indicated by the 03 level numbers) into CenturyOB and YearOB.
In typed languages such as Pascal and Java, understanding what is happening to data in memory is not important.