z/VM Pipeline Filters, continued...
We learned already that SPECS lets you reformat the records. SPECS has however a lot more interesting options that we have not discussed yet.
One of the input operands that we didn't discuss yet is the recno operand. It lets you put record numbers in the output records. recno causes SPECS to generate a record number in a 10-character field. The number is right-justified in the field and is padded on the left with blanks.
The figure shows an example where the 10-character record number is positioned at column 1 of the output record (recno 1). The record number is padded on the left with blanks. The input record is put immediately after the record number (1-* 11).
SPECS with RECNO
> pipe literal line2!literal line1!specs recno 1 1-* 11!console
.........1line1
.........2line2
Ready;
Blanks are indicated with dots here.
Recno can be used to remember the original order of the records before shuffling them. For example, number the records before sorting them and eliminating the duplicate records. Then sort again on the record number field to put them back in the original sequence.
Extra options from and by allow to specify the starting record number and the increment to use. For example, to number from record 1000 with an increment of 100, issue
pipe ... ! specs recno from 1000 by 100 1 1-* Next ! ...
We already know the next and the nextword output operands to position the results. Alignment operands let you align data within the output record. The alignment operand follows the input and output operand pair, as follows :
input output [alignment]
In the following example, the string "My Summer Vacation" is centered in a field 80 columns wide.
Aligning data with SPECS
> pipe literal My Summer Vacation!specs 1-* 1-80 center!console
My Summer Vacation
Ready;
When aligning data, SPECS strips of the leading and trailing blanks and aligns the remaining in the output field, truncated or padded as necessary. A STRIP is therefore not needed. For example :
> pipe literal hello!specs 1-* 1 left!console
hello
Ready;
Next example shows how to align lines on the right. The output field is from column 1 for 50 columns.
SPECS - aligning data to the right
> pipe literal shorter!literal a long line!specs 1-* 1.50 right!console
a long line
shorter
Ready;
The left operand can be used to align data left in the output field, but with stripping of any leading blanks. For example, consider these 2 pipelines :
> pipe Literal Hello ! specs 1-* 1.30 left ! console
Hello
Ready;
> pipe Literal Hello ! specs 1-* 1.30 ! console
Hello
Ready;
Another kind of SPECS operand is the conversion operand. The conversion operand causes SPECS to convert data from one format to another. You can, for example, convert a character input item to hexadecimal, and have the resulting hexadecimal value placed in the output record.
A conversion operand for a data item is specified between the input and output operands for that item. Thus, we now have four kinds of operands that can be specified for a single data item. The order of operands for a given item must be as follows :
input [conversion] output [alignment]
The input and output operands must always be specified. The conversion and alignment operands are optional. If desired, a conversion operand and an alignment operand can be specified for each of the single data items as shown in next example where the first eight bytes of a record are shown in hexadecimal.
Two output data items are specified. The group of operands for the first item is 1.4 c2x 1. The group of operands for the second item is 5.4 c2x 10.
Converting data with SPECS
> pipe literal 1234ABCD89!take!specs 1.4 c2x 1 5.4 c2x 10!console
F1F2F3F4 C1C2C3C4
Ready;
In SPECS, the operands 1.4 c2x 1 indicate that the first four bytes of the input should be copied to column 1 of the output after being converted from character to hexadecimal (c2x). The string 5.4 c2x 10 converts the next four bytes and positions them at column 10 of the output record.
Next figure shows several other conversions. The conversion operand C2B converts data from character to binary. B2C reverses the conversion. X2C converts from hexadecimal to character - it requires an even number of hexadecimal characters.
Additional conversion examples
> pipe literal 911!specs 1-* c2b 1!console!specs 1-* b2c!console
111110011111000111110001
911
Ready;
> pipe literal F0F0F740D1819485A240C2969584!specs 1-* x2c 1!console
007 James Bond
Ready;
Both REXX and CMS Pipelines have conversion functions, but CMS Pipelines has more possibilities and at least in one case (C2D), the same conversion function does not produce the same results in both environments. This is very confusing. For example :
> pipe literal A!specs 1.1 c2d 1!console
-63
Is different from REXX :
> say c2d('A')
193
For CMS Pipelines this is a negative number (first bit = 1).
These are the conversions supported by SPECS (you can look up the specific details in the online HELP) :
C2B
converts characters to binary (bit) equivalent.
C2D
converts internal binary integers (with 2-complement notation) to a decimal character string.
C2F
converts internal long floating point numbers to scientific notation.
C2I
converts an MVS Julian date to ISO (sorted) date.
C2P
Converts a packed decimal number to a printable character format.
C2V
Converts a character string prefaced by the length of the string to the string only.
C2X
Converts character bytes to hexadecimal notation.
B2C
Reverse function of C2B.
D2C
Converts a signed or unsigned decimal integer to a fullword binary number.
F2C
Reverse function of C2F.
I2C
Reverse function of C2I.
P2C
Reverse function of C2P.
V2C
Reverse function of C2V.
X2C
Reverse function of C2X.
It is also possible to convert from one printable format directly into another printable format. The table shows the possible conversions.
To DTo XTo BTo FTo VTo PTo IFrom D-D2XD2B----From XX2D-X2BX2FX2VX2PX2IFrom BB2DB2X-B2FB2VB2PB2IFrom F-F2XF2B----From V-V2XV2B----From P-P2XP2B----From I-I2XI2B----Advanced uses of SPECS.
This section describes how to combine several input records with SPECS, how to write multiple output records, and how to use relative column references.
Combining input records
SPECS lets you process several input records at a time. This is often useful when you want to process groups of related input records. For example, suppose you are processing input records that are consistently grouped as follows :
Record 1 of the group contains a name
Record 2 of the group contains a street address
Record 3 of the group contains the a city, state, and zip code
Now suppose that for each group of records you want to write one output record that contains the state followed by the name. To do it, you would need to get the name from the first record, skip the second record, and get the state from the third. You can do it with the read operand of SPECS.
The READ operand causes SPECS to read the next record from the input stream without writing a record to the output stream. Look at this example :
Using the READ operand of SPECS
> pipe < address data ! console
Smith, Joseph
3211 Titan Drive
Lake Town NY 11011
Jones, Susan
525 Main Street
Scranton PA 20192
Ready;
> pipe < address data ! specs 1-* 4 read read 20.2 1 ! console
NY Smith, Joseph
PA Jones, Susan
Ready;
The first PIPE command displays the contents of the file ADDRESS DATA. We see that there are two addresses. Each address takes three records. The second PIPE command displays the results we want.
Let's analyze it. The first group of operands 1-* 4 takes the entire input record, which contains the name, and puts it in the output record starting at column 4. That is all we need to do with the first input record, so we specify a read operand to consume the input record, and to get to the record that contains the street address. We don't want to do anything with the street address, so we specify yet another read.
The operands following the second read in SPECS now refer to the third input record. From this third input record, we select the state. The state always starts in column 20 of the input record. The operand group 20.2 1 puts the state abbreviation into the first and second columns of the output record.
Notice that we are still producing the same output record even though we have read three input records. After the state is put in the output record, SPECS writes the single output record to its output stream. Then the whole process repeats for the next three input records.
Writing multiple output records
The write operand causes SPECS to write an output record without reading a new input record. It is the converse of read, and allows to split input records in several output records based on a specification template.
The figure shows an example that produces two REXX variables from the results of the IDENTIFY command of CMS :
Using the WRITE operand of SPECS
pipe command IDENTIFY!specs w1 1 write w3 1!var userid!drop 1!var nodeid
IDENFIFY pumps a record in the pipe that looks like :
DECEULAE AT VMSYS VIA RSCS 07/14/99 14:30:19 CET THURSDAY
The SPECS stage now takes the first word and writes it as a first output record. It then takes word 3 for next output record. The result is that there are now 2 records going to subsequent stages, one containing the user-id and one containing the node-id.
VAR stores the first record into a REXX variable. As VAR passes all records (including the one it just assigned to a REXX variable) to the next stage, we have to DROP the first record coming along, in order to be able to store the second record in the nodeid variable.
This is a good example of the write option of SPECS but is an overkill compared to :
'IDENTIFY (LIFO'
parse pull userid . nodeid .
A benchmark shows that the PIPE solution is 2.5 times less performing.
parse value diag(8,'Q USERID') with userid . nodeid . '15'x
is another alternative if you don't need the userid of the remote spooling machine.
You can use both the read operand and the write operand in a SPECS stage command.
Using relative column references
SPECS lets you refer to input columns by relative position. For example, when you specify ranges (such as 1-7), the numbers are relative to the beginning of the record. You can also use negative numbers to refer to columns relative to the end of the record.
For example, suppose the pipeline contains records of varying lengths. How can you have SPECS write only the last column to the output record ? It is not possible with what we know already. Everything so far has been relative to the beginning of the record. Because the lengths of the records differ, no single column number will give the last column for all input records.
Instead, we need to refer to the last column by giving some number relative to the end of the record. To do so, use a negative column number. When negative column numbers are used in a column range, they must be separated by a semicolon (;). The usual hyphen (-) or period (.) cannot be used as it would be confused with the minus signs. The example shows a SPECS stage command that displays the last column of each record.
Using negative relative column numbers
> pipe literal ABCDE! literal abc! specs -1;-1 1 ! console
c
E
Ready;
The argument pair -1;-1 1 means that the first column relative to the end of the input record should be copied to column 1 of the output record.
The input range -1;-1 is a range that refers to a single column. Think of the columns as being numbered backward :
ABCDE <--record
54321 <--column numbers relative to the end of the record
abc <--record
321 <--column numbers relative to the end of the record
Next figure shows a similar example. The third column relative to the end of the input record is put in the output record at column 5.
Using negative relative column numbers
> pipe literal ABCDE!literal abc!specs -3;-3 5!console
a
C
Ready;
Suppose you want to see the last two columns. The input range should then be -2;-1...
Using negative relative column numbers
> pipe literal ABCDE! literal abc! specs -2;-1 1 ! console
bc
DE
Ready;
You cannot reverse the order of the numbers in the preceding example. This would make the beginning column of the range to the right of the ending column.
Next pipeline shows what happens when you use a negative column number that is too high. The entire record is returned. When the column number is too high in a positive column range, the same result occurs.
Specifying a range beyond the input record
> pipe literal ABCDE!literal abc!specs -600;-1 1!console
abc
ABCDE
Ready;
When we discussed the LOCATE stage command, we promised to show you how to filter records by looking at the ends of the records. Here it comes. The example finds all records ending by a character x. It is assumed that the file INPUT FILE contains variable-length records.
Looking at the end of a record
> pipe < input file !specs -1;-1 1 1-* next!find x!specs 2-* 1!cons
The first SPECS stage command copies the last column of each input record to column one of the output record. It also copies the entire input record to the same output record. Once the end of the record is moved to the beginning, we can use FIND to select those beginning with x. The second SPECS stage removes the first column of the selected lines, restoring the original contents.
It is also possible to use negative word numbers in the template. So to select the last word of a record, use w-1;w-1.
FIELDS and FIELDSEParator
With fieldsep, one can define the separator character that separates fields in records (the default is the x'05' tab character). Once this is done, it is possible to work with fields instead of column numbers or words.
Field separated files are typical in the PC world. Database or spreadsheet programs can export or import so-called DIF Files (Data Interchange Files) where fields are separated from each other by specific characters, such as a comma or semi-colon.
Field separators
> pipe literal 1,23,4 6,7,9!specs fieldsep , fields 3-4!console
4 6,7
Ready;
DUPLICATE makes copies of input records. It reads an input record and writes that record one or more times to its output stream. For DUPLICATE's operand, specify the number of additional copies desired.
The example makes 2 additional copies of each input record.
DUPLICATE example
> pipe literal Are we almost there?!literal Dad!duplicate 2!console
Dad
Dad
Dad
Are we almost there?
Are we almost there?
Are we almost there?
Ready;
The SORT stage command buffers records in the course of its processing. There are other times you might want to buffer records yourself. To do so, use the BUFFER stage command.
BUFFER holds all the records until it has read the last input record. Then BUFFER writes the records to the next stage. Use BUFFER any time the records must be delayed until all input is read.
One such time is when you want to give records to XEDIT that an earlier stage is reading. Look at this example, where we would like to reorganize the columns in a file during an XEDIT session :
pipe xedit ! specs 6-10 1 1-5 n 11-* n ! xedit
The problem here is that when the first XEDIT stage has read a record, the record pointer is advanced to the next record. As the record gets not delayed by the pipeline, the SPECS and the last XEDIT stage may do their processing immediately and thus replace the current (thus second) record, before the first XEDIT stage gets the chance to read it. The results are thus unpredictable.
If we include a BUFFER stage, however, the problem gets solved somehow :
pipe xedit ! buffer ! specs 6-10 1 1-5 n 11-* n ! xedit
Now, all records till end of the file, get read and buffered, and only then can SPECS start working. The modified records are now appended to the end of the file. Then you have to remove the 'old format' records manually. Try these 2 PIPE commands on a test file an see the differences.
Another case where buffering is needed is when you want to put records typed at the console into the CMS stack :
Using BUFFER for stacking records
/* Read user's input into the stack */
address command
say 'Please enter the your input now'
'PIPE console ! buffer ! stack
/* do something with the stack */
exit
If we wouldn't buffer the records, CMS Pipelines would put the lines you type at the terminal in the stack immediately. As CMS Pipelines is still running the CONSOLE stage, this stage would read the record from the stack immediately, pull it in again, read it again, etc...
This is an example of a loop with CMS Pipelines !
You can pad (expand) or chop (truncate) records so they have a desired length. Often PAD and CHOP are combined to create a particular output format.
CHOP stage command
CHOP truncates each record after a column. Specify the column number after CHOP.
CHOP example
> pipe literal She loves me; she loves me not. ! chop 12 ! console
She loves me
Ready;
PAD stage command
PAD fills each record to the specified length with a pad character (the default is a blank). You can request the pad character to be filled on the right or on the left.
Next figure chops the record at column 12 and then pads the record to column 20 with question marks (?). The pad character must follow the column number as shown.
PAD example
> pipe literal She loves me; she loves me not. ! chop 12 ! pad 20 ? ! console
She loves me????????
Ready;
By default, PAD adds pad characters to the right side of the string. To add them to the left, type left after pad as shown here
PAD example - padding on the left
> pipe literal She loves me; she loves me not.! chop 12 ! pad left 20 . !console
........She loves me
Ready;
And now we combine CHOP and PAD to create records with fixed lengths (here with a length of 10).
PAD and CHOP together
> pipe < test data ! console
Short
This record will be truncated
Ready;
> pipe < test data ! pad 10 ? ! chop 10 ! console
Short?????
This recor
Ready;
In this example, PAD extends short records to 10 characters with question marks (?) on the right. CHOP truncates records that are longer than 10 characters.
Combining PAD and CHOP to create fixed records is useful when you want to create an F-format file.
16Can you give an alternative to a PAD/CHOP combination, using the SPECS stage ?
This concludes the chapter on filters. Chapter 6 will explain how REXX and CMS PIPELINES can be combined, just enough to let you practice this lesson in the exercises.