Hello trid users,
some days ago i handled WordPerfect files with name extension CBT. So i
looked for other WordPerfect samples. There exist samples with WPK file name
extension and names like: SHORTCUT.WPK MACROS.WPK EQUATION.WPK ENHANCED.WPK
ALTRNAT.WPK SYMBOLS.WPK.
I found no information especially about file format specification about such
WordPerfect files, but luckily some basic info are found in unofficial
WordPerfect File Format description WPFF_DocumentStructure.htm. So i choose
that page as reference. That is expressed by line like:
<RefURL>
https://github.com/OneWingedShark/WordPerfect/blob/master/ doc/SDK_Help/FileFormats/WPFF_DocumentStructure.htm
</RefURL>
When i run TrID on such examples these are described correctly but
unspecific like "WordPerfect (generic)" by definition wp-generic.trid.xml
(see appended output/trid-v-old.txt).
For comparison reason i also run the file utility (version 5.42). This
describes the examples as "WordPerfect" with sub classification as "keyboard
file" and version "v1.1" (see appended output/file-5.42.txt).
So i run tridscan on my examples to generate wpk-wp.trid.xml. So we see that
not only the first 4 bytes are the same like \xFFWPC that is generic for all
WordPerfect samples, but also first the bytes are the same. That is
expressed by XML construct like:
<Bytes>FF575043</Bytes>
<ASCII> . W P C</ASCII>
<Bytes>00000103010100000000FBFF05003200000000000100CE02000042000000</Bytes>
<ASCII> . . . . . . . . . . . . . . 2 . . . . . . . . . . . B</ASCII>
<Pos>6</Pos>
Unfortunately the definition is based only on 6 samples.
At offset 4 pointer to document area is stored as 4 byte little endian
integer. In my example this value was "low" ( like 23381 2978 32835 3355
3775 919). The keyboard files have no document area. So the pointer contain
end of file position. So here the file size is stored. So the 2 upper bytes
are nil by lucky circumstances in my examples.
At offset the 8 and 9 the product and file type are stored. In my example
this value was always 1 and 3. According to documentation that is
significant for WordPerfect keyboard WPK samples.
At offset 10 the major version and minor version fields are stored as byte
value. In my examples this value was always 1.1. At offset 12 encryption
field is stored. In my examples this was always zero. At offset 14 pointer
to index area is stored. In my examples this value was always zero. At
offset 16 extended file header starts with byte sequence FBFF0.
Assuming that there exist examples with bigger file sises and other version
numbers this becomes according to documentation like:
<Pattern>
<Bytes>0103</Bytes>
<ASCII> . .</ASCII>
<Pos>8</Pos>
</Pattern>
<Pattern>
<Bytes>00000000FBFF05003200000000000100CE02000042000000</Bytes>
<ASCII> . . . . . . . . 2 . . . . . . . . . . . B</ASCII>
<Pos>12</Pos>
</Pattern>
The definition contain short nil sequences like:
<Pattern>
<Bytes>00</Bytes>
<Pos>37</Pos>
</Pattern>
Assuming that these are triggered by lucky circumstances i delete these.
The definition contain short non nil sequences like:
<Pattern>
<Bytes>000010030000</Bytes>
<Pos>40</Pos>
</Pattern>
<Pattern>
<Bytes>59FE55FE</Bytes>
<ASCII> Y . U</ASCII>
<Pos>80</Pos>
</Pattern>
<Pattern>
<Bytes>0B80</Bytes>
<Pos>394</Pos>
</Pattern>
Assuming that these are triggered by lucky circumstances i delete these.
Then only alphabet pattern survived like:
<Bytes>5E005F0060006100620063006400650066006700680069006A006B006C
<ASCII> ^ . _ . ` . a . b . c . d . e . f . g . h . i . j . k . l
<Pos>560</Pos>
And in global string section i delete garbage looking strings and just keep
alphabet looking strings like:
<String>A'B'C'D'E'F'G'H'I'J'K'L'M'N'O'P'Q'R'S'T'U'V'W'X'Y'Z</String>
<String>0'1'2'3'4'5'6'7'8'9</String>
With the new definition now WordPerfect keyboard WPK example are described
more precisely ( see appended output/trid-v-new.txt). TrID definition and
output are stored in archive wpk.zip. I hope that the XML file can be used
in future version of triddefs.
Unfortunately there exist other keyboard files for VAX with other file
format. So i mentioned this in a remark line.
With best wishes
Jörg Jenderek