Hello,
when i run TrID on a few Windows CE installation Cabinet (*.CAB) all
are identified too general as "Microsoft Cabinet Archive" ( see
appended output/trid-old.txt ).
So i start a variant definition by running tridscan and finally generate
ark-cab-wince.trid.xml
After searching on the net i found "Windows CE installation cabinet (.CAB)
file format" on website of cabextract tool. So i add this site as reference by
line:
<RefURL>
https://www.cabextract.org.uk/wince_cab_format/</RefURL>
After dealing with different cabinet format i also looked at
fileformats.archiveteam.org for CAB files. There a URL with test files
pointing to
http://libxad.cvs.sourceforge.net/viewvc/libxad/testfiles/CAB/ was
mentioned. There in subdirectory WinCE is PocketSCUMM.PPC_ARM.CAB .
So i know this must be also some specific CAB format.
On raspian distribution i found "pocketpc-cab" to build more installable
Pocket PC cabinet files with help of "lcab" program.
Again also add for such cabinet archives a line for mime types
<Mime>application/vnd.ms-cab-compressed</Mime>
At this point it might be useful to look at output of 7-Zip Console
tool with list command (see output/7z-l.txt).
First archive member is a file like "manifest.000" or
"POCKE~QN.000". According to reference this file contains installation
instruction for Windows CE and has DOS 8+3 filname with ".000" extension.
This gives XML-pattern like
<Bytes>2E30303000</Bytes>
<ASCII> . 0 0 0</ASCII>
According to reference this manifest start with ASCII signature, "MSCE" which
is expressed in global string section by line like:
<String>MSCE</String>
At offset 30 cabinet archive flag is stored as short little endian
value. Value 1 and 2 are used to for additional header bytes for building
cabinet chains (for example PRECOPY1.CAB-> PRECOPY2.CAB-> PRECOPY3.CAB).
Obviously this is not used for CE cabinets. Value 4 is used to reserve
additional bytes in header for something like signatures. To day we know about
importance of code checksums, but in times of Windows Ce security thinking was not so
big. So this features was implemented but apparently never used in such cabinets.
So for Windows CE cabinets flag value is apparently always 0.
This is expressed by third XML construct:
<Bytes>0000</Bytes>
<Pos>30</Pos>
That also means no optional bytes or in other word header is minimal (36 bytes),
CFFOLDER structure is minimal (8 bytes) and CFFILE structure is minimal (16
bytes + name bytes).
CE cabinets seems to have only 1 CFFOLDER ( cFolders short at offset 26). So
offset of the first CFFILE entry at offset 16 should be equal to sum of size
of header(36) and 1 folder entry size (8). Yes, this is true (44 ~ 2Ch).
Reserved areas have 0 values. At offset 24 cabinet file format version.
Currently only versionMajor = 1 and versionMinor = 3.
So second pattern with reserved2, 2C-offset, reserved3 , version ,
CFFOLDER-entry is expressed by second XML-construct:
<Bytes>000000002C0000000000000003010100</Bytes>
<ASCII> . . . . ,</ASCII>
<Pos>12</Pos>
iCabinet at offset 34 is number of cabinet file in a set, where 0 is used for
the first cabinet. So for CE cabs this is seems to be always 0 expressed by
fourth XML construct:
<Bytes>0000</Bytes>
<Pos>34</Pos>
At position 36 long offset of 1st CFDATA block (following file entry) is stored
as "coffCabStart". If archive contains only some members, then number of CFFILE
structure is low and and offset is not so high. This was expressed by
<Bytes>0000</Bytes>
<Pos>38</Pos>
This must become false if archive contains many members. So i removed that
pattern.
Following number of CFDATA blocks in folder by short "cCFData" at postition
40. Often this low when archive contain ony few and little members. So removed
that pattern part. At postion 42 compression type indicator is stored as
short "typeCompress". 0 means no compression. According to reference Windows
CE installation cabinets typically use NONE compression. This is expressed now
by XML conrtuct:
<Bytes>0000</Bytes>
<Pos>42</Pos>
This must be changed for CE cabinets with MSZIP compression.
At position 44 CFFILE structure starts with uncompressed size of file as
long "cbFile". If member is small then this value is low. But if member is
big, that value grows. So you can not rely on such null bytes. So removed
that part.
At position 48 uncompressed byte offset of the start of data is stored as long
"uoffFolderStart". For the first file in each folder, this value will usually
be zero. So keep this null pattern.
At position 52 index of the folder is stored as short "iFolder". A value of
zero indicates this is the first folder in this cabinet file. So keep this
null pattern. This and previous value is expressed byXML construct:
<Bytes>000000000000</Bytes>
<Pos>48</Pos>
At position 54 short values for date and time are stored. These of course are
different.
At postition 58 member attribute are stored as short "attribs". When we
believe in Micosofts CAB specification, where highest bit is given by
_A_NAME_IS_UTF with value 0x80 high byte of attribute is never used. This is
expressed by XML construct:
<Bytes>00</Bytes>
<Pos>59</Pos>
At position 60 name of member is stored. According to reference DOS 8+3
filename with extension ".000" is used for first member. This is expressed by:
<Bytes>2E30303000</Bytes>
<ASCII> . 0 0 0</ASCII>
<Pos>68</Pos>
Tridscan produce more null pattern at the end like:
<Pattern>
<Bytes>00</Bytes>
<Pos>88</Pos>
</Pattern>
<Pattern>
<Bytes>00</Bytes>
<Pos>117</Pos>
</Pattern>
This seems to be an unlucky accident. So removed these pattern.
With new definition file all inspected Windows CE installer CABs now described
more precise (See appended output/trid-new.txt).
TrID definition, some example and output are stored in archive cab_pocket.zip.
I hope that my XML file can be used in future version of triddefs.
With best wishes
Joerg Jenderek