Hello trid users,
Some days ago i run the cleaning tool czkawka found on
https://qarmin.github.io/czkawka/. One menu item concerns bad
extensions. After running tool i looked in saved file list
results_bad_extensions.txt for bad extension examples.
One listed extension is CUB. I found such examples as part of WiX tool set
and Orca software on Windows 8 and 10 systems.
So i run trid utility on my CUB examples. All are described generic as
"Generic OLE2 / Multistream Compound" with low rate by docfile.trid.xml.
The examples are also described with higher rate as "Windows Installer
Patch" with MSP suffix by msp.trid.xml and as "Windows SDK Setup Transform
script" with MST suffix by mst.trid.xml.
For comparison reason i check these examples by file command utility. When
running file command (version 5.43). Here all examples are also described
generic as "OLE 2 Compound Document" with sub type "Microsoft Windows
Installer Package" (See appended output/file-soft-5.43.txt) and with mime
type application/x-msi (See appended output/file-i-4.53.txt) and msi suffix
(See appended output/file-ext-4.53.txt)
For comparison reason i also run the file format identification utility
DROID ( See
https://sourceforge.net/projects/droid/). This identifies all
examples only generic as "OLE2 Compound Document Format" by PUID fmt/111.
Because CUB are OLE2 Compound container we can inspect such examples by
suited tools like Michal Mutl Structured Storage Viewer for example. There
we see that such examples contain at least 2 streams. One with name "Summary
Information" and another with name like SummaryInformation.
Unfortunately i found no little hint with information about file format. So
i was not able to add a CUB specific reference URL to TrID definition.
Instead i use Wikipedia URL about Windows Installer. There are some
sentences written about CUB. So this is expressed by line like:
<RefURL>
http://en.wikipedia.org/wiki/Windows_Installer</RefURL>
After running tridscan to generate definition cub-ms.trid.xml i looked what
XML construct are created and try to understand it. The first XML construct
looked like:
<Bytes>D0CF11E0A1B11AE1000000000000000000000000000000003E00</Bytes>
<Pos>0</Pos>
This looks like the starting magic of Generic OLE2 / Multistream Compound
files done by docfile.trid.xml. There this looks like:
<Bytes>D0CF11E0A1B11AE1</Bytes>
<Pos>0</Pos>
I would like to reduce the XML construct , but i was not able to do this. So
the byte sequence 3E00 means minor version 62 like reported by file command.
The second XML construct looked like:
<Bytes>00FEFF</Bytes>
<Pos>27</Pos>
The FEFF sequence means little-endian.
But i have only a few of such CUB examples and found no hint of information
about file format. So i do not know if this is always true or just triggered
by lucky circumstances. So i keep 2 first XML constructs. The same
considerations applies to the other XML constructs.
According to Wikipedia the characteristic for CUB are Internal Consistency
Evaluators (ICE). So the relevant parts are probably expressed in global
strings section by lines like:
<String>OPTIONAL EXPRESSION WHICH SKIPS THE ICE ACTION IF EVALUATES TO EXPFALSE
<String>TABLE WITH MERGE CONFLICTSNAME OF ICE ACTION TO INVOKE</String>
<String>ICE08 - CHECKS FOR DUPLICATE GUIDS IN COMPONENT TABLE</String>
<String>TABLE MISSING. ICE08 CANNOT CONTINUE ITS VALIDATION.</String>
<String>ICE51'ICE52'ICE53'ICE54'ICE55</String>
<String>ICE44'ICE45'ICE46'ICE4</String>
<String>ICE02'ICE03'ICE0</String>
<String>ICE33'ICE34'ICE3</String>
<String>ICE48'ICE49'ICE5</String>
<String>ICE17'ICE18'ICE</String>
<String>01'ICE</String>
<String>04'ICE</String>
<String>05'ICE</String>
<String>10ICE</String>
<String>12ICE</String>
<String>13ICE</String>
<String>14ICE</String>
<String>ICE.DLL'ICE0</String>
<String>ICE08.VBS</String>
<String>_ICESEQUENCEACTION</String>
<String>FUNCTION ICE08()</String>
The definition contain no mime type. Because CUB are OLE2 documents i could
add generic mime type application/x-ole-storage. But i choose an user
defined one. That is expressed by line like:
<Mime>application/x-ms-cub</Mime>
With the new trid definition now all my CUB examples are described now more
precisely (see appended output/trid-v-new.txt). TrID definition and output
are stored in archive cub_.zip. I hope that my XML file can be used in
future version of triddefs and that other users improve the definition.
With best wishes
Jörg Jenderek