Author Topic: updated ark-cab-ishield-hdr.trid.xml for InstallShield setup header+2 varinants  (Read 3007 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 375
Hello trid users,

some days ago i installed an old Windows software. In installation directory
are obviusly files belonging to InstallShield software with file name
extension CAB and HDR.

When when i run TrID on such examples only some HDR examples are described
correctly as "InstallShield setup header" by ark-cab-ishield.trid.xml. And
all examples are described as "InstallShield compressed Archive" by
ark-cab-ishield.trid.xml (see appended output/trid-v-old.txt).

For comparison reason i also run the file utility (newest version >5.41).
This identifies most CAB examples as "InstallShield CAB" by starting 4 byte
string "ISc(". And all HDR examples are now described as "InstallShield
setup header" (see appended output/file-new.txt). According to documentation
i also run a patched file command that displays more information (see
appended output/file.tmp).

So first i update trid definition ark-cab-ishield-hdr.trid.xml by running
tridscan. The starting 4 byte pattern is still expresssed by first XML
construct inside both trid deftion. This still looks like:
   <Bytes>49536328</Bytes>
   <ASCII> I S c (</ASCII>
   <Pos>0</Pos>
The second XML consruct looked like
   <Pattern>
      <Bytes>00010000000000020000</Bytes>
      <Pos>6</Pos>
   </Pattern>

After running tridscan this now becomes like
   <Pattern>
      <Bytes>00</Bytes>
      <Pos>6</Pos>
   </Pattern>
   <Pattern>
      <Bytes>0000000000020000</Bytes>
      <Pos>8</Pos>
   </Pattern>

Luckily there exist a free software unshield, that can handle such
InstallShield Cabinet archives. The relevant information is found in header
file cabfile.h and c-source file helper.c on unshield page on web site
github.com.

After the 4 byte CAB_SIGNATURE (0x28635349) according to c-source at offset
4 the version information is stored as 4 byte value. That information is
also shown by file command. Like in many software products apparently the
minor version number increases rapidly and then some times later the major
version number increase from 1 to 2 and so on. And by lucky circumstances
the middle version part never changes or is zero. I believe that is also
here expressed by second XML constuct. Assuming that maybe also other
versions exist or become manifested in the future, so i delete the second
XML construct.

At offset 8 the volume info is stored as 4 byte value. In all examples this
value was 0. But i do not know if this is always true. So i mentioned this
fact in remark line.

At offset 12 the descriptor offset is stored as 4 byte value. In all
examples this value was 0x200. But i do not know if this is always true. So
i mentioned this fact in remark line. So these 2 facts are expressed by
third XML construct like:
   <Pattern>
      <Bytes>0000000000020000</Bytes>
      <Pos>8</Pos>
   </Pattern>

At offset 16 the descriptor size is stored as 4 byte value.  After
inspecting hundreds of InstallShield this value was zero in nearly all my
CAB examples and non zero in my HDR examples. So i mentioned this fact in
remark line. This criterium is used by file command to distinguish the HDR
and CAB examples.

Then i got some nil patterns in defition, but i do not know if these are
relevant. So i keep these patterns. Interesting are the patterns after
mentioned desriptor offset. That is 0x200 or 512 decimal. For the CAB
examples here all values are different, whereas for the HDR examples i found
here some nil sequnces. So these are relevant especially when comparing with
HDR. So these pattern look like:
      <Pattern>
         <Bytes>0000000000</Bytes>
         <Pos>515</Pos>
      </Pattern>
      <Pattern>
         <Bytes>020000</Bytes>
         <Pos>521</Pos>
      </Pattern>

Instead of generic mime type application/octet-stream i use the user defined
one, that is used by file command ( See output/file-i-new.txt). So that is
now expressed by line like:
      <Mime>application/x-installshield</Mime>

With the knowledge described above i generate via tridscan a new dfinition
ark-cab-ishield-cab.trid.xml for CAB examples. Do the same steps as for HDR.

To distinguish the new definition from ark-cab-ishield.trid.xml i choose
another descritpion. This is expresed by lne like:

   <FileType>InstallShield compressed Archive (CAB)</FileType>

But maybe it is advisable to take the old definition and name this to
something with phrase "generic" and use that description for the new
defition.

In the old defitions the company web site installshield.com was used as
reference URL. But there you find only "nice" HDR like photos but no
relevant information. So it is only 'Blah, blah, blah' as Greta Thunberg
would say. More useful information is found on file formats archive team web
site. So this now expressed by line like:
   <RefURL>http://fileformats.archiveteam.org/wiki/InstallShield_CAB/</RefURL>

Unfortunatly i found 6 CAB examples which are identfied by file command as
HDR, because here the descriptor size is non zero. When looking what is
different i see that such examples have "low" version 0x100000 and are dated
about 1999 wheras the "good" CAB have high version number ( like 0x1005201
0x100600c 0x1007000 0x1009500 0x2000578 0x20005dc 0x2000640 0x40007d0
0x4000834 see appended cab_old/output/file-new.txt). So i run tridscan to
generate ark-cab-ishield-cab-old.trid.xml.

Interesting beside "low" version is now the appearance of lines in global
strings section like:
   <String>PROGRAMFILES</String>
   <String>COMMONFILES</String>
   <String>SUPPORTDIR</String>
   <String>TARGETDIR</String>
   <String>WINSYSDIR</String>
   <String>SRCDIR</String>
   <String>WINDIR</String>
   <String>HKCC</String>
   <String>HKCR</String>
   <String>HKCU</String>
   <String>HKDD</String>
   <String>HKLM</String>
   <String>HKUS</String>
   <String>ISC(</String>
   <String>LANG</String>

Apparently some refer to variables (like LANG SRCDIR TARGETDIR WINDIR
WINSYSDIR) and registry key names (like HKLM HKCU) whereas in "modern" CAB
no string at all appear. So i mention this fact in remark line.
Unfortunately in many HDR examples these strings also appear. So i do not
know if old CAB are clearly distinguishde from HDR, but for all my inspected
examples this works.

With updated HDR variant and the 2 new CAB variant definitions all of my
inspected InstallShield examples are now described correctly (file name
extension) (see appended cab_old/output/trid-v.txt and output/trid-v.txt).
TrID definition, some samples and output are stored in archive
hdr_cab.zip. I hope that the 3 XML file can be used in future version of
triddefs.

With best wishes
Jörg Jenderek

Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2840
    • Mark0's Home Page
Thanks for the updated and new defs.
I'm not entirely sure about the 2 new ones, maybe the generic one is enough.