Author Topic: TrID variants of ark-dz*.trid.xml replacements for Dzip compressed archive (*.dz  (Read 4007 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 375
Hello,

when handling some Dzip compressed archives only newest version are
recognized by ark-dz.trid.xml ( see appended output/trid-old.txt ).

First i look for web site for Dzip archive. So add this site as reference
URL by line:
   <RefURL>http://speeddemosarchive.com/dzip/</RefURL>

When looking inside C program main.c found in dz29src.zip source archive, we
see that major version is stored at offset 2 and minor version is stored at
offset 3 of dz-files. So mention this fact in remark line.

So latest version 2.9 of dzip archives are described by ark-dz.trid.xml by
pattern:
   <Pattern>
      <Bytes>445A0209</Bytes>
      <ASCII> D Z</ASCII>
      <Pos>0</Pos>
   </Pattern>

But older versions like 2.6 in example ark-dz-2.6.dz are not recognised.

So a generic trid definition should contain XML construct like:
   <Pattern>
      <Bytes>445A</Bytes>
      <ASCII> D Z</ASCII>
      <Pos>0</Pos>
   </Pattern>

That way is also used by file(1) command. When looking in that program
output for inspected samples ( see appended output/file-5.32.txt ) we see
that all ASCII text files starting with string "DZ" are misidentified as
Dzip archive. One example for this is the help file of Doszip commander
which is named in newest version DZ.TXT. So this method can not be used.

When looking at version history in download area, we see that only a few
version of dzip archiver exist. Newest version is dated 7 may 2003. So we can
assume that development of that archive type has calmed down. And newer higher
versions are quite unlikely to appear. So i create 3 variants
ark-dz-v*.trid.xml which match major version 2 until 0 by XML construct like:
   <Pattern>
      <Bytes>445A02</Bytes>
      <ASCII> D Z</ASCII>
      <Pos>0</Pos>
   </Pattern>

Some web servers seems to misinterpret archive therefor as text files. This
is described in document dzip.txt on web site. There an own mime type for dz
archive is proposed. So use that also for trid definition by line:
   <Mime>application/x-dzip</Mime>

According to source file the number of files inside archive is stored as
long little endian value at archive offset 8. This can be verified by looking
in output of dzip program with -l option ( see appended output/dzip-l.txt).
So add this fact also in remark line which now is:
   <Rem>
A file compressor by Nolan Pflug, Stefan Schwoon and others.
Version is stored by 2 bytes at offset 2 and number of files by 4 byte integer at 8.
   </Rem>

With variant definition files all inspected examples are now detected as
Dzip compressed archives (See appended output/trid-new.txt).

TrID definition, some examples and output are stored in archive dz.zip .
I hope that my 3 XML files can be used in future version of triddefs.

With best wishes
J?rg Jenderek

Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2743
    • Mark0's Home Page
Nice. I'll put the 2.x version as an update of the current one, and the other 2 as new definitions.
Thanks, as always!