Author Topic: updated deb.trid.xml for *.udeb *.ipk  (Read 1781 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 369
updated deb.trid.xml for *.udeb *.ipk
« on: January 04, 2020, 12:16:07 AM »
Hello trid users,

some days ago i run TrID on hundreds of Debian Packages described by
deb.trid.xml as "Debian Linux Package". There only file name extension deb
is mentioned ( see appended deb/output/trid-old.txt).

There exist also other file name extensions. So i update trid definition.

In current trid definition there exist no reference. So i add Wikipedia page
about Debian package format. This is expressed by additional line:

   <RefURL>https://en.wikipedia.org/wiki/Deb_(file_format)</RefURL>

According to that page i also add mime type. This is now shown by
additional line:

   <Mime>application/vnd.debian.binary-package</Mime>

According to reference URL some core Debian packages are available as "micro
debs". Such packages like libc6-udeb_2.27-8_armhf.udeb have file name
extension udeb.

The Debian file format is also used by other package managers like ipk and
opkg. Such packages have file name extension ipk like in example
opkg_0.2.4-r0-vuplus0-vti005_armv7ahf-vfp-neon.ipk .

When i run trid on such samples, these are described in general as "ar
archive" by ark-ar-archiver.trid.xml ( see appended
ipk/output/trid-v-old.txt).

On the other hand the newest file command {See
https://en.wikipedia.org/wiki/File_(command)} describes inspected examples
correctly like "Debian binary package" ( see appended ipk/output/file.txt).

In current deb.trid.xml recognition happens by pattern via XML construct:

        <Bytes>213C617263683E0A64656269616E2D62696E617279202020</Bytes>
        <Pos>0</Pos>

As a string this looks like "!.arch..debian-binary" followed by space
characters. According to Wikipedia page about Unix ar archiver the BSD
variant stores filenames right-padded with ASCII spaces. That is described
by current trid definition.

But also the System V or GNU variant can be used. There the slash character ( /
= 0x2F) is used to mark the end of the filename. That format is used for
inspected ipk packages.

So in general the search pattern must become:

        <Bytes>213C617263683E0A64656269616E2D62696E617279</Bytes>
        <Pos>0</Pos>

And all possible file name extensions are now expressed by line:

   <Ext>DEB/UDEB/IPK</Ext>

With the updated trid definition file now all micro debs are described
correctly ( see appended deb/output/trid-old.txt).
and also the ipk packages are now recognized ( see appended
ipk/output/trid-new.txt)


TrID definition and output are stored in archive udeb.zip. I hope that the
XML file can be used in future version of triddefs.

There exist more variants deb. I am working on these items and will generate
more trid definitions.

With best wishes
Jörg Jenderek

Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2682
    • Mark0's Home Page
Re: updated deb.trid.xml for *.udeb *.ipk
« Reply #1 on: January 04, 2020, 03:16:26 AM »
Many Thanks for the updated def, as usual!