Author Topic: aria2.trid.xml for aria2 control file (v1)  (Read 1255 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 375
aria2.trid.xml for aria2 control file (v1)
« on: October 08, 2021, 06:58:33 PM »
Hello trid users,

some days ago i must download tool aria2. When downloading executables or
archives it create control files with same main name and file name extension
ARIA2.

Unfortunately when i run TrID on such examples these are misidentified as
"GEM bitmap (v1)" by bitmap-img-gempaint.trid.xml and "DEGAS med-res bitmap"
by bitmap-pi2-degas.trid.xml. Often also misidentification as "Adobe
PhotoShop Brush" by abr.trid.xml occur (see appended output/trid-v-old.txt).

For comparison reason i also run the file utility (version 5.40). This also
does not recognize these control files (see appended
output/file-5.40.txt). According to documentation i also run a patched file
command that displays more information (see appended output/file.tmp).

Luckily aria2 is an open source program and there exist an explicit English
technical note about control file (*.aria2) format on software manual page
on GitHub. This is expressed by line like:
 <RefURL>
 https://github.com/aria2/aria2/blob/master/doc/manual-src/en/technical-notes.rts
 </RefURL>

After generating aria2.trid.xml by running tridscan i clean trid definition
file. The first 2 bytes contain version. In my examples this value was
always 1. That value is not so unique. So this leads to misidentifications.

At offset 4 extension field is stored as 4 byte big endian integer. If this
field value is 1 then infoHashCheck is true. In other cases this value is
nil. So the 3 upper bytes are always nil. These 2 facts are described by XML
construct like:
   <Bytes>0001000000</Bytes>
   <Pos>0</Pos>

At offset 6 info hash length is stored as 4 byte big endian integer. So
theoretical maximal value is hexadecimal FFffFFff, but in real examples i
found "low" value 0 or 14h. So i assume that this value is below 256. That
means 3 upper bytes are nil. That is described by XML construct like:
   <Bytes>000000</Bytes>
   <Pos>6</Pos>

At offset 8 comes optional hash value. So following fields appear at
different offsets. So by inspecting only a few examples i apparently get
short nil patterns at higher offsets like:
   <Bytes>00</Bytes>
   <Pos>25</Pos>
   ...
   <Bytes>0000</Bytes>
   <Pos>30</Pos>
These should vanish when inspecting more examples. So i delete these 2
patterns.

Instead of generic mime type application/octet-stream a user defined one is
applies. That is expressed by line like:
   <Mime>application/x-aria</Mime>

With my trid definition all of my inspected ARIA2 control examples are now
described first correctly as "aria2 control file (v1)" (see appended
output/trid-v.txt). TrID definition, some examples and output are stored in
archive aria_.zip. I hope that the XML file can be used in future version of
triddefs.

There exist an older version 0 variant of such control files, but at the
moment i have too few examples to handle such variant. I will try to do this
in a future session.

With best wishes
Jörg Jenderek


Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2743
    • Mark0's Home Page
Re: aria2.trid.xml for aria2 control file (v1)
« Reply #1 on: October 08, 2021, 09:01:40 PM »
Thanks for the new def, as usual.
At a quick glance it seems a very loosely characterized format, with no strong pattern. Will check!