Author Topic: iso-udf.trid.xml for ISO images with UDF file system  (Read 849 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 375
iso-udf.trid.xml for ISO images with UDF file system
« on: March 19, 2023, 12:19:21 AM »
Hello trid users,

Some days ago i handle CD-ROM images (401 samples including
duplicates). The standard format has a suffix like iso.

In the next step i do sub selection. Now i only consider ISO samples
which are not identified by TrID, but described by DROID ( See
https://sourceforge.net/projects/droid/) as "UDF Disc Image" by PUID
fmt/1738 (See appended output/droid-udf.csv). Then i get about a dozen
of samples.

A similar example like TEST-imgburn.iso is described by DROID as
"UDF-ISO 9660 Bridge Disc" by PUID fmt/1739. This variant is
recognized by TrID. This is described as "ISO 9660 CD image" by
iso-9660-image.trid.xml (See appended output/trid-v-old.txt).
This example is also described by file command (version 5.44) as "UDF
filesystem data" (See appended output/file-5.44.txt). Here 2 suffix
iso/udf are listed (See appended output/file-ext-5.44.txt) and
application/x-iso9660-image is used as mime type (See appended
output/file-i-5.44.txt).

I take nearly one day to understand what is going on here. This
contain two parts. One for ISO 9660 CD-ROM and also a part for
UDF. That hybrid images are therefore called like "UDF-ISO 9660 Bridge
Disc". Because many ISO samples are such hybrid then these are already
identified by ISO 9660 part but sample with UDF part are not
recognized. This can be verified by udftool command line and/or 7-zip
packing command line like:
   udfinfo nero-UDFv26.iso
   7z l -tUdf nero-UDF1.iso
   7z l -tIso TEST-imgburn.iso
The -tIso option for 7-zip handles the images as ISO 9660 images
whereas the -tUdf option handles the images as UDF images. With t
option only integrity of image is tested ("Everything is Ok").  You
can also do this verification by 7-zip packing tool via command lines
like:
   7z t -tUdf "TEST-imgburn.iso"   >"output\TEST-imgburn.iso-tUdf.txt"
   7z t -tIso "TEST-imgburn.iso"   >"output\Test-imgburn.iso-tIso.txt"
If the type is not OK then the output contains a line like:
   Can't open as archive:

So i run tridscan on undetected ISO samples to generate trid
definition iso-udf.trid.xml. Then i look inside definition for
patterns try to understand why things happens. In Front Block only 1
pattern appear. That looks like:
 <Bytes>000000000000000000000000000000000000000000000000000000000000000
 <Pos>0</Pos>
That is not surprising because samples contain only an UDF part and
neither MBR nor APM part and no ISO 99960 part. So the beginning is
empty.

In global strings section i get six lines with characteristics for
UDF. These looks like:
 <String>OSTA COMPRESSED UNICODE''''''''''''''''''''
 <String>*OSTA UDF COMPLIANT</String>
 <String>*UDF LV INFO</String>
 <String>+NSR0</String>
 <String>BEA01</String>
 <String>TEA01</String>

Information about Universal Disk Format (UDF) can be found on file
formats archive team web site and on Wikipedia. Because on first i get
no more relevant information i choose Wikipedia page. That is
expressed by line like:
 <RefURL>https://en.wikipedia.org/wiki/Universal_Disk_Format</RefURL>

Because for CD-ROM/DVD with ISO 9660 CD-ROM file system the suffix ISO
is used. This is also used for images with UDF part. To distinguish
from that old part obviously also suffix UDF is used. This this not
mentioned or described officially but this essential needed for
Windows system relying on suffix of file name.  That is expressed by
line like:
   <Ext>ISO/UDF</Ext>

Furthermore i found in shared mime database an user defined mime type
for hybrid variant. So i take this. That is expressed by line like:
   <Mime>application/x-udf-image</Mime>

But UDF samples with more parts should be recognized by the new
definition. But when trying to update then the front block will vanish
and definition will not work any more. The same problems as for
iso-9660-image.trid.xml occur here.

Instead string CD001 here i find extended descriptor section
(indicated by BEA01 string) at relative offset 1 of block 16 with size
2048 (that is offset 32769). In the next block i find string NSR0
(that is offset 34816). This type descriptor is an indicator for
UDF. That in principal is used by DROID for recognition.

With the new trid definition now such ISO examples with UDF are
described (see appended output/trid-v-new.txt). TrID definitions,
and output are stored in archive iso_udf.zip. I hope that my
definition can be used in future version of triddefs.

Some ISO samples are still not recognized. These seems to have also an
UDF file system. But i also found 2 samples "Shareware Grab Bag.iso"
BOOKSHELF.ISO which are described by file command as "High Sierra
CD-ROM filesystem". I was not able to create a TrID definition. The
file command looks here for string CD001 at offset 32769.

With best wishes
Jörg Jenderek

Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2732
    • Mark0's Home Page
Re: iso-udf.trid.xml for ISO images with UDF file system
« Reply #1 on: March 21, 2023, 11:29:01 PM »
Thanks!