Author Topic: variant thm-canonDigitalIxus300.trid.xml for Thumbnail JPEG Bitmap (*.thm)  (Read 1021 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 375
Hello trid users,

some days ago i looked at the media on my card of my digital camera. The
camera is a Canon Digital Ixus 300. Inside the DCIM directory the recorded
media are stored. The Pictures are stored as JPEG images with names like
IMG_0401.JPG, IMG_0402.JPG, etc. The movies are stored as AVI videos with
names like MVI_0441.AVI, MVI_0442.AVI. Now comes the strange things. For
every video there exist a file with same main name but with 3 byte "THM"
name extension.

So i run Trid command on such THM examples and related files. These are
described correctly as "JFIF-EXIF JPEG Bitmap" by bitmap-jfif-exif.trid.xml
or more generic as "JPEG bitmap" by bitmap-jpeg.trid.xml (see appended
output/trid-v-old.txt). For both is a 3 byte sequence at the beginning
characteristic. That is expressed by XML construct like:

   <Bytes>FFD8FF</Bytes>
   <Pos>0</Pos>

In the first definition the examples are also recognized by word Exif
somewhere in the image. That is expressed inside global string section by
line like:

   <String>EXIF</String>

But both definitions list only two possible file name extensions. That is
expressed by line like:

   <Ext>JPG/JPEG</Ext>

For comparison reasons i also run other identifying tools on such examples.

The file command ( version 5.40) identifies the THM examples correctly as
"JPEG image data" with "Exif standard" (see appended output/file-5.40.txt).
Here four extensions are listed but THM is still missing (see appended
output/file-extension-5.40.txt).

I also tried DROID (Digital Record Object Identification) found at
https://sourceforge.net/projects/droid/ . This also identifies these examples
as "Exchangeable Image File Format (Compressed)" with version 2.1 by PUID
x-fmt/390 (see appended output/THM_JPEG-DROID.csv), but it also complains
about the "THM" file name extension with a yellow exclamation mark in the GUI.

Furthermore i tried some usual graphic software like GIMP, ImageMagick, XnView
and IrfanView. All these are able to open and display these images, but
IrfanView also complains about wrong extension.

I also tried some web browser like Opera, Firefox, Microsoft Edge and
Internet Explorer. This works partial. Under Windows most browser rely on
the system functions. Windows use always the file name
extension to determin file type. Unfortunatly the THM extension is also used
by other software. On my system this extension is used by LibreOffice and
its derivats as part of the Gallery. So firefox on Windows doe not open THM
examples, but on Linux opening works.

So i run tridscan on THM images and generate trid definition. When looking in
TrID DEFS sub directory i see a similar definition with name
thm-canong3.trid.xml. This describes "Thumbnail File from Digital Camera
Canon G3" with file name extension THM by XML construct like:
 <Bytes>FFD8FFE107FC45786966000049492A000800000009000F010200060000007A
 <Pos>0</Pos>

So for me this seems to be also a JPEG variant. If this is true then the
mime type image/jpeg is missing in definition. So i try to handled my new
definition in a similar way. So i named my definition
thm-canonDigitalIxus300.trid.xml and description text is expressed by line
like:

   <FileType>
   Thumbnail JPEG Bitmap from Digital Camera Canon DIGITAL IXUS 300
   </FileType>

I searched for specification, but i found no precise specification. At least i
found on Wikipedia a page about Design rule for Camera File system
(DCF). There is written that "THM" is used for thumbnail images. For me it
does not becomes clear if JPEG format is used for THM or if other formats
can be used. So i used Wikipedia page as reference. That is expressed by line like:

   <RefURL>
   https://en.wikipedia.org/wiki/Design_rule_for_Camera_File_system
   </RefURL>

So i looked at a way to identify thumbnail format. When looking in Xnview
command line tool nconvert output with fullinfo option (
(see appended output/nconvert-fullinfo.txt) i see lines like:

  IOP:
    IOP index            (0x0001): THM
    IOP version          (0x0002): 0100

And similar information is get from ImageMagick command line tool identify
output with verbose option (see appended output/identify-verbose.txt) by
lines like:

    exif:InteroperabilityOffset: 1328
    exif:thumbnail:InteroperabilityIndex: THM
    exif:thumbnail:InteroperabilityVersion: 48, 49, 48, 48

That information is expressed inside my TrID definition by XML construct
like:

 <Bytes>009B0000000400010002000400000054484D000200070004000000303130300110030001000000</Bytes>
 <ASCII> . . . . . . . . . . . . . . . T H M . . . . . . . . . 0 1 0 0</ASCII>
 <Pos>1335</Pos>

After some painful studying EXIF structure documentation is was able to reduce this to
relevant part like:

 <Bytes>00010002000400000054484D00</Bytes>
 <ASCII> . . . . . . . . . T H M .</ASCII>

Where Tag ID 1 means InteropIndex, 2 means data type is NULL-terminated ASCII string, 4
means length of string THM\0 if i understand specification right.
   
So the above construct with other construct for JPEG files should be
sufficient to identify such THM images in a unique way. So such a reduced
variant is done by thm-jfif.trid.xml.

In definition for Canon G3 camera i found a similar XML construct like:

 <Bytes>00AB22001801000040001A00D2000000E0080000E00800000400010002000400000054484D000200070004000000303130300110030001000000
 <Pos>1560</Pos>

I found also one thumbnail MOV00020.THM from SONY DSC-T100 camera, but when i try to
add this example i got an unusable definition thm-sonyTEST.trid.xml.tmp because
here the EXIF structure is placed at another offset (see appended
sony/output/identify-verbose.txt exif:InteroperabilityOffset: 328)

With the new definition my THM examples now described more precisely with
right file name extension (see appended output/trid-v-new.txt).

TrID definition, some examples and output are stored in archive
thm_jpeg.zip. I hope that my XML file can be used in future version of
triddefs.

With best wishes
Jörg Jenderek

Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2743
    • Mark0's Home Page
Thanks!