Author Topic: dsk-ext.trid.xml for Linux extended file system image  (Read 3488 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 375
dsk-ext.trid.xml for Linux extended file system image
« on: April 13, 2020, 02:01:25 AM »
Hello trid users,

some days ago i run TrID on ext2/ext3/ext4 file systems or saved start
sectors of such file systems.

These file system images are misidentified as "MacBinary 1" by
macbinary-1.trid.xml, "Adobe PhotoShop Brush" by abr.trid.xml, "Memo File
Apollo Database Engine" by smt-apollo.trid.xml or "Sybase iAnywhere database
files" by sybase-ianywhere-dbf.trid.xml ( See appended
output/trid-v-old.txt).

On the other hand the file(1) command identifies such images correctly like
"Linux rev 1.0 ext4 filesystem data" ( see appended output/file-5.38.txt)

So i run tridscan on samples to generate trid definition dsk-ext.trid.xml.

Some information about the Linux Extended file system is found on
Wikipedia. That is expressed by reference URL line like:
   <RefURL>
   https://en.wikipedia.org/wiki/Extended_file_system
   </RefURL>

With the help of c-header file ext2_fs.h used for example in e2fsprogs tool
collection found at sourceforge i begin to refine definition. First i delete
short null patterns which are probably triggered by used low stored values
like:
   <Pattern>
      <Bytes>0000</Bytes>
      <Pos>65</Pos>
   </Pattern>

Apparently some blocks are padded with nulls. So 512 null bytes are found at
offset 512. That i is expressed by XML construct like:
   <Pattern>
      <Bytes>000000000000000</Bytes>
      <Pos>512</Pos>
   </Pattern>
But i do not know if this always true, or if this is different for unusual
block sizes.

According to documentation in super block the EXT2 magic pattern 0xEF53 is
stored followed by 2 byte status, which is 1 for mounted file
systems or 2 in case of errors. This is expressed by patterns like:
   <Bytes>53EF</Bytes>
   <Pos>1080</Pos>
   <Bytes>00</Bytes>
   <Pos>1083</Pos>
I only inspected file systems on little endian systems. So i do not know if
this byte order is also find on big endian machines.

The revision level is 1 for most system, but it can be 0 for some old file
system. That value i stored as 4 byte integer (s_rev_level at 04Ch). This is
expressed by XML construct:
   <Bytes>000000</Bytes>
   <Pos>1101</Pos>
The minor revision level is 0. That value i stored as 2 byte integer
(s_minor_rev_level at 03E). This is expressed by XML construct:
   <Bytes>0000</Bytes>
   <Pos>1086</Pos>

The directory last mounted on is stored as 64 byte string as s_last_mounted
at 088h. Since path on Linux starts with a slash this is expressed by XML
construct:
   <Bytes>2F</Bytes>
   <ASCII> /</ASCII>
   <Pos>1160</Pos>

Disk utilities like SUSE Image writer and Gnome disk utility use
specific mime type for disc images. That is now expressed by
additional line:
   <Mime>application/x-raw-disk-image</Mime>

These utilities use "img" as file name extension. That extension is also
used for raw disc images with other file system types. So apparently
according to web sites like filesuffix.com and dotwhat.net also special
extensions like ext4 is used to emphasize specific Linux file system
type. This is now expressed by line like:
   <Ext>IMG/EXT2/EXT3/EXT4</Ext>

The web site like reposcope.com mention also other file name extension like
raw-disk-image, luks, tc or vc. LUKS seems to be used for LUKS file
container. Truecrypt and Veracrypt seem to use TC and VC extension.  So i
left these extensions.

With the new definitionfile now my inspected Linux extended file system
images are recognized correctly ( See appended output/trid-new.txt).

TrID definition and output are stored in archive HGST.zip. I hope that my
new XML file can be used in future version of triddefs.

With best wishes
Jörg Jenderek

Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2841
    • Mark0's Home Page
Re: dsk-ext.trid.xml for Linux extended file system image
« Reply #1 on: April 13, 2020, 03:52:27 AM »
Thanks!