Author Topic: bundle-cafe.trid.xml for Mac OS X Mach-O universal bundle  (Read 1679 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 375
bundle-cafe.trid.xml for Mac OS X Mach-O universal bundle
« on: August 25, 2020, 09:51:43 AM »
Hello trid users,

some days ago i run TrID on hundreds of Mac OS X Mach-O universal bundles
(*.bundle). Some inspected samples like Bzip2.bundle are misidentified as
"Mac OS X Mach-O universal Dynamically linked shared Library" by
dylib-cafe.trid.xml and all are also described in general by exe-ub.trid.xml
as "Mac OS X Universal Binary executable" (see appended output/trid-v.txt).

The file command {See https://en.wikipedia.org/wiki/File_(command)}
describes most of my inspected examples correctly like "Mach-O universal
binary" with sub type classification "bundle" (See appended
output/file-5.39.txt), because the file command use another method to detect
such libraries archives.
So i run tridscan on such bundle files to create bundle-cafe.trid.xml
definition file.

I add here again web page about Mach-O file format on Wikipedia. That is now
expressed by line like:
   <RefURL>https://en.wikipedia.org/wiki/Mach-O</RefURL>

Instead generic application/octet-stream the file command shows a user
defined type (See appended output/file-i-5.39.txt). So i changed in trid
definition file mime type. This is now shown by updated line like:
   <Mime>application/x-mach-binary</Mime>

When looking in bundle-cafe.trid.xml i see in global string section lines,
which are obviously generated by lucky circumstances like:
   <String>8'''__LINKEDIT</String>
I was not able to remove such strings, although definition file is based on
186 bundles. Probably the reason is that many bundles belong to the same package
like Python or Perl. So apparently often the same string phrases occur.

All my inspected samples are binary with 2 architectures with i386 CPU binary
as first. This together with the CAFEBABE magic is expressed by pattern
like:
   <Bytes>CAFEBABE0000000200000007000000030000100000</Bytes>
So i hope that other users can improve the definition file by running
tridscan on bundles with other and more CPU architectures.

With the new trid definition file now my Mac OS X Mach-O universal bundle
are described correctly ( see appended output/trid-new.txt). TrID
definition, some examples and output are stored in archive bundle.zip. I
hope that the new XML file can be used in future version of triddefs.

Value 6 is declared as MH_BUNDLE, which is used for dynamically bound bundle
file. That method for recognition is used by file command. I had hoped that i
can adopt this method for trid, but that mach_header structure seem to
appear some times at varying offsets. In many cases this offset was 0x1000.

With best wishes
Jörg Jenderek