Author Topic: updated mif.trid.xml for Maker Interchange Format  (Read 826 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 375
updated mif.trid.xml for Maker Interchange Format
« on: November 28, 2023, 12:43:12 AM »
Hello trid users,

some days ago i handled files created or used by Adobe FrameMaker. In this
session i will handle FrameMaker samples with file name suffix MIF.

So i run trid utility on such MIF examples. Some samples are recognized and
described as "Maker Interchange Format" by mif.trid.xml with mime type
application/vnd.mif (see appended trid-v-old.txt in ok/output). But a dozen
of examples are samples are not recognized and are described as "Unknown!"
(see appended trid-v-old.txt in output).

For comparison reason i also run the file format identification utility DROID
(See https://sourceforge.net/projects/droid/). Here more samples are
recognized but not all. The samples are described as "Adobe FrameMaker
Interchange Format" with mime type application/vnd.mif via PUID x-fmt/162 (see
appended droid-mif.csv in output). What does this do? It looks for 9 byte
string "<MIFFile " at the beginning. Then it assumes that next 4 bytes are
used for version string. Then it checks for 3 byte string "> #" which is
terminator of MIFFile command and start of comment.

For control reason you can look at the first lines of MIF samples by command
like:
   head -1 *.mif
So we see (output\head-1.txt) that some (12 like charfmt.mif, columlay.mif,
dblpage.mif, defpage.mif, footers.mif, frstpage.mif, hello.mif, pgfcat.mif,
pgffmt.mif, snglpage.mif, table.mif, tablecat.mif) samples contain only 3 byte
version part like 4.0.

Furthermore i found few samples (3 like example2.mif, hello.mif, tablecat.mif)
that do not contain a comment on first line.

For comparison reason i also run file command (version 5.45) on such
samples. Here such samples are recognized and described as "FrameMaker MIF
(ASCII) file" (see appended file-5.45.txt in output and ok/output). The mime
type here is application/vnd.mif (see appended file-i-5.45.txt in output and
ok/output). Here 2 file name suffix mif/framemif are listed (see appended
file-ext-5.45.txt in output and ok/output). The file command shows version
number in parentheses (not matter if version string is 3 or 4 byte
sequence). Furthermore it shows an optional comment on first line inside
double quotes.

On Linux according to shared MIME-info database such samples are called "Adobe
FrameMaker MIF document". Here application/x-mif is used as mime type. The
samples are just recognized by looking for mif file name suffix. That
information can be found in source freedesktop.org.xml.in found for example on
gitlab.freedesktop.org.

What in principal all tool use for recognition is a characteristic byte
sequence at the beginning. That is expressed inside front block by XML
construct like:
   <Bytes>3C4D494646696C6520</Bytes>
   <ASCII> . M I F F i l e</ASCII>
   <Pos>0</Pos>

But TrID also looks for some other keywords. That is in done by lines inside
global strings section. So these look like:
   <String>TLORIGIN</String>
   <String>MIFFILE</String>
   <String>STRING</String>
   <String>PAGES</String>
   <String>SIZE</String>
After running tridscan to update definition only one line survive. That looks
like:
   <String>MIFFILE</String>

That is not surprising because smallest hello-world samples (like hello.mif,
example4.mif or example2.mif) contain only few directives (like <Para
<Paraline <String). Some short examples (like defpage.mif) contain no string
directive and some short samples (like piechart.mif) contain no para
directive.

With this updated trid definition mif.trid.xml most of my FrameMaker MIF
samples are now recognized and described (see appended trid-v-new.txt in
output and ok/output).

Then a few samples (3 like x-fmt-162-signature-id-396.mif, bookTOC.framemif
and coffee.mif) are left (see appended trid-v-new.txt in else/output) . In the
second file name suffix framemif instead of mif is used. This is only one
sample with that suffix on my systems. So i am not sure if this is valid or
triggered by accident. The first is used by DROID to recognize MIF samples. So
it just contain a few characteristic bytes at the beginning and no content.
So instead of 4 byte version part (like 4.00 8.00 2015) it contains dummy
value bytes AB. I also contains a hash character (#). This is the beginning of
a comment. But here comes no comment content. So this example is described by
file command with strange values for version and comment part.

The last sample like coffee.mif start with a comment character (#). It also
contains no MIFFile directive. Similar to sources it is possible to include
inside MIF samples other files like coffee.mif. According to embedded
information embedded inside this sample that way is described in manual "MIF
Reference". According to that by "include (mytemplate.mif)" this is
done. Unfortunately i found no exact file format specification or easy
understanding part in hundreds of pages. I also found only one such
example. So i was not able to generate a TrID definition for included MIF
samples.

TrID definition and output are stored in archive mif_.zip. I hope that my
definition can be used in future version of triddefs.

With best wishes
Jörg Jenderek