Author Topic: updated mbox.trid.xml for Standard Unix Mailbox *.mbox  (Read 723 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 375
updated mbox.trid.xml for Standard Unix Mailbox *.mbox
« on: October 09, 2023, 11:01:38 PM »
Hello trid users,

some month ago i update to windows 10. Therefore i must transfer also my mail
stuff handled by Thunderbird. I had some problems. So i look at my email
messages belonging to Thunderbird with file name suffix MBOX.

So i run trid utility on my MBOX examples (about 500). Many of my mail samples
are described with highest priority as "Standard Unix Mailbox" by
mbox.trid.xml with correct file name suffix MBOX and mime type
application/mbox. All samples are described with low priority as "E-Mail
message (Var. 2)" by eml-var2.trid.xml with mime type message/rfc822 and wrong
file suffix EML (appended output/trid-v-old.txt).

For comparison reason i also run the file format identification utility DROID
( See https://sourceforge.net/projects/droid/).  Here all examples are
described as "MIME Email" with mime type message/rfc822 by PUID fmt/950. For
samples with mbox and without file name suffix the names are considered as
invalid (EXTENSION_MISMATCH true).

According to shared-mime-info database the samples are called "Mailbox file"
with mime type application/mbox and file name suffix mbox.

For comparison reason i also run file command (version 5.45) on the undetected
samples. Here such samples are described also generic as "text" (see appended
output/file-5.45.txt). Therefore the mime type is here also generic text/plain
(see appended file-i-5.45.txt in output). The file name suffix is also not
recognized (see appended file-ext-5.45.txt in output).

With newest version of file command (mail.news,v 1.31 2023/10/09) the samples
are now described as "Mailbox text" (see appended output/file.txt). Now the
mime type is here also application/mbox (see appended file-i.txt in
output). For the file name extension now "/mbox" is shown (see appended
file-ext.txt in output).

So i update mbox.trid.xml by running tridscan on few undetected samples (like
Stromanbieter.mbox Verivox.mbox).

Many of my mbox samples contain a line with phrase Message-ID or Message-Id
(see grep.txt in output). This can be seen by running a command like:
    grep Message *

But a few old message ( like Stromanbieter.mbox Verivox.mbox9 does not contain
such a line with Message phrase.
So in global strings section one line vanished like:
   <String>MESSAGE</String>

With the updated trid definition now all my hundreds MBOX samples are
described. TrID definitions and output are stored in archive mbox_.zip. I hope
that my definition can be used in future version of triddefs.

With best wishes
Jörg Jenderek

Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2743
    • Mark0's Home Page
Re: updated mbox.trid.xml for Standard Unix Mailbox *.mbox
« Reply #1 on: October 15, 2023, 05:06:44 PM »
Thanks!