Author Topic: variant/replacement lxt-v1.trid.xml for interLaced eXtensible Trace (v1)  (Read 428 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 375
Hello trid users,

some days ago i looked at the content of an exotic CD-ROM. There are also
stored samples which are misidentified. The samples have TFM file name suffix.

Unfortunately other TFM samples are misidentified as other file formats. Some
samples are misidentified as "interLaced eXtensible Trace" by lxt.trid.xml
without reference URL and with generic mime type application/octet-stream. The
file name suffix is LXT (see appended trid-v-old.txt in output). The
recognition happens by one XML construct. That looks like:
   <Bytes>0138</Bytes>
   <ASCII> . 8</ASCII>
   <Pos>0</Pos>
So only 16 bit are used for recognition. Apparently this is often too
weak. According to file command recommendations at least 32 bits should be
used for recognition.

For comparison reason i also run the file format identification utility DROID
(See https://sourceforge.net/projects/droid/). Here the samples are also not
recognized.

For comparison reason i also run file command (version 5.45) on such
samples. Here all samples are "recognized". All samples are first described
here as "interLaced eXtensible Trace (LXT) file".  Also some more details are
shown. In parenthesis the version number is shown coding scheme name. For LXT
sample i got 1 whereas for TFM samples i get "high" values 17 and 18 (see
appended file-k-5.45.txt in output).  For the samples here also generic
application/octet-stream mime type is shown (see appended file-i-5.45.txt in
output). Here no file name suffix is shown (see appended file-ext-5.45.txt in
output).  When using keep going option -k i get for TFM samples a second and
correct description. The TFM samples are described as "TeX font metric data"
(see appended file-k-5.45.txt in output). Now i also got for the TFM samples
here instead of a generic mime type application/x-tex-tfm (see appended
file-k-i-5.45.txt in output).

Luckily there exist a "Wave Analyzer User's Guide" of GTKWave with some useful
information. That PDF document can be found on GTKWave page on sourceforge. So
this used in new definition as reference. That is expressed by line like:
   <RefURL>https://gtkwave.sourceforge.net/gtkwave.pdf</RefURL>

Now comes the interesting part. In Appendix D "LXT File Format" of user guide
some useful information are written. An LXT file starts with a two byte
LT_HDRID. That is defined as constant value 0x0138.  This characteristic is
used by current TrID and file command as pattern. Afterward comes the two byte
version number LT_VERSION. That is what is shown by file command as
version. In current guide (dated Nov 14, 2020 for GTKWave 3.3.108 and higher
versions) this is defined as constant value 0x0001. The last byte in the file
is the LT_TRLID. This is defined as constant value 0xB4. So these five bytes
are the only "absolutes" in an LXT file. So the file content looks like:
     01 38 00 01 ...file body... B4

So i create a variant lxt-v1.trid.xml for version one. The described
characteristics are here expressed by XML construct looking like:
   <Bytes>01380001</Bytes>
   <ASCII> . 8</ASCII>
   <Pos>0</Pos>

In the guide is also written that that LXT2 files use a completely different
file format as well as different constant values. I interpret that version is
at the moment 1 and apparently will never change (increase higher like 2)
because there exist LXT2 with other file format. So my conclusion is that my
new definition lxt-v1.trid.xml can be used as replacement for lxt.trid.xml.

With the new definition instead of old now the wrong description vanish. The
LXT samples are described by lxt-v1.trid.xml. The TFM samples are not
described as LXT samples and described often as "TeX Font Metric" (see
appended trid-v-new.txt trid-new.txt in output).

TrID definitions, some samples and output are stored in archive tfm_lxt.zip. I
hope that my definition can be used in future version of triddefs.

With best wishes
J?rg Jenderek

Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2743
    • Mark0's Home Page
Thanks! I will rename it as the older def, so it will take its place.