Author Topic: tlog.trid.xml for MSBuild file Tracker LOG  (Read 870 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 375
tlog.trid.xml for MSBuild file Tracker LOG
« on: July 31, 2022, 09:59:22 PM »
Hello trid users,

some days ago i must use Microsoft Visual Studio. It creates some files with
extension TLOG.

Just for interest i checked these Microsoft files.  When running TrID on
such files all examples are with low rate described wrong as "MP3 audio" by
audio-mp3.trid.xml. That is triggered because samples start with hexadecimal
value FF. The examples are described first as "Text - UTF-16 (LE) encoded"
with mime type text/plain by txt-utf-16-le.trid.xml (See appended
output/trid-v-old.txt). That is triggered by Byte order mark (BOM).  That is
2 byte sequence FFFE for UTF-16 in little endian format. That is expressed
by XML construct like:
   <Bytes>FFFE</Bytes>
   <Pos>0</Pos>

For comparison reason i check these examples by file command utility. When
running file command (version 5.42) all examples are described similar as
"Unicode text" and "UTF-16, little-endian text" (See appended
output/file-5.42.txt).

Luckily i found a page about Visual Studio C++ Project system extensibility
and toolset integration on Microsoft web site. So this information is now
shown inside new definition by line like:
 <RefURL>
 https://docs.microsoft.com/en-us/visualstudio/
 extensibility/visual-cpp-project-extensibility
 </RefURL>

After running tridscan to generate new definition tlog.trid.xml we we can
refine the definitions. First XML construct looks like:
   <Bytes>FFFE5E00</Bytes>
   <ASCII> . . ^</ASCII>
   <Pos>0</Pos>

That is BOM marker followed by caret-character, encoded in UTF-16 little
endian. That is always true because caret character is the indicator for one
or more source files. The source name are stored as full name with DOS drive
letter followed by colon and backslash character. DOS drive letter is often
like C, but you also find samples with D, E, F or other upper capital
letter. That is expressed by second XML construct like:
   <Bytes>003A005C00</Bytes>
   <ASCII> . : . \</ASCII>
   <Pos>5</Pos>

In my first efforts i get many 1 byte nil pattern at odd offsets like:
   <Pattern>
      <Bytes>00</Bytes>
      <Pos>11</Pos>
   </Pattern>
   <Pattern>
      <Bytes>00</Bytes>
      <Pos>13</Pos>
   </Pattern>
   ...
   <Pattern>
      <Bytes>00</Bytes>
      <Pos>393</Pos>
   </Pattern>
That is triggered by fact that project are stored in English named
directories starting like "C:\PROGRAM FILES". This stored as UTF-16 LE
results in ASCII byte character followed by nil byte. When directory of
projects are shorter and contain less dependencies than the last of such nil
pattern vanish. Typical build project use English names, but it is not
forbidden to use exotic non-English names. Then one character maybe occupy 2
non nil bytes. So to match also such examples i delete these nil pattern in
definition.

Instead of generic mime type text/plain i display an user defined one. That
is now expressed by line like:
   <Mime>text/x-ms-tlog</Mime>

With the trid definition now all TLOG examples are described ( see appended
output/trid-v-new.txt). TrID definitions, few examples and output are stored
in archive tlog_10.zip. I hope that my XML file can be used in future
version of triddefs.

With best wishes
Jörg Jenderek

Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2731
    • Mark0's Home Page
Re: tlog.trid.xml for MSBuild file Tracker LOG
« Reply #1 on: August 02, 2022, 11:35:35 AM »
Thanks!