Author Topic: updated ftg.trid.xml for Windows Help Full-Text Search index file + ftg.trid.xml  (Read 1068 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 375
Hello trid users,

some days ago i handled files in context of old Window help system.
So in this session i will handle files with suffix FTS and FTG.

The files are typically found in same directory as corresponding HLP file.
The samples are created by Microsoft Help tool winhlp32.exe.

So i run trid utility on my examples. The FTS samples are recognized and are
described correctly as "Windows Help Full-Text Search index file" without mime
type and reference by fts.trid.xml. The few FTG samples are not recognized and
are described as "Unknown!" (see appended trid-v-old.txt in output).

For comparison reason i also run the file format identification utility DROID
(See https://sourceforge.net/projects/droid/). Here the samples are not
recognized.

For comparison reason i also run file command (version 5.45) on such
samples. Here also the FTS samples are recognized and described correctly as
"MS Windows help Full Text Search index" Also the corresponding HLP full file
name is shown (see appended output/file-5.45.txt). The mime type here is
application/x-winhelp-fts (see appended file-i-5.45.txt in output). The
correct file name suffix is here also shown for FTS samples (see appended
file-ext-5.45.txt in output). The FTG samples (like winhlp32.FTG.GID) are not
recognized and therefor described with generic application/octet-stream mime
type as "data".

On Linux according to shared MIME-info database samples with FTS suffix are
called "FITS document" with acronym "Flexible Image Transport System". But
that is another file format.

Luckily i found on the net information parts about Windows HELP. Of course no
official from Microsoft. And this applies also to related search files with
suffix FTS and FTG. So i choose page on Wikipedia. So i use this as
reference. That is expressed inside updated definition by line like:
   <RefURL>https://en.wikipedia.org/wiki/WinHelp</RefURL>

In current definition no mime type is listed. So i choose user defined type
listed by file command. That is expressed by line like:
      <Mime>application/x-winhelp-fts</Mime>

The file command list also the full name of corresponding HLP file (like
"C:\TMP.TMP\hlp\htmhlp98.hlp"). Apparently this is stored at offset 16. So i
mention my observation in remark line because these facts become relevant when
considering FTG samples.

The description of FTS mainly happen by characteristic 4 byte pattern at the
beginning. That is expressed by XML construct that looks like:

   <Bytes>74664D52</Bytes>
   <ASCII> t f M R</ASCII>

On Wikipedia beside FTS suffix also FTG is listed as Full Text Search of
WinHelp. So i looked on my systems for such files. Unfortunately i found only
few samples. Many (like CTRLREF.FTG SETUPWIZ.FTG) are empty. So file size is
0, but many (like CTRLREF.FTG SETUPWIZ.FTG) contain just an empty line (
Carriage Return Line-Feed). So file size is 2. So in the end i got only one
real sample (like winhlp32.FTG).

So i generate ftg.trid.xml manually. At offset 16 here also full file name is
stored but here instead of HLP FTS is referenced. So i mention fact in remark
line. This fact is expressed inside global strings section by line like:
   <String>.FTS</String>

When searching on the net for difference then the phrase group is
mentioned. So compared with fts.trid.xml this fact is expressed by line like:

   <FileType>Windows Help Full-Text search Group file</FileType>

And compared with fts.trid.xml i choose another user defined mime type. That
is expressed by line like:
   <Mime>application/x-winhelp-ftg</Mime>

In the starting 4 byte pattern letter g no instead of t is used compared with
fts.trid.xml. So this is expressed by XML construct like:
   <Bytes>67664D52</Bytes>
   <ASCII> g f M R</ASCII>
   <Pos>0</Pos>

With this new trid definition now all my real help Windows Help Full-Text
search samples are described; also the Group samples (*.FTG). And now more
details are shown.

TrID definition, some samples and output are stored in archive fts_ftg.zip. I
hope that my definitions can be used in future version of triddefs.

With best wishes
Jörg Jenderek

Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2743
    • Mark0's Home Page
Thanks Jörg!