Author Topic: TrID update + variant for True Type Font  (Read 3644 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 361
TrID update + variant for True Type Font
« on: June 15, 2017, 12:33:11 AM »
Hello,

when i run trid on on hundreds of True Type fonts some are misidentified
only as "DEGAS med-res bitmap" ( see appended output\trid-old.txt ).
Such fonts like jpn_boot.ttf can be found for example in Fonts subdirectory on
EFI Boot partitions.

The newest file(1) command version (http://darwinsys.com/file/) identifies such
examples correct as "TrueType Font" ( see appended output/file-new.txt )

After running tridscan to update trid definition file i manually make some
additional changes in ttf.trid.xml.

A good starting point for TrueType fonts is Wikipedia page
So i add to XML file this URL by construct
   <RefURL>https://en.wikipedia.org/wiki/TrueType</RefURL>

File name extension for True Type font is "ttf", but there exist one
exception. Microsoft's private character editor EUDCEDIT.EXE which is
part of some Windows OS (found in %WINDIR%\system32 of windows 8.1)
creates True Type font EUDC.tte. So extension is described by line
   <Ext>TTF/TTE</Ext>

According to documentation print mime type by line
   <Mime>application/font-sfnt</Mime>
But this should become "font/ttf" in near future.

According to specifications such True Type fonts start with scalable font (sfnt)
version 0x00010000. Afterwards the number of tables is stored as short value. At
first glance font with maximal 65535 tables could exist, but real fonts contains
only dozens of tables. The highest number i found is 27 in Skia.ttf font example
and lowest number is 9 like in NISC18030.ttf. So high byte of big endian short
value is null. This is expressed by XMl construct.
   <Pattern>
      <Bytes>0001000000</Bytes>
      <Pos>0</Pos>
   <Pattern>
So i mention this facts in remark line:
<Rem>
variant with sfnt version 0x00010000 at offset 0 and number of tables < 28
stored at 4.  Windows privat character editor EUDCEDIT.EXE creates
EUDC.TTE
</Rem>

At byte 12 the Table Directory entries start. 16 byte table entry consist of tag
name, checksum, offset and length. This size range is calculated by formula like
160=10*16=minimal tables * 16 <= tables * 16 <= maximal tables * 16=27*16=432

The structure of table directory looks similar for all fonts. So you get pattern
in trid definition file describing the average of table directories. So when a
font with new lowest table number is found the table size shrinks and last
pattern vanish. So for font like segmono_boot.ttf with 10 tables directory size
is now 160 and first table starts at 172. This means in updated XML file the
last pattern vanish
   <Pattern>
      <Bytes>00</Bytes>
      <Pos>180</Pos>
   </Pattern>

Often tag names consist of four characters like "NAME", "HMTX" or "OS/2". Names
with less than four letters are allowed if followed by the necessary trailing
spaces like in "cvt ". The Apple specification mentions 46 different table
names, but not all possible tables are used at same time. So only 8 tables are
always found in all fonts. This is described by GlobalStrings section. The
string "$HMTX" describes that table before horizontal metrics table "HMTX" has a
size that all ends with 24h="$". This is probably an accident. So i reduce in
GlobalStrings section all strings to existing 4 byte table names.

After the table directory the tables them self starts. That means offset values
are low instead maximal value ffffFFFFh. Also size of table are always not
maximal. This means upper bytes of offset and size values are often null. This
is what is expressed by null pattern like

      <Pattern>
         <Bytes>0000</Bytes>
         <Pos>24</Pos>
      </Pattern>

But when inspecting font with many tables it is very probably that these null
patterns shrinks and become in updated trid definition file now XML construct
      <Pattern>
         <Bytes>00</Bytes>
         <Pos>24</Pos>
      </Pattern>

By the above pattern is can be seen the length of first table is <= 1000000h.
By adding 16 to offset you see that this also true for second table at offset 30
and so on.

With updated definition file with above modifications misidentified fonts are now
 recognized ( see appended output\trid-new.txt ).

But i found on Mac OS X systems 14 fonts like "Apple Chancery.ttf" which are
not identified ( see appended apple\output\trid-old.txt ).

The newest file(1) command version identifies such examples correct as "TrueType
Font" ( see appended apple\output\file-new.txt )

Instead sfnt version 0x00010000 also value 'true' (0x74727565) occur in some
Apple fonts.
So i mention this facts in remark line:
<Rem>Apple variant with sfnt version 0x74727565='true' at offset 0 and
number of tables &lt; 28 stored at 4.</Rem>

This information can be found in Apples True Type Manual. So i add this URL as
reference by
<RefURL>
https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6.html
</RefURL>

After running tridscan on these font variants and proceeding as described above
i got a second definition file ttf-true.trid.xml. It contains null patterns at
high offset like:
      <Pattern>
         <Bytes>00</Bytes>
         <Pos>372</Pos>
      </Pattern>

Maybe such pattern vanish if more fonts with less tables are inspected. But at
the moment i keep found patterns.
With second definition file unrecognized fonts are now described as "TrueType
Font (true)" ( see appended apple\output\trid-new.txt ).

The above consideration and tests are also true for other scalable fonts, not
only True type. OpenType fonts starting with the sfnt-version 'OTTO'
(0x4F54544F). And font collection contain such fonts after a header.  So when
ever changing ttf.trid.xml you should watch also otf.trid.xml and ttc.trid.xml.

There should also exist a third font variant. According to Apples manual instead
sfnt version 0x00010000 also value 'typ1' (0x74797031) can occur. But i found no
examples for that variant. So maybe other users look for such samples and create
next trid definition file.

trid definitions and output are stored in archive ttf_trid.zip.
I hope that my XML files can be used in future version of triddefs.

With best wishes
J?rg Jenderek

Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2667
    • Mark0's Home Page
Re: TrID update + variant for True Type Font
« Reply #1 on: June 15, 2017, 02:44:46 AM »
Thanks, as usual!