Hello trid users,
some weeks ago ago i send updates for aup3.trid.xml. There i detected a trick,
feature "application id" that can be also used by some other SQLite 3.x
databases.
The standard file name suffix for SQLite 3.x databases is SQLITE3, SQLITE or
DB. Nowadays many companies and developers use this file format to store
their data.
Some use other file name suffix. I assume they do not want that "normal" users
open the database manually by tools for handling SQLite databases. But worse
is that some does not explain in a transparent way why they do such steps and
what they did change comparing with standard database. If all goes well (file
name suffix is OK or known and samples found in well known sub directories)
then it does not hurt, but in real world you must consider also all other
point views. So some behave like Putin claiming the world belongs to me or the
whole disc belongs to me for software developers. But what if hard disc crash,
extracting or packing of software archives failed. Then you often get hundreds
of "unknown" samples lying somewhere on your disc and undoing the chaos is
then nearly impossible. Luckily in current SQLite database file format there
exist a 4 byte field application id at offset 68 that make it possible to do
sub classification. Luckily some people use this feature.
In this session i will only consider TeXnicard card database.
Some information can be found for example on page about TeXnicard card
database on file formats archive team web site. So that information is
expressed inside new db-texnicard.trid.xml by line like:
<RefURL>
http://fileformats.archiveteam.org/wiki/TeXnicard_card_database </RefURL>
So i run trid utility on such TeXnicard database samples. Unfortunatly i my
self found no real samples. So i create an artifical sample test-TeXnicard.db
by applying descriebd fatures. My sample is described as correctly generic as
"SQLite 3.x database" with mime type application/x-sqlite3 by
sqlite-3x.trid.xml. But not sub classification is done. That means file name
is is not correctly shown (See appended output/trid-v-old.txt).
For comparison reason i also run the file format identification utility DROID
(See
https://sourceforge.net/projects/droid/). Here the examples are also
recognized. These are described here also generic as "SQLite Database File
Format" with version "3" and mime type application/x-sqlite3 by PUID fmt/729.
For comparison reason i also run file command (version 5.45) on such
samples. Here these TeXnicard samples are also described as "SQLite 3.x
database" but with additional information (application id 1778603844 appended
output/file-5.45.txt) and mime type application/vnd.sqlite3 (see appended
output/file-i-5.44.txt). The correct file suffix is also not recognized (see
appended output/file-ext-5.45.txt).
There exist an official registered mime type application/vnd.sqlite3 at
iana.org for SQLite 3.x database. For the inspected TeXnicard samples i found
no mime type. Because the TeXnicard samples are just SQLite 3 database these
should at least get that mime type instead of generic application/octet-stream
mime type or deprecated application/x-sqlite3. That is now expressed inside
TrID definitions by line like:
<Mime>application/vnd.sqlite3</Mime>
At offset 68 the "Application ID" set by PRAGMA application_id is stored as 4
byte big endian integer. That is the most important sub classification feature
to distinguish the Fossil samples from others. For checkout database this is
decimal 1778603844 or hexadecimal byte sequences 6A035744.
So now i can create db-texnicard.trid.xml manually without running tridscan on
dozen of examples. The sub classification done by "Application ID" is
expressed by XML construct like:
<Bytes>6A035744</Bytes>
<ASCII> j . W D</ASCII>
<Pos>68</Pos>
The main classification like in sqlite-3x.trid.xml is expressed by XML
construct like:
<Bytes>53514C69746520666F726D61742033</Bytes>
<ASCII> S Q L i t e f o r m a t 3</ASCII>
<Pos>0</Pos>
No extension is required or suggested by the software, but the author is
currently using the .db extension. So the correct file name suffix is now
shown by line like:
<Ext>DB</Ext>
With the new trid definition now my TeXnicard card database example is
described with more details (correct file name suffix see appended
output/trid-v-new.txt). TrID definition and output are stored in archive
TeXnicard_.zip. I hope that my definition can be used in future version of
triddefs.
There exist more other database samples using the application id feature. I
will try to handle such samples in future session.
With best wishes
Jörg Jenderek