Hello,
when i run trid on some old DOS backup files (see backup.lst) these
are misidentified as "MP3 audio", "MacBinary 1 header", "MacBinary 2
header", "DEGAS med-res bitmap", "TTComp archive compressed" or
others ( see appended output/trid-old.txt )
The newest file(1) command version (
http://darwinsys.com/file/)
identifies such examples correct as "DOS 2.0-3.2 backed up" ( see
appended output/file-new.txt )
The format of such files is described in BACKUP & RESTORE document for
the Free DOS project found for example at
http://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/dos/restore/brtecdoc.htm
So i add to new trid definition files this URL as reference.
According to Free DOS documentation test for padding 44 unused bytes
in header which seem to be nulls by XML construct like
<Pattern>
<Bytes>0000000000000000000000000000000000000000000000</Bytes>
<Pos>84</Pos>
</Pattern>
At offset 5 a string field with size 78 starts containing full pathname
of the backed-up file. This string is zero terminated. This means last
byte of this field is null. This is now described by construct
<Pattern>
<Bytes>00</Bytes>
<Pos>82</Pos>
</Pattern>
The pathname of the backed-up file is stored without drive letter and
colon part ( like C: ). That means file name string starts with path
separator.
I found only real world examples that start with DOS path separator
( 5Ch = backslash = "\"). But according to V32SLASH.TXT in archive
PD0315.EXE UNIX variant (2Fh = slash= "/") can also occur.
Together with test of some other unknown bytes (that are also null
bytes) test for DOS path separator is expressed by construct
<Pattern>
<Bytes>0000005C</Bytes>
<ASCII> . . . \</ASCII>
<Pos>2</Pos>
</Pattern>
At the beginning a flag byte is stored, where FFh means complete file
process and 00h means split file. So i create 2 variant definition files.
The first is msbackup-v2.trid.xml with additional XML construct
<Pattern>
<Bytes>FF</Bytes>
<Pos>0</Pos>
</Pattern>
The second is msbackup-v2part.trid.xml describes backup parts where
sequence number is stored at offset 1 by additional XML construct
<Pattern>
<Bytes>00</Bytes>
<Pos>0</Pos>
</Pattern>
According to Free Dos documentation Microsoft and IBM use different
backup file format in different DOS versions. The format described by
above patterns is used in version 2.0 till 3.2. So such files are
described as "DOS 2.0-3.2 backup file"
The backup filename is the same as original filename. So if you backup
an DOS executable file name extension is "EXE". Backup of Word
documents have "DOC" extension and so on. So all allowed DOS extension
can occur. I do not know if this 100% valid but describes this fact by
construct:
<Ext>*</Ext>
With these 2 new definition files finally all my DOS BACKUP files
are now recognized ( see appended output\trid-new.txt ).
trid definitions and output are stored in archive backup.zip.
I hope that my XML files can be used in future version of triddefs.
With best wishes
J?rg Jenderek