Marco Pontello's Home Page
Questa pagina in italiano

(Last updated: 01/02/15)

TrIDScan - Patterns scanner

TrIDScan creates new definitions to be used with TrID. You can use it to help collect new unique definitions. Here's how.

Let's say you want to create a definition for Java Class files and you have a collection of them. Put your file collection into a directory (folder) of it's own. The more varied your collection in size and compression, the better the results. Run TrIDScan against the folder. That's all there is to it; the program does the rest.

 D:\Trid>tridscan f:\test\*.class

 TrID/32 - Scan Module v1.56 - (C) 2003-04 By M.Pontello       

 Checking files...
 Found 97 matching files
 Header Block Size: 222
 Scanning for patterns...
 Pattern(s) found: 3
 Last Pattern end at offset: 11
 Scanning for raw strings...
 Raw string(s) found: 64
 Pre-filtering strings...
 Phase 1... 100%
 Phase 2... 100%
 Erasing substrings... 100%
 String(s) found: 3
 Writing XML file ...

Scanning is generally fast, even for many files. It could be slow if there isn't at least one small file (under 300/400KB) and the the file contents is virtually random (ZIP files, MP3, JPEG, etc.). Just in case, it's possibile to disable the strings scanning (the slow part of the process) using the switch "/NS". Doing this, the scan will be blazing fast even for a thousand of files.
When finished, TrIDScan will create a file named "newtype.trid.xml" that contains the identifying details for the files you just scanned.

You have two steps left at this point: rename the file and edit its header. In the example given you might rename the file to "java-class.trid.xml" to indicate it applies to that kind of files. Then, open the file in a text editor and make necessary changes to the header information. The file "java-class.trid.xml" would have a form similar to:

<TrID ver="2.00">
        <FileType>Java Bytecode</FileType>
        <User>Marco Pontello</User>

Edit the information between the <FileType> tags to indicate what type of file it is; the information between the <Ext> tags if TrIDScan has not guessed correctly; and your contact information in the <User>, <E-mail>, and <Home> tags. That's it.

If you plan to do many scans then you can edit the contact information in the "tridscan.cfg.xml" file and it will be inserted into all further scans by default.

<?xml version="1.0"?>
    <User>Marco Pontello</User>

The tags <Rem> and <RefURL> are used to provide some kinds of info about the file type. For example:

Once you create a new definition, send me a copy of the XML file to include in the database (see the Contacts page for the address).

It's also possible to "refine" a definition, scanning some files and telling TrIDScan to start from an already existing definition, instead that starting from scratch. This way, it will be like scanning also all the files already analyzed (maybe by others users). To use the refining function, simply add to the command line the definition to use as a starting point. The new def will substitute the previous one, after saving it as "newtype.trid.xml.bak" (this come handy if for some reasons you want to rescue the previous definition). For example:

 D:\Trid>tridscan c:\dev\programs\*.class java-class.trid.xml

It's also possible to force the rescan of all uniques strings, for example when refining a def made with a previous version of TrIDScan that doesn't yet support this features. Simply use the switch "/FS" when refining.

Making definitions is easy. Use TrIDScan to scan all those data files you've been creating over the years. Send them all in. This data will be added to the master database so your work will help others identify unknown files they might have on their system. You'll be helping everyone!

If you find TrID a worthwhile project, please tell your friends about it and this site; this is going to work much better if many people participate and produce new or better defs!

Please be as specific as possible when you describe the file type. If you have data files from different versions of a program try to group them by version and then create a definition for each version of the program. If, for example, you have Excel files created by a DOS version and Excel files created by a Windows version don't just create a single Excel definition; create one for each version.


 Win32   TrIDScan v1.56, 28KB ZIP
 Win32   TrIDDefsPack v1.12, 25KB ZIP
   TrID XML defs, 1017KB RAR (archive with 5474 definitions, 01/02/15


Change Log

TridDefsPack v1.12 - 24/02/11:
* Fixed an inconsistency with the Tag element.

TridScan/32 v1.56 - 22/11/04:
+ Unique strings are sorted by length.

TridScan/32 v1.55 - 20/11/03:
+ Unique strings scanning now is case insensitive.
+ Possible bug fixed in the refining function.

TrIDScan/32 v1.50 - 15/11/03:
+ New uniques strings detection function. It could be forced with the switch "/FS", or disabled with "/NS".

TrIDScan/32 v1.23 - 13/08/03:
+ New definition refining function.
+ Added section <ExtraInfo> with elements <Rem> (for some remarks) and <RefURL> (for a related/reference URL).

TrIDScan/32 v1.00.1 - 13/07/03:
+ Added element <ASCII> with an ASCII dump of the pattern.