|
Questa pagina in italiano |
(Last updated: 04/10/24)
TrIDScan - Patterns scanner
TrIDScan creates new definitions to be used with TrID. You can use
it to help collect new unique definitions. Here's how.
Let's say you want to create a definition for Java Class files and
you have a collection of them. Put your file collection into a
directory (folder) of it's own. The more varied your collection in
size and compression, the better the results. Run TrIDScan against
the folder. That's all there is to it; the program does the rest.
D:\TrID>tridscan.py \test\*.class
TrIDScan/Py v2.02 - (C) 2015-2016 By M.Pontello
File(s) to scan found: 4
Scanning for patterns...
Checking file 1/4 '\test\corewar.class'
Checking file 2/4 '\test\hellow.class'
Pattern(s) found: 15
Checking file 3/4 '\test\life.class'
Pattern(s) found: 6
Checking file 4/4 '\test\primes.class'
Pattern(s) found: 3
Last pattern end at offset: 11
Scanning for strings...
Analyzing file 1/4 '\test\corewar.class'
Raw strings: 0K
String(s) found: 162
Analyzing file 2/4 '\test\hellow.class'
Parsing...
Raw strings: 0K
Checking strings...
Filtering strings...
String(s) found: 5
Analyzing file 3/4 '\test\life.class'
String(s) found: 3
Analyzing file 4/4 '\test\primes.class'
String(s) found: 3
New TrID's definition written as 'newtype.trid.xml'.
|
Scanning is generally fast, even for many files. It could be slow if
there isn't at least one small file (under 300/400KB) and the the file
contents is virtually random (ZIP files, MP3, JPEG, etc.). Just in case, it's
possibile to disable the strings scanning (the slow part of the process) using
the switch "-ns". Doing this, the scan will be blazing fast even for a thousand of
files.
When finished, TrIDScan will create a file named "newtype.trid.xml" that contains
the identifying details for the files you just scanned.
You have two steps left at this point: rename the file and edit its
header. In the example given you might rename the file to
"java-class.trid.xml" to indicate it applies to that kind of files. Then,
open the file in a text editor and make necessary changes to the
header information. The file "java-class.trid.xml" would have a form
similar to:
<TrID ver="2.00">
<Info>
<FileType>Java Bytecode</FileType>
<Ext>CLASS</Ext>
<Mime></Mime>
<ExtraInfo>
<Rem></Rem>
<RefURL></RefURL>
</ExtraInfo>
<User>Marco Pontello</User>
<E-Mail>marcopon@nospam@gmail.com</E-Mail>
<Home>http://mark0.net</Home>
</Info>
<General>
<FileNum>4</FileNum>
<CheckStrings>True</CheckStrings>
<Date>
<Year>2003</Year>
<Month>11</Month>
<Day>14</Day>
</Date>
<Time>
<Hour>03</Hour>
<Min>10</Min>
<Sec>51</Sec>
</Time>
<Creator>TrIDScan/Py v2.02</Creator>
</General>
<FrontBlock>
<Pattern>
<Bytes>CAFEBABE00</Bytes>
<Pos>0</Pos>
</Pattern>
<Pattern>
<Bytes>00</Bytes>
<Pos>6</Pos>
</Pattern>
<Pattern>
<Bytes>00</Bytes>
<Pos>11</Pos>
</Pattern>
</FrontBlock>
<GlobalStrings>
<String>CODE</String>
<String>INIT</String>
<String>JAVA</String>
</GlobalStrings>
Edit the information between the <FileType> tags to indicate what
type of file it is; complete the <Mime> element if the MIME type is known;
the information between the <Ext> tags if
TrIDScan has not guessed correctly; and your contact information in
the <User>, <E-mail>, and <Home> tags. That's it.
If you plan to do many scans then you can wrote your contact
informations in a "tridscan.cfg.xml" file and it will be inserted
into all further scans by default.
<?xml version="1.0"?>
<settings>
<User>Marco Pontello</User>
<E-Mail>marcopon@nospam@gmail.com</E-Mail>
<Home>http://mark0.net</Home>
</settings>
The tags <Rem> and <RefURL> are used to provide some kinds of info about
the file type. For example:
<ExtraInfo>
<Rem></Rem>
<RefURL>http://java.sun.com/</RefURL>
</ExtraInfo>
Once you create a new definition, send me a copy of the XML file
to include in the database (see the Contacts page for the address).
It's also possible to "refine" a definition, scanning some files and telling TrIDScan to start
from an already existing definition, instead that starting from scratch. This way, it will be like
scanning also all the files already analyzed (maybe by others users). To use the refining function,
simply add to the command line the definition to use as a starting point. The new def will
substitute the previous one, after saving it as "newtype.trid.xml.bak" (this come handy if for some
reasons you want to rescue the previous definition). For example:
D:\TrID>tridscan.py c:\dev\programs\*.class -d java-class.trid.xml
|
It's also possible to force the rescan of all uniques strings, for example when
refining a def made with a previous version of TrIDScan that doesn't yet support
this features. Simply use the switch "-fs" when refining.
Making definitions is easy. Use TrIDScan to scan all those data
files you've been creating over the years. Send them all in. This
data will be added to the master database so your work will help
others identify unknown files they might have on their system.
You'll be helping everyone!
If you find TrID a worthwhile project, please tell your friends about it
and this site; this is going to work much better if many people
participate and produce new or better defs!
|
N.B.
Please be as specific as possible when you describe the file
type. If you have data files from different versions of a program
try to group them by version and then create a definition for each
version of the program. If, for example, you have Excel files
created by a DOS version and Excel files created by a Windows
version don't just create a single Excel definition; create one for
each version.
|
Download
Python
|
TrIDScan/Py v2.02, 6KB ZIP
(Python 2.7.x required)
|
Python
|
TrIDDefsPack v1.26b, 3KB ZIP
(Python 2.7.x required)
|
Win32
|
TrIDScan v1.56, 28KB ZIP (deprecated)
|
Win32
|
TrIDDefsPack v1.12, 25KB ZIP (deprecated)
|
|
TrID XML defs, 2132KB
7Z
(archive with 18250 definitions, 04/10/24)
|
Change Log
TridDefsPack/Py v1.21 - 26/01/16:
+ Rewritten in Python, released under AGPL v3.0 license
+ Support new directory structure for file definitions (.\defs\0,a-z)
TridScan/Py v2.01 - 16/03/15:
+ re-added sorting of files to analyze by size - could lead to major speed up
in some circumstances
TridScan/Py v2.00 - 25/02/15:
+ Rewritten in Python, released under AGPL v3.0 license
+ Improved accuracy and (generally) speed of strings scanning
+ Can recurse subdirs
+ Added element for MIME tpye
- Old Win32 version now deprecated
TridDefsPack v1.12 - 24/02/11:
* Fixed an inconsistency with the Tag element
TridScan/32 v1.56 - 22/11/04:
+ Unique strings are sorted by length
TridScan/32 v1.55 - 20/11/03:
+ Unique strings scanning now is case insensitive
+ Possible bug fixed in the refining function
TrIDScan/32 v1.50 - 15/11/03:
+ New uniques strings detection function. It could be forced with the
switch "/FS", or disabled with "/NS"
TrIDScan/32 v1.23 - 13/08/03:
+ New definition refining function
+ Added section <ExtraInfo> with elements <Rem> (for some remarks) and
<RefURL> (for a related/reference URL)
TrIDScan/32 v1.00.1 - 13/07/03:
+ Added element <ASCII> with an ASCII dump of the pattern
|
|