Hello trid users,
some days ago i run Pirisoft ccleaner. Under option registry it offers to scan
for unused file extensions. There it complains about the file name suffix
SCRIPT. So i check on my systems for such examples. In this session i will
only consider SCRIPT samples belonging to Windows rescue DVD creation tool
"c't-Notfall-Windows" from german computer magazine c't. This itself is based
on PhoenixPE found on github. PhoenixPE uses the "next-generation" PEBakery
engine to build the Windows Recovery system. In the PEBakery Documentation in
chapter 2 "Scripting Reference" under Project Components, Scripts a section
"File Format" is listed. That is used as reference URL inside new trid
definition by line like:
<RefURL>
https://github.com/pebakery/pebakery-docs/blob/master/Projects/ScriptFiles.md </RefURL>
According to that documentation a .script file must contain a Main section
that defines the Title and Level of the script in order for it to appear in
PEBakery's Project Tree. That like in hdtune.Script or "McAfee Stinger.script"
looks like:
[Main]
Title=McAfee Stinger
Level=5
That is expressed in generated trid definition inside global strings block by
lines like:
<String>MAIN</String>
<String>TITLE</String>
<String>LEVEL</String>
When looking at these lines you will recognize that these are not very unique
enough in my opinion and could also appear in other INI like files.
In scripting reference under item Variables these are explained. Variables
consist of a unique name used to identify the variable and must be enclosed in
percent signs. Typically script lines look like:
%SetupFile%=hdtune_255.exe
Echo,"Processing %ScriptTitle%..."
WriteInterface,Value,%ScriptFile%,Interface,cb_RunFromRam,False
If,Not,ExistFile,"%ProgramsCache%\%ProgramFolder%\%SetupFile%",Run,%ScriptFile%,DownloadProgram
So these observations are expressed inside global strings block by line like:
<String>%SCRIPT</String>
Unfortunately few (about 1 of 105) script samples from magazine c't like
202-PreLoad-ct.script contain only variable lines looking like:
System,RefreshScript,"%ProjectDir%\Components\330-ImDisk.script"
FileCopy,"%BaseDir%\custom\PENetwork.ini","%ProgramsCache%\PENetwork\PENetwork.ini"
When running tridscan with this sample the above line vanish and recognition
probably become not unique enough. So that was a dilemma for me. In the end i
keep this line. So recognition is hopefully unique enough but this script
example is then not recognized.
All samples start with a License section and the second line contains a
optical long comment line build by 105 slash characters. That is expressed
inside Front Block section by XML constructs like
<Bytes>5B4C6963656E73655D</Bytes>
<ASCII> [ L i c e n s e ]</ASCII>
<Pos>0</Pos>
<Bytes>2F2F2F2F2F2F2F2F2F2F2F2F2F2F2F2F2F2F2F2F2F2F2F2F2F2F
<ASCII> / / / / / / / / / / / / / / / / / / / / / / / / / /
<Pos>11</Pos>
Most examples use CarriageReturn LineFeed character combination as line
separator, but a few samples like hdtune.Script and "McAfee Stinger.script"
use UNIX like LineFeed character (see appended
output/file-soft-5.45.txt). That was the reason why current file command does
not recognize theses samples as INI like because it explicitly checks for
Carriage Return after closing bracket. The c't sample use a mixture of CRLF
and LF.
These XML constructs maybe are unique enough to describe such script but in
documentation is nothing written that License is required and comes as first
line. So i only hope that all new scripts are created by using the existing
script as template.
The next lines look like:
//
// This script is distributed under the MIT License.
// This script is part of the PhoenixPE project and distributed under the MIT License.
Because some samples use CRLF (that are 2 characters) and some use LF (that
are 1 character) these text fragment occur at different offsets. That is
expressed inside global strings section by XML constructs like:
<Pattern>
<Bytes>2F</Bytes>
<ASCII> /</ASCII>
<Pos>120</Pos>
</Pattern>
<Pattern>
<Bytes>20</Bytes>
<Pos>137</Pos>
</Pattern>
<Pattern>
<Bytes>2F</Bytes>
<ASCII> /</ASCII>
<Pos>1581</Pos>
</Pattern>
Assuming that on the next lines a foo comment can occur than these constructs
vanish.
In global string section some lines occur that are definitely triggered by
lucky circumstances (same date range) like:
<String>COPYRIGHT (C) 20</String>
<String>2022</String>
So i shrink or delete such lines.
In the section there still exist many lines like:
<String>OF THEIR RESPECTIVE AUTHORS AND MAY BE SUBJECT TO THEIR OWN LICENSE AGREEMENT.</String>
<String>SUBJECT TO THE FOLLOWING CONDITIONS</String>
<String>THIS SCRIPT IS</String>
<String>OR SELL</String>
<String>TO DEAL</String>
<String>ECHO</String>
<String>TRUE</String>
<String>FALSE</String>
Unfortunately i have not enough time to inspect what text fragments are
required and which are just generated by lucky circumstances. So i keep them
at the moment.
Because all script files are just simple text files i choose the corresponding
generic mime type. That is expressed by line like:
<Mime>text/plain</Mime>
For comparison reason i also run file command (version 5.45) on my SCRIPT
samples. Here most samples are also described as Generic INItialization
configuration, but also additional information is shown ( one section name is
[Main])( see appended output/file-5.45.txt). For these samples also wrong
suffix (ini/inf see appended output/file-ext-5.45.txt) is shown. These samples
are described with mime type application/x-wine-extension-ini (see appended
output/file-i-5.45.txt). A few samples like hdtune.Script or "McAfee
Stinger.script" are described generic as text. Because of classification these
samples are shown with generic mime type text/plain.
With the new trid definition now my SCRIPT samples are described more specific
instead of generic INI (see appended trid-v-new.txt in output).
TrID definition, some samples and output are stored in script_.zip I hope that
my definition can be used in future version of triddefs.
With best wishes
Jörg Jenderek