Hi all,
I intended using trid to search for certain file types in Opera's (browser) cache and was very happy to see that the files where identified correctly (Opera removed extensions from the filenames in the cache some time ago, which is really dumb, but nothing can be done about it until they change their mind...).
Then I came across a drawback: I simply want to search for e.g. flash files and use "grep" to further process the found files:
$ trid *
File: opr1P0VY
100.0% (.FLV) Flash Video (4000/1)
File: opr1P0ZX
100.0% (.FLV) Flash Video (4000/1)
...
$ trid * | grep FLV
100.0% (.FLV) Flash Video (4000/1)
100.0% (.FLV) Flash Video (4000/1)
...
Not good, because "grep" outputs only the lines matching FLV, which do not contain the file name.
Matching the line with the file name is impossible, because an extension is not present (that's why I need trid in the first place ;o) and the names are synthetic and do not carry any useful information.
At first I expected that using -ae and grepping for ".flv" would do the trick, until I realized that -ae permanently renames the files in place, which is not a very good idea in a cache... ;o)
Of course I could alternatively either
*) use a more complex script to catch the two-line output and parse the filename appropriately, or
*) use -ae and remove the extensions again afterwards (e.g. with rename 's/\..*$//' *)
Both solutions are far from elegant however.
trid just does not lend itself very smoothly to automated processing of output, as far as I can see it in the moment - perhaps I have overlooked something?
But what I think is really needed is an option to influence the output format, such as "put the output on a single line":
$ trid -1 *
100.0% (.FLV) Flash Video (4000/1) File: opr1P0VY
100.0% (.FLV) Flash Video (4000/1) File: opr1P0ZX
...
which would allow for easy grepping (BTW, "-1" is "dash one", not "dash L" ;o)
Still, it would be necessary to use "sed" or "cut" to get the filename only, so a more flexible approach is needed - how about a syntax similar to Unix' "ps" command to select single output fields:
-o (output) [optional] accepts a list of pre-defined column names ("ps" also supports header renaming, which is not really necessary here because there are no headers, but this could be added as well later)
-f (filter) [optional] selects which criteria shall be used to determine whether the line should be included in the output (perhaps this should allow a simple grep-like syntax?):
$ trid -o type,ext,name -f Flash *
"Flash Video" .FLV opr1P0VY
"Flash Video" .FLV opr1P0ZX
...
(see below for the necessity of "")
$ trid -o name -f .FLV *
opr1P0VY
opr1P0ZX
...
$ trid -o path -f video/x-flv *
/home/wolfy/.opera/cache4/opr1P0VY
/home/wolfy/.opera/cache4/opr1P0ZX
...
Note the use of either "Flash", ".FLV" or "video/x-flv" as filter criterion - the whole line is matched to check this.
Speaking of "video/x-flv": Mime types should be supported as well!
Alternatively, dedicated options could be implemented:
-t for a type name (like "Flash Video")
-e for an extension (like "FLV")
-m for a mime type
$ trid -o name,ext -m video/x-flv *
opr1P0VY .FLV
opr1P0ZX .FLV
Another idea: Using "+" instead of "," in the output list could be used for concatenation:
$ trid -o path+ext -m video/x-flv *
/home/wolfy/.opera/cache4/opr1P0VY.FLV
/home/wolfy/.opera/cache4/opr1P0ZX.FLV
Note that now there is no whitespace between the name and the extension - we have a complete canonical file name including extension, which was not present before, but without renaming the file itself as is the case with "-ae".
To allow upper and lower case extensions, two different IDs for extensions could be implemented:
name+ext -> opr1P0VY.flv
name+EXT -> opr1P0VY.FLV
Additionally, special care must be taken to escape whitespace inside e.g. filenames or type identifiers, which is not the case currently, as it is not *really* necessary, because the output format is fixed.
Still this is not very elegant even in the moment when e.g. trying to process the output further (e.g. with "cut" or "sed").
With -o and the possibility to change the order of columns, proper escaping is important, because whitespace separates the columns:
$ trid -o name,type *
"Some longer filename" "Flash Video"
or
Some\ longer\ filename Flash\ Video
which makes the only unescaped whitespace character the delimiter between the two strings.
Speaking of which: an additional option "-d" could specify a user defined delimiter:
$ trid -d "::" -o name,type *
Some longer filename::Flash Video
This means that not whitespace, but the current delimiter character (defaulting to whitespace) is escaped in the output
BTW, not using -1 or -o and/or -f should make trid behave exactly like it does now, in order to not brake legacy scripts.
Sorry for the overly long post, hope I managed to bring my points across