Mark0's Forum
Software => TrID File Identifier => Topic started by: Ken Jackson on May 19, 2010, 07:51:24 PM
-
I've been trying various PDF ebooks I've found on the web, and I've noticed a lot of them are 50% MATLAB (and 50% PDF). Has anyone else seen this? What is this MATLAB portion? Is it a false reading or is this a new attack vector of some sort?
Thanks!
K
-
Hi!
Matlab & PDF files should be different enough. I doubt that there's anything malicious going on, anyway.
If you could send me one of the ebook file that show this results, I'll be happy to have a look at them and give a more precise reply.
You could, for example, submit one to the Online TrID file identifier (http://mark0.net/onlinetrid.aspx) and leave a feedback so that it will come to my attention.
Thanks,
Bye!
-
Matlab is a VERY weak binary signature:
<Bytes>25</Bytes>
<ASCII> %</ASCII>
<Pos>0</Pos>
It only takes takes a percent sign at the beginning of a file to be considered Matlab. All valid PDF's start with a percent sign, so they will show as both.
-
Matlab is a VERY weak binary signature:
True, but there's also a string, and that lower the possibility (still present) of some false positives.
The "MATLAB program" definition is also very broad: probably it would be possible to keep that as a general one, and make some more specific ones, each for various versions maybe (I'm speculating here; I'm not really that familiar with MATLAB files).
It only takes takes a percent sign at the beginning of a file to be considered Matlab. All valid PDF's start with a percent sign, so they will show as both.
Yes but, if it's a PDF, it will have an higher score since the file will match a larger pattern specific for the PDF ("%PDF").