Author Topic: e-Books are 50% MATLAB?  (Read 10479 times)

Ken Jackson

  • Guest
e-Books are 50% MATLAB?
« on: May 19, 2010, 07:51:24 PM »
I've been trying various PDF ebooks I've found on the web, and I've noticed a lot of them are 50% MATLAB (and 50% PDF).  Has anyone else seen this?  What is this MATLAB portion?  Is it a false reading or is this a new attack vector of some sort?

Thanks!

K

Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2743
    • Mark0's Home Page
Re: e-Books are 50% MATLAB?
« Reply #1 on: May 20, 2010, 10:57:27 PM »
Hi!

Matlab & PDF files should be different enough. I doubt that there's anything malicious going on, anyway.
If you could send me one of the ebook file that show this results, I'll be happy to have a look at them and give a more precise reply.
You could, for example, submit one to the Online TrID file identifier and leave a feedback so that it will come to my attention.

Thanks,
Bye!

revdrmarsh

  • Newbie
  • *
  • Posts: 11
Re: e-Books are 50% MATLAB?
« Reply #2 on: July 02, 2010, 06:09:52 PM »
Matlab is a VERY weak binary signature:
<Bytes>25</Bytes>
<ASCII> %</ASCII>
<Pos>0</Pos>

It only takes takes a percent sign at the beginning of a file to be considered Matlab.  All valid PDF's start with a percent sign, so they will show as both.

Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2743
    • Mark0's Home Page
Re: e-Books are 50% MATLAB?
« Reply #3 on: July 02, 2010, 06:27:08 PM »
Matlab is a VERY weak binary signature:

True, but there's also a string, and that lower the possibility (still present) of some false positives.
The "MATLAB program" definition is also very broad: probably it would be possible to keep that as a general one, and make some more specific ones, each for various versions maybe (I'm speculating here; I'm not really that familiar with MATLAB files).

Quote
It only takes takes a percent sign at the beginning of a file to be considered Matlab.  All valid PDF's start with a percent sign, so they will show as both.

Yes but, if it's a PDF, it will have an higher score since the file will match a larger pattern specific for the PDF ("%PDF").