Author Topic: arn-autoruns-v14.trid.xml for newer Sysinternals Autoruns data  (Read 849 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 375
arn-autoruns-v14.trid.xml for newer Sysinternals Autoruns data
« on: September 25, 2022, 01:58:37 AM »
Hello trid users,

Some days ago i send updated definition arn-autoruns.trid.xml for
Sysinternals Autoruns data file name extension ARN. But this applies only to
"older" samples.  The ARN examples produced by version 14.0 and 14.09 use a
completely other file format introduced in the middle of year 2021. So now i
look at such newer version samples. So compared with older version these
samples are now described by line like:
   <FileType>Sysinternals Autoruns data (v14)</FileType>

So i run trid utility on my newer ARN examples. All are described generic as
"Generic OLE2 / Multistream Compound" by docfile.trid.xml (See appended
output/trid-v-old.txt)

For comparison reason i check these examples by file command utility. When
running file command (version 5.43). Here all examples are also described
generic as OLE 2 Compound Document" (See appended output/file-5.43.txt) and
with mime type application/x-ole-storage (See appended
output/file-i-5.43.txt). It was not able to do sub classification, but it
display directory entry names. So second one apparently seem to start always
with Header encoded at UTF-16 string after first directory entry, which is
always "Root Entry". Third and forth directory entries are names Items and 0
( See appended output/file-soft-5.42.txt).

For comparison reason i also run the file format identification utility
DROID ( See https://sourceforge.net/projects/droid/). This identifies all
examples also only generic as "OLE2 Compound Document Format" by PUID
fmt/111.

Because such ARN samples are OLE2 Compound container we can inspect such
examples by suited tools like Michal Mutl Structured Storage Viewer for
example. There we see that such examples contain at 4 main streams. Two
(Header Items ) mentioned by file commands and two others with names
LargeIcons and SmallIcons. These observations are expressed inside global
strings section by lines like:
      <String>H'E'A'D'E'R</String>
      <String>I'T'E'M'S</String>
      <String>L'A'R'G'E'I'C'O'N'S</String>
      <String>R'O'O'T' 'E'N'T'R'Y</String>
      <String>S'M'A'L'L'I'C'O'N'S</String>

Characteristic for OLE2 Compound is the "Root Entry". This 64 byte entry is
padded with nil bytes. In my ARN examples the directory was always located
at offset 1024. That is expressed inside Front Block section by XML
construct like:
 <Bytes>00000052006F006F007400200045006E00740072007900000000000000
 <ASCII> . . . R . o . o . t .   . E . n . t . r . y</ASCII>
 <Pos>1021</Pos>

When root directory is located at offset 1024 then second entry is located
at offset 1152 (=1024+128). Second entry in all examples was Header. So this
is expressed inside Front Block section by XML construct like:
 <Bytes>000000000048006500610064006500720000000000000000000000000000000000000
 <ASCII> . . . . . H . e . a . d . e . r</ASCII>
 <Pos>1147</Pos>

When root directory is located at offset 1024 then third entry is located at
offset 1280 (=1024+128+128). Third entry in all examples was Item. So this
is expressed inside Front Block section by XML construct like:
 <Bytes>0000FFFFFFFF0000000000000000000000000000000000000000000000000000000000000000000000000000000018000000000000004900740065006D0073
 <ASCII> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I . t . e . m . s</ASCII>
 <Pos>1226</Pos>

When root directory is located at offset 1024 then forth entry is located at
offset 1408 (=1024+128+128+128). Forth entry in all my examples was 0 . So
this is expressed inside Front Block section by XML construct like:
 <Bytes>0100000000000000000000000030
 <ASCII> . . . . . . . . . . . . . 0</ASCII>
 <Pos>1395</Pos>

The 0-entry was the first sub directory of Items entry. There exist more
hundreds numbered entries (increased by one). Maybe that this is expressed
inside global strings section by lines like:
      <String>0'''1</String>
      <String>0'''H</String>
      <String>1'5'9</String>
      <String>2'0'3</String>
      <String>3'C'2</String>
      <String>5'''6</String>
      <String>6'''0</String>
      <String>8'''T</String>
      <String>8'''{</String>

The Header entry consist mainly of the string Autoruns encode as UTF-16.
This observations is probably expressed inside global strings section by
line like:
      <String>A'U'T'O'R'U'N'S</String>

Unfortunately i found no little hint with information about file format. All
site show nearly the same information. How to use tool, but nothing about
file format. So this is here expressed by line like:
 <RefURL>
 https://learn.microsoft.com/en-us/sysinternals/downloads/autoruns
 </RefURL>

After running tridscan to generate definition feedsdb-ms.trid.xml i looked
what XML construct are created and try to understand it.  The first XML
construct looked like:
 <Bytes>D0CF11E0A1B11AE1000000000000000000000000000000003E000300FEFF0900060000000000000000000000</Bytes>
 <Pos>0</Pos>

This looks like the starting magic of Generic OLE2 / Multistream Compound
files done by docfile.trid.xml. There this looks like:
 <Bytes>D0CF11E0A1B11AE1</Bytes>
 <Pos>0</Pos>

I would like to reduce the XML construct , but i was not able to do this. So
the byte 3E000300 means version 3.62 like reported by file command. And FFFE
sequence means little-endian.  But i have only a dozen of such ARN examples
and found no hint of information about file format. So i do not know if this
is always true or just triggered by lucky circumstances. So i keep first XML
construct. The same considerations applies to the other XML constructs.

The definition contain many short nil patterns like:
      <Pattern>
         <Bytes>000000</Bytes>
         <Pos>65</Pos>
      </Pattern>
      <Pattern>
         <Bytes>000000</Bytes>
         <Pos>1017</Pos>
      </Pattern>
I assume that this are generated by lucky circumstances. So i delete such
constructs.

The definition contain inside global strings section many short patterns
which looks like garbage. This look like:
      <String>*1JHEU</String>
      <String>09 'QO</String>
      <String>('''}</String>
      <String>I*'W</String>
      <String>J*'Y</String>
      <String>KS)%</String>
I assume that this are generated by lucky circumstances. So i delete such
constructs.

The definition contain no mime type. Because feeds are OLE2 documents i
could add generic mime type application/x-ole-storage. But i choose an user
defined one. That is expressed by line like:
      <Mime>application/x-ms-arn</Mime>

With the new trid definition now all my new ARN examples are described now
more precisely (see appended output/trid-v-new.txt). TrID definition and
output are stored in archive arn_new_.zip. I hope that my XML file can be
used in future version of triddefs.

With best wishes
Jörg Jenderek


Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2743
    • Mark0's Home Page
Re: arn-autoruns-v14.trid.xml for newer Sysinternals Autoruns data
« Reply #1 on: September 27, 2022, 02:40:27 AM »
Thanks!