Author Topic: nvram-virtualbox.trid.xml for VirtualBox Nvram File + variant  (Read 777 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 375
Hello trid users,

some days weeks i must migrate to Windows 10.  During that process i lost
some Virtual Box machines. So i look for file formats related to Virtual
Box. One format use filename extension nvram.

So i run trid utility on my NVRAM examples. Some of my samples are described
with highest priority as "TAR - Tape ARchive (GNU)" with mime type
application/x-gtar by ark-tar-gnu.trid.xml. With lower priority these
samples are described as "Tape ARchive (file)" with mime type
application/x-tar by ark-tar-file.trid.xml (See appended
output/trid-v-old.txt). Some of my samples are described wrong as "Adobe
PhotoShop Brush" by abr.trid.xml (See appended
NO_TAR/output/trid-v-old.txt).

For comparison reason i also run file command (version 5.44) on such
samples. Here the tar based samples are recognized. These are described
generic as "POSIX tar archive" (see appended output/file-5.44.txt). When
excluding internal tar checks by "-e tar" option i get same description with
additional information (first member name is TpmEmuTpms/permall. The UID and
GID is 0. That is normally user and group root, but the stored names for
this member are someone and somegroup (See appended output
file-soft-5.44.txt).

For comparison reason i also run the file format identification utility
DROID ( See https://sourceforge.net/projects/droid/).  The tar based samples
are described here as "Tape Archive Format" with mime type application/x-tar
by PUID x-fmt/265. The other variant is not recognized.

Some samples are just tar files. So a generic mime type application/x-tar in
principal is OK. On my Windows system OVA samples are associated with user
defined application/x-virtualbox-ova. So i choose for my samples a similar
type. This type is now expressed by line like:
      <Mime>application/x-virtualbox-nvram</Mime>

Some Nvram samples are just tar files. That can be verified by unpacking
listing (see appended 7z-l-slt.txt 7z-l.txt in output directory) like done
by commands like:
   7z l -ttar   *.nvram
   7z l -ttar -slt   *.nvram

Then we see the same information reported by file command. First member is
file TpmEmuTpms\permall which is writable, readable and executable by user
someone "-rwx------" with group somegroup. Now comes the interesting
part. Second member is a file with name name efi\nvram and is readable and
writable to all "-rw-rw-rw-". Apparently that file is of the same kind as
the other. So i extract these samples with same name as VirtualBox machine
and additional no_tar phrase before suffix.

There exist other nvram samples. These are described as "VMware BIOS state"
by nvram.trid.xml. So my inspected samples are of similar kind, but for
other virtualisation software. That is VirtualBox. And is apparently only
use/describes "UEFI BIOS".

Unfortunately i found no file format description for such VirtualBox nvram
samples. I and other people often complaining about Microsoft behaviour, but
open software is also not the holy grail in every field. Such nvram samples
are used and installed by VirtualBox, but the file type is not officially
registered or you find no sufficient file specification. Some people say
"may the source be with you", but when unpacking VirtualBox source packages
if get about 1 GB of source text files. Unfortunately i have not enough
expertise and time to find there the needed explanations.

But in VirtualBox User Manual there exist in Chapter 8 about VBoxManage a
section about modifynvram command. This command list and modify the NVRAM
content of a virtual machine.  So i use this as reference URL. So that is
expressed by line like:
 <RefURL>
 https://www.virtualbox.org/manual/ch08.html#vboxmanage-modifynvram
 </RefURL>

The interesting sub command is listvars. This lists all UEFI variables in
the virtual machine stored along with their owner UUID. This can be done for
examples command line like:
   VBoxManage modifynvram "Win10_22H2de" listvars

So we get variable names and their GUID (See appended
NO_TAR/output/win10Okt2018.txt). So i later use this to refine
definitions. Unfortunately you can not specify the nvram file by an option.

So i run first tridscan on my no tar based samples to generate
nvram-virtualbox.trid.xml. Now i look what is characteristic and what is
generated by accident (too few samples).

In global strings we find again the listed UEFI variables but encoded as
UTF16. Some are obviously identified as UEFI variables. These are expressed
by lines like:
   <String>U'E'F'I' 'V'B'O'X' 'H'A'R'D'D'I'S'K' 'V'B</String>
   <String>U'E'F'I' 'V'B'O'X' 'C'D'-'R'O'M' 'V'B</String>
   <String>E'F'I' 'I'N'T'E'R'N'A'L' 'S'H'E'L'L</String>
   <String>B'O'O'T'O'R'D'E'R</String>
   <String>B'O'O'T'0'0'0'0</String>
   <String>B'O'O'T'0'0'0'1</String>
   <String>B'O'O'T'0'0'0'2</String>
   <String>B'O'O'T'0'0'0'3</String>
   <String>B'O'O'T'0'0'0'4</String>
Some are such variables but that is not obviously visible at first
glance. These are expressed by lines like:
   <String>A'T'T'E'M'P'T' '1</String>
   <String>A'T'T'E'M'P'T' '2</String>
   <String>A'T'T'E'M'P'T' '3</String>
   <String>A'T'T'E'M'P'T' '4</String>
   <String>A'T'T'E'M'P'T' '5</String>
   <String>A'T'T'E'M'P'T' '6</String>
   <String>A'T'T'E'M'P'T' '7</String>
   <String>A'T'T'E'M'P'T' '8</String>
   <String>M'T'C</String>
I can query the content of a given UEFI variable by sub command
queryvar. This for example looks like:
   modifynvram "Win10_test" queryvar --name=Boot0003
   modifynvram "Win10_test" queryvar --name=MTC
I also got some short lines inside global string section which looks like:
   <String>I'''A</String>
   <String>T'''A</String>
   <String>X'''A</String>
   <String>}'''A</String>
I assume that these are triggered by lucky circumstances (too few
examples). So i delete these lines.

Then there are some UTF based lines, which are apparently not variables or
contents. This are like:
   <String>-'1'A'2'B'3'C'4'D</String>
Because i am unsure about the meaning i keep that.

Then there are some short lines looking like ASCII inside global string
section where i am unsure about how relevant these are. So i keep such
lines. These look like:
      <String>EI2YD</String>
      <String>_FVH</String>

The first 64 bytes seem to be constant. So this is expressed by XML
construct like:
   <Bytes>000000000000000000000000000000008D2BF1FF96768B4CA985
   <ASCII> . . . . . . . . . . . . . . . . . + . . . v . L . .
   <Pos>0</Pos>
Nothing looks a magic pattern except for 4 byte sequence _FVH and 2 byte
sequence AA55 at the end. I generate all my samples on the same machine
running Windows 8 and 10. Maybe that this pattern contains something like an
UUID which is in my case already constant. So other users should try to
improve this by generating NVRAM samples run on machines with non windows
operating systems. Unfortunately i can not do this because my few other
machines have not UEFI like raspberry pi or are some old i686 architectures
with classic BIOS instead of UEFI firmware.
 
In front block i get many very short nil patterns like:
   <Bytes>00</Bytes>
   <Pos>103</Pos>
   ...
   <Bytes>00</Bytes>
   <Pos>2043</Pos>
I assume that these are triggered by lucky circumstances. So i delete these
lines.

In Front Block i get many short nil patterns like:
   <Bytes>000000</Bytes>
   <Pos>137</Pos>
   ...
   <Bytes>0000000000</Bytes>
   <Pos>1335</Pos>
I assume that these are triggered by lucky circumstances. So i delete these
lines.

Some lines in Global Strings are triggered by efi variables with
content. This can be seen when running command like:
   VBoxManage modifynvram Win10_22H2de queryvar --name=PlatformLang
Here i get 2 byte sequence en. I do the same procedure for Lang. There i get
3 byte sequence eng. So apparently these means my examples are firmware for
English language. Assuming that also other language are possible (french for
example) these two lines now becomes like:
   <String>P'L'A'T'F'O'R'M'L'A'N'G'</String>
   <String>L'A'N'G'</String>
instead of
   <String>P'L'A'T'F'O'R'M'L'A'N'G'''EN</String>
   <String>L'A'N'G'''ENG</String>      

When i list boot entries values by command like:
   VBoxManage modifynvram "Win10_22H2de" queryvar --name=Boot0000
i got values that are expressed inside trid definition by lines like:
   <String>U'E'F'I' 'V'B'O'X' 'C'D'-'R'O'M' 'V'B</String>
   <String>U'I'A'P'P</String>
   <String>U'E'F'I' 'V'B'O'X' 'H'A'R'D'D'I'S'K' 'V'B</String>
   <String>E'F'I' 'I'N'T'E'R'N'A'L' 'S'H'E'L'L</String>

When i list Attempt entries values by command like:
   VBoxManage modifynvram "Win10_22H2de" queryvar  "--name=Attempt 1"
i got values that are expressed as ASCII inside trid deviation by lines
like:
   <String>ATTEMPT 1</String>
   <String>ATTEMPT 2</String>
   <String>ATTEMPT 3</String>
   <String>ATTEMPT 4</String>
   <String>ATTEMPT 5</String>
   <String>ATTEMPT 6</String>
   <String>ATTEMPT 7</String>
   <String>ATTEMPT 8</String>

As described before the tar base samples contain efi\nvram as second
member. This member is described by nvram-virtualbox.trid.xml
definition. The first member is TpmEmuTpms\permall. Unfortunately i do not
know under which condition the tar based or the other varaint is generated.
The tar based samples are generated with by VirtualBox version 7.0.8.
The tar charteristics are expressed inside Global Strings section by lines
like:
 <String>0''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
 ''''''''''''''''''''''''''''''USTAR  'SOMEONE'''''''''''''''''''''''''SOMEGROUP</String>
 <String>PERMALL''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
 ''''''''''''''''''0100700'0000000'0000000'000000</String>
 <String>NVRAM''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
 '''''''''''''''''''''''''0100</String>

The same tar specifics are also expresed inside Front Block section by lines
like:
 <Bytes>54706D456D7554706D732F7065726D616C6C00000000000000000000000000000000000000
 <ASCII> T p m E m u T p m s / p e r m a l l . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . . . . . . . . . .
 0 1 0 0 7 0 0 . 0 0 0 0 0 0 0 . 0 0 0 0 0 0 0 . 0 0 0 0 0 0</ASCII>
 <Pos>0</Pos>
 <Bytes>003000000000000000000000000000000000000000000000000000000
 <ASCII> . 0 . . . . . . . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . u s t a r     . s o m e o n e . . . . . . . . . . .
 . . . . . . . . . . . . . . s o m e g r o u p . . . . . . . . . .
 <Pos>155</Pos>

As described before this means the the sample belong to user someone and
group somegroup and is write-/read-/execut-able by user only (700).
Unfortuantly i create samples on the same machine running Windows 8 and 10.
Maybe such pattern shrinks to ustar magic with more samples. So other users
should try to improve this by generating NVRAM samples run on machines with
non windows operating systems.

At offset 124 the size of first archive meber is stored as octal number in
ASCII string with size 12.  Afterwards at offset 136 the modification time
of first file is stored as octal number in ASCII string with size 12.  In my
sample the mtime fields are not so different. So this was expressed by XML
construct like:
   <Bytes>003134</Bytes>
   <ASCII> . 1 4</ASCII>
   <Pos>135</Pos>
Assuming that also other modifcation times are possible, this now becomes
like:
   <Bytes>00</Bytes>
   <Pos>135</Pos>

Afterwards at offset 148 the checksum is stored as octal number in ASCII
string with size 8.  In my samples the first member permall is not so
differnt. So i get simlar checksums ( like 0014722 0014736 0014732) .This
was expressed by XML construct like:
   <Bytes>003030313437</Bytes>
   <ASCII> . 0 0 1 4 7</ASCII>
   <Pos>147</Pos>
Assuming that also other check sums are possible, this now becomes like:
   <Bytes>00</Bytes>
   <Pos>147</Pos>

At higher offsets i get short nil sequnces. These are decribed by XML
consructs like:
   <Pattern>
      <Bytes>0000000000</Bytes>
      <Pos>1030</Pos>
   </Pattern>
   ...
   <Pattern>
      <Bytes>00</Bytes>
      <Pos>1916</Pos>
   </Pattern>
I assume that this are triggered by lucky circumstances (too few
examples). So i delete such patterns.

Because the content of efi\nvram are the same as in other variant we should
inside global strings section the same items.  This i true for many lines,
but not for all. Some different UTF-16 looking lines are like:
      <String>U'E'F'I' 'V'B'O'X' 'C'D'-'R'O'M' 'V'B'1'-'1'A'2'B'3'C'4'D</String>
      <String>T'C'G'2'_'D'E'V'I'C'E'_'D'E'T'E'C'T'I'O'N</String>
      <String>T'C'G'2'_'C'O'N'F'I'G'U'R'A'T'I'O'N</String>
      <String>T'C'G'2'_'V'E'R'S'I'O'N'''1.3</String>
      <String>V'E'N'D'O'R'K'E'Y'S'N'V</String>
      <String>C'U'S'T'O'M'M'O'D'E</String>
      <String>B'O'O'T'0'0'0'6</String>
      <String>CZC'E'R'T'D'B</String>
      <String>0'8'0'0'2'7</String>
      <String>.'E'F'I</String>
      <String>H'D'D'P</String>
      <String>I'N'D</String>
      
Again i lists all UEFI variable for examples by command line like:
   VBoxManage modifynvram modifynvram "Mint-21.1-10b" listvars
Then i get lines like:
TCG2_DEVICE_DETECTION            {6339d487-26ba-424b-9a5d-687e25d740bc}
TCG2_CONFIGURATION               {6339d487-26ba-424b-9a5d-687e25d740bc}
TCG2_VERSION                     {6339d487-26ba-424b-9a5d-687e25d740bc}
VendorKeysNv                     {9073e4e0-60ec-4b6e-9903-4c223c260f3c}
CustomMode                       {c076ec0c-7028-4399-a072-71ee5c448b9f}
Boot0005                         {8be4df61-93ca-11d2-aa0d-00e098032b8c}
Boot0006                         {8be4df61-93ca-11d2-aa0d-00e098032b8c}
0800270A4323                     {5b446ed1-e30b-4faa-871a-3654eca36080}
0800271468C8                     {5b446ed1-e30b-4faa-871a-3654eca36080}
HDDP                             {fab7e9e1-39dd-4f2b-8408-e20e906cb6de}

Typically you have a few boot entries like one for HARDDISK, another for
CD-ROM and also an entry for INTERNAL SHELL.  These are labeled starting
with BOOT0000. So for the non tar varaint i get 5 entries and highest with
name BOOT0004.  For the TAR based varinat highest entry has name BOOT0006.
When showing content by command like :
   VBoxManage modifynvram "Mint-21.1-10b" queryvar  "--name=Boot0005"
i get encoded as UTF16 the entry name like ubuntu and the loader name with
path like \EFI\ubuntu\shimx64.efi.
I do not know what is the lower limit for entries, but assuming 5 like in
non tar variant i can delete the following lines:
   <String>B'O'O'T'0'0'0'6</String>
   <String>.'E'F'I</String>

Like for language variables i can shrinks somes strings to just the variable
names or content. So such lines now become like:
   <String>C'E'R'T'D'B</String>
   <String>U'E'F'I' 'V'B'O'X' 'C'D'-'R'O'M' 'V'B'</String>
   <String>T'C'G'2'_'V'E'R'S'I'O'N'</String>
      
In the tar based varaint i got more EFi variables like:
VendorKeysNv           
CustomMode             
0800270A4323           
0800271468C8           
HDDP                   
TCG2_DEVICE_DETECTION   
TCG2_CONFIGURATION     
TCG2_VERSION           
I do not know if these are optional or required. So i keep the corresponding
lines.

Then some UTF16 looking string are left like:
   <String>0000000'0000000'00002040000'14</String>
   <String>I'N'D</String>
I do not know where these are triggered, but these do not occur in non tar
based variant. So i delete these lines.

Then there are left some "short" ASCII lokking like strings. Some are also
found in non tar based varaint like:
   <String>EI2YD</String>
   <String>M'T'C</String>
   <String>_FVH</String>
So i keep these lines. Then i get a line triggered by first TAR member name
TpmEmuTpms\permall like:
   <String>TPMEMUTPMS</String>
So i keep this line.

Then are left some lines which i do not found in non tar variant like:
   <String>GBRD'0</String>
   <String>00147</String>
   <String>G'''A</String>
   <String>I'''A</String>
   <String>T'''A</String>
   <String>V'B'3</String>
   <String>X'''A</String>
   <String>Z'''A</String>
   <String>0013</String>
So i delete such lines.

With the 2 new trid definition now all my NVRAM examples are described (see
appended output/trid-v-new.txt). TrID definitions, and output are stored in
archive nvram_.zip. I hope that my definitions can be used in future version
of triddefs.

With best wishes
Jörg Jenderek

Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2743
    • Mark0's Home Page
Re: nvram-virtualbox.trid.xml for VirtualBox Nvram File + variant
« Reply #1 on: July 03, 2023, 03:11:24 PM »
Thanks!
I refined the non-tar version with some other .nvram files, and then created a new tar one to refine that too.