Author Topic: sdg.trid.xml for StarOffice Gallery (*.sdg)  (Read 2321 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 375
sdg.trid.xml for StarOffice Gallery (*.sdg)
« on: October 18, 2019, 11:21:00 PM »
Hello trid users,

some days ago i run TrID inside LibreOffice directories. In sub directory
gallery there files of Office Galleries are stored. The samples with file
name extension sdg are described as "Unknown!" ( see appended
output/trid-old.txt).

So i run tridscan to generate a definition file sdg.trid.xml. Unfortunately
i found no documentation about such gallery file format. At least one note
is mentioned on sub page about StarOffice binary formats on file formats
archive team site. This is expressed by reference line:
   <RefURL>
   http://fileformats.archiveteam.org/wiki/StarOffice_binary_formats
   </RefURL>

Apparently such SDG files already seems to start with a 11 byte sized header
like structure starting with 4 byte ASCII string SGA3. This is expressed by
XML construct:
   <Bytes>534741330400</Bytes>
   <ASCII> S G A 3</ASCII>
   <Pos>0</Pos>
I do not know if letter 3 means something like a version. So i also look for
older versions. The same gallery format is used in old StarOffice 5.2 dated
about May 2000. And the format is still used in actual LibreOffice with
version 6.3.2.2.

After header like structure comes a short byte sequence starting with ASCII
string BM. This is expressed by XML construct:
   <Bytes>0001424D</Bytes>
   <ASCII> . . B M</ASCII>
   <Pos>9</Pos>

Some sites on the net call the SDG samples like StarOffice Gallery image
format. According to site about BMP file format on Wikipedia BM string is
the start magic of Windows bitmap with file name extension bmp. So i assume
that second structure element is little Windows Bitmap graphic.
Third element mainly consists of file name or URL pointing to a gallery
element like a GIF picture or WAV audio file. If a gallery contains more
elements, then for every more element such 3 structure are appended.

So i tried to extract bitmaps from SDG samples. But usual graphic tools like
Xnview, gimp, ImageMagick refused to open extracted images. At least
IrfanView tool could open such extracted images, show some information but
display garbage.

So with help of Wikipedia page i tried to understand and clean up observed
patterns. The first image pattern looks like:
   <Bytes>000000000000</Bytes>
   <Pos>15</Pos>

The size of the BMP file in bytes is stored in bytes from offset 13 til 16.
If stored size is low like in little images then 2 upper bytes are null.
But there is no guarantee that this must be always true. So i removed this
pattern. Furthermore the stored values seems to be significantly higher than
the extracted image size. The next 4 bytes are reserved for variants like
OS/2 pointer bitmap variants. So for Windows bitmaps here always null values
are found. So i removed the above mentioned pattern.

Next Pattern looks like:
   <Bytes>000028000000</Bytes>
   <ASCII> . . (</ASCII>
   <Pos>23</Pos>
From offset 21-24 the offset of the bitmap image data is stored. By accident
in inspected examples only low values occur where 2 upper bytes are null. So
i removed that pattern part. From 25-28 the size of the DIB header is
stored. Here only value 28h is found, that means bitmap is Windows 3.x
variant with 40 byte sized DIB header. So extracted images are described by
bitmap-bmp-v3.trid.xml. So refined pattern becomes
   <Bytes>28000000</Bytes>
   <ASCII> (</ASCII>
   <Pos>25</Pos>

Next pattern looks like:
   <Bytes>000000</Bytes>
   <Pos>30</Pos>
From 29-32 the bitmap width in pixels is stored. For inspected images this
value is very low. So 3 upper bytes are null. But there is no guarantee that
is must be always true. So i delete that pattern.

Next pattern looks like
   <Bytes>0000000100</Bytes>
   <Pos>34</Pos>
From 33-36 the bitmap height in pixels is stored. For inspected images this
value is very low. So 3 upper bytes are null. But there is no guarantee that
is must be always true. So i delete that pattern part. From 37-38 the number
of color planes is stored. This must be 1. So pattern now becomes
   <Bytes>0100</Bytes>
   <Pos>37</Pos>

Next pattern looks like
   <Bytes>00</Bytes>
   <Pos>40</Pos>
From 39-40 the number of bits per pixel is stored. Typical values are 1, 4,
8, 16, 24 and 32. So upper byte is always null. So keep above pattern.

The next pattern looks like
   <Bytes>00</Bytes>
   <Pos>43</Pos>
From 41-44 the used compression method is stored. According to Wikipedia the
highest possible value is 13 for BI_CMYKRLE4. But here i find bytes sequence
like 5344. That is one reason why graphic tools can not display such images.
The IrfanView program reports for such images unknown compression message.


The next pattern looks like:
   <Bytes>0000</Bytes>
   <Pos>47</Pos>
From 45-48 the size of the raw bitmap data is stored. For small inspected
images the 2 upper bytes are null. But there is no guarantee that this must
be always true. So i removed this pattern.


The next pattern looks like:
   <Bytes>0000</Bytes>
   <Pos>51</Pos>
From 49-52 the horizontal resolution of the image is stored. By accident in
inspected examples only low values occur where 2 upper bytes are null. So
i remove that pattern part.

The next pattern looks like:
   <Bytes>0000</Bytes>
   <Pos>55</Pos>
The same for vertical resolution. So i delete that pattern part.

The next pattern looks like:
   <Bytes>0000 00000000</Bytes>
   <Pos>59</Pos>
From 57-60 the number of colors in the palette is stored. By accident in
inspected examples only low values occur where 2 upper bytes are null. So i
remove that pattern part. From 61-64 the number of important colors is
stored, or 0 when every color is important. So this becomes
   <Bytes>00000000</Bytes>
   <Pos>61</Pos>

The next null patterns seems to belongs to color table or pixel array. So i
delete such pattern.

According to sub page about Mime Content Types on openoffice.org i choose a
user defined mime type expressed by line:
   <Mime>application/x-stargallery-sdg</Mime>

With the new trid definition all SDG examples are now described ( see
appended output/trid-new-v.txt). TrID definitions, some examples and output
are stored in archive sdg.zip. I hope that my XML file can be used in future
version of triddefs.

Unfortunately i do not know if second structure element is a Windows bitmap
with unknown compression or something else. Hints and tips are welcome.

Furthermore a gallery consist of some files. For every gallery beside the SDG
file there seems to exist files with same main name but with extension sdv,
thm and sometimes str. I will try to handle these other files in a future
session.

With best wishes
Jörg Jenderek

Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2743
    • Mark0's Home Page
Re: sdg.trid.xml for StarOffice Gallery (*.sdg)
« Reply #1 on: October 19, 2019, 12:49:54 AM »
Thanks!