Author Topic: pws-aspell.trid.xml, prepl-aspell.trid.xml for aspell personal dictionary  (Read 690 times)

jenderek

  • Sr. Member
  • ****
  • Posts: 375
Hello trid users,
   
some days ago i handled some spell affix files. In this session i will only
consider spell dictionary files with PWS or PREPL suffix. This are
used/created by aspell software (see Wikipedia page
https://en.wikipedia.org/wiki/GNU_Aspell).

The aspell variant samples are typically found inside user home
directory. Depending on the used spelling language the names are like:
.aspell.de_DE.prepl
.aspell.de_DE.pws
.aspell.en.prepl
.aspell.en.pws
.aspell.it.prepl
.aspell.it.pws

So i run trid utility on my PWS/PREPL examples. All samples are described as
"Unknown!" without mime type and file name suffix (see appended trid-v-old.txt
in output).

For comparison reason i also run the file format identification utility DROID
(See https://sourceforge.net/projects/droid/). It does not recognize the
samples.

For comparison reason i also run file command (version 5.45) on such
samples. Here the samples are also not recognized and described generic as
"text" (see appended output/file-5.45.txt) with generic mime type text/plain
(see appended file-i-5.45.txt in output) and no file name suffix (see appended
file-ext-5.45.txt in output).

Luckily in the aspell documentation you find an explicit file format
specification of such files with title "Format of the Personal and Replacement
Dictionaries".  So i choose this page as reference. So that is expressed by
line like:
 <RefURL>
 http://aspell.net/man-html/Format-of-the-Personal-and-Replacement-Dictionaries.html
 </RefURL>

The personal dictionary are not binary files like the RWS dictionary. The
personal dictionary samples are "just" text files. So these can be also
created/corrected with every text editor.  So instead of generic mime type
text/plain i choose an user defined one. That is expressed by line like:
   <Mime>text/x-aspell-dictionary</Mime>

After running tridscan generating pws-aspell.trid.xml with few examples the
first XML construct looks like:
   <Bytes>706572736F6E616C5F77732D312E3120</Bytes>
   <ASCII> p e r s o n a l _ w s - 1 . 1</ASCII>
   <Pos>0</Pos>

After running tridscan generating prepl-aspell.trid.xml with few examples the
first XML construct looks like:
   <Bytes>706572736F6E616C5F7265706C2D312E3120</Bytes>
   <ASCII> p e r s o n a l _ r e p l - 1 . 1</ASCII>
   <Pos>0</Pos>

With the new trid definitions now my PWS/PREPL samples are described. TrID
definition, some samples and output are stored in archive pws_prepl.zip. I
hope that my definition can be used in future version of triddefs.

Unfortunately there exist also other word list/dictionary with other file
formats and file name suffix for aspell. Also other spelling software like
ispell and hunspell use other dictionary file formats. I will try to handle
these in a future session.

With best wishes
Jörg Jenderek


Mark0

  • Administrator
  • Hero Member
  • *****
  • Posts: 2743
    • Mark0's Home Page
Thanks!