Page 1 of 1
Slovenian dictionary
Posted: Sun May 29, 2011 9:05 pm
by fvs114
Hi Cyril,
I have compiled csv file for Slovenian (SL) dictionary from huge corpus of Slovenian language sorted by frequency, descending.
File consists from 63.000+ most frequent used words for Slovenian language.
I've attached zip file with 3 files in it: XLS frequency reference(CP1250) sorted descending , XLS word list sorted descending(CP1250),CSV list of words (UTF-8) sorted descending.
If you need to cut those files, do it from bottom, as there are words with lovest frequency, but i prefer to have as much words as possible.
v.1
Thank you.
Re: Slovenian dictionary
Posted: Mon May 30, 2011 12:19 pm
by fvs114
Hi,
I´m confused a bit.
I saw description or change log for Portugese dictionary
UPDATE:
- 130000 words added!-
Is it a good point to start with 63K words for Slovenian dictionary, or should i start working on 265K project

?
What is average number of words in other dictionaries ?

Re: Slovenian dictionary
Posted: Mon May 30, 2011 12:53 pm
by cyril
Hi
The only limitation is the final size of the apk, which is difficult to predict. I think 265k words would be too much, but anyway if the list is sorted you can give me everything and I will trim it if necessary.
Re: Slovenian dictionary
Posted: Mon May 30, 2011 8:54 pm
by fvs114
Hi Cyril,
I’ve made 265K file with words sorted descending, so you’ll be able to trim it, but i hope that won’t be necessary.
I think that is excellent selection , for this number of words, but if you’ll have to trim it hard, I’ll make another one slightly different.
v.2
If you’ll have to trim it, let me know how many lines (words) is included in dictionary, when will be finished.
Thank You

Re: Slovenian dictionary
Posted: Thu Jun 02, 2011 12:11 pm
by cyril
I just built the dictionary
here.
There was no need to trim it as it has a good compression factor, so the whole list is there. Let me know if it's ok before I put it on the market.
Re: Slovenian dictionary
Posted: Thu Jun 02, 2011 4:07 pm
by fvs114
Thanks Cyril, Great news !
I'm testing it now, and it's good, but I have to change few things - apply some filters, before official launch.
I've found some errors, have to add / change and delete few words, check personal names, and there is a problem with some CAPS....,etc. what i was unable to see without beta version.

Re: Slovenian dictionary
Posted: Sun Jun 05, 2011 6:22 pm
by fvs114
Hi Cyril,
Here is updated list with all corrections, and I think it is good enough for official launch.
v.3
260K words !!! (267.257 rows from previous version was reduced to 266.146 rows by filtering out unneeded values)
Current Beta version is indeed very,very good, with some minor exceptions, which will be be solved with this new list.
FVS114
Re: Slovenian dictionary
Posted: Fri Jun 10, 2011 11:38 am
by cyril
The slovenian dictionary is now on the Market, based on your latest list.
Let me know if things have to be changed.
Re: Slovenian dictionary
Posted: Fri Jun 10, 2011 2:01 pm
by fvs114
You’re the man Cyril !!! Thank You !!!
If anything.... , I’ll let you know.
