Hello,
I read the other forums about dictionary requests, so I started find a HUN word list.
I found one in this page: http://mokk.bme.hu/resources/webcorpus
There you can find "the first 100 000 most frequent words in order of frequency" (file: "web2.2-freq-sorted.top100k.nofreqs.txt").
I would be grateful If You could make a Hungarian dictionary for Smart Keyboard.
Thanks,
Peter
Hungarian Dictionary
-
- Posts: 1
- Joined: Fri Oct 29, 2010 9:31 am
- Phone: Samsung i9000, Froyo
Re: Hungarian Dictionary
I would like to ask the same thing. Please give us a hungarian dictionary! Thank you very much, this keyboard is awesome. 

Re: Hungarian Dictionary
Hello Cyril,
I have cheked the word frequency file that was posted @ first post and that is useless, because that is full of rubbish (special characters and the language file is mixed-up).
I am shocked how much effort those guys put to proccess huge data without a simple language filter. Nonsense!
Anyway, I have found a very useful and free to use tool that quickly create a frequency list of input text from given documents such as MS Word, Open Office or html.
You can find it here and you can also publish it at your forum so others might benefit from it:
http://neon.niederlandistik.fu-berlin.de/textstat/
I used 19 documents of 1.665.754 words such as 11.513.966 bytes and generated a word frequency list of 46.000 words. I used a treshold of minimum 3 repetition to shorten the list and filter rare words. Without the limit of 3 repetition the list would have been over 150k which is waste of resources.
Please generate a Hungarian dictionary and send it back to me for testing purposes before publishing it to the market. The file usues UTF8 character encoding and saved without BOM using notepad++.
Cheers,
endrus
I have cheked the word frequency file that was posted @ first post and that is useless, because that is full of rubbish (special characters and the language file is mixed-up).
I am shocked how much effort those guys put to proccess huge data without a simple language filter. Nonsense!
Anyway, I have found a very useful and free to use tool that quickly create a frequency list of input text from given documents such as MS Word, Open Office or html.
You can find it here and you can also publish it at your forum so others might benefit from it:
http://neon.niederlandistik.fu-berlin.de/textstat/
I used 19 documents of 1.665.754 words such as 11.513.966 bytes and generated a word frequency list of 46.000 words. I used a treshold of minimum 3 repetition to shorten the list and filter rare words. Without the limit of 3 repetition the list would have been over 150k which is waste of resources.
Please generate a Hungarian dictionary and send it back to me for testing purposes before publishing it to the market. The file usues UTF8 character encoding and saved without BOM using notepad++.
Cheers,
endrus
- Attachments
-
- Hungarian_freq_dict_UTF8_wo_BOM.zip
- (189.83 KiB) Downloaded 253 times
Last edited by endrus on Wed Dec 08, 2010 11:28 pm, edited 1 time in total.
- cyril
- Developer
- Posts: 2079
- Joined: Tue Feb 02, 2010 4:02 pm
- Phone: Nexus One 2.3
- Location: Nice, France
Re: Hungarian Dictionary
Ok you can try it here
I had to remove the duplicates when the same word is present in upper and lower case (I keep the lower case)
I had to remove the duplicates when the same word is present in upper and lower case (I keep the lower case)
Cyril
Re: Hungarian Dictionary
Cyril,cyril wrote:Ok you can try it here
I had to remove the duplicates when the same word is present in upper and lower case (I keep the lower case)
The dictionary seems to work fine and it is time to release it to the Market.

I keep an eye on the comments and future releases might follow.

Thanks for your fast action!

-Endrus
- cyril
- Developer
- Posts: 2079
- Joined: Tue Feb 02, 2010 4:02 pm
- Phone: Nexus One 2.3
- Location: Nice, France
Re: Hungarian Dictionary
OK just released it!
BTW I can think you can add more words as 500 kB is not too big
BTW I can think you can add more words as 500 kB is not too big
Cyril