Page 1 of 2

New CSV export/import feature

Posted: Thu Apr 07, 2011 8:14 am
by cyril
As I think it was a common request, I developed a CSV exporter / importer of the user dictionary and AutoText dictionary.

This will make possible to edit the dictionary from your computer, to merge dictionaries, or whatever you want!

Just go the the "Backup" section in the keyboard settings, and there are several options to perform these actions. Note that at the moment the file path is hardcoded and is
/sdcard/smartkeyboardpro/userdic.csv for the user dictionary, and /sdcard/smartkeyboardpro/autotext.csv for the AutoText dictionary.
The first line of the file is a header, DON'T REMOVE it before importing! (otherwise the first line will just be ignored)

You can test this feature in the 3.16.0 beta version.

Re: New CSV export/import feature

Posted: Sat Apr 09, 2011 3:12 pm
by Travis90
What program do I have to use to edit the CSV Files? Exel or just notepad too?

Re: New CSV export/import feature

Posted: Sat Apr 09, 2011 3:21 pm
by cyril
You can use whatever you want, as long as you keep the file in CVS format (which is very basic) and don't remove the first line

Re: New CSV export/import feature

Posted: Sat Apr 09, 2011 3:34 pm
by Travis90
Thanks! ;)
So... If I give you a CSV file with many italian words, can you improve ours dictionary? Please! Because it's very poor! =P

Re: New CSV export/import feature

Posted: Sat Apr 09, 2011 3:41 pm
by cyril
Well... finding a list of words is easy (just need to take OpenOffice dictionary...), but to make a dictionary this word list must be sorted by frequency of usage.

Re: New CSV export/import feature

Posted: Mon Apr 11, 2011 2:55 pm
by CApaddler
Hi Cyril,

I use SKP in US-English and in German. I exported my User Dictionary to edit it on a regular computer. I see, however, that German characters such as ä, ö, ü, ß etc. are represented incorrectly when I view the .CSV file on my Android device, or when I view it on my computer in various editors (Notepad, Wordpad, Excel 2007).

für is shown as für, for example.

Looks like some sort of charset problem. My Android locale is set to US-English. If I edit the .CSV file and reimport, is it going to corrupt these types of entries? How can I be sure that adding new words with these characters will be properly imported when I use the new import .CSV feature?

Thanks!

Re: New CSV export/import feature

Posted: Wed Apr 13, 2011 8:23 am
by cyril
Hello
Files must be encoded in UTF-8. You should be able to tell Excel to use UTF_8 encoding when you open the CSV file.

Re: New CSV export/import feature

Posted: Wed Apr 13, 2011 9:07 am
by CApaddler
Thanks Cyril.

For people having trouble editing their files in UTF-8 in Windows, there are a few workarounds:

1. Using Excel 2007, .csv files are not displayed with UTF-8 encoding. To work around this, rename the file to .txt and open it in Excel. It will then prompt you for the type of encoding to use.
2. Use OpenOffice/LibreOffice, native support for UTF-8 .csv files is supposedly included.
3. Google Docs is reported to work as well (untested).

Anyway, just a few options for those of you who want to edit your user dictionaries on non-Linux machines.

Re: New CSV export/import feature

Posted: Sun May 08, 2011 10:42 am
by Ivanhoe
Dear Cyril,

I have made a csv file to import for serbian language, but it contains some 200.000 words and it took couple of hours for it to be imported. Once done however, smart keyboard stopped offering any suggestions except for the names from contacts. I suspect that the problem is in the size of the file. I will upload the file here, and could You please make a dictionary installation out of it if it's possible?

Thanks in advance.
userdic.part01.rar
(240.01 KiB) Downloaded 436 times
userdic.part02.rar
(240.01 KiB) Downloaded 413 times
userdic.part03.rar
(37.51 KiB) Downloaded 418 times

Re: New CSV export/import feature

Posted: Sat May 14, 2011 2:37 pm
by cyril
Indeed the user dictionary is not designed to cope with so many words. I would be happy to create a dictionary for serbian language, but can you provide me with word frequencies as well (or at least a sorted word list) ? Without that I cannot create a dictionary.