Page 4 of 6 FirstFirst 123456 LastLast
Results 31 to 40 of 58
Discuss [Keyboard] Dictionary editor for 1.1.1 and later at the Tools - Hackint0sh.org; Sorry for not replying for so long time. It seems that this isn't as easy ...
  1. #31
    Advanced Array

    Join Date
    Dec 2007
    Posts
    44
    Post Thanks / Like
    Downloads
    0
    Uploads
    0
    Rep Power
    0

    Default

    Sorry for not replying for so long time.

    It seems that this isn't as easy as I thought. It seems that observations made by kia make it harder.

    - More common words (an, is, to, etc.) are repeated more often than least common words in the file
    - Two-letter-words file for instance are always 6092 bytes long, always have 676 "records"

    If you create a file with words not repeating, it will not suggest them while You type. I'll investigate this more and post if I find something else.


  2. #32
    Advanced Array

    Join Date
    Dec 2007
    Posts
    44
    Post Thanks / Like
    Downloads
    0
    Uploads
    0
    Rep Power
    0

    Default

    I was terribly wrong about one/two letter words format. The "counter" is not a counter, it's a lettermap like in unigrams.dat file. one/two-letter-words.dat files contain all possible one/two letter combinations and their corresponding two letter words. For example:

    28 28 [aa] = AS
    28 2A [ab] = AN
    28 2C [ac] = AC
    28 2E [ad] = AD
    28 30 [ae] = AS

    ...

    38 38 [ii] = OK
    38 3A [im] = I'M

    etc.

  3. #33
    Advanced Array

    Join Date
    Sep 2007
    Posts
    47
    Post Thanks / Like
    Downloads
    0
    Uploads
    0
    Rep Power
    0

    Default

    Thank you very much for your hard work, and don't worry about not answering, we all know this is a hobby, not a job

    I'm afraid that I've used up all my brain cells for this year, since I don't seem to be capable of understanding what you say about the letter maps in [one|two]-letter-words.dat files.

    Anyway, if you are able to create a program to make those files, I'd be really grateful.

    Thank you in advance and happy new year

  4. #34
    Advanced Array

    Join Date
    Dec 2007
    Posts
    44
    Post Thanks / Like
    Downloads
    0
    Uploads
    0
    Rep Power
    0

    Default

    Okay, I don't have much time so I can't make any compiled program for this, however I manage to write a PHP script that does the job. To use it, You have to send it to a webserver and make sure there are 777 permissions in the directory you place it.

    The script: http://www.artdstract.pl/iphone/two_letter.phps

    Also, you must place xx_XX-two-letter-words.txt file in the same directory that contains your two letter words in format:

    word [tab] FREQ

    Where FREQ is frequency of use (0 - 100), 0 is least, 100 is most.

    Example:

    as 70
    at 60
    by 80
    is 90
    ...
    Then, execute script as follows:

    http://your-webserver-address/your-d...php?lang=xx_XX

    Where xx_XX is your language code (must match with the filename of txt file above). Then it'll generate .dat file that You can use in Your iPhone.

    I know it is complicated, and I'm sorry for that, but I don't have more time to come up with something easier. I hope it'll help You!

    Note: It doesn't support unicode characters. Please replace all accented characters in your words with latin ones

  5. #35
    kia
    kia is offline
    Rookie Array

    Join Date
    Oct 2007
    Posts
    24
    Post Thanks / Like
    Downloads
    0
    Uploads
    0
    Rep Power
    0

    Default

    Thanks M4v3R!!!

    I've now managed to create a sv_SE-two-letter-words.dat file that can handle unicode. It is specific to sweden - but it should be easy to modify it for other languages as well. I've highlighted my modifications:

    http://pastebin.com/f59e4c338

    If you've got the time - I would really like to see a similar script to create one-letter-words.dat files.

    The swedish dat file can be downloaded from my blog:
    http://svenskiphoneordlista.wordpress.com


  6. #36
    Advanced Array

    Join Date
    Sep 2007
    Posts
    47
    Post Thanks / Like
    Downloads
    0
    Uploads
    0
    Rep Power
    0

    Default

    In times like these is when I regret not having learned PHP. Or not having a web server at hand, for that matter

    First thing first... Thank you very much, M4v3R, and happy new year everyone.

    With the holiday and such I haven't been able to have a look at the PHP script, but I hope I will be able to mess with it in the next couple of days and maybe create a Windows binary based on it (Note: "based on" means "plagiarizing" in this context ).

    kia, I'd love to see the changes you made to the script, we Spanish speaking people also need Unicode support, but if I try to click the link you provided all I get is a page with the following text:
    Code:
    home/pastebin/public_html/../posts/ needs to be a writable dir to use file storage engine
    .

    Again, thank you M4v3R and kia, and enjoy the holiday. Here in Spain we'll continue to celebrate untill January 6th, when the Three Wise Men (los Tres Reyes Magos) will bring us lots of presents. Namely, the ones Santa wasn't able to buy in time for the 25th.

  7. #37
    kia
    kia is offline
    Rookie Array

    Join Date
    Oct 2007
    Posts
    24
    Post Thanks / Like
    Downloads
    0
    Uploads
    0
    Rep Power
    0

    Default

    Code:
    home/pastebin/public_html/../posts/ needs to be a writable dir to use file storage engine
    It seems like pastebin are having some server trouble. Try again later.

  8. #38
    kia
    kia is offline
    Rookie Array

    Join Date
    Oct 2007
    Posts
    24
    Post Thanks / Like
    Downloads
    0
    Uploads
    0
    Rep Power
    0

    Default

    Quote Originally Posted by M4v3R View Post
    I was terribly wrong about one/two letter words format. The "counter" is not a counter, it's a lettermap like in unigrams.dat file.
    I still don't get suggestions for , and when using the three files generated by iPhoneshop. Even though I've looked thru the source code of iPhoneshop I haven't found the "bug". I've tried comparing the output of iPhoneshop with the original german dictionary from 1.1.2. But it is hard do do a visual comparison since the files are "scrambled" using some strange apple-algorithm.

    What I need is a way to find out which file(s) generated by iPhoneshop that are incorrect. Can someone suggest a method?

    My first idea was to use iPhoneshop to extract the words from the de_DE files from 1.1.2. And then rebuild them using iPhoneshop. And then make a comparison of the output. But it failed since iPhoneshop doesn't recognize the file format and gives me an error. I don't know why it fails. But I can verify that the 1.1.2 files are usable on a 1.1.1 iPhone.

    Another idea (and this is why I quote M4v3R) is to create a new unigrams.dat-file and see if it solves the problem. If the unigrams.dat and the one/two-letter-words.dat files are similar it might be possible to modify the php-script by M4v3R to create correct unigrams.dat files.

    What do you think?

  9. #39
    Advanced Array

    Join Date
    Dec 2007
    Posts
    44
    Post Thanks / Like
    Downloads
    0
    Uploads
    0
    Rep Power
    0

    Default

    As I said, my script doesn't handle UTF characters, like the one You've posted: , and . You could, however, remove all the accents and replace them with latin characters like a and o, then it would work. I have also an PHP script that does that (generates unigrams and stems files). I can post it if someone wants it.

    Oh, and it had a bug that kept suggesting least popular words than the most. Please download it again from here: http://www.artdstract.pl/iphone/two_letter.phps

    Edit: You can edit one-letter-words file by hand. There are 52 records there - for 26 for big and 26 for small letters. Replacement for 'A' is at offset 0x0B (11 in decimal), for 'B' - 0x11 (17 in decimal), and so on. Replacement for 'a' is at 0xA7, 'b' at 0xAD, 'z' at 0x13D. I think you get the picture .

    Big letters aren't replaced in en_US file, this is probably because you write shortcuts and abbreviations in big letters mostly, so they don't need to be replaced. Small letters, however, are replaced when you type wrong one. For eg. 'z' is replaced to 'a', and 'i' is replaced to 'I' (big 'i', as in english grammar).
    Last edited by M4v3R; 01-04-2008 at 06:11 PM.

  10. #40
    kia
    kia is offline
    Rookie Array

    Join Date
    Oct 2007
    Posts
    24
    Post Thanks / Like
    Downloads
    0
    Uploads
    0
    Rep Power
    0

    Default

    Quote Originally Posted by M4v3R View Post
    I have also an PHP script that does that (generates unigrams and stems files). I can post it if someone wants it.
    Yes! Please do!


 

 
Page 4 of 6 FirstFirst 123456 LastLast

Similar Threads

  1. MacNN: Sena Keyboard Folio for iPad includes built-in keyboard
    By hackint0sh in forum Latest Headlines
    Replies: 0
    Last Post: 08-30-2010, 09:20 PM
  2. Dictionary Eng -> Spa
    By MrtynKyn in forum AppStore Software
    Replies: 0
    Last Post: 07-11-2009, 03:43 AM
  3. Replies: 2
    Last Post: 01-13-2009, 06:20 PM
  4. [Dictionary] Virtual keyboard
    By MaLer in forum General
    Replies: 28
    Last Post: 04-07-2008, 10:21 AM
  5. Dutch keyboard / dictionary ?
    By ExOMaNiaC in forum General
    Replies: 16
    Last Post: 11-28-2007, 05:38 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Powered by vBulletin®
Copyright © 2014 vBulletin Solutions, Inc. All rights reserved.
Search Engine Friendly URLs by vBSEO
(c) 2006-2012 Hackint0sh.org
All times are GMT +2. The time now is 10:07 AM.
twitter, follow us!