Word games with phone numbers: 1-800-RESULTS (by CalvinH)


My boss forwarded me this blog which is a programming puzzle challenge: given a phone number, try to create a mnemonic, like 800-776-4726 is 800-PROGRAM


 



He knows that I’ve been writing word games on computers for years: I decoded a spelling dictionary from a commercial software product about 20 years ago, and have been using that as a source. Using Borland C about 20 years ago, my scrabble program can’t be beat<g>


 


So I wrote a solution to the challenge in about 100 lines of Visual FoxPro.


 


Today, my dictionary is a DLL COM server of size 630784, which includes 2 separate dictionaries: one with 171201 words, and the other with 53869. That’s around the same size as some pictures that my 4 Megapixel digital camera takes. So, a picture is worth 225000 words<g>


 


As a corollary, you might wonder how 225000 words can fit into 679k: that’s an average of 3 bytes per English word not including the dictionary logic itself! But that’s another subject.


 


To run the solution, you’ll need my dictionary


 


There are many optimizations that could be made, but they would add more lines and obscure the algorithm itself.


 


In any case, here’s the non-optimized solution.


 


There are 2 fundamental algorithms here. The first (EnumNum, 20 lines)  enumerates all the possible permutations of NUMDIGS digits. It loops through the digits, allowing each one to range through its possible values (“2” goes through “2,d,e,f”)


The 2nd algorithm (DoSeparators, 13 lines) just inserts a “-” in every possible position in  a NUMDIGS length string. There are NUMDIGS-1 possible places to insert a “-” (each of the gaps between each letter). The gap can be either of 2 values: a “-” or a “”. Thus there are 2^(NUMDIGS-1) locations to put a “-“. So the DoSeparators routine just loops from 0 TO 2^(NUMDIGS-1)-1 and uses the bits of the loop index represented in binary to insert the “-“.


 


Click to download Dictionary.dll


 


  


CLEAR


#define NUMDIGS 10


PUBLIC ox as dictionary.dict


PUBLIC nCnt,oBrowse


ox=CREATEOBJECT(‘dictionary.dict’)            && Instantiate the dictionary COM object


ox.DictNum=2 && 1 = 171000 word dictionary, 2= 53869 word


 


&&This solution uses 2 parallel strings: the number to permute and a pattern


cPattern=REPLICATE(“0”,NUMDIGS)                       && like “0000000000”


&& cPattern governs the pattern of letters to use for each digit’s place


&& a “0” for each digits place to indicate which letter of the digit it is (0,1,2,3). 0 indicates use the raw digit


&& At the end, it’ll be “1” (for 0,1), “3” (for 2,3,4,5,6,8), or “4” for “7,9”, like “3334433143”


&& Given a phone #, the # of patterns is 3^NUMDIGS if all digits are “234568”


 


cNumber=”642-394-6369″      && number of patterns is 4^8*5^2 = 1638400


cNumber=STRTRAN(cNumber,”-“,””) && remove dashes


 


CREATE TABLE phrases (phrase c(30))                     && a table into which we can put partial results


CLOSE DATABASES


USE phrases SHARED            && open it shared, so another instance can view/query it while we’re executing


BROWSE LAST NOWAIT NAME oBrowse && show the table


nCnt=0 && count of results


dtStart=DATETIME()


EnumNum(cNumber,cPattern,1)           && Call enumerator starting at 1st digit


dtEnd=DATETIME()


?”Completed Phrase search “,dtstart,dtEnd,dtEnd-dtStart


 


PROCEDURE EnumNum(cNumber as String,cPattern as String, nDigNum as Integer) && cNumber is raw string of digits with no separators


            LOCAL i,cDigit,cPat,cStartLet,cLastPat,cNewLet


            cDigit=SUBSTR(cNumber,nDigNum,1)           && the digit we’re working on


            cPat=SUBSTR(cPattern,nDigNum,1)               && where are we on this digit?


            cStartLet=SUBSTR(“  adgjmptw”,ASC(cDigit)-47,1)   &&   Starting letter for this digit: 0  1  2abc, 3def, 4ghi, 5jkl, 6mno, 7pqrs, 8tuv, 9wxyz


            cLastpat =SUBSTR(“0033333434”,ASC(cDigit)-47,1)            && # permutations -1 for this digit: 0  0    4,    4,    4,    4,    4,    5,     4,    5


            DO WHILE .t.


                        IF nDigNum < NUMDIGS       && we haven’t enumerated all digits yet


                                    EnumNum(cNumber,cPattern,nDigNum+1)      && recur with next digit


                        ELSE


                                    DoSeparators(cNumber)          && Gone through all digits. Now we’ve got a pattern, like “nicewinfox”.


                        ENDIF


                        IF cPat=cLastPat && this digit has reached the end (for ‘7’, we’ve done “7”,”p”,”q”,”r”,”s”)


                                    EXIT


                        ENDIF


                        cPat = CHR(ASC(cPat)+1)                 && increment the pattern


                        cPattern=LEFT(cPattern,nDigNum-1) + cPat + SUBSTR(cPattern, nDigNum+1)         && insert the new letter into the pattern


                        cNewlet = CHR(ASC(cStartLet)+ASC(cPat)-49)        && from ‘a’ to ‘b’ or from ‘n’ to ‘o’


                        cNumber=LEFT(cNumber,nDigNum-1) + cNewLet + SUBSTR(cNumber, nDigNum+1)        && insert the new letter into the number


            ENDDO


            RETURN


           


&& DoSeparators: given NUMDIGS letters like “nicewinfox”, insert “-” into all positions to create phrases, then test them


&& For NUMDIGS, (NUMDIGS-1) locations to insert a separator. It’s either there or not, so 2^(NUMDIGS-1) permutations


PROCEDURE DoSeparators(cNumber as String)       


            LOCAL i,j,k,cstr


            FOR i = 0 TO 2^(NUMDIGS-1)-1     && loop on # possible separator positions. For 10 digits, it’s 512


                        cPhrase=SUBSTR(cNumber,1,1)         && first char can’t be a separator


                        FOR j = 1 TO NUMDIGS-1               && for each of the positions, construct a multi-word phrase like “nice-win-fox”


                                    IF BITTEST(i,j-1)                                                        && add a separator (BITTEST(1024,10) is true because the 10th bit is 1)


                                                cPhrase=cPhrase+’-‘


                                    ENDIF


                                    cPhrase=cPhrase+SUBSTR(cNumber,j+1,1)   && add a letter


                        ENDFOR        && for each digit


                        TestPhrase(cPhrase+”-“)           && add a trailing “-“


            ENDFOR        && try next separator position


            RETURN


           


PROCEDURE TestPhrase(cPhrase as String)   && given a phrase like “645-fox-test” see if all alpha sequences are in dictionary


            LOCAL cStr,nDash


            nCnt=nCnt+1


            IF MOD(nCnt,100000)=0


                        ?nCnt,TRANSFORM(LOG10(nCnt),”999.9″),”phrases checked”,cPhrase


            ENDIF


            cStr=cPhrase


            DO WHILE .t.


                        nDash=AT(“-“,cStr)


                        IF nDash = 0    && we parsed all words


                                    EXIT


                        ENDIF


                        cWord=LEFT(cStr,nDash-1)   && the first word before the dash


                        cStr=SUBSTR(cStr,nDash+1)  && the rest of the string


                        IF !TestWord(cWord)                          && if it’s not a word


                                    RETURN


                        ENDIF


            ENDDO


            INSERT INTO phrases VALUES (LEFT(cPhrase,LEN(cPhrase)-1))  && Bingo! remove trailing ‘-‘


RETURN


 


PROCEDURE TestWord(cWord as String) as Boolean


            LOCAL i


            IF ISDIGIT(cWord)     && if it starts with a digit


                        FOR i = 2 TO LEN(cWord)


                                    IF !ISDIGIT(SUBSTR(cWord,i,1))      && if they’re not all digits


                                                RETURN .f.


                                    ENDIF


                        ENDFOR


            ELSE


                        FOR i = 2 TO LEN(cWord)    && starts with a char, lets ensure the rest are chars


                                    IF ISDIGIT(SUBSTR(cWord,i,1))


                                                RETURN .f.


                                    ENDIF


                        ENDFOR


                        IF !ox.isWord(cWord)


                                    RETURN .f.


                        ENDIF


            ENDIF


RETURN .t.    


 


Here are the first 20 results:


6423946369
6-423946369
64-23946369
6-4-23946369
642-3946369
6-42-3946369
64-2-3946369
6-4-2-3946369
6423-946369
6-423-946369
64-23-946369
6-4-23-946369
642-3-946369
6-42-3-946369
64-2-3-946369
6-4-2-3-946369
64239-46369
6-4239-46369
64-239-46369
6-4-239-46369
 


You can see that the dashes are being inserted into the number in a binary fashion: 0000, 0001, 0010, 0011, etc.


One optimization: when inserting the separators: each partial word must be either all digits or all letters. Also, don’t insert a separator between 2 digits.


Another optimization: if we constrain the words to be at least a certain length, we won’t get silly results like a-i-i-a-i-a-i-i-i-a


A major optimization: after we have examined all phrases that start with “642-“, we can use the same stored results for all 3 letter combinations of 642 (like “mda”) that are stored in the cursor. Permutations of the last 7 digits have already been calculated and put in the cursor.


We could also limit all digit words to 1 or 2 in length


The contents of the dictionary are also important. If it is a true spell checker dictionary, then it contains things like b, c, d, e, f.


These are not spelling errors, but they’re not words either. Many spelling dictionaries also contain abbreviations, like ny, md, etc.


nice-window


Because the results are in a cursor, it’s easy to do a  query:


SELECT * from phrases WHERE AT(“fox”,phrase)>0
nice-win-fox


 

Comments (16)

  1. the1 says:

    This is nice. Could you also post some sample results? Not everyone has Fox Pro.

    I’m busy today, but will try to digest your program later this week. In the mean while, can you provide some high-level description on your algorithm? I saw your inline comments, but there is no overview. Thanks.

  2. bertcord says:

    you have some mad skillzzz

  3. wOOdy says:

    @ The1:

    >> Not everyone has FoxPro

    Come on… It’s in your MSDN package. See DVD2488 or CD2014. Just install it (It’s one of the most wellbehavest MS apps I know: Doesn’t alter anything on your PC, doesn’t depend on anything. You could even just copy the directories to your PC and start working)

    @CalvinRH:

    Seems the download is not yet working. Using MSVCR80.dll? Is that already compiled in VFP9? Any chance you recompile that puppy with the currently released version? <g>

  4. wOOdy says:

    Ah, now the download is working.

    >> that’s an average of 2.8 bytes per English word not including the dictionary logic itself! But that’s another subject.<<

    Which I would be very interested in.. <g> Are you using an IDX file for lookup? (For the Non-FoxPro folks: IDX is an Indexfile, which internally uses a compressed storage) Together with an empty dummy DBF, built on the fly, a INDEXSEEK() could just return a "Yes/No" on a search…

  5. the1 says:

    Calvin,

    A minor thing: in your comments you said EnumNum() enumerates all the possible permutations of NUMDIGS digits. The term permutation means a *ordering* of a sequence (e.g. "1923" is a permutation of "1239"). I think you actually mean "combinations".

  6. B.Y. says:

    >>you might wonder how 225000 words can fit into 679k: that’s an average of 3 bytes per English word not including the dictionary logic itself! But that’s another subject.

    This is a good subject for a programming challenge: compress a dictionary.

  7. garrettvm says:

    Actually, they are neither permutations nor combinations. Since these terms are being used "expressively" rather than "precisely" either is probably descriptively OK here. The counting examples most likely make clear what is going on. Counting problems vary a lot and include much more than the basic permutation or combination–many are really hard.

  8. Since July 4 th is nearing, I thought it would be appropriate to start my independent blog. In this first

Skip to main content