Solution – Exprienced Challenge 6: How Do You Spell That Again?




This was a challenge a good 15 or 20 years in the making. That’s not because it took us that long to come up with the challenge; even we work faster than that. It’s just that one of the OfficePalooza organizers learned about the Soundex algorithm 15 or 20 years ago (yes, back when he was, uh, 5 or 6 years old) and has been waiting ever since to actually get a chance to use the Soundex algorithm. So does that mean OfficePalooza was created just so this organizer would finally have an excuse to use the Soundex algorithm? Well, don’t tell anyone this but: yes.


Hey, why else would you create something called OfficePalooza?


Having already waited 15 or 20 years, we can’t wait any longer. At long last, our Visual Basic for Applications subroutine that uses the Soundex algorithm to suggest replacements for a misspelled word:


Sub CheckSpelling()


    For Each objWord In ActiveDocument.Words


        If Len(objWord.Text) > 1 Then


            strWord = UCase(Trim(objWord.Text))


 


            Set objConnection = CreateObject("ADODB.Connection")


            objConnection.Open "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" _


                & ActiveDocument.Path & "\Soundex.accdb"


 


            objRecordSet.Open "Select * From SoundexValues Where Word = '" & strWord & "'", _


                objConnection


   


            If objRecordSet.RecordCount = 0 Then


                strPreviousLetter = ""


                strNewWord = Left(strWord, 1)


 


                For i = 2 To Len(strWord)


                    strLetter = Mid(strWord, i, 1)


 


                    If strLetter = "B" Or strLetter = "F" Or strLetter = "P" Or strLetter = "V" Then


                        If strLetter <> strPreviousLetter Then


                            strNewWord = strNewWord & "1"


                        End If


                    End If


 


                    If strLetter = "C" Or strLetter = "G" Or strLetter = "J" Or strLetter = "K" Then


                        If strLetter <> strPreviousLetter Then


                            strNewWord = strNewWord & "2"


                        End If


                    End If


 


                    If strLetter = "Q" Or strLetter = "S" Or strLetter = "X" Or strLetter = "Z" Then


                        If strLetter <> strPreviousLetter Then


                            strNewWord = strNewWord & "2"


                        End If


                    End If


 


                    If strLetter = "D" Or strLetter = "T" Then


                        If strLetter <> strPreviousLetter Then


                            strNewWord = strNewWord & "3"


                        End If


                    End If


 


                    If strLetter = "L" Then


                        If strLetter <> strPreviousLetter Then


                            strNewWord = strNewWord & "4"


                        End If


                    End If


  


                    If strLetter = "M" Or strLetter = "N" Then


                        If strLetter <> strPreviousLetter Then


                            strNewWord = strNewWord & "5"


                        End If


                    End If


 


                    If strLetter = "R" Then


                        If strLetter <> strPreviousLetter Then


                            strNewWord = strNewWord & "6"


                        End If


                    End If


 


                    strPreviousLetter = strLetter


                Next


 


                If Len(strNewWord) > 4 Then


                    strNewWord = Left(strNewWord, 4)


                End If


 


                Do Until Len(strNewWord) = 4


                    If Len(strNewWord) = 4 Then


                        Exit Do


                    Else


                        strNewWord = strNewWord & "0"


                    End If


                Loop


 


                objRecordSet.Close


 


                objRecordSet.Open "Select * From SoundexValues Where Value = '" & strNewWord & "'", _


                    objConnection


 


                Selection.EndKey Unit:=wdStory


                ActiveDocument.ActiveWindow.Selection.TypeParagraph


   


                Do Until objRecordSet.EOF


                    ActiveDocument.ActiveWindow.Selection.TypeText LCase(objRecordSet.Fields.Item("Word"))


                    ActiveDocument.ActiveWindow.Selection.TypeParagraph


                    objRecordSet.MoveNext


                Loop


  


            End If


 


            objRecordSet.Close


            objConnection.Close


        End If


    Next


End Sub


 


We don’t know about you, but it brings a tear to our eyes.


So what exactly did we do here? Well, we started off by setting up a For Each loop that loops through all the items in the active document’s Words collection; that’s what we do with this line of code:


For Each objWord In ActiveDocument.Words


As it turns out, there’s an interesting quirk with the words in the Words collection: you get not just the word, but the blank space that follows the word. (In fact, a blank space actually counts as a word, at least as far as the Words collection is concerned.) Because of that, we use this line of code (and the Len function) to count the number of characters in the word:


If Len(objWord.Text) > 1 Then


If the word has only one character we’re assuming it’s a blank space; therefore, we ignore it. If the word has more than one character then we use the following line of code to trim off any blank spaces (using the aptly-named Trim function) and then convert the word to all uppercase letters:


strWord = UCase(Trim(objWord.Text))


Now it’s time to determine whether or not the word in question is spelled correctly. To do that, we’ll simply query the database Soundex.accdb and see if the word can be found in the database; if it can, then it must be spelled correctly. To that end, the first thing we do is connect to the database:


Set objConnection = CreateObject("ADODB.Connection")


objConnection.Open "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" _


    & ActiveDocument.Path & "\Soundex.accdb"


 


There’s nothing too special about these two lines of code. In line one we create an instance of the ADODB.Connection object, an object that lets us work with databases. In line two we use the Open method to open the database Soundex.accdb. The only tricky thing here? We need to specify the full path to the database. That means we need the folder path as well as the file name. Problem? Heck no: because the database is in the same folder as our Word document we can determine the complete path by referencing the Path property of the active document and then tacking on \Soundex.accdb.


Which, needless to say, is exactly what we did.


As soon as we’ve made our connection we then issue the following query, which returns a recordset containing all the words that happen to match our target word:


objRecordSet.Open "Select * From SoundexValues Where Word = '" & strWord & "'", _


    objConnection


 


Our next step is to check the number of items in the recordset. If the RecordCount is equal to 0 that means that there are no items in the recordset; in turn, that means that the word is misspelled. (What if there is at least one item in the recordset? That means that the word is spelled correctly. That also means that our subroutine does nothing.)


Let’s assume, however, that our word is not spelled correctly. (A good assumption, seeing as how the word is not spelled correctly.) In that case, we need to calculate the Soundex value for the misspelled word. Before we do that, however, we set the value of the variable strPreviousLetter to an empty string; that’s what we do here:


strPreviousLetter = ""


 


Note. Why did we do that? We’ll explain that in a minute or two.


The Soundex algorithm we’re using always uses the first letter of the word as-is. Therefore, we use the Left function to grab the first letter of our misspelled word and store it in a variable named strNewWord:


strNewWord = Left(strWord, 1)


 


At the moment, that makes strNewWord equal to M.


That brings us to this line of code:


For i = 2 To Len(strWord)


What we’re doing here is setting up a For Next loop that runs from 2 to the number of characters in the misspelled word. Why do we start the loop at 2? You got it: because we’ve already snared the first letter in the word and stashed it in the variable strNewWord. Inside this loop, we use this line of code to grab the next letter in the misspelled word:


strLetter = Mid(strWord, i, 1)


Next up? A series of If-Then statements that replace the letter we just grabbed with its numeric equivalent. For example, this block of code checks to see if the letter in question is a B, an F, a P, or a V:


If strLetter = "B" Or strLetter = "F" Or strLetter = "P" Or strLetter = "V" Then


    If strLetter <> strPreviousLetter Then


        strNewWord = strNewWord & "1"


    End If


End If


 


If we do have one of these letters we then check to see if the previous letter in the word matches the current letter. (In other words, are we dealing with back-to-back instances of a letter, such as the two m’s in command?) If we have duplicate letters we don’t do anything; that’s part of the algorithm. If we don’t have duplicate letters then we tack the value 1 into the end of strNewWord.


Etc., etc.


Note. As an alternative to these If-Then statements we could have used regular expressions to replace the letters in the word with their numeric equivalents; that’s actually something a couple of people did in their scripts. We thought about that, but because regular expressions can get a little complicated, we decided to use If-Then statements. We figured that, with that approach, the logic of what we were doing – and why we were doing it – would be easier to follow.


After we’ve run through all the possible replacements, we set the value of strPreviousLetter to the current letter, then go back to the top of the loop and repeat the process with the next character in the misspelled word.


When we finish with all our number-to-letter substitutions our next task is to make sure that our final Soundex value starts with a letter and is then followed by 3 numbers (and only 3 numbers). If our Soundex value has more than four characters that means we need to get rid of the “excess” values. That can be done by using the Left function to extract just the first four characters:


If Len(strNewWord) > 4 Then


    strNewWord = Left(strNewWord, 4)


End If


 


And what if we have less than four characters? In that case, we need to add zeroes to the end of the Soundex value until we do have four characters. That’s what this block of code does:


Do Until Len(strNewWord) = 4


    If Len(strNewWord) = 4 Then


        Exit Do


    Else


        strNewWord = strNewWord & "0"


    End If


Loop


When all is said and done, we should end up with a Soundex value equal to this: M214. And now that we have a Soundex value we can use this line of code to query the database and return a recordset consisting of all the words that have a Soundex value equal to M214:


objRecordSet.Open "Select * From SoundexValues Where Value = '" & strNewWord & "'", _


    objConnection


 


After we return our recordset we use these two lines of code to move the cursor to the end of the document (that’s what the EndKey method and the wdStory constant are for) and then enter a blank line in the document:


Selection.EndKey Unit:=wdStory


ActiveDocument.ActiveWindow.Selection.TypeParagraph


 


All we have to do now is list all the possible corrections for our misspelled word. To do that we set up a Do Until loop that runs until we reach the end of the recordset. (That is, until the recordset’s EOF – end-of-file – property is true.) Inside that loop we use the TypeText method to enter the possible correction, then use the TypeParagraph method to move the cursor to the next line. After that we call the MoveNext method to move to the next record in the recordset.


The whole process looks like this:


Do Until objRecordSet.EOF


    ActiveDocument.ActiveWindow.Selection.TypeText LCase(objRecordSet.Fields.Item("Word"))


    ActiveDocument.ActiveWindow.Selection.TypeParagraph


    objRecordSet.MoveNext


Loop


 


And, on-screen, you should see the following spelling suggestions:


machiavellian


make-believe


massively


misapplication


misapply


misplace


misspell


misspelling


misspelt


 


Note. Yes, some of these corrections might a bit … unexpected …. In fairness to the Soundex algorithm, however, we didn’t give you the entire algorithm, just a subset.


At that point we close our recordset and our database connection, and we – and Challenge number 6 – are done. Was it worth waiting 15 or 20 years for this moment? What do you think?

Skip to main content