Soundex Codes Explained
by Kathi Reid
The
Soundex Algorithm
-
Soundex codes begin with the first letter of the surname followed by a three-digit code that represents the first three remaining consonants. Zeros will be added to names that do not have enough letters to be coded.
-
Soundex Coding Guide (Consonants that sound alike have the same code)
1 - B,P,F,V |
-
The letters A,E,I,O,U,Y,H, and W are not coded.
-
Names with adjacent letters having the same equivalent number are coded as one letter with a single number.
-
Surname prefixes such as La, De and Van are generally not used in the soundex. Mc, Mac and O generally are not considered prefixes for soundex.
To Calculate a Soundex Code by Hand:
-
Print name.
-
Cross out spaces, punctuation, accents and other marks
-
Cross out any of the following characters A, E, I, O, U, H, W, Y (unless first letter of surname)
-
Cross out the second letter of duplicate characters
-
Cross out the second letter of adjacent characters with the same soundex number.
-
Convert characters in positions 2 to 4 to a number
B, P, F, V = 1 |
-
Fill any unused positions with zeros e.g.. Lee is L000, Bailey is B400. There is always one letter followed by 3 numbers.
Soundex Limitations
-
Names that sound alike do not always have the same soundex code. For example, Lee (L000) and Leigh (L200) are pronounced identically, but have different soundex codes because the silent g in Leigh is given a code.
-
Names that sound alike but start with a different first letter will always have a different soundex code. Thus, names such as Carr (C600) and Karr (K600) should be calculated separately.
-
Soundex is based on English pronunciation so European names may not soundexed correctly. For example, some French surnames with silent last letters will not code according to pronunciation. An example is the French name such as Beaux - where the x is silent. Sometimes this surname is also spelled Beau (B000) and is pronounced identically to Beaux (B200), yet they will have different soundex codes. This could be true of any surname that does not use English pronunciation.
-
Sometimes names that don't sound alike have the same soundex code. When I am searching for the surname Powers (P620), I have to wade through Pierce, Price, Perez and Park which all have the same soundex code. Yet Power (P600), a common way to spell Powers 100 years ago, has a different soundex code.
-
Surnames with prefixes were usually coded without the prefix, but not always. If you are searching for a surnames such as DiCaprio or LaBianca, you should try the soundex for both with and without the prefix.
-
US Census soundex confusion arises with names such as Ashcraft. When the original soundex coder didn't code the H and didn't consider the H as a separator between the adjacent letters with the same code S and C , then the S and C would be considered adjacent letters to be coded only once and the soundex will be A261. In the 1920 NY Census, Ashcraft is found under A261. To calculate a soundex code by this method, use this soundex calculator.
-
Those who coded the soundex for the 1880*, 1900 and 1910** census may or may not have used this rule. They sometimes considered the H as a separator, and did not code the S and C as adjacent letters that would only be assigned one letter, but rather gave a number code to each letter. In this case Ashcraft would be A226, the result you receive with the calculator on this page.
-
The important thing to know is that the US Census was not consistent with using the letter H and W as separators between adjacent letters. If you are trying to calculate the soundex for a name with the letters W or H that separate two adjacent letters, it is best to calculate the soundex using the two different methods to locate the name in the US census. This would be true of any name that has any of the letters C,S,G,J,K,Q,X,Z on both sides of the letter H or W such as SHC, SHS, CHS, KHZ, SWS, KWS, CWK.
- A surname of more than one word, or a surname that commonly comes before a given name, such as Native Americans, Catholic nuns and Chinese surnames, may have been coded under the name which appears last, even though it might not be the actual surname. In the case of multi-word surnames, only the last word may have been coded.
Uses for the Soundex Code
Once you have a soundex code for a surname, you can order the soundex microfilm for the 1880*, 1900, 1910** and 1920 US census. This census soundex microfilm is an index to the actual census where you will receive a lot more information than is on the census soundex. If you cannot find your ancestor with the soundex code you calculated for his surname, try a soundex variation keeping the soundex limitations (see above) in mind. The purpose of the soundex indexing system is to keep all spelling and pronunciations of a given name together, but because of the limitations of the soundex, you may have to try different spellings of a name that may give you a different soundex code.
Do not assume that your surname was always spelled the way it is today, and that is the way it will appear on the census 100 years ago. The census taker, in a lot of cases, wrote the surname how he heard it. Try listening outloud to the surname and write down as many spelling variations as you can think of. One of these may be how your surname was spelled in the census.
* There is only an 1880 soundex census if there was a child under the age of 10 living at that address.
** The 1910 U.S. Census was indexed for only a handful of states, and it was called the Miracode instead of Soundex. The Miracode index uses the same phonetic code and abbreviations as the Soundex system, but the method of recording the census page reference is different. The Miracode index card lists county, volume, ED, and the sequential family number assigned by the census taker, while the Soundex card shows the county, volume, ED, sheet, and line numbers on the appropriate census schedule. The states indexed in the Miracode system for the 1910 U.S. Census are: Alabama, Arkansas, California, Florida, Georgia, Illinois, Kansas, Kentucky, Louisiana, Michigan, Mississippi, Missouri, North Carolina, Ohio, Oklahoma, Pennsylvania, South Carolina, Tennessee, Texas, Virginia, and West Virginia.