A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ahint

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to houses in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
houses (0) - 25 freq
mouses (1) - 1 freq
horses (1) - 113 freq
housed (1) - 1 freq
hooses (1) - 251 freq
hoses (1) - 1 freq
houss (1) - 63 freq
house' (1) - 4 freq
house (1) - 121 freq
hotses (1) - 1 freq
houres (1) - 8 freq
mouse- (2) - 3 freq
noises (2) - 32 freq
muses (2) - 7 freq
hose (2) - 18 freq
poses (2) - 2 freq
ouse (2) - 1 freq
uses (2) - 47 freq
howpes (2) - 1 freq
hoose's (2) - 5 freq
hosts (2) - 13 freq
husks (2) - 2 freq
hopes (2) - 37 freq
hosea (2) - 1 freq
houps (2) - 1 freq
houses (0) - 25 freq
houss (1) - 63 freq
hoses (1) - 1 freq
hooses (1) - 251 freq
houres (2) - 8 freq
hoosis (2) - 1 freq
hoss (2) - 1 freq
hoosies (2) - 9 freq
hotses (2) - 1 freq
house (2) - 121 freq
mouses (2) - 1 freq
housed (2) - 1 freq
horses (2) - 113 freq
house' (2) - 4 freq
hoaxes (3) - 1 freq
hous (3) - 110 freq
mouss (3) - 26 freq
hause (3) - 25 freq
hoosed (3) - 6 freq
choises (3) - 1 freq
hokes (3) - 1 freq
loses (3) - 6 freq
causes (3) - 23 freq
haused (3) - 3 freq
hoose' (3) - 3 freq
SoundEx code - H220
hooses - 251 freq
hich's - 1 freq
heuchs - 3 freq
hauchs - 4 freq
hakes - 1 freq
haggis - 76 freq
hkes - 1 freq
heizes - 7 freq
hoosies - 9 freq
hochs - 12 freq
heughs - 7 freq
hogus - 2 freq
houses - 25 freq
hughock - 8 freq
hughie's - 6 freq
heezes - 5 freq
hijack - 1 freq
hoose's - 5 freq
hce's - 6 freq
hizzy's - 2 freq
hooches - 1 freq
hizzies - 4 freq
hic-hoc - 1 freq
hisses - 3 freq
hikes - 1 freq
hawkes - 2 freq
hussies - 1 freq
heichs - 2 freq
haughs - 8 freq
hezekiah - 4 freq
highways - 3 freq
hugh's - 4 freq
higgie's - 8 freq
haggis's - 1 freq
hoosie's - 1 freq
hjook - 2 freq
hoosis - 1 freq
hush-hush - 1 freq
hoaxes - 1 freq
hooziss - 1 freq
hussy's - 1 freq
hcjac - 1 freq
hecky's - 4 freq
'heckys - 1 freq
hgis - 1 freq
heges - 1 freq
houssis - 3 freq
hoses - 1 freq
hughes - 5 freq
huzzas - 1 freq
hoswick - 1 freq
hjuks - 1 freq
hjuk - 1 freq
highs - 1 freq
huggis - 7 freq
hazy-eyes - 1 freq
haosaz - 1 freq
hcycu - 1 freq
hjcuq - 1 freq
hughesie - 1 freq
hjihyx - 1 freq
hqec - 1 freq
highog - 1 freq
hegwig - 1 freq
hoswick's - 1 freq
hqwzs - 1 freq
hoagies - 1 freq
hcoj - 1 freq
hokes - 1 freq
MetaPhone code - HSS
hooses - 251 freq
heizes - 7 freq
hoosies - 9 freq
houses - 25 freq
heezes - 5 freq
hoose's - 5 freq
hizzy's - 2 freq
hizzies - 4 freq
hisses - 3 freq
hussies - 1 freq
hoosie's - 1 freq
hoosis - 1 freq
hooziss - 1 freq
hussy's - 1 freq
houssis - 3 freq
hoses - 1 freq
huzzas - 1 freq
haosaz - 1 freq
HOUSES
Time to execute Levenshtein function - 0.296407 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.593088 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.082478 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.090120 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000921 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.