A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ahint

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to dubs in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
dubs (0) - 84 freq
duys (1) - 1 freq
daubs (1) - 1 freq
du's (1) - 113 freq
dabs (1) - 5 freq
duds (1) - 15 freq
dubh (1) - 4 freq
duns (1) - 2 freq
dugs (1) - 228 freq
subs (1) - 5 freq
cubs (1) - 7 freq
gubs (1) - 1 freq
dub (1) - 27 freq
dus (1) - 24 freq
duis (1) - 23 freq
dues (1) - 10 freq
debs (1) - 1 freq
rubs (1) - 9 freq
hubs (1) - 3 freq
tubs (1) - 5 freq
pubs (1) - 49 freq
dunsh (2) - 1 freq
puss (2) - 102 freq
dunes (2) - 6 freq
rebs (2) - 3 freq
dubs (0) - 84 freq
dabs (1) - 5 freq
debs (1) - 1 freq
daubs (1) - 1 freq
dues (2) - 10 freq
duis (2) - 23 freq
dus (2) - 24 freq
pubs (2) - 49 freq
dbis (2) - 1 freq
dub (2) - 27 freq
hubs (2) - 3 freq
rubs (2) - 9 freq
tubs (2) - 5 freq
duds (2) - 15 freq
gubs (2) - 1 freq
duys (2) - 1 freq
dubh (2) - 4 freq
du's (2) - 113 freq
duns (2) - 2 freq
dugs (2) - 228 freq
subs (2) - 5 freq
cubs (2) - 7 freq
dunse (3) - 1 freq
dps (3) - 1 freq
sabs (3) - 8 freq
SoundEx code - D120
dips - 11 freq
dabs - 5 freq
daubs - 1 freq
davis - 7 freq
diffuse - 2 freq
div's - 1 freq
deeps - 7 freq
devious - 4 freq
deep-sea - 4 freq
davies - 13 freq
dives - 7 freq
dowfhike - 1 freq
dabs' - 1 freq
dubs - 84 freq
dobbies - 2 freq
device - 23 freq
daffs - 3 freq
davie's - 36 freq
devise - 2 freq
dybbuk's - 1 freq
depose - 1 freq
defuse - 2 freq
dubious - 3 freq
doffs - 1 freq
deep-sey - 1 freq
dfs - 2 freq
diffs - 2 freq
davy's - 1 freq
duffy's - 4 freq
deeves - 1 freq
daffik - 2 freq
doupies - 1 freq
deips - 1 freq
daffies - 4 freq
davoo's - 1 freq
dfc - 2 freq
devyse - 2 freq
daffiks - 1 freq
dpis - 1 freq
dowps - 6 freq
daffys - 1 freq
debauch - 1 freq
dpbc - 1 freq
doves - 2 freq
deives - 1 freq
defeck - 1 freq
duffs - 2 freq
dps - 1 freq
defies - 2 freq
dpbz - 1 freq
dpz - 1 freq
dbz - 1 freq
dfpz - 1 freq
dpke - 5 freq
debs - 1 freq
dbyg - 1 freq
daveg - 1 freq
dwbaqh - 1 freq
dfkxa - 1 freq
debbie's - 1 freq
ddaps - 1 freq
dbis - 1 freq
daves - 2 freq
MetaPhone code - TBS
dabs - 5 freq
daubs - 1 freq
tubes - 20 freq
dabs' - 1 freq
dubs - 84 freq
dobbies - 2 freq
tabbies - 18 freq
tabs - 4 freq
tubs - 5 freq
dubious - 3 freq
doughboys - 2 freq
toby's - 4 freq
tibbie's - 1 freq
tabawcy - 1 freq
tibby's - 3 freq
tib's - 1 freq
tuibs - 1 freq
doughbaws - 2 freq
dbz - 1 freq
debs - 1 freq
teebs - 1 freq
tobs - 1 freq
debbie's - 1 freq
dbis - 1 freq
DUBS
Time to execute Levenshtein function - 0.182244 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.348008 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.027471 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.037006 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000834 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.