A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ablow

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to debut in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
debut (0) - 7 freq
debt (1) - 44 freq
debit (1) - 2 freq
dellt (2) - 3 freq
-but (2) - 2 freq
deputy (2) - 6 freq
reut (2) - 1 freq
debts (2) - 8 freq
dealt (2) - 33 freq
deux (2) - 1 freq
deeit (2) - 1 freq
b-but (2) - 3 freq
€œbut (2) - 38 freq
'-but (2) - 1 freq
€“but (2) - 1 freq
€¦but (2) - 5 freq
demit (2) - 2 freq
dei't (2) - 2 freq
deud (2) - 1 freq
deuk (2) - 47 freq
donut (2) - 1 freq
debate (2) - 81 freq
beaut (2) - 5 freq
delft (2) - 2 freq
deb (2) - 2 freq
debut (0) - 7 freq
debit (1) - 2 freq
debt (1) - 44 freq
debate (2) - 81 freq
debait (2) - 1 freq
depot (3) - 1 freq
devout (3) - 4 freq
delyt (3) - 3 freq
deat (3) - 1 freq
doobt (3) - 26 freq
abut (3) - 1 freq
deet (3) - 24 freq
dee't (3) - 36 freq
defaut (3) - 20 freq
deft (3) - 4 freq
det (3) - 3 freq
delt (3) - 3 freq
debs (3) - 1 freq
dent (3) - 4 freq
but (3) - 13122 freq
rebat (3) - 5 freq
dept (3) - 7 freq
depute (3) - 21 freq
dout (3) - 167 freq
doubt (3) - 86 freq
SoundEx code - D130
dippit - 12 freq
daft - 436 freq
doubt - 86 freq
daftie - 23 freq
dauvit - 95 freq
dipped - 23 freq
devoid - 7 freq
dabbed - 9 freq
debt - 44 freq
deaved - 13 freq
divvied - 2 freq
dafty - 29 freq
debate - 81 freq
'daft - 2 freq
david - 230 freq
doft - 1 freq
defeat - 30 freq
divide - 24 freq
divot - 13 freq
daavit - 25 freq
'daavit - 1 freq
defait - 7 freq
doobt - 26 freq
defaut - 20 freq
dubbed - 2 freq
dived - 21 freq
duvet - 20 freq
daffed - 3 freq
daupit - 3 freq
dabbit - 6 freq
dowpit - 26 freq
defied - 5 freq
'dauvit - 2 freq
divid - 10 freq
dvd - 11 freq
dafft - 2 freq
dayvideee - 2 freq
depth - 22 freq
deeved - 6 freq
devout - 4 freq
dabaittie - 2 freq
deputy - 6 freq
daivit - 4 freq
dappit - 1 freq
devide - 1 freq
dobbid - 1 freq
doped - 2 freq
debut - 7 freq
daubed - 3 freq
davit - 23 freq
divïd - 8 freq
'dippit - 1 freq
doffed - 3 freq
debait - 1 freq
'david - 2 freq
debit - 2 freq
deft - 4 freq
depute - 21 freq
deived - 1 freq
dept - 7 freq
divvy-oot - 1 freq
depot - 1 freq
davyth - 1 freq
doupit - 1 freq
€œdauvit - 3 freq
€œdavid - 1 freq
dowped - 1 freq
€˜devout - 1 freq
deavit - 1 freq
daavid - 12 freq
daavd - 1 freq
dawpit - 1 freq
€œdaft - 1 freq
dpd - 1 freq
dbooth - 1 freq
dvd' - 1 freq
deepth - 1 freq
“daft - 1 freq
'dived' - 3 freq
dopyt - 1 freq
MetaPhone code - TBT
doubt - 86 freq
dabbed - 9 freq
debt - 44 freq
debate - 81 freq
doobt - 26 freq
dubbed - 2 freq
taibit - 1 freq
dabbit - 6 freq
tibet - 10 freq
dabaittie - 2 freq
dobbid - 1 freq
debut - 7 freq
daubed - 3 freq
debait - 1 freq
debit - 2 freq
€œtibet - 1 freq
DEBUT
Time to execute Levenshtein function - 0.214142 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.365078 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.027594 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.036663 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000790 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.