A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example sonsie

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to domesticates in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
domesticates (0) - 1 freq
domesticated (1) - 1 freq
domesticate (1) - 1 freq
domesticatit (2) - 3 freq
'domesticated (2) - 1 freq
domestication (3) - 1 freq
domesticity (3) - 1 freq
domestics (3) - 1 freq
domestic (4) - 20 freq
estimates (4) - 1 freq
dominates (4) - 2 freq
complicates (4) - 2 freq
sometimes (5) - 362 freq
dominate (5) - 4 freq
demonstrates (5) - 1 freq
meditates (5) - 2 freq
dedicate (5) - 8 freq
testicles (5) - 1 freq
investigate (5) - 16 freq
delicate (5) - 21 freq
estates (5) - 19 freq
masticatin (5) - 1 freq
destitute (5) - 2 freq
meericales (5) - 1 freq
mediates (5) - 1 freq
domesticates (0) - 1 freq
domesticate (2) - 1 freq
domesticated (2) - 1 freq
domesticatit (3) - 3 freq
domesticity (4) - 1 freq
domestics (4) - 1 freq
domestication (4) - 1 freq
'domesticated (4) - 1 freq
domestic (6) - 20 freq
demonstrates (7) - 1 freq
masticatin (7) - 1 freq
estimates (7) - 1 freq
dominates (7) - 2 freq
complicates (7) - 2 freq
testifycates (8) - 9 freq
detects (8) - 1 freq
hesitates (8) - 1 freq
pesticides (8) - 1 freq
dessicatit (8) - 1 freq
dipsticks (8) - 1 freq
onomastics (8) - 2 freq
digestives (8) - 2 freq
descartes (8) - 3 freq
constitutes (8) - 1 freq
onomasticians (8) - 1 freq
SoundEx code - D523
daunced - 18 freq
dynasty - 10 freq
danced - 57 freq
doonsittin - 2 freq
doon-sittin - 1 freq
damaged - 15 freq
dingit - 13 freq
dinkit - 1 freq
doonsit - 1 freq
duncht - 1 freq
doonstair - 10 freq
dunced - 10 freq
'domesticated - 1 freq
doonstairs - 41 freq
dinged - 18 freq
donside - 4 freq
dunched - 8 freq
duimster - 2 freq
dauncit - 2 freq
dumstruik - 1 freq
downstairs - 2 freq
dammeyged - 1 freq
density - 4 freq
doonside - 4 freq
domestic - 20 freq
dunked - 2 freq
doon-stairs - 1 freq
dunstan - 1 freq
dynastic - 1 freq
ding-dong - 2 freq
doonstairs' - 1 freq
demi-gods - 1 freq
ding't - 2 freq
doonstream - 1 freq
densities - 1 freq
ding-dang - 6 freq
dunch't - 1 freq
dongting - 2 freq
donnchadh - 1 freq
dounsets - 1 freq
domestics - 1 freq
deemsters - 1 freq
doun-sittin - 1 freq
doonistair - 1 freq
dang-doun - 1 freq
damnest - 1 freq
doomnstairs - 1 freq
dunstane - 1 freq
dangit - 1 freq
domestication - 1 freq
domesticatit - 3 freq
domesticate - 1 freq
domesticates - 1 freq
dinkie-dies - 1 freq
doungate - 1 freq
doomsday - 1 freq
donaghadee - 5 freq
dinghyed - 1 freq
domesticated - 1 freq
dunecht - 1 freq
danight - 9 freq
dingyed - 1 freq
dingied - 3 freq
dimished - 1 freq
danicht - 4 freq
donstalk - 5 freq
dnjtk - 1 freq
donnchadhol - 2 freq
donsdaiiy - 6 freq
downside - 2 freq
dwanged - 1 freq
douneside - 4 freq
danceathon - 1 freq
dinsdale - 1 freq
dtammcd - 1 freq
densitie - 1 freq
dunsterhouseltd - 1 freq
domesticity - 1 freq
donagha-dreich - 1 freq
daneside - 1 freq
dunged - 1 freq
MetaPhone code - TMSTKTS
domesticates - 1 freq
DOMESTICATES
Time to execute Levenshtein function - 0.485204 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.606966 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.074267 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.085337 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.001174 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.