A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ahint

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to entity in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
entity (0) - 17 freq
enmity (1) - 4 freq
eatit (2) - 1 freq
unity (2) - 45 freq
entirety (2) - 3 freq
equity (2) - 7 freq
intit (2) - 11 freq
endite (2) - 1 freq
pentit (2) - 44 freq
entry (2) - 61 freq
ently (2) - 2 freq
antit (2) - 1 freq
identity (2) - 162 freq
fentit (2) - 3 freq
untidy (2) - 1 freq
gentily (2) - 4 freq
entire (2) - 32 freq
ectit (2) - 1 freq
tentily (2) - 22 freq
rentit (2) - 6 freq
entice (2) - 4 freq
ennit (2) - 2 freq
endit (2) - 55 freq
dentit (2) - 3 freq
etnty (2) - 1 freq
entity (0) - 17 freq
enmity (2) - 4 freq
intit (2) - 11 freq
antit (2) - 1 freq
entice (3) - 4 freq
ectit (3) - 1 freq
entire (3) - 32 freq
ennit (3) - 2 freq
rentit (3) - 6 freq
dentit (3) - 3 freq
natty (3) - 2 freq
notit (3) - 12 freq
tentit (3) - 8 freq
unitit (3) - 29 freq
endit (3) - 55 freq
nutty (3) - 6 freq
nitty (3) - 1 freq
eatit (3) - 1 freq
untidy (3) - 1 freq
unity (3) - 45 freq
entirety (3) - 3 freq
endite (3) - 1 freq
pentit (3) - 44 freq
fentit (3) - 3 freq
ently (3) - 2 freq
SoundEx code - E533
een-tide - 1 freq
endit - 55 freq
ended - 70 freq
eemitate - 2 freq
endow'd - 1 freq
entitlet - 8 freq
eemitates - 1 freq
entitlement's - 1 freq
emitted - 2 freq
emediatly - 2 freq
endid - 4 freq
entitled - 26 freq
entitlement - 12 freq
end-o-term - 1 freq
entitlement-obsessit - 1 freq
entitlement-driven - 2 freq
entity - 17 freq
endet - 42 freq
entitled' - 1 freq
endyte - 2 freq
entities - 15 freq
eimitate - 1 freq
enteetled - 3 freq
endite - 1 freq
e'entide - 1 freq
eynded - 1 freq
eimitator - 1 freq
eyndit - 2 freq
entitlements - 1 freq
entitilt - 1 freq
endytit - 1 freq
enteitilt - 1 freq
entitelt - 10 freq
eemitatin - 1 freq
endowed - 1 freq
endowit - 1 freq
entitlit - 1 freq
endeth - 1 freq
endoits - 1 freq
entada - 1 freq
MetaPhone code - ENTT
een-tide - 1 freq
endit - 55 freq
ended - 70 freq
endow'd - 1 freq
endid - 4 freq
entity - 17 freq
endet - 42 freq
endyte - 2 freq
endite - 1 freq
e'entide - 1 freq
eynded - 1 freq
eyndit - 2 freq
entada - 1 freq
ENTITY
Time to execute Levenshtein function - 0.304915 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.717382 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.028014 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.146038 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.001100 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.