A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example sonsie

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to entities in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
entities (0) - 15 freq
identities (2) - 18 freq
entitilt (2) - 1 freq
entitlet (2) - 8 freq
entitled (2) - 26 freq
entries (2) - 35 freq
densities (2) - 1 freq
entitlit (3) - 1 freq
tithes (3) - 2 freq
fifities (3) - 1 freq
titties (3) - 1 freq
eichties (3) - 3 freq
identitie (3) - 17 freq
ratties (3) - 2 freq
densitie (3) - 1 freq
anxieties (3) - 2 freq
catties (3) - 3 freq
amenities (3) - 4 freq
menties (3) - 1 freq
entrails (3) - 3 freq
antiques (3) - 5 freq
ettie (3) - 4 freq
nasties (3) - 1 freq
enticin (3) - 1 freq
penalties (3) - 11 freq
entities (0) - 15 freq
nutties (3) - 1 freq
entries (3) - 35 freq
identities (3) - 18 freq
bitties (4) - 29 freq
estatis (4) - 1 freq
unites (4) - 4 freq
butties (4) - 1 freq
invites (4) - 11 freq
untiques (4) - 3 freq
estaitis (4) - 4 freq
potties (4) - 1 freq
fatties (4) - 1 freq
cutties (4) - 3 freq
noties (4) - 2 freq
inititives (4) - 1 freq
ditties (4) - 1 freq
tatties (4) - 195 freq
totties (4) - 8 freq
estates (4) - 19 freq
intries (4) - 1 freq
nasties (4) - 1 freq
titties (4) - 1 freq
inimities (4) - 1 freq
entity (4) - 17 freq
SoundEx code - E533
een-tide - 1 freq
endit - 55 freq
ended - 72 freq
eemitate - 2 freq
endow'd - 1 freq
entitlet - 8 freq
eemitates - 1 freq
entitlement's - 1 freq
emitted - 2 freq
emediatly - 2 freq
endid - 4 freq
endowed - 2 freq
entitled - 26 freq
entitlement - 12 freq
end-o-term - 1 freq
entitlement-obsessit - 1 freq
entitlement-driven - 2 freq
entity - 17 freq
endet - 42 freq
entitled' - 1 freq
endyte - 2 freq
entities - 15 freq
eimitate - 1 freq
enteetled - 3 freq
endite - 1 freq
e'entide - 1 freq
eynded - 1 freq
eimitator - 1 freq
eyndit - 2 freq
entitlements - 1 freq
entitilt - 1 freq
endytit - 1 freq
enteitilt - 1 freq
entitelt - 10 freq
eemitatin - 1 freq
endowit - 1 freq
entitlit - 1 freq
endeth - 1 freq
endoits - 1 freq
entada - 1 freq
MetaPhone code - ENTTS
entities - 15 freq
endoits - 1 freq
ENTITIES
Time to execute Levenshtein function - 0.174721 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.347311 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.028487 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.037251 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000864 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.