A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ahint

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to collect in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
collect (0) - 28 freq
collecit (1) - 2 freq
collects (1) - 2 freq
colleck (1) - 11 freq
colleg (2) - 2 freq
collector (2) - 7 freq
collectit (2) - 8 freq
colleckit (2) - 5 freq
yollert (2) - 2 freq
collekit (2) - 1 freq
correct (2) - 39 freq
collecks (2) - 1 freq
connect (2) - 15 freq
gollert (2) - 7 freq
collectin (2) - 16 freq
collectan (2) - 2 freq
collart (2) - 1 freq
coolest (2) - 1 freq
collit (2) - 1 freq
coreect (2) - 1 freq
collected (2) - 16 freq
college (2) - 113 freq
colled (2) - 2 freq
colvend (3) - 2 freq
callt (3) - 14 freq
collect (0) - 28 freq
collecit (1) - 2 freq
colleck (2) - 11 freq
collects (2) - 2 freq
collit (3) - 1 freq
collectin (3) - 16 freq
collectan (3) - 2 freq
collekit (3) - 1 freq
colleckit (3) - 5 freq
collectit (3) - 8 freq
collected (3) - 16 freq
collart (3) - 1 freq
collector (3) - 7 freq
cullecten (4) - 1 freq
cellic (4) - 11 freq
collective (4) - 31 freq
callant (4) - 21 freq
callit (4) - 3 freq
collatit (4) - 3 freq
callt (4) - 14 freq
cellist (4) - 1 freq
collate (4) - 1 freq
collection (4) - 81 freq
gollert (4) - 7 freq
coolest (4) - 1 freq
SoundEx code - C423
clocked - 32 freq
cleekit - 23 freq
claggit - 6 freq
clased - 2 freq
claucht - 53 freq
cloacked - 11 freq
cleeked - 11 freq
cleckit - 6 freq
clasht - 4 freq
chiels-that - 1 freq
collect - 28 freq
clicked - 12 freq
chalked - 3 freq
collected - 16 freq
closed - 126 freq
cleekt - 2 freq
calloused - 2 freq
clouston - 4 freq
celeste - 3 freq
claustrophobic - 1 freq
collection - 81 freq
cloakit - 2 freq
cloggit - 2 freq
clickt - 2 freq
claesed - 2 freq
collective - 31 freq
closet - 9 freq
collectin't - 1 freq
clookit - 2 freq
closset - 1 freq
collections - 26 freq
clauchts - 2 freq
cloister - 2 freq
claikit - 2 freq
cleikit - 12 freq
clusters - 4 freq
celestal - 1 freq
cluster - 5 freq
cullecten - 1 freq
clocket - 1 freq
collegiate - 1 freq
collectit - 8 freq
celestial - 4 freq
cloistert - 2 freq
collogued - 4 freq
colgate - 2 freq
collectin - 16 freq
cloaked - 2 freq
clauchtin - 5 freq
collectors - 9 freq
claised - 1 freq
calcutta - 5 freq
collector - 7 freq
clockit - 2 freq
claustrophobia - 2 freq
coolest - 1 freq
collects - 2 freq
clacked - 1 freq
clashed - 4 freq
'closed' - 1 freq
clickit - 7 freq
colleckit - 5 freq
collectors' - 1 freq
cailst - 3 freq
chalkt - 1 freq
coalcutting - 1 freq
clouston's - 1 freq
clestered - 1 freq
cellist - 1 freq
coal-shade - 1 freq
claught - 49 freq
claught-warkin - 2 freq
clossed - 2 freq
clustered - 1 freq
clousta - 3 freq
clagged - 1 freq
cloistered - 1 freq
clestrain - 1 freq
collectioin - 1 freq
collectin' - 2 freq
collectives - 1 freq
collectively - 4 freq
co-locate - 1 freq
clessed - 1 freq
cloisters' - 1 freq
clostridium - 1 freq
cleuk-tipt - 1 freq
class-drookit - 1 freq
'collected - 1 freq
claiked - 1 freq
clegg-iddergaits - 1 freq
claacht - 1 freq
cleik't - 1 freq
clekkit - 1 freq
clickety - 1 freq
'collectit - 1 freq
coal-cuttin - 1 freq
colloguit - 1 freq
callisto - 2 freq
collecktive - 1 freq
coalesced - 1 freq
collekit - 1 freq
closeted - 1 freq
clister - 1 freq
cleukit - 2 freq
clauchit - 1 freq
celsitud - 1 freq
collecting - 4 freq
clockt - 1 freq
collectan - 2 freq
clogged-up - 1 freq
collecit - 2 freq
chalk-stour - 1 freq
cleshed - 1 freq
clushet - 12 freq
cliched - 1 freq
collocations - 2 freq
cloased - 1 freq
classed - 4 freq
clock-tower - 1 freq
collieston - 1 freq
cleg-tipping - 1 freq
clusterbourach - 1 freq
closetaehame - 1 freq
clichéd - 1 freq
clstevenson - 1 freq
MetaPhone code - KLKT
glaikit - 144 freq
clocked - 32 freq
cleekit - 23 freq
claggit - 6 freq
cloacked - 11 freq
cleeked - 11 freq
cleckit - 6 freq
collect - 28 freq
clicked - 12 freq
cleekt - 2 freq
glaiket - 9 freq
cloakit - 2 freq
cloggit - 2 freq
glakit - 4 freq
clickt - 2 freq
clookit - 2 freq
gleekt - 2 freq
claikit - 2 freq
cleikit - 12 freq
clocket - 1 freq
collogued - 4 freq
colgate - 2 freq
cloaked - 2 freq
calcutta - 5 freq
clockit - 2 freq
glogged - 3 freq
clacked - 1 freq
glugged - 2 freq
gleckit - 1 freq
clickit - 7 freq
colleckit - 5 freq
glaikid - 1 freq
clagged - 1 freq
glekkid - 1 freq
co-locate - 1 freq
claiked - 1 freq
glackte - 1 freq
glaikit-' - 1 freq
cleik't - 1 freq
gallowgate - 6 freq
clekkit - 1 freq
clickety - 1 freq
colloguit - 1 freq
quilkit - 2 freq
glekit - 4 freq
collekit - 1 freq
cleukit - 2 freq
glekkit - 2 freq
clockt - 1 freq
gluggit - 1 freq
glaickit - 2 freq
glig-eed - 1 freq
€œglaikit - 1 freq
glecket - 1 freq
COLLECT
Time to execute Levenshtein function - 0.191279 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.313374 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.027164 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.036948 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000835 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.