A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example sonsie

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to factories in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
factories (0) - 12 freq
factorie (1) - 2 freq
faktories (1) - 1 freq
factory's (2) - 1 freq
factors (2) - 21 freq
fectorie (2) - 1 freq
victories (2) - 5 freq
fairies (3) - 39 freq
factort (3) - 1 freq
fantasies (3) - 8 freq
tories (3) - 117 freq
faictors (3) - 3 freq
wattries (3) - 2 freq
factory' (3) - 2 freq
facies (3) - 1 freq
calories (3) - 11 freq
stories (3) - 359 freq
histories (3) - 17 freq
fasheries (3) - 1 freq
arteries (3) - 5 freq
actor's (3) - 1 freq
factory (3) - 112 freq
victorie (3) - 9 freq
faeries (3) - 14 freq
chories (3) - 1 freq
factories (0) - 12 freq
factors (2) - 21 freq
faktories (2) - 1 freq
factorie (2) - 2 freq
victories (3) - 5 freq
faictors (3) - 3 freq
fectorie (3) - 1 freq
factory's (3) - 1 freq
factory' (4) - 2 freq
fautors (4) - 1 freq
victries (4) - 1 freq
actors (4) - 31 freq
factort (4) - 1 freq
factory (4) - 112 freq
fatures (4) - 1 freq
factor (4) - 31 freq
sectors (5) - 13 freq
facts (5) - 46 freq
victorie's (5) - 1 freq
fixtures (5) - 9 freq
lavatories (5) - 1 freq
doctors (5) - 33 freq
wastries (5) - 1 freq
victors (5) - 1 freq
lectures (5) - 8 freq
SoundEx code - F236
faster - 77 freq
factory - 112 freq
factory' - 2 freq
fighters - 4 freq
foster - 19 freq
fechter - 20 freq
fechters - 6 freq
fighter - 9 freq
fester - 16 freq
feuchters - 1 freq
factories - 12 freq
fisther - 1 freq
fectorie - 1 freq
fectory - 6 freq
fosterer - 3 freq
festerin - 5 freq
faister - 16 freq
factery - 1 freq
factor - 31 freq
fosterit - 1 freq
feexturs - 4 freq
foxtrot - 2 freq
factory's - 1 freq
'faster - 3 freq
foster's - 1 freq
festers - 1 freq
fig-tree - 3 freq
fixture - 4 freq
festered - 3 freq
factors - 21 freq
faaster - 6 freq
fixtures - 9 freq
fixed-term - 1 freq
faictors - 3 freq
faictor - 3 freq
fechters' - 2 freq
faktories - 1 freq
fistir - 1 freq
factorie - 2 freq
fichterin - 1 freq
faister' - 1 freq
fostirit - 1 freq
fosterin - 2 freq
factort - 1 freq
fostert - 1 freq
fister - 2 freq
fowk-dramas - 1 freq
fichter - 1 freq
fosteringnet - 1 freq
fcctrust - 1 freq
fcxdr - 1 freq
foxtrots - 1 freq
MetaPhone code - FKTRS
victor's - 2 freq
factories - 12 freq
victorious - 3 freq
victorie's - 1 freq
victries - 1 freq
factory's - 1 freq
factors - 21 freq
faictors - 3 freq
faktories - 1 freq
veectorious - 1 freq
victories - 5 freq
victors - 1 freq
FACTORIES
Time to execute Levenshtein function - 0.342677 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.611245 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.028596 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.072390 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000911 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.