A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ahint

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to bookshop in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
bookshop (0) - 2 freq
beukshop (2) - 2 freq
buikshop (2) - 1 freq
workshop (2) - 21 freq
wirkshop (3) - 2 freq
bishop (3) - 15 freq
bloodshot (3) - 1 freq
bookish (3) - 1 freq
toonship (3) - 1 freq
workshap (3) - 1 freq
bootbhoy (3) - 2 freq
boohoo (3) - 1 freq
workshops (3) - 5 freq
warkshop (3) - 1 freq
books' (3) - 1 freq
toyshop (3) - 1 freq
boo-hoo (3) - 2 freq
bookshelf (3) - 8 freq
books (3) - 245 freq
coortship (4) - 1 freq
booths (4) - 5 freq
bookin (4) - 5 freq
dooshin (4) - 1 freq
shop (4) - 367 freq
pooshion (4) - 1 freq
bookshop (0) - 2 freq
beukshop (2) - 2 freq
buikshop (2) - 1 freq
bookish (4) - 1 freq
bishop (4) - 15 freq
workshop (4) - 21 freq
toyshop (5) - 1 freq
bookshelf (5) - 8 freq
warkshop (5) - 1 freq
books (5) - 245 freq
books' (5) - 1 freq
toonship (5) - 1 freq
wirkshop (5) - 2 freq
workshap (5) - 1 freq
boksir (6) - 1 freq
woarship (6) - 4 freq
tounship (6) - 1 freq
boaks (6) - 3 freq
worship (6) - 35 freq
bouks (6) - 8 freq
bolshoi' (6) - 1 freq
bhoyshh (6) - 1 freq
basho (6) - 1 freq
warkshap (6) - 6 freq
bishops (6) - 8 freq
SoundEx code - B210
bicep - 2 freq
backup - 1 freq
byzeeby - 3 freq
beeeeeeeeceeeep - 1 freq
basebaw - 4 freq
bishop - 15 freq
basebaa - 1 freq
bookshop - 2 freq
bashfu - 1 freq
back-pey - 1 freq
boxfu - 1 freq
bags-up - 1 freq
bee-skep - 2 freq
base-baa - 1 freq
beukshop - 2 freq
buikshop - 1 freq
boxfoo - 1 freq
back-shoppie - 1 freq
bak-fa - 1 freq
beaucoup - 2 freq
bvcskkebe - 1 freq
bskkfv - 1 freq
bxfb - 1 freq
bukb - 1 freq
bvjxb - 1 freq
byohxbyi - 1 freq
bsb - 1 freq
bqkv - 1 freq
bxp - 1 freq
bkxb - 1 freq
bagpie - 1 freq
bugaboo - 1 freq
bhxhv - 1 freq
big-buy - 1 freq
bxb - 1 freq
bvkufffaoo - 1 freq
bcp - 1 freq
busby - 1 freq
bjv - 1 freq
MetaPhone code - BKXP
bookshop - 2 freq
beukshop - 2 freq
buikshop - 1 freq
back-shoppie - 1 freq
BOOKSHOP
book - 507 freq
buik - 415 freq
beuk - 108 freq
books - 245 freq
booked - 31 freq
bookie - 25 freq
bookies - 25 freq
bookbug - 13 freq
bookshelf - 8 freq
booking - 5 freq
bookie's - 12 freq
booklet - 5 freq
bookin - 5 freq
bookit - 3 freq
bookshop - 2 freq
bookings - 2 freq
book's - 2 freq
bookbug - 13 freq
bookcase - 2 freq
buiks - 165 freq
buik's - 10 freq
buikie - 10 freq
buik-lear - 2 freq
buik-shops - 2 freq
beuks - 54 freq
beukie - 7 freq
beukstores - 2 freq
beukit - 3 freq
Time to execute Levenshtein function - 0.292305 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.625069 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.072022 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.088356 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.001125 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.