A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ahint

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to begat in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
begat (0) - 2 freq
began (1) - 296 freq
begot (1) - 1 freq
legat (1) - 1 freq
beget (1) - 2 freq
beat (1) - 207 freq
be't (2) - 3 freq
biegt (2) - 1 freq
begoot (2) - 2 freq
beggs (2) - 1 freq
beilt (2) - 1 freq
beak (2) - 36 freq
bela (2) - 1 freq
bóat (2) - 5 freq
be-an (2) - 17 freq
béat (2) - 1 freq
eat (2) - 460 freq
bedae (2) - 1 freq
legit (2) - 4 freq
heat (2) - 161 freq
beret (2) - 3 freq
bett (2) - 4 freq
bejan (2) - 1 freq
yeat (2) - 2 freq
beggar (2) - 16 freq
begat (0) - 2 freq
beget (1) - 2 freq
begot (1) - 1 freq
bigot (2) - 4 freq
biegt (2) - 1 freq
bgt (2) - 1 freq
begoot (2) - 2 freq
began (2) - 296 freq
beat (2) - 207 freq
legat (2) - 1 freq
bogart (3) - 1 freq
berate (3) - 1 freq
begin (3) - 104 freq
beart (3) - 1 freq
bleat (3) - 5 freq
bat (3) - 50 freq
best (3) - 1574 freq
beag (3) - 1 freq
bexit (3) - 1 freq
geeat (3) - 1 freq
blat (3) - 37 freq
brat (3) - 15 freq
boat (3) - 350 freq
beest (3) - 11 freq
begets (3) - 4 freq
SoundEx code - B230
bizzed - 3 freq
basket - 62 freq
beast - 141 freq
beekit - 2 freq
biscuit - 35 freq
best - 1574 freq
baked - 28 freq
buskit - 33 freq
bakst - 1 freq
bocht - 217 freq
bashed - 8 freq
begood - 11 freq
biggit - 244 freq
begged - 30 freq
beast- - 1 freq
bossed - 6 freq
'best - 4 freq
buzzed - 1 freq
booked - 31 freq
beside - 63 freq
biscuity - 1 freq
bucket - 75 freq
based - 83 freq
begoud - 104 freq
beastie - 36 freq
bust - 8 freq
bought - 80 freq
backed - 27 freq
boakit - 3 freq
big-ee'd - 1 freq
boked - 2 freq
b'goad - 1 freq
biased - 4 freq
baist - 80 freq
beukit - 3 freq
baest - 27 freq
backside - 26 freq
begat - 2 freq
busked - 5 freq
boucht - 5 freq
backseat - 4 freq
buist - 5 freq
baukit - 12 freq
begot - 1 freq
bucht - 4 freq
bochte - 1 freq
beset - 4 freq
busied - 1 freq
beast' - 2 freq
bakt - 1 freq
bouquet - 3 freq
bakside - 2 freq
baisket - 1 freq
beachit - 6 freq
baaket - 1 freq
beestie - 5 freq
beest - 11 freq
boxed - 7 freq
boacht - 1 freq
bestow - 1 freq
bussed - 2 freq
bigged - 30 freq
bushido - 1 freq
best' - 2 freq
bassett - 1 freq
boost - 19 freq
bagged - 4 freq
backchat - 4 freq
'baked - 1 freq
boast - 7 freq
bookit - 3 freq
busty - 1 freq
bexit - 1 freq
baised - 3 freq
backid - 1 freq
baeside - 1 freq
boxt - 1 freq
baakt - 1 freq
backit - 8 freq
baistie - 3 freq
buckhead - 2 freq
besta - 2 freq
beaked - 1 freq
behest - 2 freq
beached - 8 freq
bakit - 8 freq
bst - 2 freq
beist - 6 freq
backeth - 1 freq
boakt - 1 freq
backt - 1 freq
bouquet' - 1 freq
big-shot - 2 freq
backhaud - 1 freq
back-seat - 1 freq
bockid - 1 freq
boaked - 3 freq
beskit - 3 freq
becked - 1 freq
'biggit - 1 freq
boukit - 5 freq
bakkit - 2 freq
baste - 8 freq
biggid - 4 freq
buskid - 1 freq
bosied - 7 freq
begude - 2 freq
busta - 6 freq
bukksed - 1 freq
boght - 2 freq
beget - 2 freq
baguette - 2 freq
bow-hoched - 1 freq
bigot - 4 freq
bicht - 2 freq
begoot - 2 freq
bogged - 3 freq
begouth - 2 freq
biest - 1 freq
baessed - 5 freq
biwast - 1 freq
bisooth - 1 freq
baggit - 4 freq
baestie - 1 freq
bisset - 4 freq
biegt - 1 freq
bash't - 1 freq
backet - 8 freq
bycht - 1 freq
bigget - 1 freq
bigg't - 1 freq
beckett - 3 freq
€˜best - 1 freq
besty - 2 freq
biked - 1 freq
bekked - 1 freq
bauxyte - 1 freq
bekeit - 1 freq
beistie - 1 freq
bouchtie - 1 freq
bekkit - 1 freq
back-sate - 1 freq
€œbest - 4 freq
bookt - 3 freq
biskit - 1 freq
bisto - 1 freq
baikit - 1 freq
baesed - 1 freq
€˜beukit - 1 freq
€˜booked - 1 freq
bestowe - 1 freq
buckt - 1 freq
bikit - 1 freq
boggit - 1 freq
beestee - 1 freq
back-chat - 1 freq
bwisd - 1 freq
bgt - 1 freq
bbcqt - 1 freq
boycott - 2 freq
boxset - 2 freq
bbsit - 1 freq
bbctwo - 1 freq
biscotti - 1 freq
bissett - 3 freq
bsqt - 1 freq
bag'd - 1 freq
MetaPhone code - BKT
beekit - 2 freq
baked - 28 freq
begood - 11 freq
biggit - 244 freq
begged - 30 freq
booked - 31 freq
bucket - 75 freq
begoud - 104 freq
backed - 27 freq
boakit - 3 freq
big-ee'd - 1 freq
boked - 2 freq
b'goad - 1 freq
beukit - 3 freq
begat - 2 freq
baukit - 12 freq
begot - 1 freq
bakt - 1 freq
bouquet - 3 freq
baaket - 1 freq
bigged - 30 freq
bagged - 4 freq
'baked - 1 freq
bookit - 3 freq
backid - 1 freq
baakt - 1 freq
backit - 8 freq
beaked - 1 freq
bakit - 8 freq
boakt - 1 freq
backt - 1 freq
bouquet' - 1 freq
bockid - 1 freq
boaked - 3 freq
becked - 1 freq
'biggit - 1 freq
boukit - 5 freq
bakkit - 2 freq
biggid - 4 freq
begude - 2 freq
baguette - 2 freq
bigot - 4 freq
begoot - 2 freq
bogged - 3 freq
baggit - 4 freq
biegt - 1 freq
backet - 8 freq
bigget - 1 freq
bigg't - 1 freq
beckett - 3 freq
biked - 1 freq
bekked - 1 freq
bekeit - 1 freq
bekkit - 1 freq
bookt - 3 freq
baikit - 1 freq
€˜beukit - 1 freq
€˜booked - 1 freq
buckt - 1 freq
bikit - 1 freq
boggit - 1 freq
bgt - 1 freq
boycott - 2 freq
hbctyy - 1 freq
bag'd - 1 freq
BEGAT
Time to execute Levenshtein function - 0.192613 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.379022 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.028369 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.038305 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000893 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.