A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example sonsie

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to begun in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
begun (0) - 79 freq
begunk (1) - 18 freq
behun (1) - 1 freq
begen (1) - 11 freq
began (1) - 298 freq
begin (1) - 104 freq
baegun (1) - 6 freq
beg (2) - 76 freq
becin (2) - 1 freq
begs (2) - 11 freq
blun (2) - 1 freq
bejan (2) - 1 freq
begoud (2) - 104 freq
becus (2) - 1 freq
regan (2) - 1 freq
begot (2) - 1 freq
been (2) - 5175 freq
baun (2) - 18 freq
benn (2) - 1 freq
be'in (2) - 3 freq
seun (2) - 10 freq
deun (2) - 31 freq
abeun (2) - 5 freq
baegin (2) - 2 freq
begude (2) - 2 freq
begun (0) - 79 freq
baegun (1) - 6 freq
began (1) - 298 freq
begin (1) - 104 freq
begen (1) - 11 freq
begunk (2) - 18 freq
begane (2) - 1 freq
bygaun (2) - 7 freq
behun (2) - 1 freq
baegin (2) - 2 freq
ben (3) - 609 freq
megan (3) - 10 freq
boun (3) - 10 freq
behin (3) - 41 freq
beggin (3) - 33 freq
bern (3) - 10 freq
beean (3) - 1 freq
bohun (3) - 2 freq
vegan (3) - 12 freq
beinn (3) - 2 freq
egan (3) - 3 freq
begg (3) - 24 freq
be-an (3) - 17 freq
bygane (3) - 45 freq
be-in (3) - 29 freq
SoundEx code - B250
began - 298 freq
beacon - 5 freq
bosom - 12 freq
became - 134 freq
become - 197 freq
begun - 79 freq
bisom - 5 freq
becam - 97 freq
bygane - 45 freq
bashin - 7 freq
backin - 41 freq
begin - 104 freq
bossin - 4 freq
biggin - 208 freq
baskin - 10 freq
boggin - 60 freq
bacon - 61 freq
bygaun - 7 freq
buskin - 11 freq
backin' - 1 freq
buzzin - 39 freq
bakin' - 4 freq
bajan - 2 freq
bowsin - 1 freq
besom - 30 freq
bizzin - 17 freq
bissum - 2 freq
basin - 22 freq
becum - 24 freq
by-gane - 3 freq
bakin - 24 freq
bokin - 5 freq
biggan - 7 freq
bookin - 5 freq
boozin - 4 freq
back-en - 15 freq
beggin - 33 freq
buckin - 3 freq
beachin - 1 freq
buzzin' - 1 freq
beckham - 2 freq
becaim - 2 freq
bicum - 2 freq
boxen - 1 freq
bicaim - 4 freq
becin - 1 freq
baaken - 1 freq
baysin - 1 freq
backhaun - 1 freq
boakin - 10 freq
biazin - 1 freq
bizzum - 6 freq
'backin - 1 freq
'bacon - 1 freq
baggin - 2 freq
bygone - 3 freq
bowsome - 1 freq
bye-gaun - 1 freq
buchan - 65 freq
bikini - 6 freq
bachin - 2 freq
boxin - 13 freq
baegin - 2 freq
baecome - 6 freq
baegun - 6 freq
baechin - 1 freq
beekin - 10 freq
begin' - 1 freq
bowchin - 1 freq
beckon - 1 freq
bison - 2 freq
boson - 1 freq
begen - 11 freq
baken - 2 freq
bókin - 1 freq
bosnia - 3 freq
bakeen - 2 freq
bissom - 2 freq
bookan - 1 freq
baggan - 2 freq
bockan - 7 freq
'biggin - 6 freq
baak-gaein - 2 freq
baakgaein - 1 freq
backson - 12 freq
backson' - 1 freq
by-gaun - 1 freq
'boggin - 3 freq
bukksin - 1 freq
bikin - 7 freq
besseen - 1 freq
baekan - 1 freq
bekeen - 1 freq
baskan - 1 freq
boxan - 1 freq
bakan - 1 freq
biggeen - 6 freq
baecum - 3 freq
bygaein - 1 freq
bekaam - 1 freq
becom - 1 freq
bash-on - 1 freq
buchan' - 1 freq
beikin - 3 freq
becchina - 4 freq
becchin' - 1 freq
bizzen - 2 freq
back-eyn - 5 freq
begane - 1 freq
bizzan - 1 freq
bigamy - 1 freq
buxin - 2 freq
boggin' - 1 freq
boasom - 1 freq
bygaen - 2 freq
becumm - 1 freq
bizzom - 2 freq
bígin - 1 freq
bosome - 1 freq
basan - 1 freq
bak-cum - 1 freq
bak-en - 1 freq
bak-cam - 1 freq
bekkin - 1 freq
basm - 1 freq
bajen - 1 freq
€˜bajen - 1 freq
baakin - 1 freq
bejan - 1 freq
bessom - 1 freq
bozen - 1 freq
backhaan - 1 freq
bekam - 1 freq
bukin - 1 freq
bizzim - 1 freq
bizm - 1 freq
bye-gane - 3 freq
beukin - 1 freq
bocken - 1 freq
€œbeagan - 1 freq
bgm - 1 freq
beakin - 1 freq
bbcgmu - 1 freq
bigwein - 1 freq
bqom - 1 freq
bigeene - 2 freq
bakinÂ’ - 1 freq
beggin' - 1 freq
bbccin - 1 freq
bqquinn - 1 freq
byagm - 1 freq
bbcone - 2 freq
busin - 1 freq
besm - 1 freq
busan - 1 freq
buscemi - 1 freq
besoin - 1 freq
buckihame - 1 freq
bigsam - 1 freq
bqnwou - 1 freq
MetaPhone code - BKN
began - 298 freq
beacon - 5 freq
begun - 79 freq
bygane - 45 freq
backin - 41 freq
biggin - 208 freq
boggin - 60 freq
bacon - 61 freq
bygaun - 7 freq
backin' - 1 freq
bakin' - 4 freq
by-gane - 3 freq
bakin - 24 freq
bokin - 5 freq
biggan - 7 freq
bookin - 5 freq
back-en - 15 freq
beggin - 33 freq
buckin - 3 freq
baaken - 1 freq
boakin - 10 freq
'backin - 1 freq
'bacon - 1 freq
baggin - 2 freq
bygone - 3 freq
bikini - 6 freq
baegun - 6 freq
beekin - 10 freq
beckon - 1 freq
baken - 2 freq
bókin - 1 freq
bakeen - 2 freq
bookan - 1 freq
baggan - 2 freq
bockan - 7 freq
'biggin - 6 freq
by-gaun - 1 freq
'boggin - 3 freq
bikin - 7 freq
baekan - 1 freq
bekeen - 1 freq
bakan - 1 freq
biggeen - 6 freq
bygaein - 1 freq
beikin - 3 freq
back-eyn - 5 freq
begane - 1 freq
boggin' - 1 freq
bygaen - 2 freq
bak-en - 1 freq
bekkin - 1 freq
baakin - 1 freq
bukin - 1 freq
beukin - 1 freq
bocken - 1 freq
€œbeagan - 1 freq
beakin - 1 freq
bakinÂ’ - 1 freq
beggin' - 1 freq
bqquinn - 1 freq
bbcone - 2 freq
BEGUN
Time to execute Levenshtein function - 0.182413 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.342270 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.028391 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.038599 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000899 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.