A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example sonsie

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to begin in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
begin (0) - 104 freq
begun (1) - 78 freq
bein (1) - 1744 freq
baegin (1) - 2 freq
begen (1) - 11 freq
began (1) - 296 freq
begin' (1) - 1 freq
beein (1) - 14 freq
be-in (1) - 29 freq
behin (1) - 41 freq
beggin (1) - 33 freq
aegin (1) - 1 freq
be'in (1) - 3 freq
begins (1) - 82 freq
becin (1) - 1 freq
belgian (2) - 8 freq
bruin (2) - 1 freq
ingin (2) - 18 freq
beean (2) - 1 freq
tein (2) - 3 freq
belic (2) - 1 freq
behun (2) - 1 freq
buyin (2) - 84 freq
bulgin (2) - 6 freq
beget (2) - 2 freq
begin (0) - 104 freq
began (1) - 296 freq
baegin (1) - 2 freq
begen (1) - 11 freq
begun (1) - 78 freq
be'in (2) - 3 freq
begins (2) - 82 freq
baegun (2) - 6 freq
aegin (2) - 1 freq
becin (2) - 1 freq
begane (2) - 1 freq
begin' (2) - 1 freq
bein (2) - 1744 freq
beein (2) - 14 freq
beggin (2) - 33 freq
behin (2) - 41 freq
be-in (2) - 29 freq
baetin (3) - 4 freq
brin (3) - 3 freq
bingin (3) - 1 freq
login (3) - 2 freq
bygaein (3) - 1 freq
besoin (3) - 1 freq
beekin (3) - 10 freq
beirin (3) - 4 freq
SoundEx code - B250
began - 296 freq
beacon - 5 freq
bosom - 12 freq
became - 133 freq
become - 192 freq
begun - 78 freq
bisom - 5 freq
becam - 97 freq
bygane - 45 freq
bashin - 7 freq
backin - 41 freq
begin - 104 freq
bossin - 4 freq
biggin - 208 freq
baskin - 9 freq
boggin - 58 freq
bacon - 61 freq
bygaun - 7 freq
buskin - 11 freq
backin' - 1 freq
buzzin - 37 freq
bakin' - 4 freq
bajan - 2 freq
bowsin - 1 freq
besom - 30 freq
bizzin - 17 freq
bissum - 2 freq
basin - 22 freq
becum - 24 freq
by-gane - 3 freq
bakin - 23 freq
bokin - 5 freq
biggan - 7 freq
bookin - 5 freq
boozin - 4 freq
back-en - 15 freq
beggin - 33 freq
buckin - 3 freq
beachin - 1 freq
buzzin' - 1 freq
beckham - 2 freq
becaim - 2 freq
bicum - 2 freq
boxen - 1 freq
bicaim - 4 freq
becin - 1 freq
baaken - 1 freq
baysin - 1 freq
backhaun - 1 freq
boakin - 10 freq
bizzum - 6 freq
'backin - 1 freq
'bacon - 1 freq
baggin - 2 freq
bygone - 3 freq
bowsome - 1 freq
bye-gaun - 1 freq
buchan - 65 freq
bikini - 6 freq
bachin - 2 freq
boxin - 13 freq
baegin - 2 freq
baecome - 6 freq
baegun - 6 freq
baechin - 1 freq
beekin - 10 freq
begin' - 1 freq
bowchin - 1 freq
beckon - 1 freq
bison - 2 freq
boson - 1 freq
begen - 11 freq
baken - 2 freq
bókin - 1 freq
bosnia - 3 freq
bakeen - 2 freq
bissom - 2 freq
bookan - 1 freq
baggan - 2 freq
bockan - 7 freq
'biggin - 6 freq
baak-gaein - 2 freq
baakgaein - 1 freq
backson - 12 freq
backson' - 1 freq
by-gaun - 1 freq
'boggin - 3 freq
bukksin - 1 freq
bikin - 7 freq
besseen - 1 freq
baekan - 1 freq
bekeen - 1 freq
baskan - 1 freq
boxan - 1 freq
bakan - 1 freq
biggeen - 6 freq
baecum - 3 freq
bygaein - 1 freq
bekaam - 1 freq
becom - 1 freq
bash-on - 1 freq
buchan' - 1 freq
beikin - 3 freq
becchina - 4 freq
becchin' - 1 freq
bizzen - 2 freq
back-eyn - 5 freq
begane - 1 freq
bizzan - 1 freq
bigamy - 1 freq
buxin - 2 freq
boggin' - 1 freq
boasom - 1 freq
bygaen - 2 freq
becumm - 1 freq
bizzom - 2 freq
bígin - 1 freq
bosome - 1 freq
basan - 1 freq
bak-cum - 1 freq
bak-en - 1 freq
bak-cam - 1 freq
bekkin - 1 freq
basm - 1 freq
bajen - 1 freq
€˜bajen - 1 freq
baakin - 1 freq
bejan - 1 freq
bessom - 1 freq
bozen - 1 freq
backhaan - 1 freq
bekam - 1 freq
bukin - 1 freq
bizzim - 1 freq
bizm - 1 freq
bye-gane - 3 freq
beukin - 1 freq
bocken - 1 freq
€œbeagan - 1 freq
bgm - 1 freq
beakin - 1 freq
bbcgmu - 1 freq
bigwein - 1 freq
bqom - 1 freq
bigeene - 2 freq
bakinÂ’ - 1 freq
beggin' - 1 freq
bbccin - 1 freq
bqquinn - 1 freq
byagm - 1 freq
bbcone - 2 freq
busin - 1 freq
besm - 1 freq
busan - 1 freq
buscemi - 1 freq
besoin - 1 freq
buckihame - 1 freq
bigsam - 1 freq
bqnwou - 1 freq
MetaPhone code - BJN
begin - 104 freq
budgin - 3 freq
bajan - 2 freq
baegin - 2 freq
begin' - 1 freq
begen - 11 freq
bígin - 1 freq
bajen - 1 freq
€˜bajen - 1 freq
bejan - 1 freq
bigeene - 2 freq
BEGIN
Time to execute Levenshtein function - 0.207194 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.348823 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.028282 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.040993 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000879 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.