A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ahint

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to ait in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
ait (0) - 138 freq
att (1) - 451 freq
aith (1) - 23 freq
oit (1) - 1 freq
ant (1) - 8 freq
wait (1) - 477 freq
aits (1) - 21 freq
hit (1) - 1222 freq
lait (1) - 4 freq
air (1) - 970 freq
ailt (1) - 3 freq
aet (1) - 102 freq
rit (1) - 13 freq
bait (1) - 32 freq
eit (1) - 644 freq
zit (1) - 2 freq
aist (1) - 21 freq
vit (1) - 3 freq
pit (1) - 2566 freq
aix (1) - 42 freq
kit (1) - 30 freq
tait (1) - 42 freq
aim (1) - 50 freq
cait (1) - 2 freq
-it (1) - 2 freq
ait (0) - 138 freq
at (1) - 20079 freq
eit (1) - 644 freq
aite (1) - 4 freq
aat (1) - 852 freq
it (1) - 32760 freq
aet (1) - 102 freq
yit (1) - 500 freq
oit (1) - 1 freq
apt (2) - 45 freq
aik (2) - 55 freq
aip (2) - 10 freq
aic (2) - 2 freq
fit (2) - 3793 freq
cit (2) - 4 freq
ite (2) - 3 freq
ut (2) - 9 freq
yet (2) - 987 freq
gait (2) - 137 freq
et (2) - 256 freq
tit (2) - 23 freq
sait (2) - 8 freq
rait (2) - 3 freq
t (2) - 5646 freq
'it (2) - 173 freq
SoundEx code - A300
at - 20079 freq
aed - 2 freq
ate - 115 freq
aat - 852 freq
aet - 102 freq
a'd - 168 freq
ah'd - 508 freq
add - 133 freq
at- - 1 freq
aheid - 189 freq
ahd - 19 freq
ahead - 80 freq
'at - 357 freq
await - 7 freq
adae - 139 freq
'a'd - 3 freq
adee - 30 freq
ad - 126 freq
aht - 27 freq
aid - 37 freq
ado - 4 freq
'ah'd - 13 freq
at'd - 4 freq
ada - 4 freq
aat' - 1 freq
aw-day - 2 freq
aa'd - 2 freq
awtho - 5 freq
atho - 3 freq
awthou - 1 freq
adio - 1 freq
ata - 58 freq
ait - 138 freq
aith - 23 freq
atth - 1 freq
aite - 4 freq
aud - 32 freq
ahaud - 5 freq
aty - 2 freq
awta - 1 freq
ayedeea - 1 freq
aydea - 1 freq
'ahd - 1 freq
aheed - 5 freq
ata' - 6 freq
adieu - 1 freq
ahid - 2 freq
ah'da - 2 freq
ati - 16 freq
ah-ta - 1 freq
ahied - 1 freq
ataw - 7 freq
aaid - 2 freq
addi - 3 freq
awtie - 1 freq
aat'd - 5 freq
ahead' - 1 freq
att - 451 freq
att' - 2 freq
'att - 1 freq
aaud - 1 freq
a'da - 9 freq
aa'at - 1 freq
aedie - 1 freq
'aedie - 1 freq
adö - 1 freq
aet' - 1 freq
adie - 16 freq
audio - 29 freq
'add - 1 freq
aathou - 1 freq
'at'd - 1 freq
atthe - 2 freq
awte - 1 freq
atey - 1 freq
a'hæt - 1 freq
ati'aa - 1 freq
awid - 2 freq
ataa - 7 freq
-at - 1 freq
awety - 1 freq
ady - 2 freq
þat - 2 freq
aat- - 1 freq
awed - 3 freq
'ad - 1 freq
ataw' - 1 freq
ahaed - 1 freq
€œat - 7 freq
€˜aat - 1 freq
adaya - 1 freq
€˜at - 44 freq
ae-twae - 1 freq
ðat - 1 freq
€™at - 40 freq
auto - 5 freq
ahoot - 1 freq
attie - 3 freq
ae-day - 1 freq
awday - 2 freq
ayday - 1 freq
audi - 1 freq
€™ad - 1 freq
atae - 1 freq
aatho - 2 freq
addie - 6 freq
€”at - 1 freq
ae-twa - 2 freq
addy - 6 freq
aweet - 1 freq
aide - 1 freq
atwà - 2 freq
aaht - 1 freq
€˜att - 1 freq
aidh - 1 freq
atÂ’a - 1 freq
at' - 1 freq
a't - 1 freq
aÂ’day - 1 freq
ahÂ’d - 9 freq
at” - 1 freq
aday - 4 freq
awud - 1 freq
ahdh - 1 freq
‘at - 6 freq
a'day - 7 freq
a'the - 3 freq
ah't - 11 freq
aÂ’d - 2 freq
awd - 1 freq
ade - 3 freq
“at - 1 freq
aeiot - 1 freq
adda - 4 freq
'ate - 1 freq
MetaPhone code - AT
at - 20079 freq
ate - 115 freq
aat - 852 freq
a'd - 168 freq
ah'd - 508 freq
add - 133 freq
at- - 1 freq
ahd - 19 freq
'at - 357 freq
adae - 139 freq
'a'd - 3 freq
adee - 30 freq
ad - 126 freq
aht - 27 freq
aid - 37 freq
ado - 4 freq
'ah'd - 13 freq
ada - 4 freq
aat' - 1 freq
aw-day - 2 freq
aa'd - 2 freq
adio - 1 freq
ata - 58 freq
ait - 138 freq
atth - 1 freq
aite - 4 freq
aud - 32 freq
aty - 2 freq
awta - 1 freq
aydea - 1 freq
'ahd - 1 freq
ata' - 6 freq
adieu - 1 freq
ah'da - 2 freq
ati - 16 freq
ah-ta - 1 freq
ataw - 7 freq
aaid - 2 freq
addi - 3 freq
awtie - 1 freq
att - 451 freq
att' - 2 freq
'att - 1 freq
aaud - 1 freq
a'da - 9 freq
aa'at - 1 freq
adö - 1 freq
adie - 16 freq
audio - 29 freq
'add - 1 freq
atthe - 2 freq
awte - 1 freq
atey - 1 freq
a'hæt - 1 freq
ati'aa - 1 freq
ataa - 7 freq
-at - 1 freq
ady - 2 freq
þat - 2 freq
aat- - 1 freq
'ad - 1 freq
ataw' - 1 freq
€œat - 7 freq
€˜aat - 1 freq
€˜at - 44 freq
ðat - 1 freq
€™at - 40 freq
auto - 5 freq
attie - 3 freq
awday - 2 freq
ayday - 1 freq
audi - 1 freq
€™ad - 1 freq
atae - 1 freq
addie - 6 freq
€”at - 1 freq
addy - 6 freq
aide - 1 freq
atwà - 2 freq
aaht - 1 freq
€˜att - 1 freq
aidh - 1 freq
atÂ’a - 1 freq
at' - 1 freq
a't - 1 freq
aÂ’day - 1 freq
ahÂ’d - 9 freq
at” - 1 freq
aday - 4 freq
ahdh - 1 freq
‘at - 6 freq
a'day - 7 freq
ah't - 11 freq
aÂ’d - 2 freq
awd - 1 freq
ade - 3 freq
“at - 1 freq
adda - 4 freq
'ate - 1 freq
AIT
Time to execute Levenshtein function - 0.196141 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.335078 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.028555 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.038876 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.001024 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.