A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ahint

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to thay in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
thay (0) - 706 freq
shay (1) - 4 freq
tray (1) - 48 freq
thai (1) - 445 freq
thaa (1) - 1 freq
thaw (1) - 11 freq
'hay (1) - 1 freq
thy (1) - 97 freq
tay (1) - 186 freq
hay (1) - 124 freq
tham (1) - 43 freq
tha (1) - 6295 freq
tha' (1) - 11 freq
thae (1) - 1233 freq
that (1) - 27092 freq
thas (1) - 2 freq
thay' (1) - 1 freq
thar (1) - 103 freq
thak (1) - 1 freq
they (1) - 11452 freq
thray (1) - 2 freq
than (1) - 2763 freq
jay (2) - 7 freq
tea- (2) - 5 freq
tram (2) - 23 freq
thay (0) - 706 freq
thae (1) - 1233 freq
they (1) - 11452 freq
thy (1) - 97 freq
tha (1) - 6295 freq
thaa (1) - 1 freq
thai (1) - 445 freq
theiy (2) - 6 freq
thu (2) - 23 freq
utha (2) - 1 freq
tah (2) - 2 freq
theaa (2) - 1 freq
thi (2) - 2576 freq
shay (2) - 4 freq
thea (2) - 1 freq
thoo (2) - 277 freq
tho (2) - 1083 freq
the (2) - 157218 freq
theo (2) - 1 freq
thie (2) - 8 freq
thoa (2) - 4 freq
thee (2) - 234 freq
th (2) - 2479 freq
thei (2) - 5 freq
thou (2) - 95 freq
SoundEx code - T000
the - 157218 freq
tae - 65006 freq
twa - 3495 freq
they - 11452 freq
tea - 573 freq
'tae - 47 freq
to - 4164 freq
two - 773 freq
thae - 1233 freq
tho - 1083 freq
'the - 355 freq
t - 5648 freq
ta - 2534 freq
tha - 6295 freq
'they - 49 freq
'thae - 1 freq
'ti - 3 freq
tt - 36 freq
tie - 88 freq
twae - 228 freq
thou - 95 freq
thay - 706 freq
towe - 21 freq
toy - 45 freq
ti - 4171 freq
th - 2479 freq
too - 1030 freq
they' - 13 freq
taw - 10 freq
tue - 9 freq
thai - 445 freq
tae- - 5 freq
the- - 2 freq
tea- - 5 freq
two' - 1 freq
tae-' - 2 freq
tia - 2 freq
'to - 7 freq
'tho - 1 freq
'twa - 13 freq
tee - 168 freq
tw - 6 freq
- 1 freq
thy - 97 freq
toi - 34 freq
tah - 2 freq
'toi - 1 freq
thow - 2 freq
toa - 2 freq
t'd - 2 freq
'tt - 4 freq
they-eh - 1 freq
the' - 572 freq
tow - 44 freq
t'ae - 1 freq
tho' - 48 freq
twaa - 32 freq
th' - 107 freq
'the' - 6 freq
tee-hee - 1 freq
toe - 22 freq
thi - 2576 freq
te - 1570 freq
thay' - 1 freq
thee - 234 freq
thu - 23 freq
tay - 186 freq
thee' - 1 freq
ta¢ - 6 freq
thei - 5 freq
thé - 1 freq
tae' - 3 freq
't - 23 freq
toue - 1 freq
thaw - 11 freq
tha' - 11 freq
thoo - 277 freq
theiy - 6 freq
tye - 9 freq
'they' - 2 freq
'ta' - 1 freq
tahhh - 1 freq
tih - 8 freq
toooo - 1 freq
'two - 3 freq
to' - 2 freq
tai - 1 freq
they'u - 1 freq
t-t-twa - 1 freq
t' - 2 freq
thowe - 4 freq
'tha - 16 freq
t'a - 17 freq
thé - 1 freq
ty - 7 freq
theyaw - 1 freq
'th - 1 freq
twae' - 1 freq
tthey - 1 freq
too-whoo - 1 freq
teu - 29 freq
taa - 3 freq
twa' - 1 freq
td - 9 freq
'th- - 1 freq
ït - 331 freq
°tha - 1 freq
twee - 5 freq
't'd - 1 freq
t'da - 5 freq
twi - 1 freq
öt - 7 freq
- 19 freq
ta-' - 1 freq
ta- - 1 freq
tu - 23 freq
'ta - 2 freq
twa-wey - 6 freq
'th' - 1 freq
'thou' - 2 freq
'thee' - 2 freq
'thy' - 2 freq
'to' - 1 freq
t-hah - 1 freq
thie - 8 freq
'tae' - 1 freq
tau - 2 freq
thoo' - 1 freq
'too - 1 freq
t'wo - 1 freq
t'tow - 1 freq
- 3 freq
'thy - 3 freq
t'die - 1 freq
thoa - 4 freq
two'why - 1 freq
taé - 1 freq
t - 2 freq
t - 6 freq
þæt - 2 freq
t - 1 freq
teh - 1 freq
tao - 2 freq
tey - 3 freq
tua - 2 freq
the - 177 freq
tae'a - 1 freq
- 1 freq
t - 693 freq
theh - 1 freq
the - 108 freq
tho - 2 freq
ti - 2 freq
they - 24 freq
tho - 1 freq
th - 2 freq
to - 17 freq
thay - 2 freq
thew - 1 freq
t - 2 freq
the - 3 freq
tae - 6 freq
twa - 1 freq
the - 6 freq
the - 8 freq
töd - 1 freq
taew - 1 freq
they - 2 freq
thai - 6 freq
thae - 2 freq
ti - 2 freq
they - 47 freq
tu - 2 freq
thae - 6 freq
thay - 2 freq
-to- - 1 freq
'tea' - 1 freq
'te - 1 freq
thoo - 2 freq
thee - 1 freq
the - 4 freq
thoo - 30 freq
too - 2 freq
to - 1 freq
tha - 1 freq
the - 1 freq
toh - 4 freq
tui - 2 freq
two - 2 freq
thai - 2 freq
twa - 2 freq
tha - 3 freq
tae - 1 freq
tei - 5 freq
theii - 1 freq
tae - 5 freq
thy - 2 freq
to - 4 freq
t - 1 freq
tae - 1 freq
t - 3 freq
tae - 1 freq
too - 1 freq
-t - 3 freq
'tooooo' - 1 freq
they - 2 freq
twa - 1 freq
thaa - 1 freq
tuyu - 1 freq
- 1 freq
‘to - 2 freq
tii - 1 freq
‘the - 2 freq
theaa - 1 freq
thea - 1 freq
the… - 1 freq
the“i” - 1 freq
“the - 4 freq
tooo - 1 freq
tuo - 1 freq
“to - 1 freq
“they - 1 freq
“tae - 2 freq
tiu - 1 freq
“tea - 1 freq
- 1 freq
tth - 1 freq
tew - 1 freq
tea’ - 1 freq
tthe - 1 freq
tuw - 1 freq
tuu - 1 freq
'teu' - 1 freq
ta” - 1 freq
theo - 1 freq
teo - 1 freq
MetaPhone code - 0
the - 157218 freq
they - 11452 freq
though - 1213 freq
thae - 1233 freq
tho - 1083 freq
'the - 355 freq
tha - 6295 freq
'they - 49 freq
'thae - 1 freq
thou - 95 freq
thay - 706 freq
th - 2479 freq
they' - 13 freq
thai - 445 freq
the- - 2 freq
'tho - 1 freq
thy - 97 freq
thow - 2 freq
they-eh - 1 freq
the' - 572 freq
tho' - 48 freq
th' - 107 freq
'the' - 6 freq
thi - 2576 freq
thay' - 1 freq
thee - 234 freq
thu - 23 freq
thee' - 1 freq
thei - 5 freq
thé - 1 freq
thaw - 11 freq
tha' - 11 freq
thoo - 277 freq
theiy - 6 freq
'they' - 2 freq
hythe - 2 freq
they'u - 1 freq
'tha - 16 freq
thé - 1 freq
'th - 1 freq
'th- - 1 freq
°tha - 1 freq
'th' - 1 freq
'thou' - 2 freq
'thee' - 2 freq
'thy' - 2 freq
thie - 8 freq
'though - 2 freq
thoo' - 1 freq
'thy - 3 freq
thoa - 4 freq
the - 177 freq
theh - 1 freq
the - 108 freq
tho - 2 freq
they - 24 freq
tho - 1 freq
th - 2 freq
thay - 2 freq
wyth - 1 freq
thew - 1 freq
the - 3 freq
the - 6 freq
the - 8 freq
they - 2 freq
thai - 6 freq
thae - 2 freq
they - 47 freq
thae - 6 freq
thay - 2 freq
thoo - 2 freq
thee - 1 freq
the - 4 freq
thoo - 30 freq
tha - 1 freq
the - 1 freq
thai - 2 freq
tha - 3 freq
though - 1 freq
theii - 1 freq
thy - 2 freq
though - 1 freq
they - 2 freq
thaa - 1 freq
‘the - 2 freq
theaa - 1 freq
thea - 1 freq
the… - 1 freq
the“i” - 1 freq
“the - 4 freq
“they - 1 freq
wth - 1 freq
hthy - 1 freq
theo - 1 freq
THAY
they - 11452 freq
dey - 1241 freq
they're - 749 freq
the - 157218 freq
thay - 706 freq
thay're - 20 freq
thai - 445 freq
they'd - 435 freq
Time to execute Levenshtein function - 0.215839 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.359798 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.028016 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.038014 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.001034 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.