A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ablow

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to they in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
they (0) - 11266 freq
thy (1) - 97 freq
chey (1) - 1 freq
thet (1) - 9 freq
theyd (1) - 10 freq
whey (1) - 16 freq
thed (1) - 1 freq
'hey (1) - 8 freq
thay (1) - 703 freq
tthey (1) - 1 freq
hey (1) - 157 freq
threy (1) - 3 freq
theiy (1) - 6 freq
thew (1) - 1 freq
they' (1) - 13 freq
thea (1) - 1 freq
the' (1) - 572 freq
theym (1) - 27 freq
tey (1) - 3 freq
thes (1) - 12 freq
theo (1) - 1 freq
the- (1) - 2 freq
thev (1) - 3 freq
then (1) - 4451 freq
theh (1) - 1 freq
they (0) - 11266 freq
thee (1) - 233 freq
thea (1) - 1 freq
theo (1) - 1 freq
the (1) - 154319 freq
thei (1) - 5 freq
thay (1) - 703 freq
theiy (1) - 6 freq
thy (1) - 97 freq
thi (2) - 2576 freq
teh (2) - 1 freq
th (2) - 2472 freq
theii (2) - 1 freq
othe (2) - 4 freq
ther (2) - 111 freq
theaa (2) - 1 freq
tha (2) - 6292 freq
thou (2) - 95 freq
tho (2) - 1074 freq
thae (2) - 1219 freq
thoa (2) - 4 freq
thoo (2) - 277 freq
thaa (2) - 1 freq
ithe (2) - 14 freq
thu (2) - 23 freq
SoundEx code - T000
the - 154319 freq
tae - 64038 freq
twa - 3470 freq
they - 11266 freq
tea - 560 freq
'tae - 47 freq
to - 4049 freq
two - 717 freq
thae - 1219 freq
tho - 1074 freq
'the - 348 freq
t - 5646 freq
ta - 2534 freq
tha - 6292 freq
'they - 48 freq
'thae - 1 freq
'ti - 3 freq
tt - 36 freq
tie - 88 freq
twae - 228 freq
thou - 95 freq
thay - 703 freq
towe - 21 freq
toy - 44 freq
ti - 4160 freq
th - 2472 freq
too - 992 freq
they' - 13 freq
taw - 9 freq
tue - 9 freq
thai - 445 freq
tae- - 5 freq
the- - 2 freq
tea- - 5 freq
two' - 1 freq
tae-' - 2 freq
tia - 2 freq
'to - 6 freq
'tho - 1 freq
'twa - 13 freq
tee - 165 freq
tw - 6 freq
- 1 freq
thy - 97 freq
toi - 34 freq
tah - 2 freq
'toi - 1 freq
thow - 2 freq
toa - 2 freq
t'd - 2 freq
'tt - 4 freq
they-eh - 1 freq
the' - 572 freq
tow - 44 freq
t'ae - 1 freq
tho' - 48 freq
twaa - 30 freq
th' - 105 freq
'the' - 6 freq
tee-hee - 1 freq
toe - 21 freq
thi - 2576 freq
te - 1569 freq
thay' - 1 freq
thee - 233 freq
thu - 23 freq
tay - 185 freq
thee' - 1 freq
ta¢ - 5 freq
thei - 5 freq
thé - 1 freq
tae' - 3 freq
't - 21 freq
toue - 1 freq
thaw - 11 freq
tha' - 11 freq
thoo - 277 freq
theiy - 6 freq
tye - 8 freq
'they' - 2 freq
to' - 2 freq
tai - 1 freq
they'u - 1 freq
t-t-twa - 1 freq
t' - 2 freq
thowe - 4 freq
'tha - 16 freq
t'a - 17 freq
thé - 1 freq
ty - 7 freq
theyaw - 1 freq
'th - 1 freq
twae' - 1 freq
tthey - 1 freq
too-whoo - 1 freq
teu - 29 freq
taa - 3 freq
twa' - 1 freq
td - 9 freq
'th- - 1 freq
ït - 331 freq
°tha - 1 freq
'two - 2 freq
twee - 5 freq
't'd - 1 freq
t'da - 5 freq
twi - 1 freq
öt - 7 freq
- 19 freq
ta-' - 1 freq
ta- - 1 freq
tu - 23 freq
'ta - 2 freq
twa-wey - 6 freq
'th' - 1 freq
'thou' - 2 freq
'thee' - 2 freq
'thy' - 2 freq
'to' - 1 freq
t-hah - 1 freq
thie - 8 freq
'tae' - 1 freq
tau - 2 freq
thoo' - 1 freq
'too - 1 freq
t'wo - 1 freq
t'tow - 1 freq
- 3 freq
'thy - 3 freq
t'die - 1 freq
thoa - 4 freq
two'why - 1 freq
taé - 1 freq
t - 2 freq
t - 6 freq
þæt - 2 freq
t - 1 freq
teh - 1 freq
tao - 2 freq
tey - 3 freq
tua - 2 freq
the - 177 freq
tae'a - 1 freq
tih - 7 freq
- 1 freq
t - 688 freq
theh - 1 freq
the - 108 freq
tho - 2 freq
ti - 2 freq
they - 24 freq
tho - 1 freq
th - 2 freq
to - 17 freq
thay - 2 freq
thew - 1 freq
t - 2 freq
the - 3 freq
tae - 6 freq
twa - 1 freq
the - 6 freq
the - 8 freq
töd - 1 freq
taew - 1 freq
they - 2 freq
thai - 6 freq
thae - 2 freq
ti - 2 freq
they - 47 freq
tu - 2 freq
thae - 6 freq
thay - 2 freq
-to- - 1 freq
'tea' - 1 freq
'te - 1 freq
thoo - 2 freq
thee - 1 freq
the - 4 freq
thoo - 30 freq
too - 2 freq
to - 1 freq
tha - 1 freq
the - 1 freq
toh - 4 freq
tui - 2 freq
two - 2 freq
thai - 2 freq
twa - 2 freq
tha - 3 freq
tae - 1 freq
tei - 5 freq
theii - 1 freq
tae - 5 freq
thy - 2 freq
to - 4 freq
t - 1 freq
tae - 1 freq
t - 3 freq
tae - 1 freq
too - 1 freq
-t - 3 freq
'tooooo' - 1 freq
they - 2 freq
twa - 1 freq
thaa - 1 freq
tuyu - 1 freq
- 1 freq
‘to - 2 freq
tii - 1 freq
‘the - 2 freq
theaa - 1 freq
thea - 1 freq
the… - 1 freq
the“i” - 1 freq
“the - 4 freq
tooo - 1 freq
tuo - 1 freq
“to - 1 freq
“they - 1 freq
“tae - 2 freq
tiu - 1 freq
“tea - 1 freq
- 1 freq
tth - 1 freq
tew - 1 freq
tea’ - 1 freq
tthe - 1 freq
tuw - 1 freq
tuu - 1 freq
'teu' - 1 freq
ta” - 1 freq
theo - 1 freq
teo - 1 freq
MetaPhone code - 0
the - 154319 freq
they - 11266 freq
though - 1167 freq
thae - 1219 freq
tho - 1074 freq
'the - 348 freq
tha - 6292 freq
'they - 48 freq
'thae - 1 freq
thou - 95 freq
thay - 703 freq
th - 2472 freq
they' - 13 freq
thai - 445 freq
the- - 2 freq
'tho - 1 freq
thy - 97 freq
thow - 2 freq
they-eh - 1 freq
the' - 572 freq
tho' - 48 freq
th' - 105 freq
'the' - 6 freq
thi - 2576 freq
thay' - 1 freq
thee - 233 freq
thu - 23 freq
thee' - 1 freq
thei - 5 freq
thé - 1 freq
thaw - 11 freq
tha' - 11 freq
thoo - 277 freq
theiy - 6 freq
'they' - 2 freq
they'u - 1 freq
'tha - 16 freq
thé - 1 freq
'th - 1 freq
'th- - 1 freq
°tha - 1 freq
'th' - 1 freq
'thou' - 2 freq
'thee' - 2 freq
'thy' - 2 freq
thie - 8 freq
'though - 2 freq
thoo' - 1 freq
'thy - 3 freq
hythe - 1 freq
thoa - 4 freq
the - 177 freq
theh - 1 freq
the - 108 freq
tho - 2 freq
they - 24 freq
tho - 1 freq
th - 2 freq
thay - 2 freq
wyth - 1 freq
thew - 1 freq
the - 3 freq
the - 6 freq
the - 8 freq
they - 2 freq
thai - 6 freq
thae - 2 freq
they - 47 freq
thae - 6 freq
thay - 2 freq
thoo - 2 freq
thee - 1 freq
the - 4 freq
thoo - 30 freq
tha - 1 freq
the - 1 freq
thai - 2 freq
tha - 3 freq
though - 1 freq
theii - 1 freq
thy - 2 freq
though - 1 freq
they - 2 freq
thaa - 1 freq
‘the - 2 freq
theaa - 1 freq
thea - 1 freq
the… - 1 freq
the“i” - 1 freq
“the - 4 freq
“they - 1 freq
wth - 1 freq
hthy - 1 freq
theo - 1 freq
THEY
they - 11266 freq
dey - 1241 freq
they're - 732 freq
the - 154319 freq
thay - 703 freq
thay're - 20 freq
thai - 445 freq
they'd - 430 freq
Time to execute Levenshtein function - 0.187436 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.321334 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.028161 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.036865 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000825 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.