A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example sonsie

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to they� in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
they'r (3) - 35 freq
theyeir (3) - 2 freq
they'v (3) - 6 freq
theyd (3) - 8 freq
they’d (3) - 3 freq
they-eh (3) - 1 freq
they'll (3) - 322 freq
they' (3) - 13 freq
theyve (3) - 4 freq
they'u (3) - 1 freq
they've (3) - 217 freq
theyll (3) - 5 freq
they'il (3) - 2 freq
they (3) - 11452 freq
they'lt (3) - 1 freq
they'm (3) - 20 freq
theyaw (3) - 1 freq
they'ed (3) - 1 freq
theym (3) - 27 freq
they're (3) - 749 freq
theyre (3) - 6 freq
they'd (3) - 435 freq
they'rg (3) - 1 freq
theif's (4) - 1 freq
thenk (4) - 23 freq
theyaw (6) - 1 freq
they'm (6) - 20 freq
they'lt (6) - 1 freq
they (6) - 11452 freq
they'ed (6) - 1 freq
theym (6) - 27 freq
they'rg (6) - 1 freq
they'd (6) - 435 freq
theyre (6) - 6 freq
they're (6) - 749 freq
they'il (6) - 2 freq
they'r (6) - 35 freq
theyll (6) - 5 freq
theyd (6) - 8 freq
they'v (6) - 6 freq
theyeir (6) - 2 freq
they-eh (6) - 1 freq
they’d (6) - 3 freq
they'll (6) - 322 freq
they'u (6) - 1 freq
they've (6) - 217 freq
theyve (6) - 4 freq
they' (6) - 13 freq
thed (7) - 1 freq
theehut (7) - 1 freq
SoundEx code - T000
the - 157218 freq
tae - 65006 freq
twa - 3495 freq
they - 11452 freq
tea - 573 freq
'tae - 47 freq
to - 4164 freq
two - 773 freq
thae - 1233 freq
tho - 1083 freq
'the - 355 freq
t - 5648 freq
ta - 2534 freq
tha - 6295 freq
'they - 49 freq
'thae - 1 freq
'ti - 3 freq
tt - 36 freq
tie - 88 freq
twae - 228 freq
thou - 95 freq
thay - 706 freq
towe - 21 freq
toy - 45 freq
ti - 4171 freq
th - 2479 freq
too - 1030 freq
they' - 13 freq
taw - 10 freq
tue - 9 freq
thai - 445 freq
tae- - 5 freq
the- - 2 freq
tea- - 5 freq
two' - 1 freq
tae-' - 2 freq
tia - 2 freq
'to - 7 freq
'tho - 1 freq
'twa - 13 freq
tee - 168 freq
tw - 6 freq
- 1 freq
thy - 97 freq
toi - 34 freq
tah - 2 freq
'toi - 1 freq
thow - 2 freq
toa - 2 freq
t'd - 2 freq
'tt - 4 freq
they-eh - 1 freq
the' - 572 freq
tow - 44 freq
t'ae - 1 freq
tho' - 48 freq
twaa - 32 freq
th' - 107 freq
'the' - 6 freq
tee-hee - 1 freq
toe - 22 freq
thi - 2576 freq
te - 1570 freq
thay' - 1 freq
thee - 234 freq
thu - 23 freq
tay - 186 freq
thee' - 1 freq
ta¢ - 6 freq
thei - 5 freq
thé - 1 freq
tae' - 3 freq
't - 23 freq
toue - 1 freq
thaw - 11 freq
tha' - 11 freq
thoo - 277 freq
theiy - 6 freq
tye - 9 freq
'they' - 2 freq
'ta' - 1 freq
tahhh - 1 freq
tih - 8 freq
toooo - 1 freq
'two - 3 freq
to' - 2 freq
tai - 1 freq
they'u - 1 freq
t-t-twa - 1 freq
t' - 2 freq
thowe - 4 freq
'tha - 16 freq
t'a - 17 freq
thé - 1 freq
ty - 7 freq
theyaw - 1 freq
'th - 1 freq
twae' - 1 freq
tthey - 1 freq
too-whoo - 1 freq
teu - 29 freq
taa - 3 freq
twa' - 1 freq
td - 9 freq
'th- - 1 freq
ït - 331 freq
°tha - 1 freq
twee - 5 freq
't'd - 1 freq
t'da - 5 freq
twi - 1 freq
öt - 7 freq
- 19 freq
ta-' - 1 freq
ta- - 1 freq
tu - 23 freq
'ta - 2 freq
twa-wey - 6 freq
'th' - 1 freq
'thou' - 2 freq
'thee' - 2 freq
'thy' - 2 freq
'to' - 1 freq
t-hah - 1 freq
thie - 8 freq
'tae' - 1 freq
tau - 2 freq
thoo' - 1 freq
'too - 1 freq
t'wo - 1 freq
t'tow - 1 freq
- 3 freq
'thy - 3 freq
t'die - 1 freq
thoa - 4 freq
two'why - 1 freq
taé - 1 freq
t - 2 freq
t - 6 freq
þæt - 2 freq
t - 1 freq
teh - 1 freq
tao - 2 freq
tey - 3 freq
tua - 2 freq
the - 177 freq
tae'a - 1 freq
- 1 freq
t - 693 freq
theh - 1 freq
the - 108 freq
tho - 2 freq
ti - 2 freq
they - 24 freq
tho - 1 freq
th - 2 freq
to - 17 freq
thay - 2 freq
thew - 1 freq
t - 2 freq
the - 3 freq
tae - 6 freq
twa - 1 freq
the - 6 freq
the - 8 freq
töd - 1 freq
taew - 1 freq
they - 2 freq
thai - 6 freq
thae - 2 freq
ti - 2 freq
they - 47 freq
tu - 2 freq
thae - 6 freq
thay - 2 freq
-to- - 1 freq
'tea' - 1 freq
'te - 1 freq
thoo - 2 freq
thee - 1 freq
the - 4 freq
thoo - 30 freq
too - 2 freq
to - 1 freq
tha - 1 freq
the - 1 freq
toh - 4 freq
tui - 2 freq
two - 2 freq
thai - 2 freq
twa - 2 freq
tha - 3 freq
tae - 1 freq
tei - 5 freq
theii - 1 freq
tae - 5 freq
thy - 2 freq
to - 4 freq
t - 1 freq
tae - 1 freq
t - 3 freq
tae - 1 freq
too - 1 freq
-t - 3 freq
'tooooo' - 1 freq
they - 2 freq
twa - 1 freq
thaa - 1 freq
tuyu - 1 freq
- 1 freq
‘to - 2 freq
tii - 1 freq
‘the - 2 freq
theaa - 1 freq
thea - 1 freq
the… - 1 freq
the“i” - 1 freq
“the - 4 freq
tooo - 1 freq
tuo - 1 freq
“to - 1 freq
“they - 1 freq
“tae - 2 freq
tiu - 1 freq
“tea - 1 freq
- 1 freq
tth - 1 freq
tew - 1 freq
tea’ - 1 freq
tthe - 1 freq
tuw - 1 freq
tuu - 1 freq
'teu' - 1 freq
ta” - 1 freq
theo - 1 freq
teo - 1 freq
MetaPhone code - 0
the - 157218 freq
they - 11452 freq
though - 1213 freq
thae - 1233 freq
tho - 1083 freq
'the - 355 freq
tha - 6295 freq
'they - 49 freq
'thae - 1 freq
thou - 95 freq
thay - 706 freq
th - 2479 freq
they' - 13 freq
thai - 445 freq
the- - 2 freq
'tho - 1 freq
thy - 97 freq
thow - 2 freq
they-eh - 1 freq
the' - 572 freq
tho' - 48 freq
th' - 107 freq
'the' - 6 freq
thi - 2576 freq
thay' - 1 freq
thee - 234 freq
thu - 23 freq
thee' - 1 freq
thei - 5 freq
thé - 1 freq
thaw - 11 freq
tha' - 11 freq
thoo - 277 freq
theiy - 6 freq
'they' - 2 freq
hythe - 2 freq
they'u - 1 freq
'tha - 16 freq
thé - 1 freq
'th - 1 freq
'th- - 1 freq
°tha - 1 freq
'th' - 1 freq
'thou' - 2 freq
'thee' - 2 freq
'thy' - 2 freq
thie - 8 freq
'though - 2 freq
thoo' - 1 freq
'thy - 3 freq
thoa - 4 freq
the - 177 freq
theh - 1 freq
the - 108 freq
tho - 2 freq
they - 24 freq
tho - 1 freq
th - 2 freq
thay - 2 freq
wyth - 1 freq
thew - 1 freq
the - 3 freq
the - 6 freq
the - 8 freq
they - 2 freq
thai - 6 freq
thae - 2 freq
they - 47 freq
thae - 6 freq
thay - 2 freq
thoo - 2 freq
thee - 1 freq
the - 4 freq
thoo - 30 freq
tha - 1 freq
the - 1 freq
thai - 2 freq
tha - 3 freq
though - 1 freq
theii - 1 freq
thy - 2 freq
though - 1 freq
they - 2 freq
thaa - 1 freq
‘the - 2 freq
theaa - 1 freq
thea - 1 freq
the… - 1 freq
the“i” - 1 freq
“the - 4 freq
“they - 1 freq
wth - 1 freq
hthy - 1 freq
theo - 1 freq
THEY�
Time to execute Levenshtein function - 0.233522 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.366866 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.029867 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.038471 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.001072 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.