A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ablow

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to twaa in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
twaa (0) - 30 freq
twal (1) - 120 freq
twar (1) - 2 freq
thaa (1) - 1 freq
waa (1) - 217 freq
twae (1) - 228 freq
twaal (1) - 6 freq
twa' (1) - 1 freq
twa (1) - 3470 freq
taa (1) - 3 freq
awaa (1) - 99 freq
twat (1) - 11 freq
swaa (1) - 2 freq
twad (1) - 1 freq
twas (1) - 16 freq
waq (2) - 1 freq
wah (2) - 11 freq
twix (2) - 1 freq
twain (2) - 1 freq
tka (2) - 1 freq
twun (2) - 1 freq
awta (2) - 1 freq
twos (2) - 4 freq
b'aa (2) - 1 freq
tara (2) - 16 freq
twaa (0) - 30 freq
twa (1) - 3470 freq
twae (1) - 228 freq
tw (2) - 6 freq
twas (2) - 16 freq
swaa (2) - 2 freq
two (2) - 717 freq
taw (2) - 9 freq
twi (2) - 1 freq
twat (2) - 11 freq
twee (2) - 5 freq
twad (2) - 1 freq
awaa (2) - 99 freq
twaal (2) - 6 freq
waa (2) - 217 freq
thaa (2) - 1 freq
twa' (2) - 1 freq
twal (2) - 120 freq
taa (2) - 3 freq
twar (2) - 2 freq
tax (3) - 60 freq
toap (3) - 37 freq
twid (3) - 1 freq
tina (3) - 17 freq
tay (3) - 185 freq
SoundEx code - T000
the - 154319 freq
tae - 64038 freq
twa - 3470 freq
they - 11266 freq
tea - 560 freq
'tae - 47 freq
to - 4049 freq
two - 717 freq
thae - 1219 freq
tho - 1074 freq
'the - 348 freq
t - 5646 freq
ta - 2534 freq
tha - 6292 freq
'they - 48 freq
'thae - 1 freq
'ti - 3 freq
tt - 36 freq
tie - 88 freq
twae - 228 freq
thou - 95 freq
thay - 703 freq
towe - 21 freq
toy - 44 freq
ti - 4160 freq
th - 2472 freq
too - 992 freq
they' - 13 freq
taw - 9 freq
tue - 9 freq
thai - 445 freq
tae- - 5 freq
the- - 2 freq
tea- - 5 freq
two' - 1 freq
tae-' - 2 freq
tia - 2 freq
'to - 6 freq
'tho - 1 freq
'twa - 13 freq
tee - 165 freq
tw - 6 freq
- 1 freq
thy - 97 freq
toi - 34 freq
tah - 2 freq
'toi - 1 freq
thow - 2 freq
toa - 2 freq
t'd - 2 freq
'tt - 4 freq
they-eh - 1 freq
the' - 572 freq
tow - 44 freq
t'ae - 1 freq
tho' - 48 freq
twaa - 30 freq
th' - 105 freq
'the' - 6 freq
tee-hee - 1 freq
toe - 21 freq
thi - 2576 freq
te - 1569 freq
thay' - 1 freq
thee - 233 freq
thu - 23 freq
tay - 185 freq
thee' - 1 freq
ta¢ - 5 freq
thei - 5 freq
thé - 1 freq
tae' - 3 freq
't - 21 freq
toue - 1 freq
thaw - 11 freq
tha' - 11 freq
thoo - 277 freq
theiy - 6 freq
tye - 8 freq
'they' - 2 freq
to' - 2 freq
tai - 1 freq
they'u - 1 freq
t-t-twa - 1 freq
t' - 2 freq
thowe - 4 freq
'tha - 16 freq
t'a - 17 freq
thé - 1 freq
ty - 7 freq
theyaw - 1 freq
'th - 1 freq
twae' - 1 freq
tthey - 1 freq
too-whoo - 1 freq
teu - 29 freq
taa - 3 freq
twa' - 1 freq
td - 9 freq
'th- - 1 freq
ït - 331 freq
°tha - 1 freq
'two - 2 freq
twee - 5 freq
't'd - 1 freq
t'da - 5 freq
twi - 1 freq
öt - 7 freq
- 19 freq
ta-' - 1 freq
ta- - 1 freq
tu - 23 freq
'ta - 2 freq
twa-wey - 6 freq
'th' - 1 freq
'thou' - 2 freq
'thee' - 2 freq
'thy' - 2 freq
'to' - 1 freq
t-hah - 1 freq
thie - 8 freq
'tae' - 1 freq
tau - 2 freq
thoo' - 1 freq
'too - 1 freq
t'wo - 1 freq
t'tow - 1 freq
- 3 freq
'thy - 3 freq
t'die - 1 freq
thoa - 4 freq
two'why - 1 freq
taé - 1 freq
t - 2 freq
t - 6 freq
þæt - 2 freq
t - 1 freq
teh - 1 freq
tao - 2 freq
tey - 3 freq
tua - 2 freq
the - 177 freq
tae'a - 1 freq
tih - 7 freq
- 1 freq
t - 688 freq
theh - 1 freq
the - 108 freq
tho - 2 freq
ti - 2 freq
they - 24 freq
tho - 1 freq
th - 2 freq
to - 17 freq
thay - 2 freq
thew - 1 freq
t - 2 freq
the - 3 freq
tae - 6 freq
twa - 1 freq
the - 6 freq
the - 8 freq
töd - 1 freq
taew - 1 freq
they - 2 freq
thai - 6 freq
thae - 2 freq
ti - 2 freq
they - 47 freq
tu - 2 freq
thae - 6 freq
thay - 2 freq
-to- - 1 freq
'tea' - 1 freq
'te - 1 freq
thoo - 2 freq
thee - 1 freq
the - 4 freq
thoo - 30 freq
too - 2 freq
to - 1 freq
tha - 1 freq
the - 1 freq
toh - 4 freq
tui - 2 freq
two - 2 freq
thai - 2 freq
twa - 2 freq
tha - 3 freq
tae - 1 freq
tei - 5 freq
theii - 1 freq
tae - 5 freq
thy - 2 freq
to - 4 freq
t - 1 freq
tae - 1 freq
t - 3 freq
tae - 1 freq
too - 1 freq
-t - 3 freq
'tooooo' - 1 freq
they - 2 freq
twa - 1 freq
thaa - 1 freq
tuyu - 1 freq
- 1 freq
‘to - 2 freq
tii - 1 freq
‘the - 2 freq
theaa - 1 freq
thea - 1 freq
the… - 1 freq
the“i” - 1 freq
“the - 4 freq
tooo - 1 freq
tuo - 1 freq
“to - 1 freq
“they - 1 freq
“tae - 2 freq
tiu - 1 freq
“tea - 1 freq
- 1 freq
tth - 1 freq
tew - 1 freq
tea’ - 1 freq
tthe - 1 freq
tuw - 1 freq
tuu - 1 freq
'teu' - 1 freq
ta” - 1 freq
theo - 1 freq
teo - 1 freq
MetaPhone code - TW
twa - 3470 freq
two - 717 freq
twae - 228 freq
towe - 21 freq
two' - 1 freq
dowie - 119 freq
'twa - 13 freq
dowe - 1 freq
twaa - 30 freq
twae' - 1 freq
twa' - 1 freq
'two - 2 freq
twee - 5 freq
twi - 1 freq
t'wo - 1 freq
two'why - 1 freq
twa - 1 freq
two - 2 freq
twa - 2 freq
dewie - 1 freq
twa - 1 freq
dwa - 1 freq
TWAA
Time to execute Levenshtein function - 0.812547 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 1.073835 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.027820 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.172549 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000863 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.