A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ahint

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to hthy in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
hthy (0) - 1 freq
'thy (1) - 3 freq
thy (1) - 97 freq
hhy (1) - 1 freq
hahn (2) - 2 freq
thyr (2) - 6 freq
thyn (2) - 1 freq
'shy (2) - 1 freq
shy (2) - 51 freq
itchy (2) - 36 freq
thc (2) - 3 freq
why (2) - 804 freq
lthv (2) - 2 freq
'they (2) - 48 freq
hly (2) - 1 freq
'th- (2) - 1 freq
hoey (2) - 1 freq
hehe (2) - 3 freq
heh (2) - 41 freq
hoh (2) - 1 freq
hilthy (2) - 1 freq
thy (2) - 2 freq
ths (2) - 5 freq
hy (2) - 5 freq
hay (2) - 124 freq
hthy (0) - 1 freq
heth (2) - 25 freq
'thy (2) - 3 freq
hythe (2) - 1 freq
hhy (2) - 1 freq
thy (2) - 97 freq
hath (2) - 3 freq
atho (3) - 3 freq
thay (3) - 703 freq
kathy (3) - 4 freq
eth (3) - 8 freq
hkh (3) - 1 freq
htp (3) - 1 freq
htew (3) - 1 freq
pathy (3) - 1 freq
heath (3) - 2 freq
cathy (3) - 112 freq
th (3) - 2472 freq
'tha (3) - 16 freq
ht (3) - 8 freq
tha (3) - 6292 freq
'tho (3) - 1 freq
haha (3) - 65 freq
huh (3) - 15 freq
htt (3) - 1 freq
SoundEx code - H300
heid - 3256 freq
had - 4968 freq
hot - 203 freq
haud - 913 freq
he'd - 960 freq
head - 247 freq
het - 260 freq
hit - 1222 freq
'haud - 40 freq
heed - 348 freq
hid - 3407 freq
haed - 1597 freq
hoat - 50 freq
heid'd - 2 freq
hate - 188 freq
hide - 185 freq
heiddae - 2 freq
hidtae - 4 freq
hat - 176 freq
howd - 1 freq
hyte - 3 freq
hoyed - 8 freq
hud - 1280 freq
heat - 161 freq
hut - 85 freq
huda - 1 freq
hood - 36 freq
hed - 1101 freq
hae't - 8 freq
hawd - 11 freq
hod - 12 freq
heidie - 70 freq
'hudd - 3 freq
hudd - 3 freq
hath - 3 freq
hait - 16 freq
haad - 71 freq
hyde - 16 freq
haet - 66 freq
heady - 6 freq
howdie - 10 freq
hied - 16 freq
haut - 3 freq
haddie' - 2 freq
hidie - 1 freq
hout - 1 freq
hoot - 28 freq
'hit - 1 freq
hett - 35 freq
heidy - 16 freq
haddie - 9 freq
hoodie - 16 freq
hoo'd - 4 freq
heth - 25 freq
haetae - 7 freq
haitd - 1 freq
heid' - 4 freq
heet - 2 freq
hyed - 1 freq
'hd - 1 freq
how'd - 7 freq
howed - 2 freq
hot' - 1 freq
howdya - 1 freq
'he'd - 6 freq
hd - 4 freq
hei'd - 2 freq
heidwey - 4 freq
hid' - 1 freq
ht - 8 freq
hetty - 1 freq
hïd - 6 freq
'hoat - 1 freq
howdy - 1 freq
hae'd - 1 freq
het' - 1 freq
hi-doh - 6 freq
hadd - 70 freq
'hit' - 2 freq
hoid - 10 freq
heedtae - 1 freq
hyt - 1 freq
'het - 2 freq
'hat - 1 freq
hede - 4 freq
hatt - 1 freq
huid - 2 freq
hade - 1 freq
'hide - 1 freq
hedd - 117 freq
hoody - 1 freq
hidey - 2 freq
'head - 1 freq
hideawa - 1 freq
haid - 31 freq
heid-a - 1 freq
headie - 1 freq
heath - 2 freq
hythe - 1 freq
'hidie' - 1 freq
heid- - 1 freq
hie-heid - 3 freq
heywood - 1 freq
heute - 1 freq
huidie - 2 freq
hewitt - 3 freq
'had - 1 freq
hoidey - 1 freq
hud - 1 freq
hued - 1 freq
'hood - 1 freq
headwye - 1 freq
haud - 4 freq
hud - 1 freq
hit - 9 freq
heth - 1 freq
hid - 2 freq
hud - 1 freq
huttie - 1 freq
hid - 2 freq
huddie - 2 freq
hte - 1 freq
hyde' - 1 freq
hide - 1 freq
heedy - 3 freq
hiddae - 3 freq
howdoo - 1 freq
'heid - 2 freq
hattie - 1 freq
how-d - 1 freq
hout - 1 freq
howt - 1 freq
het - 1 freq
hyd - 7 freq
head - 1 freq
hadd - 3 freq
heyd - 1 freq
haaed - 1 freq
heidwie - 1 freq
hd - 1 freq
hi'd - 2 freq
hoad - 1 freq
'hud - 1 freq
he’d - 3 freq
haute - 1 freq
hoodoo - 1 freq
haddo - 1 freq
hudduo - 1 freq
heidi - 1 freq
“had - 1 freq
hthy - 1 freq
howty - 1 freq
hudty - 1 freq
htew - 1 freq
hudtae - 1 freq
htt - 1 freq
heyday - 1 freq
'had' - 1 freq
MetaPhone code - 0
the - 154319 freq
they - 11266 freq
though - 1167 freq
thae - 1219 freq
tho - 1074 freq
'the - 348 freq
tha - 6292 freq
'they - 48 freq
'thae - 1 freq
thou - 95 freq
thay - 703 freq
th - 2472 freq
they' - 13 freq
thai - 445 freq
the- - 2 freq
'tho - 1 freq
thy - 97 freq
thow - 2 freq
they-eh - 1 freq
the' - 572 freq
tho' - 48 freq
th' - 105 freq
'the' - 6 freq
thi - 2576 freq
thay' - 1 freq
thee - 233 freq
thu - 23 freq
thee' - 1 freq
thei - 5 freq
thé - 1 freq
thaw - 11 freq
tha' - 11 freq
thoo - 277 freq
theiy - 6 freq
'they' - 2 freq
they'u - 1 freq
'tha - 16 freq
thé - 1 freq
'th - 1 freq
'th- - 1 freq
°tha - 1 freq
'th' - 1 freq
'thou' - 2 freq
'thee' - 2 freq
'thy' - 2 freq
thie - 8 freq
'though - 2 freq
thoo' - 1 freq
'thy - 3 freq
hythe - 1 freq
thoa - 4 freq
the - 177 freq
theh - 1 freq
the - 108 freq
tho - 2 freq
they - 24 freq
tho - 1 freq
th - 2 freq
thay - 2 freq
wyth - 1 freq
thew - 1 freq
the - 3 freq
the - 6 freq
the - 8 freq
they - 2 freq
thai - 6 freq
thae - 2 freq
they - 47 freq
thae - 6 freq
thay - 2 freq
thoo - 2 freq
thee - 1 freq
the - 4 freq
thoo - 30 freq
tha - 1 freq
the - 1 freq
thai - 2 freq
tha - 3 freq
though - 1 freq
theii - 1 freq
thy - 2 freq
though - 1 freq
they - 2 freq
thaa - 1 freq
‘the - 2 freq
theaa - 1 freq
thea - 1 freq
the… - 1 freq
the“i” - 1 freq
“the - 4 freq
“they - 1 freq
wth - 1 freq
hthy - 1 freq
theo - 1 freq
HTHY
Time to execute Levenshtein function - 0.186173 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.342368 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.027735 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.037149 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000820 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.