A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example sonsie

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to cigarette in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
cigarette (0) - 21 freq
cigarettes (1) - 9 freq
ceegarette (2) - 1 freq
ceigarettes (2) - 1 freq
cígarette (2) - 1 freq
gazette (3) - 3 freq
ungaretti (3) - 1 freq
garotte (3) - 1 freq
ceegarettes (3) - 7 freq
claethe (4) - 1 freq
cretter (4) - 1 freq
gretter (4) - 2 freq
vignettes (4) - 1 freq
clearest (4) - 4 freq
clarence (4) - 2 freq
grete (4) - 2 freq
margarete (4) - 1 freq
concrete (4) - 60 freq
chatte (4) - 6 freq
majorette (4) - 2 freq
carte (4) - 4 freq
lafayette (4) - 3 freq
cabaret (4) - 2 freq
claudette (4) - 1 freq
discrete (4) - 3 freq
cigarette (0) - 21 freq
ceegarette (2) - 1 freq
cigarettes (2) - 9 freq
ceigarettes (3) - 1 freq
ceegarettes (4) - 7 freq
garotte (4) - 1 freq
ungaretti (4) - 1 freq
cígarette (4) - 1 freq
grett (5) - 34 freq
gazette (5) - 3 freq
regretted (6) - 4 freq
charitie (6) - 2 freq
gavotte (6) - 1 freq
cartie (6) - 1 freq
crete (6) - 14 freq
courgette (6) - 1 freq
ciabatta (6) - 1 freq
lazaretto (6) - 1 freq
migrate (6) - 1 freq
claret (6) - 8 freq
grott (6) - 1 freq
greitt (6) - 1 freq
gritty (6) - 5 freq
grottie (6) - 1 freq
gritt (6) - 1 freq
SoundEx code - C263
cigarette - 21 freq
cigarettes - 9 freq
cowk-wirthy - 1 freq
choochert - 1 freq
chequered - 1 freq
checquered - 1 freq
ceegarettes - 7 freq
ceegarette - 1 freq
ceigarettes - 1 freq
MetaPhone code - SKRT
secret - 198 freq
scared - 45 freq
screed - 32 freq
skreid - 18 freq
scairt - 6 freq
skirt - 56 freq
skairt - 1 freq
saicret - 37 freq
skeert - 1 freq
cigarette - 21 freq
security - 55 freq
scart - 23 freq
sigurd - 40 freq
skyrit - 1 freq
sacred - 26 freq
scurrit - 2 freq
scarred - 15 freq
sugart - 1 freq
squared - 6 freq
scrat - 31 freq
scoured - 5 freq
saucrit - 6 freq
scaured - 4 freq
scourit - 2 freq
squarrt - 1 freq
sukkert - 1 freq
squirt - 4 freq
scoort - 2 freq
sacrit - 4 freq
so-cried - 1 freq
scratty - 4 freq
scoored - 3 freq
security' - 1 freq
scaredy - 1 freq
scurried - 5 freq
scored - 41 freq
scort - 1 freq
socried - 1 freq
sacret - 1 freq
secured - 4 freq
saicred - 2 freq
skrit - 6 freq
skurt - 4 freq
scrit - 6 freq
sae-cried - 3 freq
scar'd - 1 freq
skreed - 1 freq
skord - 1 freq
skart - 2 freq
scaird - 1 freq
securit - 2 freq
skurried - 1 freq
screid - 8 freq
seecrit - 1 freq
scaurt - 2 freq
securitie - 3 freq
skared - 1 freq
skaired - 6 freq
saecret - 1 freq
ceegarette - 1 freq
sugared - 1 freq
seicret - 3 freq
sikkart - 1 freq
scurred - 2 freq
skyred - 1 freq
scored- - 1 freq
scrote - 1 freq
zqqrd - 1 freq
CIGARETTE
Time to execute Levenshtein function - 0.186996 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.346591 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.027519 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.037820 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000949 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.