Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to texters in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
texters (0) - 1 freq tatters (2) - 5 freq torters (2) - 1 freq textur (2) - 1 freq teers (2) - 4 freq reuters (2) - 6 freq peters (2) - 4 freq texes (2) - 3 freq sesters (2) - 3 freq eaters (2) - 2 freq teigers (2) - 2 freq betters (2) - 6 freq refters (2) - 1 freq merters (2) - 1 freq sterters (2) - 1 freq letters (2) - 142 freq mixters (2) - 2 freq efters (2) - 1 freq belters (2) - 6 freq teeter (2) - 1 freq beaters (2) - 2 freq lecters (2) - 1 freq penters (2) - 2 freq oxters (2) - 46 freq texture (2) - 9 freq	texters (0) - 1 freq textures (2) - 2 freq baxters (3) - 4 freq texts (3) - 101 freq mixters (3) - 2 freq oxters (3) - 46 freq titters (3) - 1 freq texture (3) - 9 freq tweeters (3) - 1 freq textur (3) - 1 freq text's (3) - 1 freq torters (3) - 1 freq tasters (3) - 1 freq tatters (3) - 5 freq peiters (4) - 1 freq techers (4) - 1 freq tempers (4) - 2 freq meters (4) - 12 freq heaters (4) - 5 freq dexter (4) - 4 freq kelters (4) - 1 freq pelters (4) - 9 freq extrees (4) - 1 freq feexturs (4) - 4 freq tartars (4) - 1 freq	SoundEx code - T236 thegither - 842 freq taegither - 44 freq thegither' - 4 freq thigether - 5 freq the-streen - 1 freq tichter - 11 freq the-gither - 7 freq together - 55 freq thigither - 9 freq thegithir - 5 freq thegaither - 11 freq tight-arsed - 1 freq thegether - 44 freq tegither - 4 freq tegether - 1 freq thouchtr - 1 freq thegethir - 1 freq taegether - 32 freq taegcther - 1 freq teckie-drawin - 6 freq texture - 9 freq tighter - 4 freq tightrope - 2 freq thegeither - 1 freq twuster - 4 freq twustér - 2 freq thegither's - 2 freq th'gither - 13 freq th'gither' - 1 freq teuchters - 11 freq toaster - 6 freq textures - 2 freq taciturn - 1 freq thegaithir - 1 freq tax-gaitherers - 1 freq texters - 1 freq tagidder - 58 freq textured - 2 freq teuchter - 42 freq t'stoaries - 1 freq th'geither - 1 freq thegitherness - 2 freq tigithir - 7 freq tigithir' - 1 freq thegidder - 13 freq teuchterin - 1 freq tagedder - 2 freq thegither-- - 1 freq togither - 4 freq togydder - 1 freq ticht-reined - 1 freq taegethir - 1 freq 'thegither' - 1 freq tageedir - 1 freq taygither - 1 freq taygether - 1 freq tuechter - 2 freq thessehydro - 1 freq teuchtermusic - 1 freq tasters - 1 freq teuchtertoni - 15 freq teuchterontour - 1 freq togetherdarling - 1 freq taijdrmhw - 1 freq tigither - 2 freq textur - 1 freq thoughtyouwereinyirfavecity - 1 freq teuchterdavid - 1 freq thegither- - 1 freq twister - 2 freq thestranger - 4 freq thuckster - 14 freq tastier - 1 freq together' - 1 freq toastywarm - 1 freq twisterfilm - 1 freq	MetaPhone code - TKSTRS dexterous - 1 freq textures - 2 freq dextrous - 1 freq texters - 1 freq	TEXTERS
Time to execute Levenshtein function - 0.191123 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.355598 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.028978 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.038238 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000966 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics