Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to timin in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
timin (0) - 6 freq tirin (1) - 4 freq gimin (1) - 2 freq timit (1) - 1 freq tizin (1) - 3 freq timing (1) - 1 freq cimin (1) - 1 freq wimin (1) - 1 freq mimin (1) - 2 freq aimin (1) - 10 freq tiein (1) - 1 freq tuimin (1) - 2 freq teimin (1) - 1 freq timid (1) - 16 freq mimic (2) - 3 freq typin (2) - 14 freq sivin (2) - 4 freq tistin (2) - 1 freq mimij (2) - 1 freq fumin (2) - 8 freq trimlin (2) - 1 freq tyin (2) - 12 freq ainin (2) - 1 freq imn (2) - 2 freq twain (2) - 1 freq	timin (0) - 6 freq teimin (1) - 1 freq tuimin (1) - 2 freq tirin (2) - 4 freq timit (2) - 1 freq tomini (2) - 2 freq tman (2) - 1 freq teemin (2) - 26 freq timeen (2) - 1 freq gimin (2) - 2 freq timid (2) - 16 freq timing (2) - 1 freq tizin (2) - 3 freq cimin (2) - 1 freq wimin (2) - 1 freq aimin (2) - 10 freq mimin (2) - 2 freq tiein (2) - 1 freq tims (3) - 8 freq aitin (3) - 21 freq wumin (3) - 5 freq tooin (3) - 1 freq tim (3) - 46 freq amin (3) - 1 freq tion (3) - 1 freq	SoundEx code - T550 them-an - 4 freq thin-an - 2 freq teemin - 26 freq twynin - 6 freq thoomin - 2 freq thaim-an - 2 freq tuimin - 2 freq tinnin - 1 freq teeman - 2 freq tuinin - 1 freq teemen - 2 freq timin - 6 freq thinnin - 1 freq thum-an - 2 freq tannin - 8 freq time-in - 1 freq tenon - 1 freq thim-an - 2 freq tunin - 10 freq tin-an - 1 freq teename - 1 freq tynin - 20 freq timeen - 1 freq thinnan - 1 freq tuinan - 1 freq teuman - 1 freq taen-on - 3 freq tømin - 1 freq tømmin - 1 freq twinin - 6 freq taen-in - 1 freq twinnin - 1 freq tomini - 2 freq toun-en - 1 freq ten-man - 1 freq t-name - 1 freq teimin - 1 freq teenie-weeny - 1 freq teenie-weenie - 1 freq two-man - 1 freq tainan - 1 freq tunin' - 1 freq tonimo - 18 freq tman - 1 freq tinman - 1 freq tinwnnh - 1 freq	MetaPhone code - TMN domain - 18 freq dominie - 66 freq damn - 73 freq daimen - 4 freq teemin - 26 freq demon - 22 freq demn - 2 freq dimmin - 3 freq tuimin - 2 freq teeman - 2 freq teemen - 2 freq timin - 6 freq dumbin - 2 freq damien - 2 freq demaun - 1 freq time-in - 1 freq dimnae - 1 freq 'damn - 3 freq diamon - 2 freq dem-an - 1 freq deman - 1 freq timeen - 1 freq teuman - 1 freq domino - 10 freq tømin - 1 freq tømmin - 1 freq domini - 1 freq tomini - 2 freq teimin - 1 freq ��damn - 3 freq teamni - 1 freq domine - 2 freq tman - 1 freq damian - 2 freq demean - 1 freq dmin - 1 freq	TIMIN
Time to execute Levenshtein function - 0.195648 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.324358 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.026993 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.037349 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000793 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics