Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to tastin in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
tastin (0) - 13 freq tasting (1) - 5 freq castin (1) - 48 freq testin (1) - 15 freq pastin (1) - 2 freq bastin (1) - 1 freq tistin (1) - 1 freq tostin (1) - 1 freq fastin (1) - 14 freq tastit (1) - 29 freq wastin (1) - 32 freq lastin (1) - 9 freq toastin (1) - 2 freq tastie (1) - 2 freq hoastin (2) - 18 freq sautin (2) - 1 freq statin (2) - 12 freq cashin (2) - 1 freq wantin (2) - 321 freq cassin (2) - 4 freq rasin (2) - 1 freq tuttin (2) - 5 freq oaftin (2) - 1 freq rantin (2) - 9 freq boastin (2) - 2 freq	tastin (0) - 13 freq tostin (1) - 1 freq tistin (1) - 1 freq testin (1) - 15 freq toastin (1) - 2 freq tostan (2) - 1 freq taisten (2) - 1 freq testan (2) - 1 freq tasting (2) - 5 freq lastin (2) - 9 freq tastie (2) - 2 freq pastin (2) - 2 freq castin (2) - 48 freq wastin (2) - 32 freq bastin (2) - 1 freq fastin (2) - 14 freq tastit (2) - 29 freq postin (3) - 21 freq lustin (3) - 1 freq toastit (3) - 5 freq ristin (3) - 3 freq taisin (3) - 1 freq fasten (3) - 2 freq kistin (3) - 3 freq tasted (3) - 20 freq	SoundEx code - T235 twistin - 14 freq tichtened - 9 freq testament - 42 freq testimonials - 1 freq tichtent - 3 freq testin - 15 freq textin - 8 freq tight-mouthed - 1 freq testimony - 5 freq tastin - 13 freq tightens - 2 freq testimonie - 4 freq toastin - 2 freq taisten - 1 freq ticht-and - 1 freq twustin - 3 freq taxation - 3 freq tichten - 3 freq tighten - 4 freq testing - 12 freq 'testin - 1 freq thocht-on - 1 freq tightened - 2 freq tweistin - 1 freq testan - 1 freq tea-stained - 1 freq tightenin - 2 freq tostan - 1 freq tightnan - 2 freq testaments - 4 freq ticht-nailed - 1 freq thochtiness - 1 freq tectonic - 2 freq testimonial - 3 freq tichtens - 4 freq tichtness - 2 freq tostin - 1 freq tichtenin - 2 freq ��testament - 1 freq tasting - 5 freq tistin - 1 freq tightness - 1 freq toastin' - 1 freq twisting - 1 freq toasting - 3 freq tightening - 1 freq texting - 1 freq thesatinepheonixcardisgorgeous - 1 freq tasteohame - 1 freq	MetaPhone code - TSTN decidin - 17 freq testin - 15 freq destiny - 12 freq dustin - 12 freq tastin - 13 freq disdain - 10 freq toastin - 2 freq taisten - 1 freq distain - 1 freq destinie - 3 freq decidein - 1 freq distanee - 2 freq 'testin - 1 freq testan - 1 freq tostan - 1 freq desydin - 1 freq tostin - 1 freq tistin - 1 freq ��destiny - 1 freq deistin - 3 freq toastin' - 1 freq	TASTIN
Time to execute Levenshtein function - 0.168018 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.368926 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.028446 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.038387 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000844 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics