Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to thickness in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
thickness (0) - 7 freq thickest (2) - 1 freq sickness (2) - 8 freq 'sickness (2) - 1 freq thinness (2) - 1 freq thickens (2) - 1 freq harkness (3) - 2 freq trackless (3) - 1 freq highness (3) - 1 freq rockness (3) - 1 freq chuckneys (3) - 2 freq dairkness (3) - 16 freq richness (3) - 6 freq thicket (3) - 1 freq tidiness (3) - 1 freq epicness (3) - 1 freq swackness (3) - 1 freq tuimness (3) - 6 freq seikness (3) - 7 freq thinnest (3) - 2 freq bleckness (3) - 3 freq thinkers (3) - 6 freq likness (3) - 1 freq whiteness (3) - 9 freq chiefness (3) - 1 freq	thickness (0) - 7 freq thickens (3) - 1 freq sickness (4) - 8 freq thinness (4) - 1 freq thickest (4) - 1 freq 'sickness (4) - 1 freq stickiness (5) - 2 freq teuchness (5) - 1 freq thenkless (5) - 1 freq chickens (5) - 31 freq thickish (5) - 1 freq blackness (5) - 11 freq hackneys (5) - 1 freq bleckness (5) - 3 freq thickos (5) - 1 freq thraaness (5) - 1 freq packness (5) - 1 freq thankless (5) - 6 freq trackless (5) - 1 freq rockness (5) - 1 freq chuckneys (5) - 2 freq swackness (5) - 1 freq harkness (5) - 2 freq trickles (6) - 4 freq chucknies (6) - 1 freq	SoundEx code - T252 thoosans - 60 freq taking - 42 freq taxing - 1 freq teaching - 30 freq touching - 7 freq teachins - 4 freq technical - 26 freq thousans - 24 freq tokens - 5 freq tcenayger - 1 freq tecnaiger - 1 freq thoosin's - 2 freq thcing - 1 freq thickness - 7 freq technically - 8 freq toughness - 1 freq technique - 12 freq techniques - 4 freq tossing - 2 freq 'tossing - 2 freq taichins - 1 freq tiggy-winkle - 1 freq 'thoosans - 1 freq takins - 2 freq touchan's - 1 freq tekno-economic - 1 freq tcm's - 1 freq thoosan-star - 1 freq tokenistic - 1 freq technicians - 4 freq takkins - 1 freq tecumseh - 1 freq techincally - 1 freq taikens - 1 freq thoos'ns - 1 freq technicalities - 1 freq 'tecumseh' - 1 freq technicolor - 7 freq teachings - 2 freq ticking - 1 freq teknicly - 1 freq tokenism - 1 freq techneecian - 2 freq teuchness - 1 freq technician - 1 freq taikenistic - 1 freq touchingly - 1 freq taegang - 1 freq teasing - 2 freq t-sionnaich - 1 freq tzwmcauzbk - 1 freq tkmesg - 1 freq txnx - 1 freq tacking - 3 freq tackings - 1 freq tockens - 1 freq thejamhouseedin - 1 freq tcmck - 1 freq tejmuk - 2 freq tighnacoille - 2 freq tsgnq - 1 freq thickens - 1 freq tieganstevenson - 1 freq	MetaPhone code - 0KNS thickness - 7 freq thickens - 1 freq	THICKNESS
Time to execute Levenshtein function - 0.180747 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.365664 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.048191 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.037364 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000857 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics