Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to indians in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
indians (0) - 11 freq indian's (1) - 1 freq indiane (1) - 1 freq indian (1) - 32 freq indiana (1) - 2 freq bindins (2) - 1 freq sinians (2) - 1 freq mindins (2) - 15 freq ninians (2) - 3 freq injins (2) - 1 freq india (2) - 21 freq findins (2) - 4 freq endins (2) - 13 freq ingans (2) - 9 freq ingins (2) - 14 freq india' (2) - 1 freq lydians (2) - 1 freq innins (2) - 16 freq indies (2) - 6 freq inions (2) - 1 freq biddins (3) - 12 freq ondang (3) - 3 freq brians (3) - 1 freq trinians (3) - 1 freq finding (3) - 16 freq	indians (0) - 11 freq endins (2) - 13 freq indiana (2) - 2 freq indian (2) - 32 freq indian's (2) - 1 freq indiane (2) - 1 freq lydians (3) - 1 freq ingins (3) - 14 freq inions (3) - 1 freq endeens (3) - 4 freq ingans (3) - 9 freq indies (3) - 6 freq innins (3) - 16 freq findins (3) - 4 freq bindins (3) - 1 freq ninians (3) - 3 freq mindins (3) - 15 freq injins (3) - 1 freq ending (4) - 17 freq dans (4) - 3 freq ongyans (4) - 4 freq undies (4) - 3 freq sidins (4) - 1 freq injines (4) - 2 freq mendins (4) - 1 freq	SoundEx code - I535 in-atween - 2 freq intent - 37 freq intimidatin - 2 freq intendit - 25 freq indiana - 2 freq intention - 24 freq intense - 24 freq indian - 32 freq indian's - 1 freq immediant - 1 freq intend - 10 freq intensive - 7 freq intimmers - 25 freq inhaudin - 1 freq intended - 14 freq intimations - 1 freq indiaman - 1 freq intonit - 1 freq inhauden - 2 freq intentions - 16 freq inatween - 7 freq indomitable - 1 freq indians - 11 freq intently - 9 freq intimidate - 1 freq intantly - 1 freq intensity - 6 freq intentionally - 1 freq intimate - 10 freq intmint - 1 freq intimmers' - 1 freq intments - 1 freq indentured - 1 freq intenshuns - 1 freq intin - 2 freq inhauddin - 1 freq immediantlie - 1 freq indiane - 1 freq inten - 2 freq intennin - 1 freq indentation - 1 freq intonation - 11 freq intimacy - 1 freq intendant - 1 freq immedanthe - 1 freq immedantlie - 4 freq intimatit - 5 freq intimately - 1 freq intensitie - 1 freq intangible - 3 freq intimidated - 1 freq intiimers - 1 freq intimeitit - 1 freq intoned - 1 freq intensifyin - 1 freq intensely - 4 freq indentify - 1 freq intimidation - 2 freq intemperate - 1 freq intensified - 1 freq inaathentic - 1 freq intonaetion - 1 freq intents - 2 freq intimidatit - 1 freq intendin - 3 freq intented - 1 freq intimbers - 1 freq intimmer - 1 freq inteemasee - 1 freq imdoium - 1 freq imodium - 1 freq intensifier - 2 freq indynursebrian - 1 freq intmastclmc - 1 freq indieandluna - 1 freq iamnotanornithologist - 1 freq indyonskye - 1 freq intamzq - 1 freq intensifiers - 2 freq intensifer - 1 freq indynowsnp - 1 freq iandunt - 1 freq	MetaPhone code - INTNS intense - 24 freq indian's - 1 freq indians - 11 freq	INDIANS
Time to execute Levenshtein function - 0.184499 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.328762 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.027246 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.036735 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000790 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics