Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to edgar in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
edgar (0) - 10 freq ledger (2) - 1 freq edvard (2) - 2 freq ceegar (2) - 1 freq edter (2) - 1 freq edr (2) - 1 freq edgy (2) - 7 freq dear (2) - 425 freq sugar (2) - 88 freq ergan (2) - 1 freq edam (2) - 45 freq edged (2) - 9 freq edges (2) - 32 freq teegar (2) - 4 freq edra (2) - 5 freq ndgear (2) - 1 freq eeger (2) - 1 freq edg (2) - 1 freq eynar (2) - 34 freq ear (2) - 143 freq edge (2) - 190 freq edder (2) - 21 freq edgin (2) - 3 freq elmar (2) - 2 freq hagar (2) - 1 freq	edgar (0) - 10 freq dar (3) - 86 freq odhar (3) - 1 freq hagar (3) - 1 freq edder (3) - 21 freq edge (3) - 190 freq doar (3) - 29 freq edgin (3) - 3 freq cigar (3) - 19 freq edgit (3) - 1 freq drar (3) - 1 freq editar (3) - 2 freq eager (3) - 27 freq gar (3) - 162 freq daar (3) - 11 freq eddir (3) - 10 freq edg (3) - 1 freq lugar (3) - 4 freq edgy (3) - 7 freq dear (3) - 425 freq edr (3) - 1 freq edter (3) - 1 freq ledger (3) - 1 freq ceegar (3) - 1 freq edged (3) - 9 freq	SoundEx code - E326 edgar - 10 freq edcrick - 5 freq	MetaPhone code - ETKR edgar - 10 freq	EDGAR
Time to execute Levenshtein function - 0.172141 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.339248 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.027541 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.037477 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000942 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics