Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to oed in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
oed (0) - 6 freq ded (1) - 4 freq oped (1) - 1 freq -ed (1) - 1 freq med (1) - 184 freq loed (1) - 4 freq zed (1) - 1 freq odd (1) - 142 freq oeg (1) - 1 freq ted (1) - 13 freq oecd (1) - 1 freq ond (1) - 1 freq red (1) - 297 freq yed (1) - 7 freq oad (1) - 2 freq aed (1) - 2 freq roed (1) - 1 freq old (1) - 178 freq owed (1) - 13 freq oe (1) - 13 freq oid (1) - 1 freq hed (1) - 1155 freq oeq (1) - 1 freq led (1) - 247 freq goed (1) - 3 freq	oed (0) - 6 freq aed (1) - 2 freq oad (1) - 2 freq eed (1) - 11 freq od (1) - 8 freq ed (1) - 53 freq yed (1) - 7 freq oid (1) - 1 freq deu (2) - 40 freq ieed (2) - 1 freq ad (2) - 126 freq odo (2) - 1 freq doe (2) - 6 freq oda (2) - 5 freq ode (2) - 13 freq dei (2) - 3 freq aud (2) - 32 freq ocd (2) - 4 freq ovd (2) - 3 freq ned (2) - 43 freq uid (2) - 1 freq doea (2) - 1 freq yid (2) - 87 freq ieod (2) - 1 freq yeed (2) - 1 freq	SoundEx code - O300 oot - 13916 freq out - 790 freq othe - 4 freq o't - 277 freq ooty - 21 freq oot-d'ye - 1 freq owt - 10 freq oath - 13 freq odd - 142 freq 'oot - 12 freq ode - 13 freq od't - 4 freq ot - 17 freq oat - 12 freq o'd - 13 freq ootae - 90 freq -odd - 3 freq oota - 54 freq ootwi - 17 freq oawthe - 1 freq owed - 13 freq owet - 1 freq oit - 1 freq ooadaa - 1 freq owte - 2 freq outta - 5 freq od - 8 freq oottae - 1 freq ooto - 85 freq oed - 6 freq oot' - 4 freq ootd - 2 freq owd - 1 freq o'tay - 1 freq out' - 2 freq 'out - 2 freq ootdae - 5 freq o't' - 1 freq owid - 7 freq oatae - 1 freq ott - 4 freq oda - 5 freq oot-o'-e-way - 2 freq oot- - 1 freq ��oot - 5 freq oo'd - 1 freq ��out - 2 freq out-the-wey - 1 freq out-waw - 1 freq ��out - 1 freq ��oot - 45 freq oot-the-wey - 1 freq oeht - 1 freq 'oot' - 1 freq odo - 1 freq ootta - 2 freq ��oot - 1 freq othha - 1 freq oad - 2 freq ��ootwi - 2 freq ��ode - 1 freq outty - 2 freq oudey - 1 freq oid - 1 freq o'the - 7 freq othe - 8 freq o'dee - 1 freq o'at - 1 freq oodie - 1 freq outwi - 1 freq oto - 1 freq ot - 2 freq oot - 1 freq 'ootwi' - 1 freq outa - 6 freq	MetaPhone code - OT oot - 13916 freq out - 790 freq o't - 277 freq ooty - 21 freq owt - 10 freq odd - 142 freq 'oot - 12 freq ode - 13 freq ot - 17 freq oat - 12 freq o'd - 13 freq ootae - 90 freq -odd - 3 freq oota - 54 freq oit - 1 freq ooadaa - 1 freq owte - 2 freq outta - 5 freq od - 8 freq oottae - 1 freq ooto - 85 freq oed - 6 freq oot' - 4 freq owd - 1 freq o'tay - 1 freq out' - 2 freq 'out - 2 freq o't' - 1 freq oatae - 1 freq ott - 4 freq oda - 5 freq oot- - 1 freq ��oot - 5 freq oo'd - 1 freq ��out - 2 freq ��out - 1 freq ��oot - 45 freq oeht - 1 freq 'oot' - 1 freq odo - 1 freq ootta - 2 freq ��oot - 1 freq oad - 2 freq ��ode - 1 freq outty - 2 freq oudey - 1 freq oid - 1 freq o'dee - 1 freq o'at - 1 freq oodie - 1 freq oto - 1 freq ot - 2 freq oot - 1 freq outa - 6 freq	OED
Time to execute Levenshtein function - 0.218422 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.465397 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.027783 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.037272 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000935 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics