Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to nottinghamshire in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
nottinghamshire (0) - 1 freq nottinghame (4) - 1 freq nottingham (5) - 5 freq whittinghame (7) - 2 freq roxburghshire (7) - 5 freq thingmie (8) - 1 freq distinguishin (8) - 3 freq jottings (8) - 2 freq pairthshire (8) - 1 freq notthegramma (8) - 2 freq nightmare (8) - 18 freq cunninghame (8) - 4 freq doctrinaire (8) - 1 freq kinglassie (8) - 4 freq continuallie (8) - 1 freq orthographie (8) - 3 freq lancashire (8) - 4 freq northside (8) - 2 freq imturningvampire (8) - 1 freq goinghome (8) - 1 freq tattie-masher (8) - 1 freq perthshire (8) - 14 freq hampshire (8) - 1 freq forfarshire (8) - 1 freq birminghame (8) - 1 freq	nottinghamshire (0) - 1 freq nottinghame (7) - 1 freq nottingham (8) - 5 freq tattie-masher (12) - 1 freq whittinghame (12) - 2 freq roxburghshire (12) - 5 freq tyninghame (13) - 1 freq nothingness (13) - 1 freq perthshire (13) - 14 freq cunninghameha (13) - 1 freq stirlingshire (13) - 1 freq nittenstar (13) - 15 freq distinguishin (13) - 3 freq pairthshire (13) - 1 freq notthegramma (13) - 2 freq nothings (13) - 1 freq jottings (13) - 2 freq englandshire (13) - 5 freq distinguisht (14) - 1 freq distingwished (14) - 1 freq non-english (14) - 5 freq inverness-shire (14) - 1 freq cuttings (14) - 1 freq sittinrooms (14) - 1 freq onceuponashire (14) - 7 freq	SoundEx code - N352 notions - 40 freq nothing - 89 freq natioun's - 1 freq nations - 65 freq naething - 226 freq 'naething - 2 freq nothin's - 3 freq nithing - 61 freq nation's - 10 freq naethinness - 2 freq nodding - 4 freq nation-state - 2 freq nuthins - 1 freq noathing - 2 freq naethin's - 4 freq nations' - 2 freq 'nuthin's - 1 freq newton's - 1 freq naithmaist - 1 freq notheen's - 1 freq naitiouns - 9 freq naethins - 1 freq nutmeg - 1 freq notions' - 1 freq neitions - 1 freq nathing - 1 freq natiouns - 2 freq naethingness - 1 freq nottingham - 5 freq nooatimes - 1 freq nottinghame - 1 freq naething's - 1 freq nothynge - 1 freq neathing - 2 freq ��naething - 3 freq naethingelaine - 1 freq nething - 1 freq naetion-staet - 2 freq nation-staetes - 1 freq naetion-staets - 1 freq nothingness - 1 freq needing - 6 freq nutmegged - 2 freq nottinghamshire - 1 freq ��nothing - 1 freq notnixon - 2 freq nmddynygy - 1 freq nutmegs - 1 freq neediness - 1 freq nittenstar - 15 freq nothins - 1 freq nytimes - 1 freq nothings - 1 freq	MetaPhone code - NTNFMXR nottinghamshire - 1 freq	NOTTINGHAMSHIRE
Time to execute Levenshtein function - 0.337641 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.415338 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.027900 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.038608 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000811 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics