Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to nation-staetes in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
nation-staetes (0) - 1 freq naetion-staets (2) - 1 freq nation-state (2) - 2 freq naetion-staet (3) - 2 freq non-starter (5) - 1 freq nationals (6) - 3 freq nationalities (6) - 5 freq nationalists (6) - 7 freq palin-stabs (6) - 1 freq rationales (6) - 1 freq marionette (7) - 1 freq tinysteps (7) - 3 freq nativities (7) - 1 freq apron-tails (7) - 1 freq hailsteenes (7) - 1 freq kick-started (7) - 1 freq anti-stress (7) - 1 freq nationalistic (7) - 2 freq doon-stairs (7) - 1 freq causey-stanes (7) - 8 freq maistet's (7) - 1 freq hailstanes (7) - 2 freq reinstate (7) - 1 freq translates (7) - 15 freq rattlestanes (7) - 1 freq	nation-staetes (0) - 1 freq naetion-staets (2) - 1 freq nation-state (3) - 2 freq naetion-staet (4) - 2 freq nationalists (8) - 7 freq non-starter (8) - 1 freq interstates (9) - 1 freq palin-stabs (9) - 1 freq non-scots (9) - 3 freq nationalities (9) - 5 freq anti-stress (10) - 1 freq nationalistic (10) - 2 freq doon-stairs (10) - 1 freq high-status (10) - 2 freq airn-stith (10) - 1 freq nationalist (10) - 18 freq top-stories (10) - 1 freq tinysteps (10) - 3 freq nationals (10) - 3 freq nations' (11) - 2 freq anatomists (11) - 1 freq pan-scots (11) - 1 freq windin-sheets (11) - 1 freq consates (11) - 2 freq staetes (11) - 1 freq	SoundEx code - N352 notions - 40 freq nothing - 89 freq natioun's - 1 freq nations - 65 freq naething - 226 freq 'naething - 2 freq nothin's - 3 freq nithing - 61 freq nation's - 10 freq naethinness - 2 freq nodding - 4 freq nation-state - 2 freq nuthins - 1 freq noathing - 2 freq naethin's - 4 freq nations' - 2 freq 'nuthin's - 1 freq newton's - 1 freq naithmaist - 1 freq notheen's - 1 freq naitiouns - 9 freq naethins - 1 freq nutmeg - 1 freq notions' - 1 freq neitions - 1 freq nathing - 1 freq natiouns - 2 freq naethingness - 1 freq nottingham - 5 freq nooatimes - 1 freq nottinghame - 1 freq naething's - 1 freq nothynge - 1 freq neathing - 2 freq ��naething - 3 freq naethingelaine - 1 freq nething - 1 freq naetion-staet - 2 freq nation-staetes - 1 freq naetion-staets - 1 freq nothingness - 1 freq needing - 6 freq nutmegged - 2 freq nottinghamshire - 1 freq ��nothing - 1 freq notnixon - 2 freq nmddynygy - 1 freq nutmegs - 1 freq neediness - 1 freq nittenstar - 15 freq nothins - 1 freq nytimes - 1 freq nothings - 1 freq	MetaPhone code - NXNSTTS nation-staetes - 1 freq naetion-staets - 1 freq	NATION-STAETES
Time to execute Levenshtein function - 0.269200 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.535164 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.029411 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.037989 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000886 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics