Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to rupert in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
rupert (0) - 17 freq revert (2) - 4 freq rapet (2) - 16 freq rubbert (2) - 2 freq fuspert (2) - 7 freq rumpelt (2) - 1 freq rupert's (2) - 2 freq ruler (2) - 10 freq riper (2) - 4 freq robert (2) - 237 freq apert (2) - 12 freq quaert (2) - 1 freq lapert (2) - 1 freq ripest (2) - 1 freq rupes (2) - 1 freq hunert (2) - 5 freq report (2) - 224 freq cuvert (2) - 1 freq raport (2) - 1 freq rulers (2) - 8 freq super (2) - 36 freq repent (2) - 17 freq pert (2) - 60 freq rypers (2) - 1 freq expert (2) - 43 freq	rupert (0) - 17 freq report (2) - 224 freq raport (2) - 1 freq repent (3) - 17 freq appert (3) - 1 freq revert (3) - 4 freq tapert (3) - 2 freq pert (3) - 60 freq rypers (3) - 1 freq impert (3) - 1 freq repeat (3) - 77 freq expert (3) - 43 freq reporte (3) - 1 freq repoort (3) - 1 freq robert (3) - 237 freq apert (3) - 12 freq riper (3) - 4 freq rapet (3) - 16 freq lapert (3) - 1 freq ripest (3) - 1 freq depart (4) - 7 freq pouert (4) - 2 freq seperat (4) - 1 freq roberto (4) - 2 freq speert (4) - 27 freq	SoundEx code - R163 robert - 237 freq reportin - 15 freq revertit - 4 freq repertoire - 7 freq report - 224 freq reporter - 23 freq reproduction - 3 freq reportit - 38 freq repoort - 1 freq robertson's - 5 freq robertson - 75 freq revered - 3 freq repartee - 2 freq reports - 41 freq revert - 4 freq rappered - 1 freq reproduce - 2 freq ripport's - 1 freq referred - 18 freq robert's - 4 freq repaired - 8 freq rupert - 17 freq rupert's - 2 freq 'robert - 1 freq referrit - 11 freq referrt - 1 freq reappeart - 4 freq reporters - 3 freq rebirth - 5 freq rhubard - 1 freq reportage - 2 freq reported - 14 freq rapportit - 1 freq rubbert - 2 freq reportin' - 2 freq report's - 2 freq roberto's - 1 freq reports- - 1 freq reporte - 1 freq reappeared - 1 freq robertsons - 1 freq raportit - 2 freq raport - 1 freq reproducin - 1 freq re-appearit - 1 freq ��robert - 1 freq reproduced - 2 freq ��roberto - 2 freq roberto - 2 freq reproducit - 1 freq revertin - 2 freq roberton - 49 freq reproductive - 3 freq ��revered - 1 freq robertburns - 4 freq robertsonpaulc - 1 freq rupertmurdoch - 1 freq reporting - 2 freq robertmiggins - 102 freq refereed - 1 freq robertburnsqz - 1 freq robertburnsday - 1 freq robertburnsnts - 39 freq robertplant - 4 freq roberts - 1 freq reverted - 1 freq robroydfarmshop - 1 freq robertburnsfed - 9 freq robertliddell - 1 freq roberthmw - 1 freq robertreidalba - 3 freq reparations - 1 freq robertwulb - 3 freq robertgordonuni's - 1 freq robertsondawn - 1 freq reverting - 1 freq robertjames - 1 freq robertmdaws - 70 freq robertglen - 1 freq	MetaPhone code - RPRT report - 224 freq repoort - 1 freq repartee - 2 freq rappered - 1 freq repaired - 8 freq rupert - 17 freq reappeart - 4 freq reporte - 1 freq reappeared - 1 freq raport - 1 freq re-appearit - 1 freq	RUPERT
Time to execute Levenshtein function - 0.193914 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.348194 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.028263 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.038504 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000845 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics