Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to wirth in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
wirth (0) - 89 freq firth (1) - 44 freq eirth (1) - 1 freq width (1) - 6 freq werth (1) - 2 freq airth (1) - 11 freq warth (1) - 64 freq hirth (1) - 1 freq girth (1) - 5 freq wurth (1) - 7 freq worth (1) - 248 freq wirh (1) - 1 freq irth (1) - 37 freq mirth (1) - 10 freq wirthy (1) - 7 freq yirth (1) - 41 freq birth (1) - 92 freq wirt (1) - 27 freq with (1) - 856 freq wirey (2) - 1 freq wyth (2) - 1 freq pith (2) - 9 freq wirds (2) - 868 freq hairth (2) - 26 freq swith (2) - 28 freq	wirth (0) - 89 freq wurth (1) - 7 freq warth (1) - 64 freq werth (1) - 2 freq worth (1) - 248 freq wirthy (1) - 7 freq yirth (2) - 41 freq wrath (2) - 19 freq worthy (2) - 21 freq with (2) - 856 freq mirth (2) - 10 freq birth (2) - 92 freq wirt (2) - 27 freq width (2) - 6 freq eirth (2) - 1 freq irth (2) - 37 freq airth (2) - 11 freq firth (2) - 44 freq wirh (2) - 1 freq girth (2) - 5 freq hirth (2) - 1 freq wersh (3) - 24 freq wurthie (3) - 1 freq warsh (3) - 1 freq pairth (3) - 1 freq	SoundEx code - W630 word - 697 freq worried - 90 freq wreath - 14 freq wirth - 89 freq weariet - 19 freq wird - 576 freq wrote - 138 freq wired - 15 freq worth - 248 freq write - 380 freq whaur'd - 2 freq weird - 167 freq wirrit - 3 freq wordy - 11 freq writhe - 1 freq wrath - 19 freq weirdo - 18 freq werth - 2 freq wurdie - 9 freq wearied - 9 freq weird' - 1 freq worthy - 21 freq 'whaur'd - 1 freq worrit - 22 freq warth - 64 freq wearit - 7 freq wraith - 17 freq wrut - 1 freq wurd - 185 freq waarty - 1 freq waard - 2 freq wirthy - 7 freq writ - 51 freq wordie - 14 freq whurred - 1 freq whaurt - 2 freq ward - 89 freq wurid - 1 freq wurth - 7 freq weirdy - 1 freq wurthie - 1 freq wyert - 1 freq weyert - 1 freq weired - 1 freq wurreyt - 1 freq word' - 2 freq waur'd - 1 freq wurt - 4 freq worriet - 9 freq warthie - 9 freq wart - 4 freq weert - 1 freq warid - 5 freq where'd - 2 freq wirret - 1 freq wrat - 14 freq weared - 1 freq wourd - 1 freq worth' - 1 freq wierd - 4 freq wirt - 27 freq wirried - 6 freq wrot - 8 freq whaurit - 1 freq wrate - 47 freq wrowt - 4 freq weered - 3 freq wryte - 5 freq wared - 9 freq weiriet - 2 freq whar'd - 1 freq werit - 1 freq wirriet - 3 freq wort - 1 freq whereat - 1 freq waired - 1 freq werd - 1 freq 'write - 1 freq wayward - 1 freq worthie - 5 freq wrait - 6 freq weyward - 1 freq weirdie - 1 freq wirdie - 17 freq warit - 1 freq wareit - 1 freq waurrit - 1 freq waird - 1 freq wroeht - 2 freq warty - 1 freq weirit - 1 freq worte - 1 freq ��word - 1 freq wirdy - 1 freq wrout - 1 freq whered - 1 freq worried - 1 freq 'word - 1 freq weerd - 1 freq wurrid - 1 freq write' - 2 freq weewrite - 3 freq	MetaPhone code - WR0 wirth - 89 freq worth - 248 freq werth - 2 freq worthy - 21 freq warth - 64 freq wirthy - 7 freq wurth - 7 freq wurthie - 1 freq warthie - 9 freq worth' - 1 freq worthie - 5 freq	WIRTH
Time to execute Levenshtein function - 0.211064 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.362412 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.027685 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.036804 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000860 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics