Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to wuman in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
wuman (0) - 40 freq wumman (1) - 575 freq human (1) - 315 freq woman (1) - 101 freq wuhan (1) - 2 freq wumans (1) - 1 freq wumen (1) - 7 freq wumin (1) - 5 freq wyman (1) - 1 freq weman (1) - 1 freq wumann (1) - 1 freq cuman (1) - 2 freq teuman (2) - 1 freq wwhan (2) - 1 freq wum-man (2) - 1 freq ehman (2) - 1 freq umman (2) - 15 freq busan (2) - 1 freq wurkan (2) - 2 freq lucan (2) - 2 freq bumin (2) - 1 freq lunan (2) - 1 freq gupan (2) - 1 freq wumman' (2) - 3 freq sumin (2) - 6 freq	wuman (0) - 40 freq wumin (1) - 5 freq wyman (1) - 1 freq wumen (1) - 7 freq weman (1) - 1 freq woman (1) - 101 freq wumman (2) - 575 freq weiman (2) - 1 freq wimin (2) - 1 freq weeman (2) - 9 freq wimen (2) - 1 freq wumans (2) - 1 freq cuman (2) - 2 freq wuhan (2) - 2 freq human (2) - 315 freq women (2) - 77 freq wumann (2) - 1 freq wemen (2) - 16 freq wunn (3) - 9 freq aman (3) - 2 freq wsmn (3) - 1 freq wavan (3) - 9 freq roman (3) - 76 freq lumn (3) - 1 freq cumen (3) - 1 freq	SoundEx code - W550 wimmen - 39 freq woman - 101 freq wumman - 575 freq weemin - 176 freq wunnin - 10 freq winnin - 88 freq wummin - 231 freq wuman - 40 freq women - 77 freq wemen - 16 freq whinin - 8 freq winnen - 1 freq weimen - 12 freq weemen - 176 freq wanin - 2 freq winnowin - 3 freq wumann - 1 freq wemeen - 1 freq wummen - 33 freq wumen - 7 freq wamman - 1 freq weemen' - 1 freq wimmin - 31 freq wumman' - 3 freq wummin' - 1 freq weeman - 9 freq wuamman - 2 freq wanun - 1 freq weiman - 1 freq womman - 1 freq wan-man - 1 freq weman - 1 freq winnan - 1 freq weimun - 1 freq weimin - 1 freq 'woman' - 1 freq whinneyin - 1 freq wiemen - 1 freq wyman - 1 freq waamin - 1 freq wumin - 5 freq ��weemen - 2 freq wum-man - 1 freq wummim - 1 freq wimen - 1 freq wimin - 1 freq women - 3 freq women' - 1 freq weewummin - 1 freq	MetaPhone code - WMN wimmen - 39 freq woman - 101 freq wumman - 575 freq weemin - 176 freq wummin - 231 freq wuman - 40 freq women - 77 freq wemen - 16 freq weimen - 12 freq weemen - 176 freq wumann - 1 freq wemeen - 1 freq wummen - 33 freq wumen - 7 freq wamman - 1 freq weemen' - 1 freq wimmin - 31 freq wumman' - 3 freq wummin' - 1 freq weeman - 9 freq wuamman - 2 freq weiman - 1 freq womman - 1 freq weman - 1 freq weimun - 1 freq weimin - 1 freq 'woman' - 1 freq wiemen - 1 freq waamin - 1 freq wumin - 5 freq ��weemen - 2 freq wimen - 1 freq wimin - 1 freq women - 3 freq women' - 1 freq	WUMAN wumman - 575 freq woman - 101 freq women - 77 freq dug-wumman - 7 freq wumman's - 28 freq wuman - 40 freq wummans - freq wummin - 231 freq wummin's - 10 freq wummin-bodie - 4 freq wummen - 33 freq wumin - 5 freq
Time to execute Levenshtein function - 0.209113 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.385781 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.027257 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.042233 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.001018 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics