Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to bkotg in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
bkotg (0) - 1 freq boag (2) - 10 freq brote (2) - 1 freq boot (2) - 111 freq boots (2) - 89 freq both (2) - 195 freq broth (2) - 68 freq kong (2) - 7 freq bots (2) - 4 freq bott (2) - 3 freq bong (2) - 3 freq blot (2) - 4 freq blog (2) - 38 freq bog (2) - 54 freq brot (2) - 1 freq booty (2) - 1 freq kott (2) - 1 freq booth (2) - 10 freq skots (2) - 1 freq bkg (2) - 1 freq bot (2) - 437 freq bpqtg (2) - 1 freq borg (2) - 1 freq zkots (2) - 4 freq fowg (3) - 1 freq	bkotg (0) - 1 freq bkg (3) - 1 freq bpqtg (4) - 1 freq borg (4) - 1 freq booth (4) - 10 freq bookt (4) - 3 freq zkots (4) - 4 freq skots (4) - 1 freq baking (4) - 9 freq biking (4) - 2 freq bakt (4) - 1 freq bikit (4) - 1 freq boking (4) - 1 freq kott (4) - 1 freq bakit (4) - 8 freq btgu (4) - 1 freq bot (4) - 437 freq both (4) - 195 freq booty (4) - 1 freq kong (4) - 7 freq boots (4) - 89 freq boot (4) - 111 freq boag (4) - 10 freq brote (4) - 1 freq bots (4) - 4 freq	SoundEx code - B232 beasts - 144 freq baskets - 8 freq bissett's - 1 freq buckets - 25 freq besides - 35 freq biscuits - 41 freq 'backstage - 1 freq buchts - 6 freq busts - 2 freq beastie's - 3 freq beasties - 51 freq baist's - 4 freq bisides - 1 freq 'beasts - 5 freq bests - 2 freq baests - 13 freq begets - 4 freq beastis - 1 freq baisties - 3 freq beest's - 1 freq basket's - 1 freq bestows - 1 freq beast's - 7 freq baists - 22 freq best-sellin - 1 freq bestkennt - 1 freq best-kennt - 1 freq backstage - 5 freq bestest - 5 freq boasts - 3 freq beckwith's - 2 freq bastes - 4 freq bust's - 1 freq bucket's - 1 freq best-kent - 6 freq bukkits - 1 freq biests - 1 freq beists - 1 freq baesties - 1 freq best-seller - 2 freq bestseller - 1 freq bouchts - 1 freq boosts - 1 freq 'biscuits' - 1 freq besyds - 2 freq buists - 1 freq bukkets - 1 freq best-keepit - 1 freq bouquets - 1 freq backsides - 2 freq ��besides - 1 freq ��besides - 1 freq basket-swords - 1 freq bigots - 4 freq bbcdouglasf - 1 freq baists' - 1 freq bbcthesocial - 10 freq bbcscotcomms - 1 freq besties - 1 freq bizquits - 1 freq bustage - 1 freq bgstxyfwtn - 1 freq bigotsureejits - 1 freq biscuits' - 1 freq bfkthsjgf - 1 freq bycatch - 1 freq bpqtg - 1 freq bbcsouthscot - 8 freq bzggedyk - 1 freq bkotg - 1 freq bbceducation - 1 freq bestcanton - 1 freq bbckitchencafe - 2 freq bctgb - 1 freq biscuitsgod - 1 freq	MetaPhone code - BKTK bkotg - 1 freq	BKOTG
Time to execute Levenshtein function - 0.300016 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.573831 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.079152 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.038721 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000791 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics