Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to hejit in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
hejit (0) - 1 freq eejit (1) - 71 freq ejit (1) - 1 freq jit (2) - 2 freq heiq (2) - 1 freq helpit (2) - 52 freq hewis (2) - 2 freq heim (2) - 2 freq heyin (2) - 1 freq heezit (2) - 4 freq hertit (2) - 18 freq semit (2) - 1 freq deit (2) - 9 freq herin (2) - 2 freq merit (2) - 17 freq hikit (2) - 2 freq hei (2) - 257 freq heest (2) - 1 freq telit (2) - 1 freq jeit (2) - 2 freq redit (2) - 2 freq heisit (2) - 1 freq eejits (2) - 50 freq demit (2) - 2 freq resit (2) - 1 freq	hejit (0) - 1 freq eejit (2) - 71 freq ejit (2) - 1 freq eedjit (3) - 7 freq heavit (3) - 2 freq holit (3) - 1 freq heatit (3) - 2 freq healt (3) - 11 freq heet (3) - 2 freq heirt (3) - 4 freq ejjit (3) - 1 freq heedit (3) - 6 freq heapit (3) - 4 freq heist (3) - 8 freq headit (3) - 19 freq hearit (3) - 2 freq hent (3) - 6 freq eeyjit (3) - 1 freq het (3) - 262 freq gjit (3) - 2 freq heidit (3) - 60 freq hert (3) - 770 freq heat (3) - 163 freq heilit (3) - 1 freq heft (3) - 111 freq	SoundEx code - H230 heicht - 53 freq height - 45 freq high-doh - 4 freq hicht - 47 freq heichtie - 1 freq haste - 37 freq hecht - 47 freq hoast - 42 freq hyst - 5 freq heezed - 40 freq howkit - 29 freq hushed - 10 freq hoched - 1 freq hukt - 1 freq huikd - 1 freq hejit - 1 freq heized - 14 freq hissed - 45 freq host - 37 freq howked - 25 freq hackt - 1 freq hackit - 49 freq hookit - 1 freq hiss't - 1 freq haughheid - 1 freq haughty - 3 freq hugged - 7 freq heuked - 3 freq hoked - 11 freq hight - 2 freq hawkit - 1 freq haistie - 1 freq haist - 5 freq heooket - 1 freq hoased - 1 freq hooked - 15 freq heised - 1 freq hushit - 1 freq hagged - 2 freq haggit - 3 freq heched - 1 freq hoist - 5 freq hasty - 6 freq hacked - 9 freq histy - 1 freq hoaked - 2 freq hye-shade - 2 freq houguid - 1 freq hoosed - 6 freq haughed - 1 freq hast - 2 freq hacket - 1 freq 'haste - 1 freq heist - 8 freq hasthe - 1 freq heuched - 1 freq hockid - 1 freq hcid - 1 freq hized - 1 freq hooshed - 1 freq hoasty - 2 freq hyste - 9 freq hooched - 3 freq hikit - 2 freq hashed - 4 freq hixt - 1 freq hæst - 1 freq house-heid - 1 freq heest - 1 freq hiked - 2 freq hogget - 1 freq hogsweed - 1 freq hiegate - 1 freq hosst - 1 freq howzit - 1 freq haikit - 1 freq hoocht - 3 freq hecht-aye - 1 freq heisit - 1 freq hoast' - 1 freq his't - 1 freq hoosehaud - 62 freq heezit - 4 freq hastie - 1 freq hokkit - 1 freq howest - 1 freq hauchty - 1 freq hhsssst - 1 freq huggit - 1 freq howgate - 2 freq hoscote - 39 freq haiked - 1 freq haust - 3 freq høst - 3 freq haused - 3 freq hocked - 1 freq hockit - 3 freq ��hicht - 1 freq hous-heid - 1 freq hajget - 1 freq hayeqt - 1 freq hxzgt - 1 freq hst - 1 freq 'host' - 1 freq hosed - 1 freq hcxiti - 1 freq hqt - 1 freq housed - 1 freq	MetaPhone code - HJT hejit - 1 freq hedged - 2 freq hodged - 3 freq	HEJIT
Time to execute Levenshtein function - 0.191267 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.349659 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.028753 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.036923 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000895 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics