Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to threi-an-twuntiet in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
threi-an-twuntiet (0) - 1 freq ane-an-twuntie (5) - 1 freq echt-an-twuntie (6) - 1 freq three-in-wan (8) - 1 freq threatenit (8) - 1 freq fower-an-twenty (8) - 1 freq breid-an-butter (8) - 7 freq heid-huntit (8) - 1 freq breidan-butter (8) - 1 freq threatnin (9) - 1 freq thinkin-bunnet (9) - 1 freq thrangitie (9) - 1 freq mid-twinties (9) - 1 freq three-tier (9) - 1 freq shaidae-tinted (9) - 1 freq threitent (9) - 1 freq greetin-teenies (9) - 1 freq thrawn-tonguit (9) - 1 freq unreconstructit (9) - 2 freq threttie (9) - 4 freq bread-an-butter (9) - 2 freq green-tinted (9) - 1 freq threatening (9) - 2 freq partisanunit (9) - 1 freq reid-haundit (9) - 2 freq	threi-an-twuntiet (0) - 1 freq echt-an-twuntie (9) - 1 freq ane-an-twuntie (9) - 1 freq fower-an-twenty (11) - 1 freq three-in-wan (11) - 1 freq thrawn-tonguit (12) - 1 freq threatenit (13) - 1 freq threatent (14) - 3 freq threitent (14) - 1 freq green-tinted (14) - 1 freq thinkin-bunnet (14) - 1 freq heid-huntit (14) - 1 freq thritteent (15) - 1 freq twenty-twenty' (15) - 1 freq threitened (15) - 10 freq threatened (15) - 17 freq trans-frontier (15) - 1 freq thern-wi (15) - 1 freq transfrontier (15) - 1 freq threatenin (15) - 20 freq ghaist-hauntit (15) - 1 freq thrawn-gabbit (15) - 1 freq throu-puttin (15) - 1 freq thirlin-tae (15) - 1 freq throuatween (15) - 2 freq	SoundEx code - T653 turnt - 622 freq turn't - 55 freq turnt-up - 4 freq turned - 496 freq thornton - 3 freq tirnt - 21 freq thrawn-heidit - 1 freq trained - 27 freq trendy - 8 freq torrent - 4 freq trundles - 4 freq taranty - 1 freq thrawn-tonguit - 1 freq turnit - 27 freq trend - 13 freq turn-oot - 5 freq thorntree - 1 freq tyrants - 3 freq turen't - 1 freq trends - 3 freq tornado - 3 freq traumatised - 5 freq traumatise - 1 freq truant - 3 freq tyrant - 7 freq trimmed - 8 freq trendies - 3 freq torrents - 3 freq traamatic' - 1 freq trinity - 11 freq treend - 1 freq traumatic - 3 freq turn-oots - 1 freq tirned - 24 freq tirrand - 3 freq turroundin - 1 freq turned-up - 3 freq train-traivel - 1 freq turntheirsels - 1 freq trintlin - 2 freq turnt-oot - 1 freq trimmt - 1 freq tormod - 1 freq tharmoid - 1 freq trintle - 1 freq teerin't - 1 freq termt - 1 freq tirrandom - 1 freq turn'd - 1 freq turnout - 3 freq train-tracks - 1 freq term-time - 1 freq tirrantie - 1 freq tairmed - 1 freq thrummed - 3 freq trentino-alto - 1 freq traint - 1 freq threi-an-twuntiet - 1 freq tronda - 1 freq torrential - 1 freq tory-awned - 1 freq turnoot - 3 freq trowiematics - 1 freq trundlin - 1 freq trinidadian - 1 freq trending - 3 freq termed - 1 freq trontheatre - 1 freq turntable - 1 freq trendaberdeen - 1 freq terential - 1 freq tramadol - 1 freq trendin - 2 freq trendy - 1 freq thorntonloch - 2 freq	MetaPhone code - 0RNTWNTT threi-an-twuntiet - 1 freq	THREI-AN-TWUNTIET
Time to execute Levenshtein function - 0.480118 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 1.091141 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.028672 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.089501 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000859 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics