Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to thrawn-heidit in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
thrawn-heidit (0) - 1 freq three-leidit (4) - 2 freq preen-heidit (4) - 1 freq tow-heidit (4) - 1 freq wrang-heidit (4) - 1 freq thrawn-gabbit (4) - 1 freq hail-heidit (4) - 2 freq hale-heidit (4) - 5 freq langheidit (5) - 2 freq airy-heidit (5) - 1 freq reed-heidit (5) - 1 freq baa-heidit (5) - 1 freq seeven-heidit (5) - 1 freq thrawn-lookin (5) - 1 freq snake-heidit (5) - 1 freq tousie-heidit (5) - 2 freq bawn-heided (5) - 1 freq heich-heidit (5) - 6 freq half-hoidit (5) - 1 freq bawheidit (5) - 1 freq haill-heidit (5) - 2 freq white-heidit (5) - 1 freq grey-heidit (5) - 1 freq tuim-heidit (5) - 4 freq which-heidit (5) - 1 freq	thrawn-heidit (0) - 1 freq thrawn-gabbit (7) - 1 freq tow-heidit (7) - 1 freq preen-heidit (7) - 1 freq three-leidit (7) - 2 freq thrawn-lookin (8) - 1 freq thrawn-like (8) - 1 freq haurd-heidit (8) - 1 freq thrawn-lik (8) - 1 freq thrawn-tonguit (8) - 1 freq wrang-heidit (8) - 1 freq hail-heidit (8) - 2 freq hale-heidit (8) - 5 freq reid-heidit (9) - 5 freq rid-heidit (9) - 2 freq hie-heidit (9) - 1 freq tattie-heidit (9) - 1 freq twa-leidit (9) - 15 freq het-heidit (9) - 1 freq threidit (9) - 4 freq wan-leidit (9) - 1 freq curly-heidit (9) - 1 freq fair-heidit (9) - 2 freq whyte-heidit (9) - 1 freq grey-heidit (9) - 1 freq	SoundEx code - T653 turnt - 622 freq turn't - 55 freq turnt-up - 4 freq turned - 496 freq thornton - 3 freq tirnt - 21 freq thrawn-heidit - 1 freq trained - 27 freq trendy - 8 freq torrent - 4 freq trundles - 4 freq taranty - 1 freq thrawn-tonguit - 1 freq turnit - 27 freq trend - 13 freq turn-oot - 5 freq thorntree - 1 freq tyrants - 3 freq turen't - 1 freq trends - 3 freq tornado - 3 freq traumatised - 5 freq traumatise - 1 freq truant - 3 freq tyrant - 7 freq trimmed - 8 freq trendies - 3 freq torrents - 3 freq traamatic' - 1 freq trinity - 11 freq treend - 1 freq traumatic - 3 freq turn-oots - 1 freq tirned - 24 freq tirrand - 3 freq turroundin - 1 freq turned-up - 3 freq train-traivel - 1 freq turntheirsels - 1 freq trintlin - 2 freq turnt-oot - 1 freq trimmt - 1 freq tormod - 1 freq tharmoid - 1 freq trintle - 1 freq teerin't - 1 freq termt - 1 freq tirrandom - 1 freq turn'd - 1 freq turnout - 3 freq train-tracks - 1 freq term-time - 1 freq tirrantie - 1 freq tairmed - 1 freq thrummed - 3 freq trentino-alto - 1 freq traint - 1 freq threi-an-twuntiet - 1 freq tronda - 1 freq torrential - 1 freq tory-awned - 1 freq turnoot - 3 freq trowiematics - 1 freq trundlin - 1 freq trinidadian - 1 freq trending - 3 freq termed - 1 freq trontheatre - 1 freq turntable - 1 freq trendaberdeen - 1 freq terential - 1 freq tramadol - 1 freq trendin - 2 freq trendy - 1 freq thorntonloch - 2 freq	MetaPhone code - 0RNHTT thrawn-heidit - 1 freq	THRAWN-HEIDIT
Time to execute Levenshtein function - 0.235093 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.440217 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.027880 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.036986 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000855 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics