Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to ants in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
ants (0) - 5 freq aits (1) - 21 freq hants (1) - 1 freq pants (1) - 32 freq aets (1) - 11 freq fants (1) - 1 freq aunts (1) - 10 freq ante (1) - 1 freq ahts (1) - 1 freq an's (1) - 4 freq anns (1) - 1 freq sants (1) - 1 freq cants (1) - 1 freq acts (1) - 40 freq ant (1) - 8 freq wants (1) - 285 freq bants (1) - 5 freq nts (1) - 1 freq ats (1) - 126 freq rants (1) - 6 freq ints (1) - 1 freq anas (1) - 9 freq ans (1) - 2 freq antsy (1) - 1 freq anti (1) - 11 freq	ants (0) - 5 freq ents (1) - 1 freq ints (1) - 1 freq anits (1) - 2 freq aunts (1) - 10 freq antsy (1) - 1 freq nts (1) - 1 freq arts (2) - 34 freq anas (2) - 9 freq ans (2) - 2 freq anto (2) - 1 freq anti (2) - 11 freq anes (2) - 218 freq nits (2) - 21 freq units (2) - 21 freq nuts (2) - 55 freq nats (2) - 2 freq nets (2) - 46 freq nots (2) - 2 freq aats (2) - 10 freq gants (2) - 1 freq ante (2) - 1 freq rants (2) - 6 freq an's (2) - 4 freq fants (2) - 1 freq	SoundEx code - A532 anticipation - 22 freq amidst - 6 freq antics - 12 freq auntics - 1 freq aunties - 16 freq andy's - 5 freq antique - 9 freq antigone - 2 freq auntie's - 6 freq antiquities - 3 freq andews - 1 freq amethyst - 2 freq andes - 1 freq antisyzygy' - 1 freq antichrist - 3 freq antiques - 5 freq ants - 5 freq anti-christ - 2 freq anti-social - 3 freq anti-scottish - 1 freq antisyzygy - 3 freq aunts - 10 freq anti-climax - 1 freq anti-stress - 1 freq aneth's - 1 freq antistius's - 1 freq anti-clart - 1 freq anits - 2 freq amidships - 1 freq antic - 1 freq anticipatan - 1 freq antechamber - 1 freq antiek - 1 freq an'-at's - 1 freq aund's - 1 freq -andz - 2 freq aunt's - 1 freq antiquarian - 2 freq amids - 1 freq anticipatit - 4 freq anti-conscription - 1 freq anti-cholesterol - 1 freq antiquitie - 2 freq antecestors - 4 freq anti-gaelic - 1 freq anticipates - 1 freq antisocial - 1 freq antsy - 1 freq antiquity - 2 freq anticipated - 3 freq antisyzygetic - 1 freq anti-establishment - 4 freq anti-spam - 1 freq anticipate - 1 freq anti-clockwise - 1 freq anticipatin - 3 freq amethysts - 1 freq anti-austerity - 4 freq anti-semitism - 1 freq anti-xenophobic - 1 freq anti-govrenment - 1 freq andthocht - 1 freq andycap - 1 freq antiquesroadshow - 1 freq antisocialism - 1 freq antiqueroadtrip - 1 freq andyconsidine - 2 freq andyclosee - 1 freq auntiesyzygy - 1 freq anti-catholic - 1 freq amidgetgem - 1 freq andyscargill - 1 freq anoticing - 1 freq antwegian - 1 freq andykiko - 2 freq antecedents - 2 freq aintist - 1 freq amitchellallen - 1 freq	MetaPhone code - ANTS aunties - 16 freq andy's - 5 freq auntie's - 6 freq andews - 1 freq andes - 1 freq ants - 5 freq aunts - 10 freq anits - 2 freq an'-at's - 1 freq aund's - 1 freq -andz - 2 freq aunt's - 1 freq antsy - 1 freq	ANTS
Time to execute Levenshtein function - 0.171093 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.330356 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.027106 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.037011 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000787 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics