Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to tannadice in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
tannadice (0) - 5 freq tantalise (3) - 1 freq candice (3) - 1 freq tannadicelad (3) - 1 freq tannahill (3) - 7 freq jaundice (3) - 3 freq lanladie (3) - 1 freq cannabis (4) - 1 freq cannaee (4) - 1 freq tantric (4) - 2 freq canadae (4) - 6 freq vandalise (4) - 1 freq 'annie (4) - 3 freq avarice (4) - 1 freq tentative (4) - 3 freq tankie (4) - 1 freq aareadie (4) - 16 freq annoyince (4) - 2 freq kennedie (4) - 1 freq tinnie (4) - 7 freq taenails (4) - 2 freq teenaige (4) - 1 freq paradise (4) - 44 freq annoonce (4) - 4 freq standardise (4) - 1 freq	tannadice (0) - 5 freq tannoid (5) - 1 freq tanned (5) - 10 freq jaundice (5) - 3 freq tannadicelad (5) - 1 freq candice (5) - 1 freq annidir (6) - 1 freq annandale (6) - 2 freq tinnie (6) - 7 freq announce (6) - 7 freq annoonce (6) - 4 freq tinned (6) - 8 freq tenancie (6) - 2 freq tenancy (6) - 12 freq kennedie (6) - 1 freq tannin (6) - 8 freq canonade (6) - 1 freq pendice (6) - 2 freq annoyince (6) - 2 freq lanladie (6) - 1 freq tanning (6) - 3 freq tinnd (6) - 1 freq manance (6) - 2 freq tannahill (6) - 7 freq pinnace (6) - 3 freq	SoundEx code - T532 tends - 21 freq then-it's - 1 freq tints - 3 freq twinties - 3 freq tennet's - 1 freq twenty-echt - 1 freq twenty-sax - 1 freq tents - 22 freq twenties - 11 freq thematically - 2 freq tiends - 1 freq twenty-saicont - 1 freq twinty-sax - 2 freq twinty-aicht - 2 freq tenets - 1 freq twinty-setven - 1 freq tendis - 1 freq tomataes - 10 freq taunts - 2 freq tomatoes - 6 freq twenty-six - 4 freq twentyeicht - 1 freq twenty-eicht - 3 freq twuntie-echt - 1 freq taunds - 1 freq twenty-seven - 3 freq twenty-eight - 1 freq time-distance-speed - 1 freq twinty-echt - 3 freq timattas - 1 freq taentacles - 1 freq tomatas - 3 freq ttands - 1 freq team-mates' - 1 freq thematic - 4 freq tomato's - 1 freq thematicallie - 1 freq taands - 1 freq tamatas - 2 freq twintyecht - 1 freq tint-glazed - 1 freq twinty-eight - 1 freq tomaties - 1 freq taints - 1 freq tentacles - 1 freq time-display - 1 freq toontie-six - 1 freq twintysax - 1 freq tannadice - 5 freq tent-takkin - 1 freq tamataes - 1 freq twentysixbux - 8 freq tommydoc - 3 freq teamtosh - 1 freq tundjejbod - 1 freq tentsmuir - 1 freq tawnyhootcasino - 1 freq tannadicelad - 1 freq	MetaPhone code - TNTS doo'n-oots - 1 freq tends - 21 freq dainties - 5 freq tints - 3 freq dunts - 22 freq tennet's - 1 freq tents - 22 freq tiends - 1 freq tenets - 1 freq deinties - 2 freq tendis - 1 freq taunts - 2 freq dantes - 1 freq dundas - 5 freq dundee's - 7 freq taunds - 1 freq doughnuts - 8 freq doughnuts' - 1 freq dante's - 22 freq ttands - 1 freq danadays - 10 freq doondies - 1 freq dainty's - 1 freq denties - 2 freq dints - 2 freq taands - 1 freq taints - 1 freq tannadice - 5 freq dandies - 15 freq	TANNADICE
Time to execute Levenshtein function - 0.238793 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.405074 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.028299 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.037310 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.000820 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics