Corpus of 21st Century Scots Texts - Levenshtein

A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to step-sister in Corpus

Levenshtein	Double Levenshtein	SoundEx	MetaPhone	Manually curated
step-sister (0) - 5 freq stepsister (1) - 9 freq stepsisters (2) - 3 freq stepsister's (3) - 2 freq step-mither (3) - 6 freq step-brither (4) - 6 freq seenister (4) - 1 freq step-ladder (4) - 1 freq ��s-sister (4) - 1 freq step-faither (4) - 6 freq seinister (4) - 3 freq hauf-sister (4) - 3 freq stepmither (4) - 1 freq stepfaither (4) - 1 freq steisher (4) - 3 freq 'meenister (5) - 2 freq posster (5) - 1 freq sneester (5) - 3 freq mcenister (5) - 1 freq slaister (5) - 8 freq deesaster (5) - 1 freq splitter (5) - 1 freq splinter (5) - 4 freq seistem (5) - 27 freq slester (5) - 4 freq	step-sister (0) - 5 freq stepsister (2) - 9 freq stepsisters (4) - 3 freq step-mither (6) - 6 freq stepsister's (6) - 2 freq hauf-sister (7) - 3 freq step-faither (7) - 6 freq ��s-sister (7) - 1 freq step-ladder (7) - 1 freq step-brither (7) - 6 freq stepdaughter (8) - 1 freq tapster (8) - 6 freq stap-stairt (8) - 2 freq stap-bi-stap (8) - 1 freq spinster (8) - 4 freq stap-ower (8) - 1 freq transistor (8) - 1 freq steisher (8) - 3 freq stepmither (8) - 1 freq seinister (8) - 3 freq seenister (8) - 1 freq posster (8) - 1 freq stepfaither (8) - 1 freq stamagaster (8) - 4 freq tipster (8) - 1 freq	SoundEx code - S312 steps - 207 freq staps - 36 freq stobs - 8 freq stoaps - 22 freq suithfast - 11 freq stovies - 37 freq stuffs - 6 freq stoves - 3 freq suthfast - 1 freq sit-ups - 1 freq stops - 44 freq shitebag - 2 freq suithfest - 2 freq stuff's - 4 freq stepek - 1 freq stuffie's - 1 freq south-facin - 1 freq side-by-side - 3 freq step's - 2 freq sotheby's - 2 freq 'shit-face' - 2 freq stabs - 9 freq stevie's - 13 freq stoops - 4 freq stoppeq - 1 freq stavs - 1 freq stepsisters - 3 freq stepsister's - 2 freq stepsister - 9 freq stoups - 1 freq stap-stairt - 2 freq seed-box - 1 freq stapag - 1 freq side-bi-side - 2 freq staffs - 1 freq steps' - 1 freq said-haafgaits - 1 freq suithfestness - 1 freq stap-bi-stap - 1 freq soothfast - 1 freq stepson - 1 freq scottification - 1 freq staups - 3 freq steppies - 2 freq stubbs - 1 freq steevest - 1 freq side-face - 1 freq stoppage - 1 freq seed-baas - 1 freq shitebags - 1 freq step-sister - 5 freq stiffs - 1 freq stowps - 2 freq sate-back - 1 freq staves - 2 freq ��steps - 1 freq syetphzbsq - 1 freq stevekydd - 1 freq stevieg - 2 freq stvsport - 2 freq stefsmith - 1 freq staffsref - 1 freq stop's - 1 freq stephjohn - 1 freq setps - 1 freq steves - 2 freq steves - 2 freq stvkathryn - 1 freq stepswizard - 1 freq steveseagull - 1 freq steviesouness - 1 freq stubs - 1 freq sthbuji - 1 freq steveson - 2 freq	MetaPhone code - STPSSTR stepsister - 9 freq step-sister - 5 freq	STEP-SISTER sister - 449 freq sisters - 130 freq sister's - 36 freq stepsister - 9 freq step-sister - 5 freq stepsisters - 3 freq step-sisters - freq hauf-sister - 3 freq
Time to execute Levenshtein function - 0.508605 milliseconds The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings	Time to execute Double Levenshtein function - 0.998021 milliseconds In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.	Time to execute SoundEx function - 0.085786 milliseconds Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.	Time to execute MetaPhone function - 0.091805 milliseconds Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.	Time to execute Manually curated function - 0.001114 milliseconds Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.

Web Analytics