A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

. .Previous author - Next author
- fine grain dialect comparison - Venn diagrams - punctuation analysis - chronology -

Smirnov, Kuzma

Basic Stats

Total words by this author in corpus - 522
Total unique words used by this author in corpus - 231
Ratio of total words to unique words - 2.26
Tagged as LAL (General Central) dialect.
Top ten most common words - an, the, staundard, roushie, in, a, o, leids, scots, haes,

List of texts in corpus


Facebook (2015-05-10) in Central dialect (LAL), categorised as prose (522 words)

Author word Keyness frequencies

This should list the words that the author uses in a disproportionate manner more often than other writers in the corpus. This may include (a) proper nouns (character names in the author's stories), (b) other words related to the specific subject matter and (c) words specific to the regional dialect.
WordCount Normalised
per million
Keyness
staundard14 26,819.92206.180
roushie13 24,904.21182.810
byleids5 9,578.5467.799
ukraine4 7,662.8453.294
thare's4 7,662.8451.372
i'4 7,662.8451.372
inglis6 11,494.2542.012
melt4 7,662.8441.875
leids7 13,409.9639.745
b4 7,662.8436.084
u4 7,662.8435.615
pronoonciation3 5,747.1330.692
an36 68,965.5226.848
iveryday2 3,831.4226.639
offeecial3 5,747.1326.007
haes7 13,409.9623.772
dominatin2 3,831.4223.730
belarusian6 11,494.25nan
sib3 5,747.1321.439
treatit2 3,831.4220.930
grammar3 5,747.1320.571
baith5 9,578.5418.492
braid3 5,747.1317.506
nou4 7,662.8416.499
anely3 5,747.1316.043
forms2 3,831.4214.717
ukrainian4 7,662.84nan
distinction2 3,831.4217.472
orthography2 3,831.4214.614
thair5 9,578.5414.271
speak3 5,747.1313.342
different3 5,747.1312.678
especially2 3,831.4212.361
leid4 7,662.8412.224
cawed2 3,831.4211.683
thay3 5,747.1311.634
daes2 3,831.4211.319
juist4 7,662.8410.627
thaim4 7,662.849.745
uise2 3,831.429.183
three3 5,747.139.079
scots7 13,409.968.794
belarus3 5,747.13nan
nae6 11,494.257.762
til3 5,747.137.600
thare2 3,831.427.345
aw5 9,578.546.513
masel2 3,831.426.494
canna2 3,831.426.313
tae4 7,662.846.288
comes2 3,831.426.184
been4 7,662.845.436
while2 3,831.425.074
sic2 3,831.424.621
mony2 3,831.424.601
the20 38,314.184.122
twa3 5,747.134.117
for7 13,409.964.089
life2 3,831.424.051
some3 5,747.134.009
whaur2 3,831.423.523
by3 5,747.133.450
sae3 5,747.133.440
fae4 7,662.843.126
surzhik2 3,831.42nan
will2 3,831.422.966
say2 3,831.422.848
wis2 3,831.422.656
that2 3,831.422.545
a10 19,157.092.170
ower3 5,747.132.016
as6 11,494.251.959
are2 3,831.421.813
hae3 5,747.131.799
be5 9,578.541.706
auld2 3,831.421.655
in12 22,988.511.619
o7 13,409.961.551
like3 5,747.131.483
wi2 3,831.421.140
whit2 3,831.420.569
trasianka2 3,831.42nan
is5 9,578.540.848
on2 3,831.420.548
but3 5,747.130.407
at4 7,662.840.119
or2 3,831.420.042
fouaniver2 3,831.42nan
thing2 3,831.424.238