Letter / character frequencies

Earlier versions of this website were blind to accented characters, I think I've sorted it out now, but I'm not 100% certain.

To investigate the phenomenom of accented characters in scots writing, I have written a script to count the number of occurrences of every character. It counts through each piece of text, character by character, converting the character from the UTF-8 unicode encoding using the perl ord() function, this gives a decimal value for each character.

The letter characters are identified, converted into uppercase and then create table of letter frequencies that combine the two cases. The occurrences of letters in each dialect are listed.

The next step is to identify which specific writers use accents where others do not, or if the use of accents is common in various dialects.

Number of occurrencesutf8 decimalappears aspercentage of corpusCentral / LallansDoric / NorthernShetlandOrkneySouthern / BordersUlster
54837069"E"12.63053%32501512824915357133433108735319
43126065"A"9.93315%255099967061445694912324432264
40011684"T"9.21582%24374689229991187422114327345
35585073"I"8.19625%212783857201113972121788921107
30910578"N"7.11957%18329372497933369161640420662
28178483"S"6.49029%17305661504867064011462117532
26003279"O"5.98928%15428459799880867091336117071
24518272"H"5.64724%14770552357525260071552818333
24082382"R"5.54684%14295455852756756141278716049
16314376"L"3.75765%985493493854963808878511567
14862468"D"3.42324%84311354438119414983308272
12047667"C"2.77491%75112265433199218258517589
11287985"U"2.59993%69630218282993257567539100
10399887"W"2.39537%62717220063517235060747334
10188877"M"2.34677%62177231373389234150915753
8938889"Y"2.05886%52266218532814191744846054
7834970"F"1.80460%42888230032594182536584381
7807966"B"1.79838%46395185932459172440934815
7636371"G"1.75886%45911169882327186041635114
7495780"P"1.72647%44707183502623164735104120
5956375"K"1.37191%34158148132173156929803870
2853686"V"0.65727%17509667492767612651485
1375374"J"0.31677%77574033497166534766
794888"X"0.18307%51871630277124292438
572990"Z"0.13196%3371148020426269379
475281"Q"0.10945%2196144917353255626
222214"Ö"0.00511%232152
116201"É"0.00267%5691545
87205"Í"0.00200%7611
42192"À"0.00097%2616
29220"Ü"0.00067%1311212
28207"Ï"0.00064%5194
17200"È"0.00039%1313
16196"Ä"0.00037%11131
15210"Ò"0.00035%123
15211"Ó"0.00035%141
13208"Ð"0.00030%13
10216"Ø"0.00023%181
9199"Ç"0.00021%81
8198"Æ"0.00018%17
6212"Ô"0.00014%6
6193"Á"0.00014%51
5217"Ù"0.00012%5
5218"Ú"0.00012%41
4221"Ý"0.00009%4
3204"Ì"0.00007%12
3222"Þ"0.00007%3
2256"Ā"0.00005%2
2274"Ē"0.00005%2
2194"Â"0.00005%2
2195"Ã"0.00005%2
2332"Ō"0.00005%2
1490"Ǫ"0.00002%1
1262"Ć"0.00002%1
1268"Č"0.00002%1
1362"Ū"0.00002%1
1197"Å"0.00002%1
1209"Ñ"0.00002%1