Letter / character frequencies

Earlier versions of this website were blind to accented characters, I think I've sorted it out now, but I'm not 100% certain.

To investigate the phenomenom of accented characters in scots writing, I have written a script to count the number of occurrences of every character. It counts through each piece of text, character by character, converting the character from the UTF-8 unicode encoding using the perl ord() function, this gives a decimal value for each character.

The letter characters are identified, converted into uppercase and then create table of letter frequencies that combine the two cases. The occurrences of letters in each dialect are listed.

The next step is to identify which specific writers use accents where others do not, or if the use of accents is common in various dialects.

Number of occurrencesutf8 decimalappears aspercentage of corpusCentral / LallansDoric / NorthernShetlandOrkneySouthern / BordersUlster
55578169"E"12.62089%32657712974115357133433108739676
43857365"A"9.95928%256332978981445694912324437152
40541184"T"9.20623%24481890274991187422114330523
36037273"I"8.18346%213770867741113972121788923588
31369078"N"7.12339%18415873370933369161640423509
28554083"S"6.48415%17387662131867064011462119841
26371979"O"5.98863%15490960540880867081336119393
24924172"H"5.65986%14847053113525260071552820871
24447582"R"5.55163%14368156481756756141278718345
16548976"L"3.75799%989843538054963808878513036
15076368"D"3.42358%84796358368119414983309533
12191767"C"2.76854%75390268193199218258518476
11454085"U"2.60102%699982205129932575675310170
10554787"W"2.39680%63151222803517235060748175
10333477"M"2.34655%62550234153389234150916548
9066089"Y"2.05874%52521221312814191744846793
7937670"F"1.80250%43071232272594182536585001
7930666"B"1.80091%46588188082459172440935634
7735971"G"1.75670%46130171342327186041635745
7579780"P"1.72123%44897185392623164735104581
6050075"K"1.37386%34331149662173156929804481
2880886"V"0.65418%17579674492767612651617
1397074"J"0.31724%77854057497166534931
798588"X"0.18133%51981636277124292458
580090"Z"0.13171%3382148220426269437
480781"Q"0.10916%2204145517353255667
226207"Ï"0.00513%519202
222214"Ö"0.00504%232152
117201"É"0.00266%5791545
87205"Í"0.00198%7611
42192"À"0.00095%2616
39200"È"0.00089%13125
31220"Ü"0.00070%1311214
18211"Ó"0.00041%144
16196"Ä"0.00036%11131
15210"Ò"0.00034%123
13208"Ð"0.00030%13
10216"Ø"0.00023%181
9199"Ç"0.00020%81
8198"Æ"0.00018%17
6193"Á"0.00014%51
6212"Ô"0.00014%6
5218"Ú"0.00011%41
5217"Ù"0.00011%5
4221"Ý"0.00009%4
3204"Ì"0.00007%12
3222"Þ"0.00007%3
2274"Ē"0.00005%2
2256"Ā"0.00005%2
2195"Ã"0.00005%2
2332"Ō"0.00005%2
2194"Â"0.00005%2
1490"Ǫ"0.00002%1
1268"Č"0.00002%1
1362"Ū"0.00002%1
1209"Ñ"0.00002%1
1197"Å"0.00002%1
1262"Ć"0.00002%1