Letter / character frequencies

Earlier versions of this website were blind to accented characters, I think I've sorted it out now, but I'm not 100% certain.

To investigate the phenomenom of accented characters in scots writing, I have written a script to count the number of occurrences of every character. It counts through each piece of text, character by character, converting the character from the UTF-8 unicode encoding using the perl ord() function, this gives a decimal value for each character.

The letter characters are identified, converted into uppercase and then create table of letter frequencies that combine the two cases. The occurrences of letters in each dialect are listed.

The next step is to identify which specific writers use accents where others do not, or if the use of accents is common in various dialects.

Number of occurrencesutf8 decimalappears aspercentage of corpusCentral / LallansDoric / NorthernShetlandOrkneySouthern / BordersUlster
65320769"E"12.60688%39081014368817759150583108054812
51896465"A"10.01599%30783410814917247109772324051517
47424584"T"9.15292%2907219902211692100242114041646
42096373"I"8.12458%251057972961325984531788533013
36891278"N"7.11999%219569815401100979821640432408
33424483"S"6.45090%205634689471031673631462127363
31118579"O"6.00586%187080664111018876211336026525
29674572"H"5.72717%18015759336601366601552429055
28605382"R"5.52082%17038962099869264411278425648
19613376"L"3.78536%1193733947764274353878117722
17790968"D"3.43364%1030563929894414642832913143
14141267"C"2.72925%890502911039622560584910881
13584085"U"2.62171%850382404534102945675313649
12448187"W"2.40248%763782414340822693607311112
12232577"M"2.36087%75606256583860267650889437
10885289"Y"2.10084%64664250463163220244839294
9252970"F"1.78581%51335251242974206236547380
9236066"B"1.78255%55355203752778198440927776
9110871"G"1.75838%55984184372745216841627612
8875080"P"1.71287%53992201032979187235106294
7185375"K"1.38676%41859167622431175029796072
3440086"V"0.66392%212297692106475512652395
1626074"J"0.31382%918745555201955341269
899688"X"0.17362%59511695310137292611
633490"Z"0.12225%3764152221033269536
531581"Q"0.10258%2598149218056255734
1086207"Ï"0.02096%61191060
224214"Ö"0.00432%332153
203201"É"0.00392%65915123
160200"È"0.00309%131146
87205"Í"0.00168%7611
43192"À"0.00083%26116
34220"Ü"0.00066%1311217
20211"Ó"0.00039%146
16196"Ä"0.00031%11131
15210"Ò"0.00029%123
13208"Ð"0.00025%13
10216"Ø"0.00019%181
9193"Á"0.00017%711
9199"Ç"0.00017%81
8198"Æ"0.00015%17
6212"Ô"0.00012%6
5218"Ú"0.00010%41
5217"Ù"0.00010%5
4221"Ý"0.00008%4
3204"Ì"0.00006%12
3222"Þ"0.00006%3
2274"Ē"0.00004%2
2256"Ā"0.00004%2
2332"Ō"0.00004%2
2194"Â"0.00004%2
2195"Ã"0.00004%2
1362"Ū"0.00002%1
1490"Ǫ"0.00002%1
1262"Ć"0.00002%1
1197"Å"0.00002%1
1209"Ñ"0.00002%1
1268"Č"0.00002%1