Letter / character frequencies

Earlier versions of this website were blind to accented characters, I think I've sorted it out now, but I'm not 100% certain.

To investigate the phenomenom of accented characters in scots writing, I have written a script to count the number of occurrences of every character. It counts through each piece of text, character by character, converting the character from the UTF-8 unicode encoding using the perl ord() function, this gives a decimal value for each character.

The letter characters are identified, converted into uppercase and then create table of letter frequencies that combine the two cases. The occurrences of letters in each dialect are listed.

The next step is to identify which specific writers use accents where others do not, or if the use of accents is common in various dialects.

Number of occurrencesutf8 decimalappears aspercentage of corpusCentral / LallansDoric / NorthernShetlandOrkneySouthern / BordersUlster
64182769"E"12.60379%38179614337017759150583108052764
50953465"A"10.00590%30046410794117247109772324049665
46611284"T"9.15321%2842589883411692100242114040164
41379073"I"8.12574%245424970801325984531788531689
36258078"N"7.12011%214603813731100979821640431209
32863183"S"6.45344%201177687701031673631462126384
30638879"O"6.01665%183333663161018876211336025570
29068472"H"5.70827%17517159232601366601552428084
28142282"R"5.52638%16688461982869264411278424639
19298976"L"3.78979%1170783940564274353878116945
17492068"D"3.43497%1007003921394414642832912595
13926667"C"2.73482%873672905639622560584910472
13323885"U"2.61644%830262399234102945675313112
12188087"W"2.39340%742522409140822693607310689
12037477"M"2.36383%74061256253860267650889064
10698689"Y"2.10092%63214249913163220244838933
9107370"F"1.78843%50344250652974206236546974
9085666"B"1.78417%54176203352778198440927491
8954771"G"1.75847%54765184092745216841627298
8730580"P"1.71444%52827200592979187235106058
7061675"K"1.38671%40854167352431175029795867
3383686"V"0.66445%207817675106475512652296
1606074"J"0.31538%902345475201955341241
892188"X"0.17518%58821695310137292605
628390"Z"0.12338%3729151821033269524
524581"Q"0.10300%2544148818056255722
1085207"Ï"0.02131%61191059
224214"Ö"0.00440%332153
197201"É"0.00387%59915123
160200"È"0.00314%131146
87205"Í"0.00171%7611
43192"À"0.00084%26116
34220"Ü"0.00067%1311217
20211"Ó"0.00039%146
16196"Ä"0.00031%11131
15210"Ò"0.00029%123
13208"Ð"0.00026%13
10216"Ø"0.00020%181
9193"Á"0.00018%711
9199"Ç"0.00018%81
8198"Æ"0.00016%17
6212"Ô"0.00012%6
5218"Ú"0.00010%41
5217"Ù"0.00010%5
4221"Ý"0.00008%4
3204"Ì"0.00006%12
3222"Þ"0.00006%3
2194"Â"0.00004%2
2274"Ē"0.00004%2
2332"Ō"0.00004%2
2195"Ã"0.00004%2
2256"Ā"0.00004%2
1197"Å"0.00002%1
1268"Č"0.00002%1
1362"Ū"0.00002%1
1209"Ñ"0.00002%1
1490"Ǫ"0.00002%1
1262"Ć"0.00002%1