Letter / character frequencies

Earlier versions of this website were blind to accented characters, I think I've sorted it out now, but I'm not 100% certain.

To investigate the phenomenom of accented characters in scots writing, I have written a script to count the number of occurrences of every character. It counts through each piece of text, character by character, converting the character from the UTF-8 unicode encoding using the perl ord() function, this gives a decimal value for each character.

The letter characters are identified, converted into uppercase and then create table of letter frequencies that combine the two cases. The occurrences of letters in each dialect are listed.

The next step is to identify which specific writers use accents where others do not, or if the use of accents is common in various dialects.

Number of occurrencesutf8 decimalappears aspercentage of corpusCentral / LallansDoric / NorthernShetlandOrkneySouthern / BordersUlster
91439569"E"12.70623%51785920685241873297654689370971
73108965"A"10.15905%40710115318244407214383530769513
65420584"T"9.09069%38132913949027007191673251854560
58454673"I"8.12272%32996113547432275164352885541431
51160678"N"7.10916%28867111352126176152902499142836
46418683"S"6.45023%2702409732824582141672179335963
42903379"O"5.96175%2439889272023369143202029334225
41531472"H"5.77111%2381538582914653138932344139269
39526682"R"5.49253%2242578602019548121502044832763
26999476"L"3.75178%154858567121419985101324622392
25645668"D"3.56366%1370535489624852102931221717089
19063967"C"2.64908%1147114035784934930901813085
18631285"U"2.58895%11158133198747555801049817931
17489387"W"2.43027%10022034539100665622949814917
17076177"M"2.37286%982703616592125489861812965
14907989"Y"2.07157%837313477475464252670612037
12703870"F"1.76529%66913338056753384359349745
12630366"B"1.75508%71550283756154395862469984
12501071"G"1.73711%72698259236241409162319788
12021280"P"1.67044%69872271886323372652177841
9902175"K"1.37597%54468229675715343647207698
4678186"V"0.65006%27453106512234143019543040
2166974"J"0.30111%11366592911393588791992
1127688"X"0.15669%71452111582242434757
857190"Z"0.11910%43762644347102403699
656681"Q"0.09124%31841838275121334814
3455207"Ï"0.04801%71193428
1544214"Ö"0.02146%8415284
436200"È"0.00606%115420
236201"É"0.00328%6921325136
124205"Í"0.00172%97243
84220"Ü"0.00117%191201241
48192"À"0.00067%2911116
43211"Ó"0.00060%15226
42216"Ø"0.00058%2364
29198"Æ"0.00040%1991
20210"Ò"0.00028%173
19194"Â"0.00026%415
19196"Ä"0.00026%21133
15208"Ð"0.00021%15
15193"Á"0.00021%8412
10195"Ã"0.00014%82
10218"Ú"0.00014%451
9199"Ç"0.00013%81
7212"Ô"0.00010%61
6221"Ý"0.00008%6
6217"Ù"0.00008%51
4222"Þ"0.00006%4
4203"Ë"0.00006%4
4204"Ì"0.00006%22
3256"Ā"0.00004%21
2209"Ñ"0.00003%11
2197"Å"0.00003%11
2600"ɘ"0.00003%2
2330"Ŋ"0.00003%2
2332"Ō"0.00003%2
2274"Ē"0.00003%2
1540"Ȝ"0.00001%1
1268"Č"0.00001%1
1490"Ǫ"0.00001%1
1202"Ê"0.00001%1
1262"Ć"0.00001%1
1362"Ū"0.00001%1