The table below shows all data on piano key usage extracted from the MIDI files. This table attempts to show several aspects about the pieces. The titles, opus numbers, dates of composition, and harmonic key are given, for easy reference, and the bulk of the table shows the total amount of time each key was depressed during each sonata in milliseconds, placed within the compass of a modern piano. Colour coding is used to show the size of keyboard each work was written for, different colours representing what I suspect are different pianos. At the bottom of the chart, the two pianos that it is known Beethoven owned during this period are shown. If the data from a sonata is presented in one of these colours it implies that it is highly likely that this sonata was written for this instrument.
Below is a bar chart showing the total amount of time each piano key was depressed over the entirety of the thirty-two sonatas.
As can be seen, the curves approximate a normal distribution with some anomalies. Part of the reason for these large spikes in the graph is created by bias in harmonic key usage. Below is a table showing Beethoven's key usage over the entire corpus. Relative major and minor keys are counted together, as their pitch-class usage is identical.
As can be seen from the table above, some keys are greatly privileged over others. In order to ascertain that the aberrant spikes in the graph above were the result of harmonic key privileging and not the result of another process, an estimation was made as to the occurrence of pitch-classes, given the use of harmonic keys over the total sonatas. This estimate was then compared to the actual prevalence of pitch-classes in the data.
The estimated and actual usage of Pitch Classes are extremely close, indicating that it is highly likely that the spikes of pitch-classes C, D, and G that can be seen in fig. 2. are a result of biases towards these pitch-classes resulting from the harmonic keys used. Re-adjusting the results to compensate for this bias creates the following chart:
From fig. 6, it appears that the spread of notes over Beethoven's sonatas approximates a normal distribution. Calculation of the mean and standard deviation for this data allows for the super-imposition of a normal curve onto the chart to confirm our suspicions. Here μ (population mean) = 61.32 σ (population standard deviation) = 13.42.