E.A. Smith and R.J. Senter presented the Automated Readability Index (ARI) in November 1967. The formula uses two variables WPS (words per sentence) and SPW (strokes per word), whose values are mechanically tabulated by counters attached to an electric typewriter.
The authors present all the data and explain the derivation of the ARI, based on a regression equation of the type a + b (word length) + c (sentence length); a, b and c are constants. ARI = 0.5 (WPS ) + 4.71 (SPW) – 21.43.
In 1969, G. Harry McLaughlin derived SMOG Grading using a regression equation of the type a + b (word length x sentence length); a and b are constants. If we pursue McLaughlin’s idea, it is possible to arrive at the Simplified Automated Readability Index (SARI).
Let us begin with a lemma. WPS x SPW = SPS (strokes per sentence). This result is elementary and so, instead of a proof, I offer a hint: WPS has ‘words’ in the numerator and SPW has ‘words’ in the denominator. Now the regression equation becomes a + b (SPS).
M.J. Moroney, in his book titled Facts From Figures, describes a few procedures for calculating a and b. I was able to easily get the data for SPS from the WPS and SPW values presented in a table by Smith and Senter. Then I found, a = – 0.015 and b = 0.094. Since a is too small, I left it out of the reckoning. So, SARI = 0.094 (SPS).
In 2005, I independently derived the Character-count Index = C1/10; C1 is the number of characters (no spaces) in a sentence. I am happy to observe that this formula is an approximation of SARI. Thumbs up!