Rogers’ Listening Formula

January 25, 2012

By Nirmaldasan

(nirmaldasan@hotmail.com)

‘A Formula for Predicting the Comprehension Level of Material to be Presented Orally’ by John R. Rogersappeared in The Journal of Education Research (Volume 56, Number 4, December 1962).Rogers developed criterion passages based on the assumption that children understood their own language and that of children in the same age group. He writes: “Four hundred and eighty recordings of more than a hundred words each were used in developing the criterion. An exact typed reproduction of each recording was made. In effect, these transcriptions represented twelve bodies of material — one for each of the twelve school grades.”

The regression equation (0.669x + 0.4981y – 2.0625) is based on two independent variables: 1. Average idea unit length (x), obtained by the number of words divided by the number of independent clauses; and 2. Percentage of words in the sample that do not appear on Dale’s long list of 3000 familiar words (y).Rogerssays that the grade level predicted by his formula ‘would be accurate within a range of about two grades above or below the predicted one more than two-thirds of the time’.

Since the formula appears complex with its decimal points, I have found a useful approximation by choosing a sample of 50 words. If I is the number of independent clauses in the sample of 50 words; and if ND is the number of words not in the Dale list of 3000 words, then GL (grade level) = (33/I) + ND – 2. This formula grades oral texts on a scale of 1 to 12+.

The Dale list was revised in 1983. It is found in Readability Revisited: The New Dale-Chall Readability Formula by Jeanne S. Chall and Edgar Dale (1995). The earlier list may be easily found online. I think that theRogers’ listening formula should work well with either list.

However, a word of caution. Rogers writes: “From the nature of the procedure followed in deriving the formula, it is obvious that the last word has not been spoken. Furthermore, the formula is presented as untried, untested, and unproved. At present there is no graded body of material against which it may be tested. Perhaps other researchers will work out methods for testing the formula in practical situations.”

Simplified Automated Readability Index

December 23, 2011

By Nirmaldasan

(Nirmaldasan@hotmail.com)

E.A. Smith and R.J. Senter presented the Automated Readability Index (ARI) in November 1967. The formula uses two variables WPS (words per sentence) and SPW (strokes per word), whose values are mechanically tabulated by counters attached to an electric typewriter.

The authors present all the data and explain the derivation of the ARI, based on a regression equation of the type a + b (word length) + c (sentence length); a, b and c are constants. ARI = 0.5 (WPS ) + 4.71 (SPW) – 21.43.

In 1969, G. Harry McLaughlin derived SMOG Grading using a regression equation of the type a + b (word length x sentence length); a and b are constants. If we pursue McLaughlin’s idea, it is possible to arrive at the Simplified Automated Readability Index (SARI).

Let us begin with a lemma. WPS x SPW = SPS (strokes per sentence). This result is elementary and so, instead of a proof, I offer a hint: WPS has ‘words’ in the numerator and SPW has ‘words’ in the denominator. Now the regression equation becomes a + b (SPS).

M.J. Moroney, in his book titled Facts From Figures, describes a few procedures for calculating a and b. I was able to easily get the data for SPS from the WPS and SPW values presented in a table by Smith and Senter. Then I found, a = – 0.015 and b = 0.094. Since a is too small, I left it out of the reckoning. So, SARI = 0.094 (SPS).

In 2005, I independently derived the Character-count Index = C1/10; C1 is the number of characters (no spaces) in a sentence. I am happy to observe that this formula is an approximation of SARI. Thumbs up!

The Optimal Reading Rate

November 21, 2011

By Nirmaldasan

(nirmaldasan@hotmail.com)

“Rauding theory holds that: (a) there is an optimal rate where efficiency is a maximum, (b) the optimal rate is the same during reading and auding, and (c) at the optimal rate, reading efficiency is equal to auding efficiency,” writes  Ronald P. Carver in ‘Optimal Rate of Reading Prose’ (Reading Research Quarterly, Vol. 18, No. 1 (Autumn, 1982), pp. 56-88). Rauding is typical reading, which he differentiates from skimming, scanning, studying and memorizing.

Carver disagrees with the findings of R.E. Jester and R.M.W. Travers that the optimal rates for reading and auding are 300 and 200 words per minute (wpm) respectively. He writes: “On the contrary, the present data suggest that approximately 300 wpm is the most efficient rate for typical college students when they read college-level material as well as when they read Grade 5 material. Thus, it would not seem appropriate for good readers to adjust or change their rate as material decreases in difficulty, because it would be inefficient for them to do so.”

If you skim at the rate of 1000 wpm, as many skilled readers do, there is indeed a greater chance of misreading or misunderstanding the text. I am of the view that the optimal reading rate for both reading and auding ranges from 200 wpm to 300 wpm. This is roughly 3 to 5 words per second.

While researching the average sentence length in terms of words and syllables and letters, I had discovered an approximate equation: 1 x L (letters) = 3 x S (syllables) = 5 x W (words). Using this equation, we find that the optimal reading rate of 3 to 5 words per second may also be expressed as 5 to 8 syllables per second and 15 to 25 letters per second.

Tailpiece: As I was explaining my position on the optimal reading rate to my son Andrew Veda, he drew my attention to a passage in Bernard Shaw’s play “In Good King Charles’s Golden Days”. One of the charactersNewton says: “You can do, quite deliberately and intentionally, seven distinct actions in a second. How do you count seconds? Hackertybackertyone, hackertybackertytwo, hackertybackertythree and so on. You pronounce seven syllables in every second.”

 

A Question Of Sample Size

October 31, 2011

By Nirmaldasan
(nirmaldasan@hotmail.com)

Three samples are usually considered to be enough for testing the readability of a text. Whether the chosen sample is representative depends on the sample size. Interestingly, different formulae require different sample sizes.

The Winnetka formula, created by Mabel Vogel and Carleton Washburne in 1928, requires a sample of 1000 words. The Lorge Readability Index (1944), Flesch Reading Ease (1948), Gunning’s Fog Index (1952) and Fry Graph (1977) require a sample of about 100 words. Harry McLaughlin’s SMOG (1969) uses a 30-sentence sample. The FORCAST formula of FORd, CAylor and STicht (1973) needs a sample of 150 words. The new Dale-Chall formula (1995) needs an exact 100-word sample.

Since a formula is only a statistical tool that calculates the approximate grade level of a representative sample, it follows that a shorter sample is most likely to yield an inaccurate result. Testing the full text or choosing larger samples may guarantee better results. However, Harris-Sharples found that a ‘minimum of eight samples produced readability scores similar to the largest samples’, according to Jeanne S. Chall and Edgar Dale (Appendix A of Readability Revisited: The New Dale-Chall Readability Formula).

But what is the optimal sample size? A 1000-word sample is easier to test than a full text; a 100-word sample reduces the task by a tenth; and a 10-word sample would be tempting indeed for those who want quick results. Let us try, though on a small scale, to find out the optimal sample size for calculating semantic complexity, which is usually determined by the length of the words in a text.

I picked up Roald Dahl’s Matilda and calculated the average number of syllables per word (ASW) on a sample of the first 1000 words of Chapter I (The Reader Of Books). There were 1376 syllables and, therefore, ASW = 1.376. I then divided the 1000 words into 10 samples, each of 100 words, and found that the ASW varied from 1.3 to 1.46. Next, I divided the first 100 words (ASW = 1.42) into 10 samples, each of 10 words, and found that the ASW varied from 1 to 1.6.

Samples of 10 words vary too much, but samples of 100 words appear more stable. Perhaps this explains why many readability formulae use 100-word samples. Clearly, the calculation of semantic complexity with a 10-word sample cannot be recommended.

Storybooks For Kindergarten

October 1, 2011

By Nirmaldasan

(Nirmaldasan@hotmail.com)

Glenn Doman, author of How To Teach Your Baby To Read, says: “Two years of age is the best time to begin if you want to expend the least amount of time and energy in teaching your child to read.” The child needs to be taken through seven steps, beginning not with the alphabet but with visual differentiation. Doman suggests an initial list of 59 words and explains how these words have to be taught patiently one at a time. Here’s the list:

The ‘Self’ Vocabulary

hand, knee, foot, head, nose, hair, lips, toes, leg, eye, ear, arm, teeth, belly, mouth, elbow, thumb, finger, tongue, shoulder.

The ‘Home’ Vocabulary

  1. Family:

mommy, daddy, brother, sister, dog, cat, fish, bird, baby

  1. Objects:

Chair, table, door, window, wall, bedroom, bathroom, kitchen, refrigerator, tv

  1. Possessions:

Plate, spoon, cup, hat, shoes, ball, orange, pants, dress, pajamas

  1. Doing:

Sitting, standing, running, laughing, climbing, creeping, walking, jumping, throwing, reading.

Only in the sixth step the child will be ready to read a real book. Doman writes: “The choice of the book to be used is very important and should meet the following standards:

  1. It should have a vocabulary of not more than 150 different words.
  2. It should present no more than a total of 15 or 20 words on a single page.
  3. The printing should be no less than ¼” high.
  4. Text and illustrations should be separated as much as possible.”

“There are three distinct levels of understanding in the process of learning how to read,” says Doman. First, the child discovers that words have meaning. Second, the child notices that words combine with other words. Third, the child realizes that the book is actually talking to him/her.

Doman recommends 26 books, including some written by Dr. Seuss, which fulfilled the following criteria: “1. large enough print; 2. print not intertwined with pictures; 3. size of vocabulary; and, 4. subject matter.”

Teachers may follow Doman’s guidelines in picking suitable storybooks for their children in kindergarten.

Writing For The Average Reader

August 26, 2011

By Nirmaldasan

(nirmaldasan@hotmail.com)

Though there are several factors that make text difficult for the reader, classic readability relies on two effective variables (vocabulary and sentence length) to grade texts on a scale of 1 to 17+ years of schooling. Jeanne S. Chall and Edgar Dale, in their new Dale-Chall readability formula (Readability Revisited, 1995), present a revised list of 3000 words familiar to 80 % of 4th graders in the U.S. and tables for obtaining cloze scores and reading levels.

Edgar Dale’s original word list had only 769 words known to beginning readers. It was derived from Edward L. Thorndike’s Teacher’s Word Book and Madeline Horn’s list for the International Kindergarten Union. Dale expanded the list to 3000 words for the Dale-Chall readability formula of 1948.

The new Dale-Chall formula of 1995 is better than the original in that there are no complex arithmetic tasks. One counts the number of complete sentences in exact 100-word samples as well as the number of words not in the list. There are clear instructions for identifying familiar and unfamiliar words. The authors say: “We suggest that the analyst make a first approximation as to whether a word is familiar (‘elemental’) or unfamiliar (‘educated’). The list (and guidelines) should then be consulted to confirm the analyst’s judgment. With practice this procedure becomes quite rapid.” Once the number of complete sentences and the number of unfamiliar words are determined, the reading levels may be obtained from the tables.

The average reader has about eight years of schooling. I have discussed this in an article titled ‘The Average Reading Grade’. So if writers are writing for the average readers, what guidelines can be drawn from the new Dale-Chall formula?

Martin Cutts recommends an average sentence length of about 15-20 words. This means that in an exact 100-word sample, there must be at least five complete sentences. I also looked at those parts of the Dale-Chall tables which indicated 8-9 years of schooling. It is my considered opinion that if the number of unfamiliar words is more than 20%, then the text is not appropriate for the average reader.

The revised Dale list of 3000 words is not available online. You need to get a copy of Readability Revisited published by Brookline Books.

Here follow my two guidelines for writers writing for the average reader: 1. In exact 100-word samples, let there be no fewer than five complete sentences; and 2. Let the percentage of unfamiliar (‘educated’) words be no more than 20.

The Fog Estimate

July 25, 2011

By Nirmaldasan

(nirmaldasan@hotmail.com)

Robert Gunning’s Fog Index is the sum of sentence fog and word fog. Sentence fog is caused by long sentences and may be measured by the expression 0.4 x AWS (average number of words per sentence). To measure the word fog, the ratio of AHS (average number of hard words per sentence) and AWS is multiplied by 40. Hard words are polysyllables other than capitalized words, easy compound words and disyllabic verbs made trisyllabic by adding ‘-es’ or ‘-ed’.

Harry McLaughlin developed his SMOG (Simple Measure Of Gobbledygook) on the assumption that sentence length and word length must be multiplied instead of added. If we apply this valid principle to Gunning’s factors, then sentence fog x word fog = (0.4 x AWS) x (AHS/AWS) x 40 = 16 x AHS = H16 (number of hard words in 16 sentences).

This is a useful result as two variables are reduced to one. No need to calculate sentence fog or word fog. Just count the number of hard words in 16 sentences to get the predictive power of both the variables. This count H16 will help us easily match text with years of schooling from 1 to 17+.

If an average sentence has 6.54 polysyllables, then SMOG is 17+. Unlike Gunning, McLaughlin counted every polysyllable in a sample of 30 sentences. We will count only the hard words in 16 sentences and fairly assume that if the average sentence has three hard words or more, then the frequency of gobbledygook is 17+.

Based on this assumption, the Fog Estimate = (H16 / 3). If a text has 15 hard words in 16 sentences, then anyone with five years of schooling may read it with ease. But if it has 51, then the Fog Estimate is 17+. This new derivation from the Fog Index, I dedicate to Robert Gunning and Harry McLaughlin.

Special SMOG Grading Revisited

June 24, 2011

By Nirmaldasan

(nirmaldasan@hotmail.com)

In ‘Special SMOG Grading’ written in April 2010, I simplified Harry McLaughlin’s already simple readability formula of 1969. But the simplification had a limitation; it could predict years of schooling only from 5 to 12+.  Recently, I discovered a fresh simplification of McLaughlin’s formula for the higher grades 13 to 17+.

The average number of polysyllables per sentence (APS) is a useful metric. SMOG Grading = [square root of (30 x APS)] + 3. Here follow my simplifications:

If APS is less than 3, Special SMOG Grading (5 to 12+) = (3 x APS) + 5

If APS is 3 or more, Special SMOG Grading (13 to 17+) = APS + 10

Suppose APS is 2.6. Then, SMOG Grading = [square root of (30 x 2.6)] + 3 = 11.83; and Special SMOG Grading (5 to 12+) = (3 x 2.6) + 5 = 12+

Suppose APS is 5.3. Then, SMOG Grading = [square root of (30 x 5.3)] + 3 = 15.6; and Special SMOG Grading (13 to 17+) = 5.3 + 10 = 15.3.

For APS = 1 to 7, here are the corresponding scores of McLaughlin’s formula with those of my simplifications in brackets: 8.47 (8); 10.74 (11); 12.48 (13); 13.95 (14); 15.24 (15); 16.41 (16); 17.49 (17+). You can easily see how McLaughlin’s formula and my simplifications closely agree with each other.

An easy way to calculate the APS is to count P10, which is the number of polysyllables in 10 sentences. In which case, the simplifications assume these forms:

If APS is less than 3, Special SMOG Grading (5 to 12+) = 3 x (P10 / 10) + 5

If APS is 3 or more, Special SMOG Grading (13 to 17+) = (P10/10) + 10

However, my simplifications still have a limitation. While McLaughlin’s formula can predict years of schooling from 3 to 17+, mine can only predict from 5 to 17+.

Fry Graph Made Easy

May 23, 2011

By Nirmaldasan

(nirmaldasan@hotmail.com)

Edward Fry’s Readability Graph shows the approximate grade levels from 1 to 17+. This graph is discussed in William DuBay’s book Smart Language and may be easily found online. Its variables are the percentage of syllables on the x-axis and the percentage of sentences on the y-axis. A simple count of the number of syllables in 100 words yields the percentage of syllables, but finding out the percentage of sentences needs a little arithmetic as 100-word samples seldom end with a full stop.

Suppose a 100-word sample has 4 complete sentences of 92 words and a sentence-fragment of 8 words, then the percentage of sentences is (4/92) x 100 = 4.34. Now if the sample has 132 syllables, then the zone on the graph where the two coordinates meet shows a grade level of 8.

The Fry Graph is easy to use, but it can be made easier with different variables. Let us begin with the x-axis. Rudolph Flesch, in an article ‘A New Readability Yardstick’, published in the Journal of Applied Psychology (3 June 1948), suggested a simple syllable-counting procedure: “To save time, count all syllables except the first in all words of more than one syllable and add the total to the number of words tested.” So if a 100-word sample has 132 syllables, there will be only 32 exsyls (excess syllables).

Irving Fang’s Easy Listening Formula and Davis Foulger’s Simplified Flesch Reading Ease use exsyls as a variable. I followed their example and incorporated the exsyls in the Flesch-Kincaid Index. So why should not the Fry Graph take advantage of the exsyls? Replacing the percentage of syllables with the percentage of exsyls is easily done by subtracting 100 from each of the numbers marked on the x-axis. Or simply erase the 100th digit. So the new numbers on the x-axis are 08, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64, 68, 72, 76, 80 and 82.

Now, let us move on to the y-axis. There is an exact equation between the average number of words per sentence (AWS) and the number of sentences in 100 words (S%).  And here it is: AWS = 100 / (S%). If a 100-word sample ends with a full stop, then S% is easier to calculate than AWS. But this happens in the rarest of rare cases. Since 100-word samples are reluctant to end with a full stop, AWS is easier to calculate than S%. Robert Gunning’s Fog Index, Flesch-Kincaid Index, Smith and Senter’s Automated Readability Index are some formulae that use the AWS. So why should not the Fry Graph take advantage of the AWS too?

Using the formula AWS = 100 / (S%), the number of sentences per 100 words may be replaced with the AWS. For example, 4.8 shall be replaced with 20.83 [100 / 4.8]. Likewise, the other numbers on the y-axis shall also be replaced.

The exsyls are easy to count and the AWS is easy to calculate. With these variables, I hope that the deservedly popular Fry Graph may become even more popular.

Low-literacy Sentence Length

April 22, 2011

By Nirmaldasan
(nirmaldasan@hotmail.com)

Low-literacy readers have problems with vocabulary and syntax. Unfamiliar words and long sentences make text difficult for them. Most of the materials prepared for the average readers have to be rewritten for those with very limited reading ability.

For average readers, keep sentences short. But for low-literacy readers, keep sentences shorter. I recommend a low-literacy sentence length of less than 15 words. Of course, this is arbitrary. You may write a longer sentence if anything shorter may compromise the sense. But let the long sentence be the exception, rather than the rule. With a little bit of practice, you’ll find that a short sentence under 15 words can best communicate to readers with low literacy.

Every word in a sentence affects the readability of a low-literacy text. Interestingly, in such texts, polysyllabic words are not a factor of reading difficulty. In Adult Literacy (Basic Skills And Libraries), Gerry Bramley writes: “Adults with literacy problems often have an extensive ‘social sight’ vocabulary which will naturally include polysyllabic words. For example, an adult with severe reading problems might still be able to walk around a supermarket and recognize such signs, labels and directions as ‘vegetables’, ‘tomatoes’, ‘sausages’ and ‘checkout’. It is monosyllabic words, particularly conjunctions, adjectives and pronouns, which cause difficulties for adult literacy students.” (Page 5, Library Association Publishing, London, 1991)

So words, be they short or long, must be familiar. Writers of low-literacy texts may rely on a useful list of commonest words, found in the third edition of Martin Cutts’ Oxford Guide To Plain English (2009). This book also has a chapter on low-literacy plain English, based ‘on the practical knowledge of Janet Pringle, a Canadian expert on the needs of low-literacy readers’.

The rule of thumb once again: Let the low-literacy sentence length be under 15 words.


Follow

Get every new post delivered to your Inbox.