When it comes to measuring readability, there are a whole variety of readability tests. You could go for the long established Flesch-Kincaid Reading Ease or the later Flesch-Kincaid Grade Level. But what about the Gunning-Fog and the New Dale-Chall?
With so many tests out there, does it matter which test you use?
(If you want to see which formulas ReadablePro uses, then check out our Readability formulas page.)
With many tests available, studies wanting to measure readability will often analyse text using a batch of mainstream readability tests. In some cases studies report no meaningful difference between test results across different readability tests leading to the generation of an average readability score rather than committing to a single measure. Indeed, at least at surface level, comparisons of readability test scores across different tests for the same text reveal a moderate to strong correlation between scores.
But, this is not always the case with many studies emphasising the variance between scores from different readability tests rather than the similarity. In line with this variance there is emerging evidence that, depending on the task, some readability tests may be more suitable than others.
Research exploring the appropriateness of different readability tests for specific topic areas or tasks is in its infancy but already some interesting results are emerging. Here are a few scenarios in which you might use a readability tests and some reflections on which test might be most appropriate.
Deciding what texts are appropriate are for students
An area for which readability testing has long been used is to help education professions make decisions about which books are appropriate for their students. By assessing the readability of texts, education professionals can be more confident that the books they set for their students are pitched at a level of readability appropriate for the reading stage of these students.
One way to assess the success by which different readability formula can be used to gauge difficulty of reading materials is to compare readability test scores with students’ actual Oral Reading Fluency. Oral Reading Fluency refers to oral reading accuracy and frequency and is deemed a principal indicator of 4-11 year-old students’ general reading ability. In a recent in-depth study exploring this issue, researchers looked at the link between readability test scores and ORF for a variety of readability tests making a range of comparisons on performance of the tests for different ability levels. For example, comparisons were made between 3rd and 4th grade materials, and between 4thand 5th grade materials. The study reported that while some tests showed promise for accurately measuring readability for particular ability levels, the only reliable measure of text difficulty across all ability groups with the Dale-Chall test.
The Dale Chall test is based on a list of 763 familiar words commonly known by 4th graders with the difficulty of reading increasing the more unfamiliar or ‘hard’ words, i.e. that don’t make it onto the list of familiar words, are included. The test has now been updated to become the New Dale Chall test which has an increased word bank of 3000 words.
The findings of the study are supported by previous research and suggest that for the task of selecting the most appropriate texts for the grade level of the students, Dale Chall is the most appropriate readability formula to use.
Using readability tests to assess health information
Another area where readability testing is increasingly common is in the assessment of written healthcare information. For the healthcare sector to ensure patients are getting the right messages around their care and treatment, those messages need to be understandable when provided in written format.
Given the range of readability formula available, a study was conducted to investigate the appropriateness of different readability tests for assessing written health information.
The study focused on the performance of different readability tests in the assessment of written health information on depression and its treatment. Findings revealed (a) that there was disparity in test performance across different formula of up to six reading grade levels of variability and (b) that the SMOG Index performed with the greatest consistency across comparisons. The study also reported that the SMOG generated higher levels of expected competencies and used more recent validation criteria for determining grade level estimates. On this basis, the researchers recommend that the SMOG Index is the most suitable readability test for assessing health information. This recommendation is in line with recommendations from the US National Institutes of Health who, in their Clear & Simple: Developing Effective Print Materials for Low-Literacy Audiences guidance, identify the SMOG formula for the assessment of health information for people with limited literacy skills.
So, evidence suggests that the SMOG – an acronym for the pleasingly titled ‘Simple Measure of Gobbledygook’ – is the most appropriate readability measuring for assessing written health information. Just a word of caution on usability of the test. The SMOG readability score is generated based on the 10 consecutive sentences near the beginning of the text, the 10 in the middle and the 10 near the end. Hence, the measure isn’t appropriate for short passages of text.
Measuring readability of financial disclosures
Consistent with requirements for companies to share their information in a way that is transparent to the reader, readability measures are often used to assess financial documents such as annual reports of a company’s financial performance. Here the FOG has typically been used for analysis, often run alongside the Flesch-Kincaid Grade Level or Flesch-Kincaid Reading Ease. However, an examination of the suitability of FOG formula in assessing difficulty of reading for financial information raises concerns about the appropriateness of this tool. One reason for this perceived unsuitability hinges on the target audience that financial information documents are aimed at. The researchers argue that those people who would typically be reading financial reports – such as annual reports of a company’s financial performance – would have a high level of education. Therefore the standard recommendations about keeping readability at a level suitable for the general public are not appropriate for financial documentation.
However, one reason that the readability is used for financial information is to avoid the risk of deception. Sometimes known as impression management, this is the idea that the way an article is written can result in a positive portrayal of what might not be such a rosy picture. A recent article in Research in International Business and Finance reported an association between annual report readability and earnings management – the higher the discretionary accounting adjustments, the higher the Fog Index readability score.
Given findings such as these, any recommendation that FOG or other standardised readability measures should not be used for financial information and that special readability formula or special rules should be developed for this type of information should be treated with caution.
Measuring readability for people with dyslexia
Readability tests have been developed for use with the general population. But, can readability formula still be of value when it comes to measuring difficulty of reading of texts for people with dyslexia?
Dyslexia is a specific learning difficulty which causes difficulties with skills associated with learning, namely reading and writing. Dyslexia does not have any impact on intelligence and occurs in 10-15% of the US and the UK population. Studies exploring ease of reading for people with dyslexia have found that for this population, readability of text and comprehension of text are separate. So, the ease with which text can be read – its readability – is separate from the ease with which text can be understood – comprehension. So, what does this mean for readability testing?
Bearing in mind the separate functions of readability and comprehension for people with dyslexia, researchers conducted an eye-tracking study to explore whether people with dyslexia would benefit from text simplification. The study explored the effect of word substitutions with short or long words and more or less frequently used synonyms on comprehension and readability for a sample of dyslexic participants compared to a control group. The study found that short words improved the comprehension or understandability of text while frequency of words improved readability and resulted in the participants reading faster.
This research has some really interesting implications for choosing appropriate readability tests for use with people with dyslexia. For instance, if we think the use of short words for improving comprehension, a readability measure such as Flesch Kincaid Grade Level would be useful here as the formula incorporates word length with readability scores improving when fewer long words are included. However, when we consider the role of including more familiar words to improve readability, the New Dale Chall measure can be useful as this formula is based on a bank of commonly known words. So, for people with dyslexia, two measures performing two different functions suggest a promising role for readability for dyslexia.
As the body of evidence on the suitability of different formula continues to grow we will be increasingly better informed to select and use measures of readability that are tailored to the aims of the task.