Our second presentation of the symposium provided an overview of the challenges that test developers often face when trying to develop shorter assessments that are reliable, valid and fair.
For a long time, much of the research literature and practice in this area has made the assumption that longer tests are more effective, particularly when it comes to applications which involve decision-making, such as selection. This view was discussed throughout this presentation, as well as the difficulties often faced when developing shorter tests.
Length of test and reliability
One of the assumptions of Classical Test Theory is that longer questionnaires provide greater reliability and that greater reliability should lead to increased validity.
However, there is an assumption within the formula (the Spearman Brown Prophecy Formula) that is used to provide an estimate to correct for scale length that each question item within a questionnaire will provide equivalent shared variance and will contribute equally to reliability.
In reality, this is an invalid assumption as some question items are often better than others. To chase a higher internal consistency, longer questionnaires typically contain many redundant items.
Shorter questionnaires typically have a low internal consistency (the question items measure different things). However some well researched and developed questionnaires have been shown to have strong reliability estimates in terms of alternate form and test re-test (Rammstedt et al., 2018, Saville et al, 2012). Arguably, this is the ideal scenario – a test that has broad, short but reliable questionnaire scales.