Validity is the extent to which a test measures what it claims to measure.
Testing is a matter of making judgments about test-takers competence in view of their performance on certain tasks.
These judgments are inferences as tests do not collect concrete evidence about test-takers’ ability, in the natural state, but only abstract inferences
Evidence of test performance is used to draw conclusions about candidates’ ability to handle the demands of the criterion situation.
For high-stakes tests procedures need to be taken to investigate the procedure by which the conclusions were drawn.
Test validation is this process of investigating the quality of the test-based conclusions
The different types of validity are:
Â Â Â Â Face validity
Â Â Â Â Content (sampling) validity
-[Â Â Â Â Â Â Â Concurrent Validity
<!Â Â Â Â Â Â Â Predictive Validity
Face validity is the extent to which a test meets the expectations of those involved in its use – Â stake-holders
This type of validation is designed to decrease opposition by ensuring that nobody is too unhappy with it.
An example of an instrument that measures face validity is Rosenberg’s self esteem scale.
When a test has content (sampling) validity, the items on the test represent the entire range of possible items the test should cover.
To ensure this, individual test questions may be drawn from a large pool of items that cover a broad range of topics.
Content validity establishes that the measure covers the full range of the concept’s meaning, i.e. covers all dimensions of a concept
When a test has content validity, the test reflects the syllabus on which it is based
A test is said to have criterion-related validity when the test is demonstrated to be effective in predicting criterion or indicators of a construct.
There are two different types of criterion-related validity:
<!Â Â concurrent Validity
<!Â Â predictive validity
Concurrent validity occurs when the criterion measures are obtained at the same time as the test scores.
This indicates the extent to which the test scores accurately estimate an individual’s current state with regards to the criterion.
Predictive validity occurs when the criterion measures are obtained at a time after the test.
A test has construct validity if it demonstrates an association between the test scores and the prediction of a theoretical trait.
Construct under-representation and construct irrelevant variance are two major threats to validity too.
A test is said to demonstrate construct under-representation if tasks included in the test fail to measure important dimension of the construct. If this happens, results of the test are unlikely to reveal test-taker’s ability within the domain the test claims to measure.
A test is said to demonstrate construct irrelevant variance if tasks measure variables which are irrelevant to the domain the test claims to measure. This type of invalidity can take two forms:
<!Â Â Â Â construct irrelevant easiness
<!Â Â Â Â construct irrelevant difficulty