SEM

Session 6  Lecture

Standard Error of Measurement

True Scores / Estimating Errors / Confidence Interval

True Scores

Every time a student takes a test there is a possibility that the raw score (observed score) obtained  may be less or more than the score the student should actually have received (true score). The difference between the observed score and the true score is called the error score. S true  = S observed  + S error

In the examples to the right Student A has an observed score of 82. His true score is 88 so the error score would be 6.  Student B  has an observed score of 109. His true score is 107 so the error score would be -2.

 

Picture 1

If you could add all of the error scores and divide by the number of students, you would have the average amount of error in the test. Unfortunately, the only score we actually have is the Observed score(So).

The True score is hypothetical and could only be estimated by having the person take the test multiple times and take an average of the scores, i.e., out of 100 times the score was within this range. This is not a practical way of estimating the amount of error in the test.

True Scores / Estimating Errors / Confidence Interval / Top


Estimating Errors

Another way of estimating the amount of error in a test is to use other estimates of error. One of these is the Standard Deviation. The larger the standard deviation the more variation there is in the scores. The smaller the standard deviation the closer the scores are grouped around the mean and the less variation.

Another estimate is the reliability of the test. The reliability coefficient (r) indicates the amount of consistency in the test. If you subtract the r from 1.00, you would have the amount of inconsistency. In the diagram at the right the test would have a reliability of .88. This would be the amount of consistency in the test and therefore .12 amount of inconsistency or error.

Picture2

Using the formula:   {SEM  =  So   x    Sqroot(1-r)} where  So is the Observed Standard Deviation and r is the Reliability the result is the Standard Error of Measurement(SEM). This gives an estimate of the amount of error in the test from statistics that are readily available from any test.

The relationship between these statistics can be seen at the right. In the first row there is a low Standard Deviation (SDo) and good reliability (.79).
In the second row the SDo is larger and the result is a higher SEM at 1.18. In the last row the reliability is very low and the SEM is larger. As the SDo gets larger the SEM gets larger. As the r gets smaller the SEM gets larger.
SEM SDo Reliability
.72 1.58 .79
1.18 3.58 .89
2.79 3.58 .39

 

True Scores / Estimating Errors / Confidence Interval / Top


Confidence Interval

The most common use of the SEM is the production of the confidence intervals. The SEM is an estimate of how much error there is in a test. The SEM can be looked at in the same way as Standard Deviations. Sixty eight percent of the time the true score would be between plus one SEM and minus one SEM. We could be 68% sure that the students true score would be between +/- one SEM.  Between +/- two SEM the true score would be found 96% of the time. Or, if the student took the test 100 times, 64 times the true score would fall between +/- one SEM.
Picture3
The SEM can be added and subtracted to a students score to estimate what the students true score would be. The table at the right shows for a given SEM and Observed Score what the confidence interval would be. The most notable difference is in the size of the SEM and the larger range of the scores in the confidence interval.

While a test will have a SEM, many tests will have a SEM for different parts of the test. Click here for examples of the use of SEM in two different tests:

 

SEM Minus Observed
Score
Plus
.72 81.2 82 82.7
.72 108.2 109 109.7
2.79 79.21 82 84.79
2.79 106.21 109 111.79

True Scores / Estimating Errors / Confidence Interval / Top