Latent Rank Theory (Neural Test Theory)

This website introduces the latent rank theory (LRT) / neural test theory (NTT), which is being developed by Kojiro Shojima and his colleagues. LRT is a nonparametric test theory that uses a mechanism based on the self-organizing map (SOM) or generative topographic map (GTM).

Quick Introduction PPT

Example of Can-Do Chart PDF

LRT (NTT) was first presented at the International Meeting of the Psychometric Society in 2007 (IMPS2007) in Tokyo, and many foreign and domestic researchers and test practitioners have expressed interest in it.

Conventional Test Scoring on a Continuous Scale

The academic achievements of students in classrooms around the world are evaluated on the basis of test results expressed on a continuous scale. However, is there a substantial difference in the achievements of two students if one receives, for example, 65 on a 0—100 scale and the other receives 70. It would probably be difficult to distinguish the difference because it is likely to fall within the margin of error. Can we rely on scores on a continuous scale as a measure for evaluating academic achievements or in selecting applicants taking an entrance exam?

Tests simply don’t have sufficient resolution (i.e., “reliability”) for distinguishing the difference between two people with nearly equal abilities. A weighing machine, for example can used to detect even slight differences in weight between two people, but it is practically impossible for a test to distinguish a slight difference in abilities. The reliability of an ordinary weighing machine is sufficient for us to readily believe that Student A is 2 kg heavier than Student B if the machine says that Student A weights 62 kg and Student B weighs 60 kg. However, we would likely doubt that the ability of Student C is certainly slightly higher than that of Student D if Student C scores a 62 on a test and Student D scores a 60. The most that tests can do is to rank the test takers into 5—20 levels. In fact, some teachers assert that it is practical to rank them into only five levels Excellent, Very Good, Good, Below Average, and Needs Improvements.

Although some people argue that expressing scores on a continuous scale is effective from the viewpoint of motivation because the scores generally differ a bit each time a test is taken, an effort should be made to prevent students from swinging between feeling happy and sad in response to such differences because human abilities cannot be drastically changed overnight. While “continuous scoring” is very convenient for precisely matching the number of entrance exam passers to the number of admission slots. However, graded evaluation can reduce the excessively authoritative status of testing, which would encourage the use of factors other than test scores such as school records and interviews.

Test Standardization Theory for Grading Academic Achievement

How can we grade academic achievement? Graded evaluation of academic achievement is inseparably connected to constructing a qualifying test, the result of which can be used to construct an “ability catalog” for each examinee and to certify the level of the test taker’s mastery of each item in the catalog.

Before such a qualifying test for grading academic achievement can be created, a suitable test theory must be developed. Test theories are statistical theories for standardizing tests and administrating them over the years. Item response theory (IRT) is the prevailing test theory, and many large-scale tests such as TOEFL, TOEIC, and NAEP are administered in accordance with IRT. However, since IRT is based on a continuous ability scale and locates examinees’ abilities on that scale (e.g., the paper-based version of TOEFL has a continuous scale ranging from 300 to 677), the IRT is not a satisfactory test theory for standardizing tests to become qualifying tests. A test standardization theory in which examinees’ abilities are evaluated on not a continuous scale but an ordinal scale is necessary for ability grading. Of course, the continuous scale of IRT can be used for graded evaluation by dividing it at several points though where to locate the points would be problematic.

We are developing a latent rank theory (LRT) / neural test theory (NTT) that will be satisfactory for standardizing tests to become qualifying tests. LRT/NTT is a statistical model comprising the self-organizing map (SOM) or generative topographic map (GTM) mechanism and is a test theory for grading academic achievement on an ordinal measurement scale.

Test Theory for Administering Qualifying Tests

The figure illustrates the difference between the traditional continuous ability scale and the LRT ordinal scale for graded evaluation. Every test should have a function for not only selecting passers but also for diagnosing each examinee’s abilities. Therefore, it is particularly important that a test can be used to construct an ability profile for each achievement level (latent rank) and to construct an achievement progress table (can-do chart; see Accessories). It is much easier to make a can-do chart using an ordinal ability scale than a continuous one because describing the can-do statements corresponding to each achievement level is only practical when the ability scale is ordinal. Drawing a can-do chart is equal to controlling the quality of the test. The chart secures accountability of the test by manifesting what the test measures, so only the chart can hallmark the test.

Latent Rank Theory

Neural test theory is a statistical model in which the latent variable is ordinal. Such statistical theory has not been established. Therefore, we propose a framework called “latent rank theory (LRT)”.

 

Latent Rank Theory (LRT

Data Type

LRT-SOM

LRT-GTM

Binary

RN08-01

RN08-06

Polytomous (Ordinal)

RN07-03

In preparation

Polytomous (Nominal)

RN07-21

In preparation

Continuous

In preparation

In preparation

 

The NTT is a latent rank theory that uses the self-organizing map (SOM) or generative topographic map (GTM) mechanism to estimate the NTT model. The model with the GTM mechanism (LRT-GTM) is a batch-type learning model, and computation time for estimation is much faster than that of the model with the SOM mechanism (LRT-SOM). In addition, the calculation results are invariant with the LRT-GTM model, while they vary slightly with the LRT-SOM model, although the fluctuations are negligible. However, the item reference profile (see Feature) obtained with the LRT-SOM model is a little smoother than that with the LRT-GTM model.

Go to Main Page