Pages

The LSAT Curve | Test-Equating at LSAC

LSAT Bell CurveThis post is Part 1 of the "The LSAT Curve" series of blog posts. Click here for links to each part of the series.

There's a lot of confusion about the LSAT's curve. The LSAT is not actually scored to a curve, but most test-takers think it is.

This series is my effort to explain LSAC's process of test-equating, raw score conversions, percentiles, and why the test isn't actually curved. Because I dislike statistics (and because most of you probably do also), this blog post involves very little math. However, it might involve some thinking.

You've been warned.

LSAC's Associate Director of Psychometric Research, Lynda Reese, recently wrote the following to one test-taker who asked about the curve (I've added the links):
[T]he LSAT is not graded to a curve...Rather, for every form of the LSAT, a statistical process called test equating is carried out to adjust for minor differences in difficulty between different forms of the test. Specifically, the item response theory (IRT) true score equating method is applied to convert raw scores (the number correct) for each administration to a common 120 to 180 scale. A detailed description of this methodology can be found in...Applications of Item Response Theory to Practical Testing Problems...The equating process assures that a particular LSAT scaled score reflects the same level of ability regardless of the ability level of others who tested on the same day or any slight differences in difficulty between different forms of the test. That is, the equating process assures that LSAT scores are comparable, regardless of the administration at which they are earned.
I'm not a psychometrics expert, but I decided to go ahead and learn more about how LSAC constructs the exam and ensures different PrepTests are of relatively equal difficulty.

I looked up the book Ms. Reese referenced (and believe me, it wasn't exactly a walk in the park).

The following is my understanding of how LSAC creates each LSAT and goes about the test-equating process. Feel free to leave questions and comments, especially if you have a decent understanding of statistics, psychometrics, etc. LSAC's also welcome to leave comments. They haven't commented on the blog yet, but the door's always open.

If you're new to the LSAT, see the LSAT FAQ for more on the basics before getting into all the details.

If you're not new to the LSAT, read on, starting with these definitions of basic terms and concepts:

Conversion Chart: Chart at the end of each PrepTest that helps you translate a raw score into a score out of 180

Percentile: The percentage of test-takers whose scores fall below yours. If you score in the 50th percentile, you scored higher than half of all test-takers. If you score in the 97th percentile, you scored higher than 97% of all test-takers.

PrepTest: Previously administered and released LSAT exam

Psychometrics: The study of psychological measurements. As far as we're concerned, it's the "science" of standardized testing.

Raw Score: The number of questions you answer correctly on the LSAT

Test-equating/Pre-equating: "a statistical method used to adjust for minor fluctuations in the difficulty of different test forms so that a test taker is neither advantaged nor disadvantaged by the particular form that is given" - LSAC (PDF).

Test form: A particular LSAT exam

Scores have to be meaningful and consistent
The LSAT is a standardized exam. This means that a 160 on the Feb 2010 LSAT should be equivalent to a 160 on the June 2010 LSAT, which should be equivalent to a 160 on the October 2010 LSAT, etc. Law schools can't be bothered to look at particular Logic Games, Logical Reasoning, and Reading Comprehension on various exams to see if students with identical scores actually performed at different levels. They can't bother to look at test-takers' raw scores. That's why they have equated numerical scores out of 180, after all.

Administering the same questions over and over wouldn't work
One theoretical (and stupid) way to ensure that all scores were equal would be to create only one LSAT PrepTest and administer it over and over. This would ensure that all test-takers were treated equally and that the "raw score conversions" were always fair. However, this ignores the fact that test-takers would share information with each other.

People who took the February 2010 LSAT would give/sell info about questions that appeared to test-takers who took it in June 2010, etc. Under such a system, the later one took the exam, the more inflated his/her score would be, on average. Thus, LSAC can't just keep giving the exact same questions exam after exam.

For this reason, LSAC needs to create different exams for each released test administration and make them of relatively equal difficulty. A 160 on one LSAT (aka "test form") needs to be equivalent to a 160 on any other LSAT.

***

Read on for Part 2: Why the LSAT Isn't Scored on a Curve: Myth and Fact

Photo by hname / CC BY 2.0

12 comments:

  1. 5th percentile means you only score better than 5% of test takers right?

    ReplyDelete
  2. What's the source of the Lynda Reese quote? Thanks!

    ReplyDelete
  3. @Anonymous 2/25 - Yes.

    @Anonymous 2/26 - As stated in the blog post, it's from an emall.

    ReplyDelete
  4. From my point of view, there is no curve and difficulty levels in LSAT. It is the matter of level of practice and understanding of the test takers that influence the scores. Just like for somebody more experience in Logical Reasoning and less in Logical Games and vice verse.

    ReplyDelete
  5. Wow... That really was a lot to think about. But it is great to hear the information, and the reminder that it is not on a curve.

    ReplyDelete
  6. It seems that in order to properly score an exam, the test-makers need to know how difficult each question is.

    But in order to know how difficult a question is, they have to know how test-takers performed on similar questions in the past. And that requires properly scoring an exam, which requires knowing how difficult each question on the exam is, etc..

    It seems like this is a chicken-and-egg situation. What am I missing?

    ReplyDelete
  7. You're alluding to a very interesting question:

    How did the test-makers measure the difficulty of each question on PrepTest 1 (June 1991)?

    Who did they test those questions on?

    My guess would be that they tested them on a pool of people about whom they knew some kind of factor loosely correlated with LSAT performance (such as GPA).

    Perhaps they tested them on people who took the pre-June 1991 LSAT.

    I know that doesn't really answer your question (b/c how did they measure those questions, etc., etc.) It's worth looking into. Why don't you email LSAC and let us know what you find?

    I'll probably look into it at some point in the future.

    In the meantime, keep reading this series. It may address what I'm sure are some other questions you have about how the LSAT's constructed.

    ReplyDelete
    Replies
    1. Perhaps the first test WAS graded on a curve?

      Delete
    2. There is a section on the LSAT that is an undgraded portion. This section is referred to as "experimental." This experimental section is likely used to gauge the difficulty in test questions prior to administration as part of a graded section.

      Delete
  8. Hi.

    I was just wondering if a potentially low LSAT score harms your chances at admission if you know you can do significantly better next time? In other words, should I cancel my score or risk finding out I got something like a 144 if I know I am capable of a 160, but just wasn't prepared enough the first time? This only pertains to Canadian schools for me - and almost all take your best score, not an average.

    ReplyDelete
    Replies
    1. I am curious to find out my score and law schools see that I cancelled anyway if I do, but I feel that seeing my score, no matter how low it is, might help me gauge test-day performance and improve my skills in a way that prep tests cannot mimic. What should I do?

      Delete
  9. So, to clarify, the performance of test takers on a particular lsat will not affect the performance of any particular one of those test takers? Or, in other words, the scale is determined before the raw scores of the test takers are determined?

    ReplyDelete