Periodic Assessments and Diagnostic Reports
Case Studies in Mathematics and Literacy
Intervention Programs
Betsy Taleporos
Former Director of Assessment, America's Choice
Abstract
1
This paper discusses the formative use of periodic assessments
as they were developed and are in use by America’s Choice Pearson in
its mathematics and language arts intervention programs. It is a
practical case study of the use of design principles in creating
assessments that are useful for classroom teachers and, by the
nature of their design, provide diagnostic information that is
instructionally relevant. The use of these measures varies with the
program but all of them are designed to highlight misconceptions or
common error patterns. It is important to recognize that
misconceptions occur in both content domains, as they do in other
domains. Uncovering misconceptions or error patterns offers
tremendous insight into a formative use of assessments, since the
reasons behind answering a question incorrectly can directly inform
instructional practice. This approach is also underscored by some of
the suggestions in the lead article in this issue of ED.
Overview
2
Today’s assessment landscape is changing, but remains dominated
by large-scale testing which, as indicated by the lead article in
this issue, is fraught with problems that are not always in sync with
the needs of the classroom teacher. The current state test reports
give information that is generally broader in scope than the
information a classroom teacher needs to help students improve in the
learning expected by the instruction given to them directly and
specifically.
The nature of state test reports do not lend themselves to
diagnosis or focusing on specific needs of students in a way that
lets teachers plan to meet those needs in their day-to-day practice.
The information is not provided in a timely manner, often received
months after students take tests. Where teachers can look at the
results of their current, not last year’s, class, the information is
generally too broad to be of practical use. Further, the type of
tasks provided for students to work on in most state testing
situations rarely tap deep understanding .
While much is wrong with the current system, the new consortia
for assessing the Common Core State Standards are making attempts to
correct some of the current flaws, including enhanced item types and
an emphasis on formative assessment during the school year.
Currently, for both consortia, the formative assessments are
optional, and outside the formal accountability measurement, but
their value is clearly recognized. Whether the fact that they are
optional, and don’t count in a final accountability score, will
weaken their impact is yet to be seen.
The new item types, however, are bound to make an impact on
classroom instruction, where so much time is spent on prepping for
the annual accountability tests. If those tests are significantly
different than the ones currently used by most states, then the
impact will undoubtedly be positive. Nonetheless, the system is still
plagued with the issues surrounding the need for continual feeds of
information on how well students are learning what they are being
taught. The need for formative assessment will still be as critical
as it is now with the current individual state testing systems.
Classroom assessments have their own set of problems as well.
Teachers receive little guidance in test construction in their
pre-service training or their continual professional development. The
resulting assessments may not be as rigorous as needed, and the
quality of the items included may not be optimal. Nonetheless, they
are a reflection of what is valued by the teacher, a measure of the
intended curriculum as well as of the enacted curriculum.
This paper contains figures which llustrate some of the
features of both the mathematics and language arts assessments. The
figures also include screenshots of parts of the on-line reports,
which are at the heart of the assessment. Because real time access to
the reports is proprietary, only screenshots could be shown in this
paper.
Mathematics Navigator
3
The Program
Mathematic Navigator is an intervention program, designed for
students who need some additional time and focused teaching in
specific areas of mathematics. There are 26 modules in this
program, each focusing on a different targeted area of mathematics,
such as Place Value, Fractions, Data and Probability, Exponents,
Expressions and Equations, Rational Numbers, to name but a few.
The Assessments
Each of these modules has a pretest and a posttest, as well
as checkpoint assessments. There is also an omnibus screener for
each grade level to help determine students’ needs for particular
modules. Figure 1 shows the assessments that are part of the
Mathematics Navigator program. Figure 2 lists the reports and shows
the levels of aggregation possible for each of them. It also shows
the purpose of each of the reports.
The testing reports are essential online tools for the teacher
to use in implementing the program. The reports focus on diagnosis
and performance levels.
Diagnostic Reports
One report, called a roster report, shows the answer that
each student gave to each question, and also shows a listing of the
misconceptions that these answers show the student to have. Each
question number on the report is hyperlinked so that the teacher
can click on it and see the actual question and the answer choices
to see what specific choice a student has made. Part of the roster
report is shown in Figure 3. It shows how each student’s choice is
provided, and is shaded yellow if incorrect. The teacher can get a
bird’s eye view of how well a whole class did on an assessment
simply by looking at the proportion of item choices that are shaded
yellow, but also by looking at quantitative information on the
roster report itself.
Item Analysis
The roster report also shows the percent of student getting
the item correct, and the percent choosing each answer choice. The
report also highlights individual questions where the majority of
students got the same wrong answer. These bits of information are
useful to teachers to get a broad view of the needs of the whole
group of students in the mathematics navigator class. The item
analysis information is shown in Figure 4.
Test Design – Focus on Misconceptions
4
The tests are designed in a very purposeful way. The items are
all multiple choice, measuring key concepts taught in the module. The
wrong answer choices are coded to common misconceptions so a
student’s pattern of answer choices can be used to describe the
misconceptions that they have. For each misconception, there are at
least four opportunities for a student to choose an option that
reflects it. If the student typically chooses the wrong answers that
reflect the misconception, the report will show that they have that
particular misconception.
As a design issue, the minimal number of four opportunities was
chosen in somewhat of an arbitrary fashion, based on experience and
industry standard approaches. This number is thought to provide
stable enough estimation, given the chance to see a recurring pattern
of selecting errors that reflect the given misconception. If a
student makes a selection of a given misconception at least 75% of
the time, we can be fairly confident that they have the given
misconception. If it is chosen between 50 and 74% of the time, we
conclude that they may have the misconception, but we are not as sure
as we are when they systematically select the wrong answer with that
misconception. Anything less than 50% does not permit us to make a
conclusion about the systematic reflection of a given misconception.
Using the percent of times the student picks the answer
reflecting a given misconception, the report will show either that
the student definitely has the misconception, possibly has the
misconception, or that there is no evidence of a pattern indicating
that the student has the misconception. Figure 5 shows this report.
Grouping of Students
5
The reports also provide a listing of students by their
misconception patterns that are often useful to teachers in setting
up small group instruction. This information is used by teachers to
have a diagnostic understanding of their students, and can be used to
guide instruction for them. Teachers can group students together who
have similar misconceptions, or can group a student with a given
misconception with another student who understands the misconception.
Checkpoint Assessments
6
The checkpoint assessments are provided several times over the
course of the module. Each includes a debugging activity in which the
students are asked to review each wrong answer and determine the
thought pattern that would have led to the choice of that wrong
answer. This is an additional design feature that enhances the
diagnostic value of the checkpoint assessments as the discussion
focuses on the thought patterns that exemplify misconceptions.
Figure 6 shows a report of the checkpoint assessments. The
number of correct answers is transformed by a predetermined
cutpoint, to indicate that the student is doing well (shaded
green), may be having some difficulties (shaded yellow) or is have
a great deal of trouble (shaded red). The cutpoints vary with the
checkpoint assessments, determined by expert judgment for each one.
While not scaled together in a psychometric analysis, the use of
the judgment methodology simply indicates the student’s status on
the given checkpoint, and whether their relative status has changed
from one checkpoint time to another.
Literacy Navigator
7
The Program
Literacy Navigator is also an intervention program, designed
for students who are having trouble keeping up with their regular
classroom instruction and need additional focused teaching around
informational text comprehension. It consists of a foundation module
and several follow-on modules, each providing instruction in
comprehension of informational text.
The assessments for Literacy Navigator (Figure 7) are also
very carefully designed, and the reports feature diagnostic
information similar to those just described for Mathematics
Navigator. The roster reports are organized just as they are for
Mathematics Navigator. They provide a listing of what each student
gave as an answer for each question, and a hyperlink to the
question itself so that the teacher can view the question and the
option choices. The texts used are not provided on line; teachers
must refer back to the actual tests themselves to view the text,
but the actual items are viewable through the hyperlinks.
These roster reports also show the percent of students
answering each item correctly, and the percent choosing each
option. Wrong option choices are shaded yellow. In addition, any
item where a large number of students chose the same wrong answer
is shown so that teachers can focus on whole class
misunderstandings.
The assessments and reports follow the pattern established
for Mathematics Navigator, except that instead of a grade level
screener there is a test to confirm the appropriateness of the
grade level chosen for a particular group of students. Figure 8
shows the reports provided for Literacy Navigator. Please note the
similarities to the structure for Mathematics Navigator.
In Literacy Navigator, there are program objectives sub-scores
shown on the roster report as well as total scores. The test is
broader than the Mathematics Navigator tests, where the total score
relates to only one specific strand of mathematics. The use of sub
score information gives finer grained information than a total
comprehension score. Information is given about student’s ability to
accurately retrieve details, make inferences, link information, deal
with issues of pronoun reference, handle mid-level structures such
as cause and effect, sequence and problem/solution, and word study
concepts.
The primary diagnostic information comes from an analysis of
error patterns. This is like the misconception analysis for
mathematics. Each option choice is coded as being either a non-text-
based response, a text-based misread or a text-based response that
is accurate but not the right answer to the question posed. The
percent of wrong answers falling into each of these categories is
then reported for each student to show the kind of error being
made. This is extremely useful information to a teacher. Two
students with the same number of errors, but for one the errors are
all non-text-based and the others are text-based but just not the
right answer pose two different challenges for instruction. Figure 9 shows this information.
Design Issues
8
From a design perspective, there are at least four
opportunities for a student to choose an option that falls into one
of the three kinds of errors. This allows for stable estimation of
the pattern of errors a given student is making in response to a
specific level of complexity of text.
One of the most important ways that these designs came about
was the result of developing the assessments as the curricula for
both mathematics and literacy programs were being developed. Working
alongside of the curriculum developers allowed for the alignment of
the assessment with the intention of the curriculum designers and the
allowed for the capturing of the diagnostic approaches within the
curricula themselves. Thus, what emerged was a very carefully
designed and aligned approach that allowed the reports to follow the
design of the curriculum and the assessments in a way that makes them
maximally useful to teachers as they proceed with instruction.
Summary
9
The diagnostic use of curriculum embedded assessments is an
important ingredient in a successful formative assessment program.
The fundamental design principles that these assessments illustrate
relates primarily to the issue of validity, as discussed in some
length in the lead article in this issue. If test is to be valid to
serve classroom teachers, it must be designed with a carefully
planned set of reports that will address their needs. Teachers need
the assessments to be helpful to them in planning differentiated
learning, finding the strengths and weaknesses that their students
have so they can be addressed on an individual pupil basis.
It is in the design elements of these reports that will make or
break the use for which the information is intended. Having reports
that show individual student misconceptions or error patterns is the
key ingredient of the reports described here, and they are critical
to the teacher’s ability to group students appropriately for
instruction, to address identified needs, and to tailor additional
formative daily assessment activities to reflect the underlying
misconceptions or pattern of responses that students are displaying.
In addition to the misconception and error patterns, the design
of the reports allows teachers to have a bird’s eye view of the whole
class performance, by providing the overall item analysis information
with hyperlinks that allow teachers to view items as they are
examining how the whole class performed. Highlighting any places
where many students chose the same wrong answer, and viewing the item
with its option choices in a direct and immediate manner allows he
teacher to view larger chunks of performance gaps that can be
addressed.
The tests, obviously, must be carefully designed to allow for
the generation of the reports that support valid inferences about
student behavior that gets reflected in the reports. Selecting wrong
answer choices in the preplanned way that both the mathematics and
literacy assessments were done allows the teacher to see first, if
the students are demonstrating reliable error patterns, and second,
to have those reliable patterns reported on in a way that allows for
customizing classroom practice.
About the Author
10
Betsy Taleporos has recently retired as the Director of Assessment for
America’s Choice. She managed all the research, evaluation, and
assessment work for this organization which has had a major impact
in the Standards-based education movement and in national School
Reform efforts. She was responsible for the development of
approximately 300 mathematics and literacy assessments, including
performance based and multiple choice formatted tests, all of which
are curriculum embedded and are directly linked to classroom
practice. Prior to joining America’s Choice, she managed large-scale
test development projects in English Language Arts and Mathematics
for several major national test publishers. Before that, she
directed the assessment efforts in New York City managing the
efforts in test development, psychometrics, research, analysis,
administration, scoring, reporting and dissemination of information.
In that capacity she also served as the New York City site
coordinator for the New Standards project. Betsy brings a strong
background and expertise in areas of practical application, aligning
instruction and standards and assessments, and in academic research
and teaching at the graduate and undergraduate level for New York
University, Adelphi University and Long Island University.