EP 300 EDUCATIONAL MEASUREMENT AND EVALUATION
Introduction
This course introduces students to the concept of
educational measurement, monitoring, assessment and evaluation as they are used
in education system. Specifically the course intends to equip students with
basic knowledge and skills to;
i)
To measure and assess
educational outcomes
ii)
Construct different types of
test items and measurement scales
iii)
To administer examinations,
score and grade examination results
iv)
To judge critically the
practice of examinations in Tanzania
v)
To summarize examination
results
vi)
To analyze the use and misuse
of examinations
MODULE ONE
1. BASIC CONCEPTS IN MEASUREMENT AND EVALUATION
In obtaining basic information with
regard to classroom instruction and instructional decisions, the following
basic terms are used namely testing, measurement, evaluation and assessment.
Test
It is an instrument or device
usedto gather/collect information about students’ achievements or other cognitive
skills. It is a set of questions set by the teacher to measure a sample of
behavior of students. It is like a balance used to obtain weight or a foot used
to obtain the height of an object or person.Test determines the degree to which
and individual student perform in comparison with other students.
Test can
either be a
i.
Diagnostic test which is used to determine areas of difficulties
encountered by the learner to enable the teacher to take corrective measures
ii.
Aptitude test it is a test of a person’s ability to learn a task
or to perform a task
iii.
Achievement test it is a test given to the learner to determine how
much the learner has learned
An effective
teacher will make use of test results to make instructional decisions. Some of
the instructional decisions are
i.
Determination
of the appropriateness of the teaching plans
ii.
Grouping
students for effective learning
iii.
Determining
learning difficulties that students face
iv.
Identifying
students who are underachieving
v.
Determining
the effectiveness of instruction
vi.
Identifying
students who have a poor self-understanding
vii.
Identifying
students who are in need of special assistance
viii.
Guiding and counseling students to
choose a career
ix.
Selecting
and promoting students from lower level to the higher level e.g from primary
school to secondary school.
Types of tests
Informationthat teachers collect and use in their classrooms
comes from assessment procedures that are either standardized of
nonstandardized.
Standardized testis
a type of test which is administered, scored and interpreted in the same way
for all students across schools, district, state or nation. They are
administered to students in many different classrooms but always under
identical conditions of administration, scoring and interpretation. The main
reason for standardizing testing is to ensure that the testing conditions and
scoring procedures have similar effect on the performance of students in
different schools and states
Examples of Standardized Tests
Aptitude
Test.
The aim of this test is to measure general mental capacity or efficiency to
learn a given content. They are used to predict future performance. These tests
include language proficiency tests, mathematics, and comprehension.
Examples of these tests include Potential Ability Test (PAT), matriculation
examination, mature age entry examination.
Achievement
Tests.The
measurement of students’ achievement of Objectives of instruction in a given
course.
Intelligence tests. The IQ (Intelligence
Quality) tests areas such as reading
and reasoning. Such important tests
include Stanford Binet intelligence test. For instance if you want to be recruited to ITV or TBC or
any Radio, you will be subjected to an interview that requires you to read to
listeners clearly, so that your ability can be identified.
Nonstandardized( teacher made) tests
Are developed for a single classroom with a single group
of students and are not used for comparison with other groups. Nonstandardized
test are prepared by the subject teacher
Measurement
Measurement is the process of quantifying or assigning a number to a performance or
trait. It involves the process of assigning numerical scores to the performance
of a quiz or test. Therefore numerical scores are used to represent the
individual performance or trait. Or it is a process of assigning numbers to
tests according to specific rule such as counting the correct answers. When we
measure we want to answer the question. How much a student has performed?
Evaluation. Is the process of attaching value
judgment to a performance or measure.It involves the judgment of whether the student is performing high or low
level. For example: Alfera who scored 70 out of 100 in mathematics was above
average, showed steady progress, is a bight student. Evaluation aims at
answering the question: how good? Evaluation is more comprehensive to include
testing, measurement and non-formal observation. Evaluation does not always
base on measurement. Sometimes evaluation can base on the information acquired
from non-measurement techniques of evaluation like observation.
The functional role of evaluation procedures in the
classroom context are as follow:
Placement evaluation: this is concerned with student’s entry performance
and intends to check if the student posses the necessary knowledge and skills
to begin the programme. It can also suggest skipping some topics which are well
known to the students. Placing student in a special classor placing a student
in a more advanced course of study.
Formative evaluation
This is the
continuous evaluation of classroom teaching in order to find out if students
are following well the lesson. It is carried out during instruction for the
purpose of improving or monitoring learning progress. The purpose is to provide
continuous feedback to the parents and students and teachers with regard to
success or failure.
Diagnostic evaluation
This is a highly specialized procedure aimed
at detecting learning difficulties that are left unresolved during formative
evaluation. When a student continues to perform poorly in any learning task, a
more detailed diagnosis is needed. The purpose of diagnosis is to find out the
causes of recurring learning difficulties so as to formulate a plan for
remedial actions.
Summative evaluation
Summative
evaluation is done at the end of the unit of instruction or at the end of the course so as to
determine if the instruction objectives have been achieved. Summative
evaluation is used for grading courses, certifying student’s mastery of
intended learning outcomes, judging the appropriateness of course objectives
and effectiveness of school programme.
Purpose of Evaluation
Generally the
purpose of evaluation is to determine the performance i.e skills attainment and
knowledge learned in the educational program. Specifically the purpose of
evaluation is to:
i.
Monitor/check students progress
ii.
strengthen
desired outcome and behavior
iii.
to
guide students to choose a career
iv.
to
group and place students
v.
aid in
curriculum /instruction improvement
vi.
Assess
teachers’ effectiveness and efficiency
Effectiveness
is the
degree to which the teacher achieves the goals
Efficiency refers to
the achievement of the goal at the
lowest cost
Assessment
Assessment is the process of collecting, synthesizing, and interpreting
information in order to make a decision. Testing,
measurement and evaluation often contribute to the process of assessment.
There are different meanings of educational assessment.
Assessment
means sitting beside a learner in an
attempt to help the learner understand what he/she knows or is able to do.
Assessment is a process of gatheringdata by interacting with the learner in
order to understand his/her needs. It
also simply means talking to a learner in order to understand learner’s needs
or problems.
Assessment is judging. It is a process of determining the degree to which a learner has
attained some standard or level of achievement. The assessor/teacher may require the learner to answer specific
questions or perform specific task depending on what that standard is.
Assessment is coaching. Here it means that assessment
is there to help the learner achieve a specific objective. The assessor
observes the learner and provides some direction on how to proceed. At the same
time the assessor collect information about what the learner knows and can do
and also where the learner has difficulty or may need more instruction. The
main argument of the coaching metaphor is that assessments occur as part of the
learning process.
How Do Teachers Gather Assessment Information?
Airasian and
Russell (2008) have identified 3 ways through which teachers collect
information for making decision. These are student product/work, observation
and oral questioning
i.
Student product includes homework, written assignment, essays,
science project etc. These products provide teacher with information about
students’ cognitive skills.
ii.
Observation it involves watching or listening to students as they carry out activity
or respond in a given situation.
Observation helps teacher to understand behavior of students such
as:
·
mispronunciation
of words in oral reading,
·
interacting
in groups,
·
speaking
out in class, bullying (or annoying) other students,
·
losing
concentration,
·
having
puzzled look on their faces,
·
raising their hands in class ,
·
Failing
to sit still more than three minutes.
Observations can be formal or informal.
iii.
Oral questions are used to collect formal and informal information about student. Oral questions are
used by teachers during instruction so
as ;
·
to
review a prior topic,
·
brainstorm the new one,
·
find out how the lesson is understood by students,
·
Engaging a student who is not paying attention.
The teacher can gather information without breaking the lesson.
Types of assessment
There are three types of assessment.
i.
Diagnostic assessment or Early
Assessment. This is carried out before instruction to determine readiness or entry behavior. The
collection of information is normally based on informal observation. The type of information collected is based
on cognitive, affective and psychomotor
domains.
ii.
Formative assessment which is carried out during instruction for the
purpose of improving or monitoring progress. The collection of information is
based on both formal and informal
observation and student papers such as tests, homework, project . The type
of information collected is largely cognitive and affective and it is kept in
writing.
iii.
Summative
evaluation. This is carried out at the end of instruction/program for the
purpose of grading, selection, placement, certification and evaluation of the
achievement or not of instructional objectives. The information gathered is
mainly based on cognitive domain.
Purpose of assessment
1.
To
establish a favorable classroom environment that support students learning
through helping students to interact one another, respect one another,
cooperate and follow school rules and regulation.
2.
To
plan and conduct classroom instruction.
Conducting instruction/teaching process requires constant assessment and
making decision. E.g. when student seem not to understand the lesson or they
seem to be bored, the instructor has to make decision on how to solve the problems
to make learning proceed.
3.
To
place students to the group which they fit e.g higher reading group, middle
reading group, fast learners and slow learners in order to assist them.
4.
To
provide feedback to the students and their caregivers. Observation and feedback
is intended to modify and improve students’ learning
5.
To
diagnose student learning difficulties and disabilities so as to help them
learnby carrying remedial teaching or make accommodations or referring for more
specialized diagnosis and intervention.
6.
Summarizing
and grading academic learning and progress. The act of making final decision about students
learning at the end of instruction is termed summative assessment. Much of a
teacher’s time is spent on collecting information that will be used to grade
students or summarize their academic progress.
The Uses of Classroom Assessment
Classroom
assessment is used to monitor student progress by various education
stakeholders at various levels namely:
National and State Policy makers
Classroom assessment
assists national policy makers in:
i.
Setting
state and national standards of
performance
ii.
Developing
policies based on assessment
iii.
Tracking the progress of national achievement in
education sector
iv.
Providing
resource to improve learning
v.
Providing
rewards or sanctions for student, school and state achievements
School Administrators
i.
to
identify program strengths and weaknesses
ii.
to plan and improve instruction
iii.
to
monitor classroom teachers
iv.
to
identify instructional needs and programs
v.
to
monitoring students achievement over time
Teachers
i.
to
monitor students progress
ii.
to
judge and change classroom instruction
iii.
to
identify students with special needs
iv.
to
motivate students to do well
v.
to
place students in groups
vi.
to
provide feedback to teachers and
students
Parents
i.
to
judge strengths and weaknesses of students/program
ii.
to
monitor student’s progress
iii.
to
meet with teachers to discuss student
classroom performance
iv.
to
judge teacher’s quality/effectiveness
Objectives
Objectives help a teacher to focus on what is important and what a
teacher wants to accomplish. They also describe the kind of contents; skills
and behaviors teachers hope their students will develop through instruction.
Other names of objectives are instructional objectives, learning targets,
educational objectives, behavioral objectives, student’s outcomes and
curriculum objectives. There are three levels of objectives
i.
Global objectives, they are also called goals. They are general and
broad.
Because of their breadth they are not used in planning
classroom instruction and assessment.
ii.
Educational objectives. They are more specific than global objectives
iii.
Instructional objectives are the most specific type of objectives they are used in planning classroom instruction
and assessment
Instructional
Objectives and Evaluation
Instructional objectives are statements prepared by a teacher that
describe specific behavior or a pattern of behavior that a learner is expected
to demonstrate as a result of a lesson or series of lessons. Pupil’s behavior
or achievement should be observable and measurable. It is a precise description
of the competencies the learner is expected to develop from the instruction
(Ondiek, 1986).
These
statements clearly define the desired learning outcome that is expected from teaching.
Learning outcomes are learning products a student is expected to demonstrate as
evidence that learning has occurred. Therefore during and after instruction,
the teacher determines the extent to which instructional objectives are being
achieved.
Instructional objectives are stated in terms of what we expect the
learner to be able to do or express at the end
The purpose of instructional objectives is to identify what students are
expected to learn in order to help teacher to:
i.
communicate to others the purpose of instruction
ii.
provide
direction for the teaching processe.gselect appropriate instructional methods
and materials
iii.
form
the base for evaluating student’s learning achievementi.e help the teacher to
plan assessment that will allow him/her to decide whether or not students have
learned desired content and skills that are the focus of instruction.
Characteristics of good instructional objectives
i.
They
are related to intended learning outcomes of instruction
ii.
They
are concerned with students i.e. they describe students performance and they
are stated in terms of what student is to learn from instruction
iii.
They
are specific because they describe the student’s actual action/behavior/skills
that can be observed, measured, instructed and assessed.
iv.
Indicate
the condition under which the behavior must be performed and the level of
performance the students must show.
e.g
Given a map of Tanzania, students should be able to shade all major lakes
correctly.
·
Given
a map of Tanzania – condition under which the behavior is to be demonstrated
·
Shade
all major lakes –behavior expected is to
be demonstrated
·
Correctly
– standard or level of performance
v.
They
are derived from general objectives which are stated by using general verbs
like understand, appreciate and know which are not measurable.
vi.
They
are stated by using action verbs that indicate observable response that can be
evaluated by another person
vii.
They
have time bound
Components of Instructional objectives
Instructional objective is made up of two parts
i)
The
behavioral part which specifies expected student behavior or
achievement as a result of instruction.
ii)
The
content part which specify the
context of which the behavior in the objective is to operate.
Example.
At the end of the lesson students should be able to list three types of communicable diseases.
Behavior:
to list
Content:
Communicable diseases
CLASSIFICATION OF EDUCATIONAL OBJECTIVES
Benjamin Bloom and his colleagues in 1956developed three taxonomy of
educational objectives or learning domain/behavioral domains in order to
promote higher forms of thinking.The educational objectives are divided into
three major areas of behavior called
i.
The
cognitiveobjectives because the
behavior involved in this skill deal with recalling knowledge and applying
intellectual skills
ii.
The
affective objectives because the
behavior involved in this objective deals with attitudes and values.
iii.
Psychomotor objectives because the behavior involved in these objectives
deal with motor skills.
The objectives are described
using narrower, more specific verbs (action verbs which can be evaluated at the
end of the lesson).
These
behavioral domains are sub-divided into sub-categories which are arranged from
simple to complex. The student s are supposed to demonstrate mastery of each
skill at the lower order before they move to the more advanced skills.
For example in the cognitive domain teacher should focus on helping
students to remember information before they understand it, helping them to
understand before they can apply it to a new situation. Each skill in the
taxonomy represents a building block to the next.
These categories and sub-categories of learning domains are useful in
preparing lesson objective and identifying learning outcomes.
Cognitive Domain
Cognitive skills are used by teachers to determine the level of thinking
their students have achieved. The skills are ranked on a continuum from lower
order to the higher order thinking. The knowledge level of the taxonomy
represents lower level thinking because it focuses on memorization or recall of
information learned while the rest levels in the taxonomy represent higher
level of thinking and reasoning that calls for students to carry out thinking
and reasoning process more complex than memorization.
There are six
major categories of cognitive processes which are listed in order starting from
the simplest to the most complex.
Competence
|
Skills demonstrated
|
Knowledge
|
Deals with recall of information or remembering
previous learned information
Example. knowledge of dates, events, specific facts,
procedures, concepts, methods, principles, terminologies, names, prices of
commodities, formula, places, major ideas ,mastery of subject matter
Question Cues list, define, mention, name,
label, collect, tabulate, what, who,
when, where.
|
Comprehension
|
Deals with understanding of information or grasp
of meaning, Example.
translate knowledge into new context, interpret facts/materials, compare,
contrast, order words or items, group items, infer causes and predict
consequences, paraphrase, rewrite, explain ideas, summarizing information
Key words. Summarize, interpret,
contrast, predict, differentiate, estimate, reasoning, brainstorm, expand,
explain, qualify, and propose.
|
Application
|
Use of information to a new situation,
Example.
Applying methods, procedures, theories, concepts in the new situation,
Solving
problems using required skills or knowledge learnt
Constructing
charts or graphs
Key words. apply, demonstrate, calculate,
complete, illustrate, show, examine,
organize, modify, relate, classify, change, and discover, experiment,
compare and contrast, construct, exercise,
|
Analysis
|
Separate materials or concepts into component
parts
Example
-understanding
the organizational structure.
-recognition
of organization principles involved
-analysis
of the elements
- Uncovering
unique characteristics,
-
recognition of unstated assumptions
Key words. analyze, identify facts,
separate, classify, select, compare,
arrange, distinguish, differentiate,
|
Synthesis
|
Deals with building a structure or pattern from diverse elements
Examples
putting
different parts together to form the whole
Formulating
the new patterns or structure,
relate knowledge from several areas, predict
Writing
a creative story, poem, or song.
Integrating the learning from different
areas into a plan for solving problem
Key words. Combine, integrate, modify,
rearrange, substitute, plan, create, design, invent, compose, formulate,
prepare, generalize, rewrite.
|
Evaluation
|
Making judgment about the values of idea or
materials
Key words. Examine, assess, appraise,
critique, criticize, defend, evaluate, justify, support, relate, rank,
discriminate, justify, rate, , weigh, decide, determine, summarize.
|
Affective Domain of Learning
Affective
domain of learning consists of change of values, attitudes, and way of feeling,
outlook, interests, emotions, appreciation, motivations and preferences.
Affective Taxonomy is based upon the degree of a person’s involvement in an
activity e.g. classroom activities or idea.
There are five
levels of affective learning domain which are organized from simple to complex
as shown in the table below;
Competence
|
Skills Demonstrated
|
Receiving
Phenomena
|
Willingness
to hear/listen/select and to pay attention.
For example. The ability of student to listen to others with respect, ability to
remember the name of newly introduced people.
Key verbs: choose, describe, follow, give, hold,
identify, locate, name, point, select, use, reply etc.
|
Responding
|
Active participation on the part of the learner,
attend and react to a particular phenomena
Example.
Learners participation in class discussion, giving presentation,
questioning new ideas, concepts,
models, theories, principles in order to fully understand them
Key verbs :
answer, assist, discuss, greet, help, label,
perform and practice
|
Valuing
|
The worth or value a person attaches to a
particular object, phenomena or behavior. This ranges from simple acceptance
to the more complex state of commitment.
Example.
Demonstrate belief in democratic process, informs the school management on
matters that one feels strongly important.
Key words:
Complete, demonstrate, differentiate, explain, follow, form, initiate,
invite, join, justify, propose, read, report, select, share, study, work,
|
Organization
|
Organize values into priorities by contrasting
different values, resolving conflict between them and creating a unique value
system. The emphasis is on comparing, relating, and synthesizing.
Key words stick on, alter/adjust, arrange, combine, compare,
complete, defend, explain, formulate, generalize, identify, integrate,
modify, order, organize, prepare, relate, synthesize
|
Internalizing
values [ characterization]
|
The
development of new values controls
person’s behavior. The behavior is persistent, consistent,
predictable,
Examples,
cooperate in group activities, [displays team work], displays professional
commitment to ethical practice on daily basis. Revise behavior and changes in
the light of new evidence. Value people for what they are, not how they look.
Key words.
Act, discriminate, display, influence, listen, modify, perform, practice,
propose, qualify, question, revise, serve, solve, verify.
|
Psychomotor learning domain
Psychomotor domains of learning consist of change in the way of acting, skills
performance, physical productivity or manipulative skills and body
movement. The developments of these
skills require practice, and they are measured in terms of speed, precision(
accuracy/exactness), distance, procedures, or techniques in execution. There are seven sub-categories of
psychomotor behavioral domain which are
arranged hierarchically from
simple behavior to the most complex one. The organization of taxonomy
ranges from a student showing readiness to perform a psychomotor task to the use
trial and error to learn a task andto actually carrying out a task on his/her
own.
There are seven levels of psychomotor learning domain which are
organized from simple to complex as shown in the table below;
Competence
|
Skills
demonstrated
|
Perception [ awareness]
|
The ability to use sensory organs to obtain hints
that guide motor activity.
Example. Detect non-verbal communication cue/signal;
adjust heat stove to correct temperature by smell and taste of food.
Key words. Choose, describe, differentiate, distinguish,
identify, isolate, select, relate.
|
Mindset. Readiness to act.
|
It includes mental, physical and emotional sets
[mindset]
Example. Showing desire to learn a new process
[motivation].
Key words. Begin, display, explain, move, process, react, show,
state
|
Guided response
|
This includes imitation, trial and error and practice.
Example. Perform mathematical questions as demonstrated.
Follow instruction to build a model.
Key words. Copy, trace, follow, reproduce, and respond.
|
Mechanism
|
This is the intermediate stage in learning
complex skills. Learner’s responses have become habitual and the movement can
be performed with confidence and proficiency.
Example. Using a personal computer.
Key words. Assemble, construct, dismantle, display,
fasten, fix, grind, heat, manipulate measure, mend, mix, organize, and
sketch.
|
Complex overt response
|
This is the act that involve complex movement
such as performing without hesitation.
Key words. Assemble, construct, dismantle, display,
fasten, fix, grind, heat, manipulate measure, mend, mix, organize, and sketch
[ here there are adverbs and adjectives that indicates the performance is
quicker, more better and accurate]
|
Adaptation
|
Skills are well developed and the individual can
modify movement patterns to fit special requirements.
Example. Responding effectively to unexpected experience,
modify instruction to fit the needs of the learners
Key words. adapt, adjust, change, rearrange,
reorganize, revise,
|
Origination
|
Creating new movement pattern to fit a particular
situation or specific problem. Learning outcomes emphasizes creativity based
on highly developed skills
Examples. Construct a new theory, develop a new
comprehensive training programme,
Key words.
Arrange, build, combine, compose, construct, create, design, initiate,
make, originate.
|
Table 1.
Examples of action verbs used to write educational objectives for each
category of cognitive domain.
Knowledge
|
Comprehension
|
Application
|
Analysis
|
Synthesis
|
Evaluation
|
Count
|
Classify
|
Compute
|
Break down
|
Arrange
|
Appraise
|
Define
|
Compare
|
Construct
|
Differentiate
|
Combine
|
Conclude
|
Identify
|
Contrast
|
Demonstrate
|
Discriminate
|
Compile
|
Criticize
|
Label
|
Convert
|
Illustrate
|
Outline
|
Create
|
Critique
|
List
|
Discuss
|
solve
|
Separate
|
Design
|
Grade
|
Match
|
Distinguish
|
|
Subdivide
|
Formulate
|
Judge
|
Name
|
Estimate
|
|
|
Generalize
|
Recommend
|
Quote
|
Explain
|
|
|
Generate
|
support
|
Recite
|
Generalize
|
|
|
Group
|
|
Repeat
|
Give examples
|
|
|
Integrate
|
|
Reproduce
|
Infer
|
|
|
Organize
|
|
Select
|
Interpret
|
|
|
Relate
|
|
state
|
Paraphrase
|
|
|
summarize
|
|
|
Rewrite
|
|
|
|
|
|
Summarize
|
|
|
|
|
|
Translate
|
|
|
|
|
Below are sample instructional objectives derived from Blooms Taxonomy
with the taxonomic categories.
1.
Knowledge: At the end of the lesson
students should be able tolist three
types of family.
2.
Comprehension: At the end of the lesson students should be able
to distinguish between
democratic states from authoritarian states.
3.
Application: At the end of eighty (80) minutes lesson students
should be able to construct a
bar graph by using information available in the table.
4.
Analysis:At the end of two hour lecture, students should be able to analyze the main features of
monopoly capitalism.
5.
Synthesis: At the end of two hour lecture, students should be
able to integrate information
from the science experiment into lab report.
6.
Evaluation:At the end of two hour lecture, students should be able to judge the quality of varied persuasive
essays
NB. Instructional objectives should be SMART
S=Specific
M= Measurable
A= Attainable
R= Realistic
T= Time bound
Assessment of the learning domains
Different assessment approach characterizes the different behavioral
domains. For example.
Cognitive Domain
The most commonly taught and assessed educational objectives are those
in the cognitive domain [ Airasian and Russell, 2008]. Nearly all tests that
students take in schools are intended to
measure one or more of these cognitive activities. Teacher’s instruction is
usually focused on helping students to attain cognitive mastery of some
contents or subject area.
Cognitive domain is measured through
paper and pencil test of various type as well as oral questioning ( e.g
comprehensive examination, research defense).
Affective Domain
Affective domain is assessed by observation and questionnaires
e.g rating scale
Affective behaviors are rarely
assessed formally in school and classroom.
Teachers assess affective behavior informally especially when grouping
up students. For Example to determine students who can work under no supervision
and who cannot.
Most classroom teachers can
describe their students affective characteristics based on their informal observations and interactions with
their students.
Psychomotor Domain
Psychomotor domain is assessed by observing students carrying out the
desired physical activity.
It is measured in terms of speed, precision, distance, procedures, or
techniques in execution.
1.6. Table of specification of instructional objectives
What is table of specification?
TOS is the table that helps teacher align objectives,instruction and assessment
when teaching.
Table of specification has two dimensions:
i.
The
content dimension which includes the main topic of instruction and assessment
ii.
The
process dimensions which include the six categories cognitive domain related to
each topic content or objectives.
Content
dimension
|
Process dimension/objectives
|
|||||
Knowledge
|
Comprehension
|
Application
|
Analysis
|
Synthesis
|
Evaluation
|
|
Stages of writing
|
X (L)
Mention the three stages of writing process
|
X(M)
Explain the purposes of the three stages of
writing process
|
|
|
|
|
Topic sentences
|
|
|
X(M)
Write down the topic sentences
|
X(L)
Differentiate topic sentences from other types of
sentences
|
|
X (H)
How useful is the topic sentence in writing an
essay?
|
Writing essays
|
|
|
|
|
X (H)
Write an essays
on the stages of writing process
|
|
Note:
L=Low ( time
M= Middle ( time)
H= high amount of time
X = Objective(s)
|
The intersection between process dimension and the content dimension is referred to as
objective.
For example: X= Students should be able to mention the three
stages of writing (prewriting, writing and editing).i.e students will remember
the three stage of writing process.
L= Refers to the amount of time allotted to this objectives. Since it is
a simple memorization task, a low amount of time is spent teaching it.
For example. Student should be able to explain in his or her own
words the purposes of the three stages of the writing process. Stages of
writing relate to two different objectives
Once objectives are identified and organized the next step is to develop
a lesson plans for them. In selecting activities to be done so as to achieve
the objectives the following should be taken into consideration:
i.
The
ability level of your students
ii.
Their
attention spans
iii.
Suggestions
made in the textbook.
iv.
Additional
resources available to supplement and reinforce the textbook.
In the first objective the table reminds teacher that he/she needs both
remembering and explaining activities to attain the first two objectives. For example specific objectives could read as
follows;
a.
At
the end of the lesson students should be able to mention the three stages of
writing process ( knowledge)
b.
At
the end of the lesson students should be able to explain the purposes of the
three stages of writing process (comprehension).
Instruction
Once objectives and planned activities are identified, the next step is
to instruct the student basing on objectives developed.
Assessment
When constructing a test teacher need to be concerned with the class
content/ the subject matter and the kind of thinking/responses required on the
test.
A test should also align with the level of thinking required of students
during instruction and assessment. For example if teacher taught about
knowledge (recall information) the test should require students to recall
information learned.
Decision
in planning a test.
When deciding what should be included in the assessment and the type of
tasks students should take, four most important questions teacher need to
answer.
1.
What
should I test?
In deciding what to test it is important to focus on both the objectives
and actual classroom instruction that took place.
2.
What
type of assessment items or tasks should be given
The type of assessment item or tasks should base on learning objectives.
For example from table above. The type of assessment procedure chosen depends
on the nature of the objective being assessed.
a.
Comprehension(
write in one’s own words)
b.
Applying
( write a topic sentence)
c.
Synthesize
( integrate and write an essay)
3.
How
long should the test take?
a.
The
age of the students, the subject being tested and the length of the class
period all affect the length of a test.
b.
The
number of questions per objective depends on the instructional time spend on
each objective and its importance.
Stages in planning classroom test
Tests are used to evaluate students learning. Testing involves
determining the behavior to be measured and designing test items that will
elicit desired performance. The goal of classroom testing is to improve
teaching and learning. Things to consider when planning for a test.
i.
Determine the purpose of the test.
The purpose of the test is to:
a).
to determine if student have perquisite skills needed for instruction and to
find out what they know about the lesson to come ( pre-testing).
b).
to monitor learning process, to detect learning problems and to provide
feedback to students and teacher. ( formative testing)
c). to measure
the extent to which instructional objectives have been achieved. (Summative
testing).
ii. Developing
test specification
Test specification entails making a test a representative sample of the
instructional objectives (the content that was taught) by using a table of
specification. The table of specification is made up of instructional
objectives / behavioral objectives and the content
Table 1. Specification of concepts in measurement
and evaluation
OBJECTIVES
|
Knowledge
|
Comprehension
|
Application
|
Analysis
|
Synthesis
|
Evaluation
|
No. Of test
items
|
%
|
CONTENT
|
|
|
|
|
|
|
|
|
testing
|
2
|
1
|
2
|
1
|
1
|
2
|
9
|
18
|
measurement
|
4
|
2
|
2
|
2
|
3
|
2
|
15
|
30
|
evaluation
|
2
|
2
|
4
|
3
|
1
|
1
|
13
|
26
|
assessment
|
2
|
3
|
2
|
2
|
2
|
2
|
13
|
26
|
Total
items
|
10
|
8
|
10
|
8
|
7
|
7
|
50
|
|
% of items
|
20
|
16
|
20
|
16
|
14
|
14
|
|
100
|
iii.
Selecting
Appropriate Test Items
There are two types of test items
namely objectives(multiple choice, matching and true/false itemsand subjective
test items (essay items).
According to Thungu et al (2010) the allocation of marks for cognitive skills can be as follows:
Skills
|
Percentage
|
Knowledge
|
12%
|
Comprehension
|
16%
|
Application
|
32%
|
Analysis
|
20%
|
Synthesis
|
12%
|
Evaluation
|
8%
|
Total
|
100%
|
iv.
Preparing a set of relevant items
The intended learning outcome will dictate the type
of items to be used. If the intended learning outcome is to mention, name,
list, the selection items will be appropriate. If the intended outcome is to
identify, a supply type test items will be used. An item should be included
only if it can measure a sample of the intended learning outcome.
PRINCIPLES OF TEST CONSTRUCTION
Purpose of Testing
i). to identify what students have learned after the
completion of a lesson or unit of instruction. These tests are also important
when discussing student progress at parent-teacher conferences.
ii). to identify student strengths and weaknesses.
This is effective when teachers use pretests
at the beginning of units in order to find out what students already know and
where the teacher's focus needs to be.
iii). It
is used for placing students
iv). Tests can be used as a way to determine who will
receive awards and recognition
iv). to assess teacher
and/or School's Effectiveness
v). Show the depth of
understanding of an idea or mastery of a skill
vi). Show student growth over
time in a particular area of knowledge.
vi). Compare one student’s or
group’s achievement to another’s on the same task.
vii). Predict students’ future
performance
Construction of test items
General Rules for Writing Test Items
i). Use examination format as a guide to item writing. Examination
format describes the scope and content coverage to be measured and the sample
of tasks to include.
ii). Write more items than needed for a particular examination so as to
allow the weaker items to be discarded during later review.
iii). Write items well in advance of the submission date. Setting items
aside for several days and then review will help reveal any lack of clarity and
ambiguity that was overlooked.
iv). each test item should call forth the performance described in the
intended learning outcome.
v). Write each item so that the task to be performed is clearly defined.
In formulating questions use simple and direct language, correct punctuation
and grammar and avoid unnecessary wording.
vi). Write each item at an appropriate reading level. Pupil’ responses
should be determined by the performance
being measured and not by some factor the item was not designed to measure.
vii). Write each item so that it does not provide help in answering
other items. For example, a name, date,
or fact called for in a short – answer item might be unintentionally included
in the stem of a multiple –choice item in another part of the test.
viii). Write each item so that the answer is one that would be agreed
upon by experts. This rule is easy when
measuring factual knowledge but more complex measuring complex outcome calling
for the best answer such as the best reason, the best method, the best interpretation
and the like. Be sure that experts would agree that the answer is clearly the
best.
ix). Write each item so that it is at the proper level of difficulty.
That is to say the difficulty of the item matches the performance to be
measured and the purpose of the test.
x). whenever the item is revised, recheck its relevance i. e to be sure
that it still provide a relevant measure of the intended learning outcome.
Classification of Tests
There are two types of tests
Objective tests/ selection items
are those in which examinee select the correct answer from among a number of choices presented in them.
Objective items include multiple choice items, true and false items and
matching items.Objective tests are distinguished from subjective test in that
the task is highly structured and limit
the type of response. Students are not free to redefine and organize and
present the answer in their own words.
Subjective /
supply item test.It is an item format that requires the student to structure a rather long written response up to several paragraphs.
Supply items require the students to supply or construct his or her own answer.
Subjective/supply or constructed - response items include restricted response
items, short answer items, completion items and essay items.
Objective/selection tests
Multiple Choice Items
Multiple choice items consist of a stem which presents the problem or question to the student
(premise) and a set of option or choices from which students select an answer.
Options could be 4-5 and there should be only one correct answer. Incorrect but
reasonable options in the multiple choice questions are called distracters. The problem may be stated
as a question or as an incomplete statement. For example
i.
Direct
question
Which of the following
is not an element of weather?
a)
Humidity
b)
Leaching
c)
Sunshine
d)
temperature
2. Incomplete statement
…………………is one of the elements of weather
a). rainfall
b).
erosion
c).
deforestation
d).
leaching
Multiple choice items are used to measure simple and complex learning
outcomes.
a). knowledge of terminology/vocabulary/terms
Example An organism living in or
on another organism is
a.
A
predator.
b.
Prey.
c.
A
parasite.
d.
A
host.
b). Knowledge of specific facts Multiple choice can be used to assess
students’ grasp of discipline based factual knowledge it deals with what who
where and when
Example: Which of the following states does not border Oklahoma?
a.
Colorado
b.
Missouri
c.
Nebraska
d.
New
Mexico
c). Knowledge of Procedure
Example: The correct procedure for combining acid and water is
a.
Add
acid to large amount of water
b.
Add
water to large amount of acid
c.
Add
acid to water, cool and swirl
d.
Add
water to acid, cool and swirl
d). Knowledge of Principles
Example. The principle of capillary action helps to explain how fluids:
a.
Enter
solutions of lower concentration
b.
Escapes
through small openings
c.
Pass
through small semi-permeable membranes
d.
Rise
in fine fluids
Multiple choice questions also measure higher level
outcome such as:
i.
Application ( Faradays law can be used to explain
a).
b).
c).
d).
ii.
Interpretation
Majimaji war occurred in the
southern part of Tanzania because.
a).
b).
c).
d).
iii.
Justification of methods and procedure
Why do farmers rotate their crops?
a).
b).
c).
d).
Guidelines for constructing multiple choice type of
item
i.
State the problem clearly in the stem
For example:
The components of a multiple-choice item are
ii.
Include one correct or most defensible answer
For example
According to …….the most serious aspect of the
energy crisis is the
iii.
Select attractive distracters. Distracters should
be attractive to examinees
iv.
Options
should be presented in a logical, systematic order. For example dates of events
should be arranged chronologically, numerical quantities in ascending order and
names in alphabetical order.
v.
Options should be grammatically parallel and
consistent with the stemother wise they can provide clue to the correct
alternative.
Example 1
A test which can be scored by untrained person in
the content area of the test is an
a.
Diagnostic
test
b.
Criterion-referenced
test
c.
Objective
test
d.
Reliable
test
e.
Subjective
test
Examinees take advantage of inconsistent stem and options to get the
correct answer. They respond in terms of verbal skill possibly quite different
from the skills intended to measure.
The item might be rewritten as follows
A test which can be scored by untrained person in
the content area of the test is said to be
a.
Diagnostic
b.
Criterion-referenced
c.
Objective
d.
Reliable
e.
Subjective
vi.
Options
should be mutually exclusive i.e it should contain one option which is the most
correct or the best answer.
vii.
Ensure
that correct responses are not consistently shorter or longer than other
distracters. The difference in length might give clue o the correct answer.
viii.
The
options such as “none of these”, “none of the above”, “all of these”, “ all of
the above” should not be used when the examinee is to select the best but not
necessarily absolutely correct answer.
ix.
Correct
answers in a test should appear randomly
What
are the advantages and disadvantages ofmultiple choicetest items?
Matching Items
Matching items consists of two columns;
a.
A column for the stem/problem to be answered
called Premise( Column A)
b.
A column of responses(column B).
Normally a column of stem is placed on the left hand side and the column
of responses is placed on the right. Matching items often measure recognition
of factual knowledge based on simple associations that may include:
·
Persons
who are associated with events
·
Dates
with historical events
·
Terms
with definition
·
Rules
with examples
·
Symbols
with concepts
·
Parts
with functions
·
Plants/animals
with classification
Guidelines for writing matching items
i.
Include
homogenous materials in each exercise
ii.
Include
at least three to five but no more than eight to ten items in a matching set
why? Long set of matching items require examinee to do a good deal of work in keeping track
of stems and searching for options. Furthermore it is difficult to write long matching
items which are homogenous. Thus three to eight items per matching set is a
reasonable compromise.
iii.
Eliminate
irrelevant clues. There should not be verbal association clues, plural and
singular clues between the stem and the correct option pair.
iv.
Place
each set of matching items on a single page
v.
Reduce
the influence of clues and thereby increase the difficulty of matching item.
This can be accomplished through
a.
Using a different number of options than there are items
b.
Allowing each option to be used more than once.
vi.
Compose
the response list of single word or very short phrases
vii.
Arrange
the responses in systematic order: alphabetical, chronological. This order
enables examinees to find correct responses more quickly.
viii.
A
column of response should have more items than the other.
ix.
Items
in the columns should be grouped homogeneously
E.g. LIST A
LIST B
1. Leonardo Da Vinci a. American
Gothic
2. Edward hopper b. The
Thinker
3. Michelangelo c.
Mona Lisa
4. Auguste Rodin d. The
last Supper
5. Grant Wood
What
are the advantages and disadvantages of matching test
items?
True-False
Or Alternative Response Items
These are
test items with only two possible answers.It consists of declarative statement that the pupil/student is asked to
mark true or false, right/wrong; correct/incorrect; yes/no, agree/disagree etc.
Because of these different responses they are called alternative responses.
The
alternative response or true and false items are used in measuring.
a.
The
ability to identify the correctness of statements or definitions of terms.
b.
The
ability to distinguish facts from opinion.
c.
The
ability to recognize cause and effects relationship.
Guidelines for Constructing True-False/Alternative Items
i.
Include
only one idea in each item
ii.
Eliminate
partly true-partly false items
iii.
Ensure
that true and false items are approximately equal in length.
iv.
Balance
the number of true items and false items
v.
Eliminate
vague terms of degree or amount.E.g words like frequently, seldom are open to
interpretation in the true-false items.
vi.
Use caution in writing negative item statements.
What
are the advantages and disadvantages oftrue-false or alternative response test
items.
Subjective/Supply Test Items
Subjective
items consist of completion (fill in the blank) items, short answer items,
essay type items.
1.
Short
answer present the problem with the
direct question which require the students to answer using their own
constructed responses.e.g What is the name of the first president of Tanzania?
What are the main parts of human body?
2.
The
completion items present the problem as an incomplete sentence. E.g. the name
of the first president of Tanzania is ……………………………………..
Short answer and completion items assess primarily factual knowledge-recall-dates, places, specific person and
comprehension.
The main parts of human body are
i……………………….ii……………iii………………iv………….
Guidelines for writing good short answer items
i.
Construct
the stem so that the answer is definite and brief.
ii.
Make
sure that there is only one correct answer
iii.
Avoid
lifting sentences from textbook
iv.
For
completion and fill-in-blank formatting
-
Make
response blanks equal length
-
Avoid
grammatical clues preceding the blank.
-
Do
not use too many blanks in one item-usually no more than two
-
Include
enough information in the stem to ensure the desired response
Essay Items
Essay items allow students to communicate a unique
constructed answer to a question.
There are two categories of essay type questions.
These are:
i.
The restricted-response
questions are essay questions that limit
content in terms of scope and response. A student is required to state
or list factors, reasons, differences, similarities, merits and demerits. Such
questions limit the student in terms of
content of the answer and length of the response.
Example-
Limited content
List the types of leadership style
Limited response
Briefly the advantages and
disadvantages of each style.
ii.
The extended-response items
These
are test questions that require the students to select factual information,
organize the answer in a way they like, integrate idea as they deem
appropriate. Extended response questions are used to measure the ability of student to select information,
organize, integrate and evaluate ideas.
Example
Describe the influence of climate change on agricultural
development in Africa today
Evaluate the significance of participatory teaching
techniques at primary school level in Tanzania.
Principles for construction of better
essay questions/items
·
Essay
questions should measure learning outcomes that cannot be satisfactorily
measured by objective test items.
·
They
should measure the achievement of instructional objectives.
·
Each
question should indicate clearly the task to be undertaken by students
·
They
should indicate the time limit for each question
What are the advantages and disadvantages of essay
test items.
ASSEMBLING,
ADMINISTRATION AND ANALYSIS OF TEST RESULTS
Classroom
testing process
i.
Identify
the learning outcomes to be tested and measured
ii.
Selection
of appropriate test format
iii.
Construct
test items that are relevant to learning outcomes specified
iv.
Assembling
of the test questions
Assembling
Classroom Test
Assembling classroom test refers to the process of
grouping test items by type such as multiple choice, true-false etc.
The importance of grouping test by type is
i.
To avoid the necessity of students
shifting from one response mode to another as they move from item to item.
ii.
To help students cover more items in a
given time
iii.
Makes scoring easier
Organization
of test items.
One of the important
considerations in assembling the test is the order in which the item types are
presented. In most tests selection items come first and supply items come last.
Guidelines
for assembling test items
1. Record
test items in a special way e.g on a paper
2. Review
test items several times so as to make items appropriate to learners’ outcomes
that are intended to be measured.
3. Arrange
items in a logical manner according to the examination format
i.
Organize test items by type selection
before and supply items last
ii.
Do not split multiple choice or matching
items across two pages of the test
iii.
Separate multiple choice option from the
stem by beginning the options on a new line.
iv.
Number the test items
v.
Space items for easy reading and writing
responses.
vi.
Make sure that you have enough copies of
examination.
vii.
Provide enough questions to ensure
reliability
4. Prepare
instructions to be followed by students in answering the test items
5. Each
section of a test should have instructions that direct students what to do
Test
Administration and Marking
Test administration involves establishment of a
conducive physical and psychological setting that allow students to demonstrate
their best performance as well as to manage time.
Guidelines
for administering test
i.
Creating
a quiet comfortable Physical and psychological setting.
Physical
setting
Examination environment should be quiet and
comfortable. This can be achieved through minimizing interruption of any kind.
Some of the ways to minimize interruption in the examination room are:
a.
Posting a sign on the door indicating
that testing/examination is in progress.
b.
Proofreading the test items and
directions before administering it
c.
Ensuring that enough facilities such as
desks, chairs, clock are available
d.
Making sure that there is enough
ventilation and light
Psychological
setting
This involves creating
psychological setting that reduces student anxiety. Test anxiety is diminished
through informing students on test, giving students good instruction, a good
unit of review.
ii.
Keeping
track of time by informing student on the remaining time
While administering test, teacher should be aware of
cheating. Cheating is a common disease in school. Students cheat in the
examination for various reasons such as:
a. Pressure
from parents/teachers
b. Failure
to prepare and study for the test
c. Internal
pressure from being in an course that gives a limited number of high grades
d. Danger
of losing a scholarship
Forms
of Cheating
a.
Copying from another student’s
examination/test answers.
b.
Dropping a test paper so that others can
copy from it.
c.
Writing test information on an eraser or
a small piece of paper and passing it to
another student or using it.
d.
Developing codes, formula, key words on
object for use in the test
e.
Changing answers when teacher allow
students to grade each other.
f.
Keeping test information in a toilet
room
g.
Writing test information on the arms or
thighs to cheat.
h.
Use programmed material in watches or calculator
in the test room.
i.
Look at another student’s paper during a test
How
can we discourage cheating?
a. Search
students while they enter they enter a test room.
b. Providing
students good instruction and information about a test
c. Before
testing, students’ books and other materials should be kept away from the test
room
d. Observing
students during testing
e. Knowing
the common methods of students’ cheating
f. Students’
seat should be spread out in the test room
g. Discouraging
students’ to wear caps in the test room.
h. Using
different test forms.
i.
Assigning students seats for a test
j.
Giving more in-class test and fewer take
home test.
Scoring
of Tests
The process of scoring
a test involves measurement that is assigning a number to represent a student’s
performance. It provides a summary of student’sperformance. The complexity of
scoring varies with the type of test. Selection test is easier to score than
the supply item test.
Scoring
the selection test
Selection test consists
of multiple choice, matching and true-false test items. Scoring selection test
is objective because they are brief and have only one correct answer. There are
different methods of scoring objective items. One common method is to put a
tick to the correct answer and a cross to the wrong answer. However it is
advised to indicate the correct answer
instead of a cross to the wrongly answered item and the score instead of a tick to the correctly answered item.
Scoring
Short Answer Test
Short answer and
completion test items call for short responses like word, phrase, date, name
etc. Therefore scoring is not difficult and can be quite objective
Scoring
essay test items
Essay item is the most
complex item to score because essay questions allow each student to construct a
unique and lengthy answer/response
to the question posed. Therefore there is no single answer key uniform to all
responses. Thus the interpretation of responses is necessary.
Factors which undermine
teacher’s ability to evaluate essay fairly and reliably are:
a.
Halo effect i.e. irrelevant factors
canattract the attention of the marker
making an essay appear better
than it really is. Such factors include:
i.
Hand writing
ii.
Style of writing such as sentence
structure
iii.
Spelling and grammar
iv.
Neatness
b.
Identity of the student
c.
Location of one’s paper in the pile of
test papers
d.
Teacher’s dislike for a student
e.
Teacher’s mood
How
to minimize biases
a.
Develop a scoring guide (rubric)/
marking scheme. Scoring guide lists the
key components in the essay that will be graded as well as the level of performance that will receive
points or it refers to a short description that defines each level of
performance along with the number of points that level will receive.
E.g
accuracy of the content; language/vocabulary; sources/citations; spelling/
grammar, organization of essay
b.
Teacher should identify students by
number when scoring essay responses.
c.
Score student on the basis of present
performance, not on the ability, interests or past performance of student.
d.
Inform the students on the demands of
essay questions such as good hand writing, proper punctuation, spelling,
accuracy and organization of the essay
e.
Score the first essay for all students
before moving to the next essay in order to be consistent and to do justice to
students when scoring.
f.
Describe
in advance how you are going to handle factors that are not relevant to the
learning outcome being measured. Such factors include: spelling, handwriting,
sentence structures, punctuation and neatness.
g.
Re-read essay answer a second time after
scoring so as to check objectivity.
Approaches
to scoring essays.
There
are two approaches to scoring essays
a.
Holistic scoring which provides a single
overall score/ grade of the complete essay. Holistic score is useful when the
overall impression of student achievement is made.
b.
Analytical scoring. This provides
separate score for each components of a test e.g. score for accuracy,
organization, supporting arguments, grammar and spelling. Analytical score
provides students with detailed feedback that can help students improve
different aspects of their essays. It is useful when determining the strengths
and weaknesses in student’s work or to assess multiple objectives that are
integrated in the essay.
ITEM ANALYSIS
Item analysis refers to the process of judging
the quality of selected response
test item. It is a set of procedures
designed to evaluate the quality of test items used for assessment.Item
analysis is done after a test has been done and scored so as to determine
whether
·
each item in the test functioned as it
was intended
·
the item was capable of discriminating
between the best and weak student in terms of achievement
·
the item was able to measure the effect
of teaching and learning process
·
The item was of appropriate difficulty.
Individual item for
assessment can have unique characteristics namely:
i.
Item difficulty ( how hard a test item
is)
ii.
Item discrimination (tells us how frequently an item is answered
correctly by those who perform well on a total test). Item discrimination
reflects the relationship between student’s responses for the total test and
their responses to a particular test item.
Item
Difficulty
Item difficulty is the
ratio or percentage of individuals who answered an item correctly.
Item
difficulty index = number of correct answers
Total
no. of students who answered the item
The
easier the item, the larger the item difficulty index.If item 1 is answered
correctly by 15 out of 20 students then the item difficulty index is 15 which
is 0.75 or 75%
20
Item
difficulty is used as a measure of how hard an item is for all students, those
who performed well overall and those who performed poorly. A good assessment is
one that balances the difficulty of items to provide information about a range
of student abilities and performance.
Item
discrimination
Is
the degree to which an item differentiate those who have higher level of
achievement from those who have lower level of achievement. The discriminating
power of an item is a measure of the ability of an item to distinguish between
those students who performed well overall on a test and those who did not.
Procedures for
analyzing test items
h. Identify the
three groups of students in the classroom, the higher, middle and lower
performing students. Ranks order all of the test papers from the highest score
to the lowest score.
iii.
select about 25% of papers from the top and
25% of the papers from the bottom
iv.
Put
aside the middle papers as they will not
be used for analysis
v.
For
each test item tabulate the number of students in the upper and lower group who
selected each alternative
vi.
Compute
the difficulty index of each item for the upper and lower group using the
following formula
Item difficulty index
= number of correct answers
Total
no. of students who answered the item
Example.
Item number 1
Tanganyika
attained its independence in
High Low
10 6 a. 1961
8 4 b. 1962
00c.
1965
2 10d. 1967
Total ( 20 ) (20 )
Option
(a) is the correct answer of an item number one (question 1)
Calculate
the item difficulty index on each item for the high and low groups.
High group. Item difficulty = No. of
students who answered the item correctly
No. of students in that group
=
10
20
= 0.5 or 50%
Low
group. Item difficulty = No. of students who answered the item correctly
No.
of students in that group
= 6
= 20
= 0.3 or 30 %
Item
discrimination
Take the item difficulty for the high
group – the item difficulty for the low group= item discrimination.
Item
discrimination index = 0.5 – 0.3= 0.2
item
discrimination values range from -1.00 to + 1.00
The
discriminator can either be positive or negative
Positive
discriminator
is an item that is answered correctly by majority of students who did well on
the test compared to those who performed poorly. The more positive the
discriminator the better the item is functioning in differentiating among the
varying levels of achievement. Such item is said to be precise, useful and
effective test item.
Negative
discriminator is
an item that is answered correctly by the majority of poor performing students
compared to those who did well overall. Such kind of item is undesirable.
Non
discriminator is
an item which does not differentiate between the higher performing and the low
performing students.
The
purpose of item discrimination is to compare the response rate of the
high-performing students to the low performing students on individual items.
vii.
Evaluate the effectiveness of
distracters in each item (the effectiveness of incorrect alternatives. This is
achieved by inspecting the number of students in the upper and lower group who
selected the distracter being evaluated.
For example the result
in item number 1 of a test was as follows:
Example.
Item number 1
Tanganyika attained its independence in
High
Low
10 6 a. 1961
8 4 b. 1962
0 0 c. 1965
2 10 d. 1967
Total (20) (20)
Option
(a) is the correct answer of an item number one(question 1)
Interpretations
a.
Option A is a good option and it
functions as intended because it attracted more students from the upper group.
b.
Distracter B is a poor distracter
because it attracted more students from the upper group than students from the
lower group
c.
Distracter C is ineffective because it
attracted no student
d.
Distracter D is a good distracter
because it functions as intended by attracting more students from the lower
group than from the upper group
Effects
of item analysis
i.
Provide a base for efficient discussion of test results
iii.
Provide a base for improving classroom
instruction by revising the curriculum on the part that seemed to be difficulty
iv.
Provide a base for improving skills in
test construction
v.
Provide a base for carrying out
remedial teaching in area that are difficulty
Grading
Grading is the process
of holistically evaluating student’s performance and assigning evaluative
symbols to represent what learners know
and can do or may not know or be able to do as evidenced by various
assessments( Airassian, 2008). They
represent teachers’ summary judgment about how
well students have mastered the contents and processes taught in the
subject area during a particular term or grading period. Grades are based on
two dimensions such as:
i.
Analysis of assessment data such as
quizzes, homework, tests, assignments and others.
ii). Interpretation and communication of
grades. Having gathered data from your student’s assessment, teacher need to
make judgment about the meaning of these data. The interpretation should base
on a set of criteria your school has established. Thomas Guskey and Jane Bailey
(2003) identified three types of learning criteria used in grading and
reporting. These are:
a)
Product criteria. This is a type of grading which is based on the final
examination report (summative evaluation)
b)
Process criteria this is the process of
grading and reporting which is based on the course work and final examination.
c)
Progress criteria. This is the grading
system which deals with how much students have gained from their learning
experiences (e.g. oral comprehensive examination).
Why do we grade?
The purpose of grading includes:
i).
to communicate students’ academic achievement to students, parents and others.
However grades become distorted when non academic factors such as attendance,
efforts, attitudes, class participation, group work, class discussion or
behavior are included.
ii)
Administratively, grades are used
a).
to determine the students’ ranks in class,
b). to credit for graduation
c). to determine the suitability for promotion or graduation or
employment.
iii).
they are used to determine the strength and weaknesses of the different
teaching approaches for teachers.
iv).
they are used to motivate students and parents to improve students efforts.
vii)
They are used for guidance. They help
teachers, students and counselors to choose appropriate courses and course
level.
viii)
They help teachers to identify students
who are in need of special services.
vi)
They are used to sort out the best student from the rest.
How Do We Grade?
There
are different forms of grading namely
i)
Letter grade e.g. A, B, C, D, E, F,
ii)
Using standard based achievement
categories such as excellent, good, fair, poor
iii)
Using percentage or numerical grades
such as 100%, 90% or 100, 80, 70
iv)
Using pass/fail system
v)
Using point system( tracking grades by
adding the points received during the term e.g quiz = 6/10; group work = 8/10
etc
vi)
Use of teachers written comments
Classroom grading is
based on teacher’s judgment. Teachers’ judgment is based on
i). information about
the performance being judged (test scores, book report, performance
assessment).
ii). a basis of comparison
that can be used to translate that information into grading judgments ( e.g
what level of performance is worth A, B,
C, D, etc.
Approaches
to Comparison for Grading.
Grade is a judgment
about the quality of a student’s performance. Several bases of comparison can
be used to assign grades to students. The most commonly used classroom grading
compare a student’s performance to:
i.
The performance of other students
ii.
Predefined standards of good or poor
performance
iii.
Student’s own ability
Comparing
student’s performance with other students (Norm referenced Grading).
It refers to the process of assigning grades by
comparing the performance of one student with the performance of other students
For example when teacher says that Helen has performed better than the rest of
students in the class he/she is making norm referenced grading
Comparing
student’s performance with the pre-established performance standards) Criterion
referenced grading).
The performance
standards define the level or score that a student must attain so as to receive
a particular grade. All students who reach a given level get the same grade
regardless of how many students reach that level. For example, students’
assessment contain two parts the course work and final examination. Passing the
course depends on getting 50 percent of the total marks. Thus 50% is the
performance standard. Pass or failure will depend on how you compare yourself
to the performance standards of 50 percent.
Comparison
with student’s own ability/Ability based grading approach
It involves comparing
student’s actual performance with the performance they expect based on
teachers’ judgment of the student’s ability. The term overachiever and
underachiever describe students who do better or worse than teachers’
expectations for what they should be doing. Normally many teachers assign
grades to students by comparing a student’s actual performance with their
perception of the student’s ability.
Disadvantages of
perception based grading system are;
a)
The approach depends on teacher having
an accurate perception of each student’ ability. In the real sense teachers do
not know the reality about student’s ability
b)
Teachers get difficulty time
differentiating a student’s ability from other characteristics such as
self-assurance, motivation or responsiveness. Currently several studies have
revealed multiple abilities that help
students learn and perform in different modalities such as visual, oral,
written etc hence, which one should a teacher focus on to judge student
ability?
c)
Perception based approach confuses
parents and outsider. For example a high ability student might attain 80%
mastery of instruction and might receive a C grade if perceived to be
underachieving while a low ability student who attains 60% mastery might
receive an A grade for exceeding expectations. An outsider might think that the
low ability student has mastered more of the course because he/she got high
grade.
Grades are regarded as
prize that you receive when you study hard or a punishment you get when you do
not work hard.Some negative effects of grading are
i.
Student’s getting low grade may lose
their self - esteem
ii.
Failure to graduate if you receive low
marks
iii.
Detainment if you get low grade
iv.
School dropout
SUMMARIZING
TEST RESULTS
Summarizing involves
synthesis of assessment information into a single grade. The steps involved in
summarization are
i.
Combine information from various
assessments into a single grade
ii.
Each type of assessment information
should be expressed in terms of the same scale so that all information can be
combined into a composite one
iii.
Compute the overall scores by:
a). giving each kind of assessment the
weight it deserves
b). sum the scores
c). divide the total scores by the number of
assessment information.
SUMMARIZING
TEST RESULTS
INTERPRETATION
OF TEST RESULTS
Once you have scored
and graded students’ tasks you need to interpret in order to get meaning from
it. Scores on an assessment tells only part of the story. To be meaningful the
scores must be interpreted with respect to other variables such as:
i.
The scores of other students
ii.
The student’s prior performance on
similar assessments
iii.
The contents of the items answered
correctly.
statistics
provide a picture about individual, group performance as well as the
effectiveness of instructional method. This is because statistics helps us to
know the typical/average student’s performance on assessment, the overall
performance and the spread of scores i.e the lowest and the highest scores
Ways of Showing the Distribution of
Scores
The distribution of
scores shows the pattern or organization of data so as to detect meaning from
it. The distribution of scores can be indicated through:
i.
Frequency
table. It is developed by arranging scores from the
lowest to the highest score. Then tallying the number of times a particular
score occurred. From the table one can:
a. Compare
the performance of individual against the others
b. See
the distribution of scores i.e the highest the middle and the lowest scores
c. See
the poorly and well performed students.
ii.
Histogram.
This is a pictorial representation of data in the form of bar graphs. It is
used to display frequency distribution. It has two axes, the X-axis (horizontal
line) which displays the scores and the Y-axis (the vertical line) which
display the frequency of each score.
iii.
Frequency
polygon. It is the line graph similar to bar graph.
Measures of central tendency
This is a numerical
summary of a set of scores. There are three measures of central tendency; mean,
median and mode. Each of these is a different ways of summarizing scores into a
single number.
i.The
mean
(X) it is an arithmetic average of a set of scores. it is calculated by summing
up individual scores and dividing by the total number of scores. The formula
is:
Mean (X) = total
number of individual scores
Number of students (N) or Scores
The mean uses all scores in the set of
data. Every assessment score is used to calculate the mean including those who
did extremely well and those who did extremely poorly. . Scores that are quite
different from the majority (either higher or lower) are called outliers.
Outliers can distort the mean by pulling it lower or higher than what might be
the typical or average performance on the test. A skewed distribution that is
pulled lower by outliers is called a negatively skewed distribution. A
distribution that is pulled higher by the outlier is a positively skewed
distribution
What is the importance
of knowing the shape of the distribution? It helps teacher to
know how students have been growing or not growing from the entry point (test
done before instruction) to the end of instruction
Advantages of the mean
a.
It takes all of the scores into account.
None of the scores is left out
b.
It is simple to calculate
Disadvantages of the mean
a.
It is affected by extreme values or
outliers. Outliers tend to pull the mean lower than we might expect. When
thereis no outlier the mean is high.
b.
The mean may not exist in the data set
j.
The
median. This is the middle score in a set of scores. it is
calculated by following certain procedures such as
a. Arranging
the scores from the lowest to the highest
b. Determine
the middle score/s
c. If
there are odd number the median is
the middle score and if there are even
numbers the median is obtained by adding the two middle numbers and then
divide by two. The median is best used when you are concerned that outliers
might be affecting the mean making it less representative of a group of scores.
Advantages of the
median
a.
It is not affected by outliers
b.
It is easy to compute and comprehend.
c.
It is useful when comparing a set of
data
Disadvantages of the
median
a.
Sometimes the median is the number that
is not actually present in a data set
b.
It consumes a lot of time to sort
outscores from the smallest to the highest.
c.
Does not take into account all the data
in a data set/does not use all information available
The mode. Mode is the most frequently occurring score in
a set of scores (Popham, 2008; Musial et al, 2009). In a set of scores there can be two
frequently occurring numbers, and then we call this a bimodal distribution. In case there are more than two modes we
would call these multimodal
distributions.
Example: 1. Scores
35, 56, 73.67.43.62.70, 39.45, 51.56.61.56, 71, 82, 80, 66, 58.64, 54.
The mode is
56 (the frequently occurring number in a set of scores)
Example 2 Scores 35, 56, 73.67.43.62.61, 39.45, 51.56.61.56,
71, 82, 80, 61, 58.64, 54.
The modes
are 56 and 61 (bimodal distribution)
Example: 3. Scores
35, 56, 73.67.43.62.61, 39.45, 51.56.61.56, 73, 82, 80, 61, 58.73, 54.
The modes
are 56, 61 and 73 (multimodal
distributions).
In a set of score where
there is no frequently occurring number there is no mode.
.
Advantages of the mode
a.
It is simple to determine
b.
It is not affected by extreme large or
or small values
c.
It is useful for qualitative data
Disadvantages of the
mode
a.
It focuses only on the most frequent
number in a data set leaving other scores
b.
Measures
of Variability
Measures
of variability tells us about
i.
The variability of student learning and
the overall effectiveness of instruction.
ii.
The consistency of student performance
iii.
Whether the scores are spreadout or bunched
together
The
measures of variability include: range, standard deviation and variance
Range
Range
is the differences between the highest scores and the lowest scores. It deals
with the consistency or diverse of a set of scores (Musial et al, 2009).
How to calculate: Range = Highest score – Lowest score
Example
56,67,63,38,62,66,45,51,53,43,52,44,77,58,69.
Range
the highest score = 77; the lowest score = 38
Range = 77 – 38 = 39
Advantage = it
is easier and quicker to estimate
Disadvantage= it is
greatly influenced by outliers i.e. higher
or lower scores.
Standard
Deviation
Standard
Deviation is a measure of the average distance each
individual score is from the mean. It
indicates how spread out the scores around the mean. If the standard deviation
is relatively small compared to the mean then the scores are more homogeneous
that is they are grouped together.This means that on average individual scores
do not deviate much from the mean. When the standard deviation is large we say
individual scores are heterogeneous (they are spread out) meaning that on
average the individual scores do deviate quite a bit from the mean.
SD
tells
how spread out or clustered a set of scores are from the mean. This helps
teachers to see how variable student performance is in a classroom.
How to calculate the SD
SD =
(X-X)2 =
represents each individual scores minus the mean, squared;
N = the number of scores
EXAMPLE
Scores: 20, 20, 25, 25, 30, 30,
i.
To
calculate the mean
Individual
Scores X
|
The
Average (Mean) X
|
Deviation
(X-X)
|
Squared
Deviation ( X-X)2
|
20
|
25
|
20-25 = -5
|
25
|
20
|
25
|
20-25 = -5
|
25
|
25
|
25
|
25-25 = 0
|
0
|
25
|
25
|
25-25 = 0
|
0
|
30
|
25
|
30-25 = 5
|
25
|
30
|
25
|
30-25 = 5
|
25
|
|
|
|
|
X = 20+20+25+25+30+30=150
|
|
|
( X-X)2 = 25+25+25+25=100
|
N = 6
|
|
|
N = 6
|
X = X
= 150 = 25
N 6
|
|
|
( X-X)2=
100 = 16.7
N 6
16.7
|
Variance = sum of the squared
deviations
SD =
The average distance of
each score from the mean of 25 is 4.1, this means that on average the scores
are approximately 4 points above or below the mean. SD is an indicator of how
spread is the scores from the mean. The larger the SD the more the spread the
scores.
SD uses all the scores
in a set thus it is likely to be representative of the spread of scores. It is
used as a unit of measuring. It could tell which student scored two SD higher
or lower than the mean.
Conclusion
1.
A teacher can display test scores in a
meaningful ways by using frequent tables, histogram or a frequent polygon.
2.
Measures of central tendency such as the mean,
mode and median can be used to extract meaning from test scores
3.
A teacher can determine the variability
among scores by calculating the range and standard deviation
Generally
measures of central tendency and variability can be used to judge whether
students met learning objectives and how effective instruction was.
ASSESSMENT OF
NON-COGNITIVE OUTCOMES AND INTELLIGENCE QUOTIENT (IQ)
Classroom observation techniques
Teaching
is driven by what we observe in the classroom. Observation is the process of
gaining information by watching and listening to students, it can be used to
evaluate student’s knowledge, skills, disposition, and behavior. Through simple observation teacher know when
students follow directions to complete an assigned task or they do not.
What
actually don teachers need to observe?
Teachers
observe both appropriate behaviors so as to increase as well as inappropriate
behaviors in order to decrease.
Teacher’s classroom observations are based on
Academic skills such as
reading, mathematics, science, social studies, language. Academic skill is assessed and stated in observable and measurable
terms.
Psychomotor
skill
such as physical movement in various sports, dance, performing arts, singing,
playing and so on.
Prosocial skills which involves
attitudes, feeling, belief and disposition.
Approaches to
Classroom Observation
Observation
tools such as anecdotal notes, observation checklist and rating scales are
widely used for observing and assessing student’s learning.
i.
Anecdotal notes
or records
This is a
technique which is used to document observations of significant skills or
behavior of students. It records factual
description of incidents that teacher has observed personally.
It provides a
purposefully and detailed description of the strength and weaknesses of a student
performance based on pre-specified performance criteria such a
student’s ability to: transition to a new activity, follow instructions, focus
on the task at hand.
The records can
be
i.
Anecdotal
notes which consists of date, name of student, setting and incident/s or what
happened.
Anecdotal Notes
Student name Grace Luis
Date 2/1/2015
Setting: Group poster project
What happened/incident
Today during
group project, Grace complained about the marker colors she was given. I
reminded her of the rule but she grabbed a marker and scribed on the poster,
ruining it.
|
Anecdotal A,B,C
records
Date/time
|
Context/activity
|
Antecedent
|
Behavior
|
Consequences
|
Student reaction
|
2/1/2015
12.00 Noon
|
Students were
working on a group poster project
|
Grace was the
material manager for Grayson’s group. She gave him 4 light colored markers
|
Grayson said
these colors stink. he then grabbed a black marker from grace and scribed all
over the poster
|
I stated the
rule and punished him by taking him out of class for some time.
|
Grayson
returned into the classroom and joined the workgroup.
|
NOTE:
Context=
the setting
Antecedent= what happened before the behavior
Behavior = what the behavior looks like
Consequences=
what happened after challenging behavior
The usefulness
of anecdotal notes/records
It
is useful when writing report cards comments in parent or student conference
They
are useful if intervention such as
acceleration or remedial teaching is needed for the students
Observation
Checklist
Is
a list of behavior that is used to asses a student’s skills such as academic,
psychomotor and prosocialskills. Teacher observes the skills and marks them as
present or absent, correct or incorrect. Each skill should be written in such a
way that it is observable and measurable.
Observation
checklist for school facilities (put a tick where appropriate)
School Name
District
Date
|
Rating scales
Is
a form of checklist which consists of a list of qualities that are judged
according to a scale that indicate the degree to which each quality is present.
Each characteristic can be observed according to some underlying degrees of
accomplishment.
Descriptive
Rating Scale
It is a rating scale which is based on a
series of adjectives or thumbnail sketches. They allow teacher to rate the
adequacy or inappropriateness of a student behavior on the scale. In
constructing a descriptive rating scale
1.
Specify
the observable behaviors that are important in your case.
2.
Write
adjectives that describe point on a scale. The best way to develop adjectives
is to determine the best and the worst likely performances and then choose in
between levels to create the full scale.
Example of rating scale designed for a student
working math problem individually.
Student………………………Date……………………..Assignment:
Math skills
|
Numerical rating
scales
It
is a rating scale which associates
number with descriptions along the scale. The higher the number the greater
the accomplishment and the lower the number the lower the accomplishment. It is
used when summarizing observations across some period of time. The number of
points within a rating scale could be based on the number of times a particular
behavior has been noted. This kind of rating could look as follow:
1.
Never = behavior is not observed
2.
Occasionally
= behavior has been performed but repeated instances of nonperformance are observed
3.
Usually
= behavior is performed but a small number of instances of nonperformance are
observed
4.
Always
= behavior is consistently and regularly performed
EXAMPLE. Rating
scale for a group project
Group work
Rating scale
Project……………………………………………..
Rating scale
1.=seldom
or never
2.
=some/only part of the time
3.=
usually
4.=Always
|
Rating scale for
a individual project
Observed
student……………………………………
Date…………………………………………
Activity……………………………………………
|
Advantages of
assessing through observation
1.
Allows
teacher to assess and monitor progress and behavioral skills as part of normal
teaching
2.
Allows
teacher to discover unique information such as skills and problems that would
be difficult to discover by through
other means
3.
Observation
method permits teacher to adapt other assessment methods so as to meet the
needs of students.
4.
Information
gathered through observation can be used together with formal methods such as
paper and pencil test to assess students
Disadvantages
1.
Faults
can occur when judgment is based on single observation
2.
It
is time consuming to obtain information through observation
3.
If
teacher is not focused on the specific skill he /she will end up observing
unrelated behavior.
Peer
Appraisal and Self Assessment
Peer
appraisal
This is one of the ways in which
students internalize the characteristics of quality work by evaluating the work of their peers. However, in order to offer
helpful feedback, students must be given instructions of what they are to look
for in their peers' work. The instructor must explain expectations clearly to
them before they begin. One way to make sure students understand this type of
evaluation is to give students a practice session with it. The instructor
provides a sample writing or speaking assignment. As a group, students
determine what should be assessed and how criteria for successful completion of
the communication task should be defined. Then the instructor gives students a
sample completed assignment. Students assess this using the criteria they have
developed, and determine how to convey feedback clearly to the fictitious
student. Students can also benefit from using rubrics or checklists to guide
their assessments. At first these can be provided by the instructor; once the
students have more experience, they can develop them themselves. The checklist
asks the peer evaluator to comment primarily on the content and organization of
the essay.
For peer evaluation to work
effectively, the learning environment in the classroom must be supportive.
Students must feel comfortable and trust one another in order to provide honest
and constructive feedback. Instructors who use group work and peer assessment
frequently can help students develop trust by forming them into small groups
early in the semester and having them work in the same groups throughout the
term. This allows them to become more comfortable with each other and leads to
better peer feedback.
Comments
Post a Comment