Sunday, March 22, 2015

@ONE Designing Effective Online Assessments Course Week 4

The topic for this week was assessment plans:  How to modify, as needed, assessments to online format.


When in doubt, go back to the logic of assessment review the following questions:

  • What do you want to measure?
  • Why do you want to measure it?
  • How will you measure it?
  • When and under what circumstances will you measure it?
  • What did you find?
  • What does it mean?
  • What’s next (action)?


“Timing” includes a few different things:

  • How long assessments will take in the online environment
  • Sequencing of discussions to allow for initial posting and follow-up responses. Timing may also include what other assignments students have, the duration of the course, and whether weekends and holidays are included in the time the activity is being done.

“Timing” includes three components:

  • How long assessments will take in the online environment – example above
  • The window in which the assessment can be completed within
  • What deliverable structure you create for students to turn in assessments

Consideration must be spent in evaluating how much time to allocate to assessments in the online environment. This applies to a few different scenarios:

Assessments that take place at a set time: Using tools available in your CMS you can allow a set time for the completion of the task. In this case, the factors facing classroom student and online students are the same; therefore the time allocated in the classroom will likely be the same for online.

Group work: Factors such as delayed responses, varying schedules, communication challenges (e.g. inability to read body language), and technical limitations must all be taking into consideration (see the end of the following article for a long list of factors to take into account: How to Design Effective Online Group Work Activities.

Effective online group activities often fall into one of three categories: There’s no right answer, such as debates, or research on controversial issues. There are multiple perspectives, such as analyzing current events, cultural comparisons, or case studies. There are too many resources for one person to evaluate, so a jigsaw puzzle approach is needed with each student responsible for one part. - See more at:

Online group work checklist

  • Students understand the value of both the process and product of the collaboration. Students have guidance concerning how to work in an asynchronous team.
  • Group size is small enough to allow for full participation of all members.
  • Course provides numerous opportunities for community building prior to group projects.
  • Assignment is an authentic measure of student learning.
  • Assignment will benefit from collaborative work.
  • Students have clear guidelines of the expected outcome of the collaborative assignment. Assignment creates a structure of positive interdependence in which individuals perceive that they will succeed when the group succeeds.
  • Assignment is scheduled to allow adequate time for preparation and communication. Assignment is designed in a manner to allow students a level of personal control.
  • Students are provided with tools and instructions to facilitate online communication. Each group has a collaborative workspace within the online course.
  • Students have technology skills relevant for asynchronous communication.
  • Back-up procedures are in place to deal with technology failure.
  • Grading and/or evaluation strategies differentiate between the process and the product. Strategies are in place to monitor interaction processes.
  • Clear grading rubrics are provided at the start of the assignment to guide student work. Self and peer evaluations are included in the process to monitor individual involvement and accountability.

Completion Window

Factors that go into setting the completion window:

Consistency: assessments of similar types should be given the same completion window. This will help create stability and routine for students that will ultimately set them up for success.


Academic integrity: If the assessment will carry a large percentage of their grade, the likely hood of cheating will increase. For these cases, consider a window of 24 hours over a 48 – 72 hour window. Windows less than 24 hours are not recommended because of the asynchronous expectation of the student.


This aspect of timing deals with when your assessments occur. While assessment is an ongoing activity that is preformed throughout the course, there are deliverables to the assessments and “structure” speaks to when those deliverables are due.

Unlike a classroom where is an inherent structure that is given by the days of the week that the class meets, there is no such inherent structure in the online class. A big dent can be made in creating stability by thoughtfully creating a consistent deliverable structure. An example structure might be:

New modules/units open on Sunday at 11:59pm

Modules/units close on Sunday at 11:59pm

Weekly assignments are due on Friday at 11:59pm

Case studies open at the beginning of the module and close at the end of the module

Initial discussion board post due by Wednesday at 11:59pm

Initial discussion board responses (to other classmates) due by Friday at 11:59pm

Quizzes open on Thursday at 11:59pm (and close with the module)

Exams open on Thursday at 11:59pm (and close with the module)

The dates above are just an example. When creating your structure you will want to keep in mind when you are going to be available to your students, when IT staff will be available. A suggestion would be to open/close units on Wednesday when it is likely that you will available if a student needs support.

Each structure will look different based on your course, the assessments used, the weights of the various assessments, etc., however the goal is to help create a sense of structure and stability for your online student.


Once you have refined your Strategy with regards with your SLOs, you need to examine the individual assessments to be sure they are functioning properly and effectively. Essentially you need to answer the following questions for your assessments:

  • Do they measure what you intended?
  • Do they measure consistently?
  • Do they measure effectively?
So, how do you go about answering these questions? You use a combination of judgmental methods and empirical methods.

Peer or Expert Judgement

One way to address the issue of whether your assessment measures what it is supposed to measure is to have it reviewed by a panel of your peers or by a group of content experts. The panel can examine whether each item on an assessment or each task in a performance assessment is aligned what you are intending to measure.

With regards to SLOs, the panel can examine whether each item on an assessment or each task in a performance assessment is aligned with the intended SLO and how well the result reflects competency in the SLO. They can also comment on clarity of directions, procedures and item stems and alternatives. In many colleges this approach is taken through departmental or division curriculum committees that have the responsibility to setting SLOs and appropriate assessment techniques.

In implementing such an approach it is preferable that each peer or expert perform the analysis independently and then submit their judgments to be combined with the other members of the panel or group. Once the results are summarized, it is appropriate to have either a new group finalize the results by resolving any discrepancies or the initial group to discuss the results and arrive at a consensus.

Ideally the use of Peer or Expert Panels is conducted prior to the actual administration of the assessments with students. It can also be done as part of a pilot test of the assessments.

Empirical Methods

Empirical methods use actual student data to examine whether the test and the items that comprise it are functioning properly. Looking first at the total assessment, the questions that you want to answer are:

  • Is there a ceiling effect?
  • Is there a floor effect?
  • Is there sufficient variability in student performance?
  • Are there any abnormalities in the shape of the score distribution (bi-modal, etc)?
Plotting the scores from the assessment and examining the shape of the distribution can easily answer these questions. A ceiling effect would be indicated by having a great proportion of the scores at the very top of the possible score range. A floor effect is the opposite, too many scores at the bottom of the possible score range. The amount of variability is simply the spread (how width) the score distribution is over the total possible score range. Ideally this spread should be centered about the middle of the score range and then range across at least 2/3 of the range. Finally you want to look for abnormalities such as bi-modality (two humps) in the distribution this may indicate that you have two subgroups of students performing very differently on the assessment.

In addition to examining the shape of the overall score distribution from an assessment you should also examine the results from each individual item. Essentially you wish to answer the same four questions as for the total test but now for each item. You should calculate the proportion of students who got the answer correct as well as the proportion that got it wrong. If there were multiple possible alternatives, you want to calculate the proportion selecting each alternative. You want to look for items that have very high proportions of students getting the correct answer or very low proportions getting the correct answer. These items are too easy or too hard. You want to look for alternatives that are not selected by any student or alternatives that are selected by a high proportion of students. The ones selected by no students need to be revised as they are not serving as real distracters. The ones with a high proportion of students selecting them may need to be revised or the instruction may need to be revisited to see why students are selecting it.

To complement the empirical analysis of the student scores and item performance, it is often useful to collect anecdotal student input on the assessment. This can be done through focus groups, interview, an open-ended survey items. You want the students to identify any items that were confusing, whether the directions were clear whether there was sufficient time, and if any words caused them trouble.

You should use the data and insights derived from the analysis of the data to drive improvements in your assessments and assessment process. It should also be used to identify places in the instruction that may need to be changed or modified. We will cover more about this in the next section.


Reliability and Validity (Link opens new window)

Sources of Invalidity

Validity is the characteristic of an assessment that means it is measuring what it is intended to measure. Good assessment design and construction helps insure that the assessment has validity. Analyzing the data from the assessment and items can also help establish a test’s validity. However, there are four sources of invalidity to be aware of and guard against:

  • Cheating, Plagiarism and Authentication
  • Test Bias
  • Accessibility
  • Test Procedures and Conditions
While all four of these sources are issues with face-to-face testing they become somewhat more heightened in the online environment because of the lack of physical proximity to the student taking the assessment. Cheating and plagiarism are major concerns in the Internet area and all the social networking sites and psyche compound it. Knowing your students and their work is one of the best ways to guard against this issue. Also there are services such as TurnItIn that review the student products for plagiarism and cheating. The materials in the Student Materials provide further information on how to address these issues.

Authentication is about verifying the that the students who show up in your course really are who they're supposed to be.  This problem also plagues traditional classrooms in the form of paid paper writers and even paid exam takers. The online environment makes it even easier for people to assume identities not their own. Many will argue, large face-to-face lecture courses have similar problems.  As the long time favorite cartoon showing a dog at a computer says "The nice thing about being online, is no one knows you're a dog."
The issue of authentication is a major issue for distance education. In fact, regulations from the federal government have recently forced online administrators to seriously grapple with this issue. Many new policies and procedures at the local college level address this issue. New commercial services are showing up as well. @One has an excellent seminar series on this topic: 

Given that this is an important topic that effects all aspects of a course and course design, we have already weaved the topic in at the beginning of this course. Feel free to refer back to the materials in Week 1 for more on this topic.


Test bias means that the test or items on the test is performing differentially for identifiable subgroups of students. The most forms of test bias are situations where different racial/ethnic groups or gender groups do better or worse consistently than members from other such groups. For example, females score better on a reasoning test than males. If this is due not to a true difference in the two groups ability or achievement level but rather due to the way in which the test has been developed or administered then this is test bias. The statistical analyses used to identify possible bias are very sophisticated and beyond this course and most classroom settings. However there are some things that you can do to guard against bias occurring in your assessments:

Watch your language. Language is one of the biggest sources of test bias. Students from different language groups, cultures or ethnic groups may have confusion with certain terms or attach different meanings to them. This causes the student to respond to the item in a way differently than you intended. You can help prevent this by keeping language simple and direct and ideally testing the assessment out with students from different groups in advance of using it.
  • Use graphics or pictures. These should augment textual instructions or item stems if possible.
  • Keep directions and item stems short and clear.
  • Use alternative assessment techniques to allow the student to express themselves in their own fashion.
A good approach to avoiding bias is always do a clinical pretest of your assessment with a small group of students that represent different characteristics and get verbal feedback from the on their experience with the assessment.

Personal Bias

One particular source of bias comes from our own beliefs and motivations, which we may not be consciously aware of, making them very hard to detect. For example, a researcher may have his or her grant funding riding on the results of a particular survey and therefore have a vested interested in how the results turn out. No matter how much a s/he tries to be objective in assembling the survey it will be very hard to develop a clean survey. Let's take a look at another example from the world of sports and judging performances. Figure skating performances used to be rated on a simple 0-6 scale by a team of international judges for both artistic merit and technical performance. Audiences long suspected the judges to be biased, but nothing was done until a scandal erupted over the collusion of the Russian and French judge to favor their own contestants. After this event, the judging was completely overhauled into details rubrics which the judges could apply to a recorded version of the performance taken from several camera angles. Although not everyone will agree the judging is completely free of personal bias, all will agree is much fairer than before.
To underscore, outside reviews of assessments by colleagues and test students are helpful in ferreting out personal bias, as well as the continued evaluation and review of the assessments as they are used.

Halo Effects

Halo effects are a special case of personal bias and come into play when an instructor is overly impressed by a particular student, i.e., the "teacher's pet." In this case, the student can do no wrong and his/her performance may be seen in more favorable lights than is justified by taking a more objective eye. The opposite can also be true in a reverse halo effect. An instructor can have such a low opinion of a particular student that s/he can do no good work, no matter what the quality of performance actually merits. True objectivity is hard to achieve and maintain.

As you may already be aware, online courses can poise difficult challenges for students with disabilities.  In designing your assessments, as well as your courses, it's a good idea to keep these students' needs in mind.  Obviously, assessments that depend on visual cues will bias against the visually impaired and assessments that depend on auditory cues will bias against students with hearing issues.

For online tests and quizzes:

Tips for Creating Accessible Surveys (Link opens new window)

Survey Tools and Accessibility (Link opens new window)

For authentic and performance based assessments, the situation is less clear cut depending on the actual assessment and the student or students involved.  In this case, special accommodations might be needed and it's simply best to keep an open mind.

Test Procedures and Conditions

Just as a student may react to an item differently than intended, it is essential that all students have the same experience while being assessed. In face-to-face settings this is fairly easy to control as they are all in one location with the same environmental features. Online it is more difficult to establish the same conditions however you can insure that each student has the same amount of time to complete the assessment, access to the same or similar resources, and is prompted for performance in the same manner. General texts on assessment treat some of these issues in more detail; the Linda Siskie reference is a good place to start.

Summative and Formative Evaluation

Evaluating the effectiveness of your course material and assessments is an important part of the process of course development. There are two primary ways in which most educators evaluate their own courses; summative and formative evaluation. This is information that is typically collected in a retrospective fashion that allows an instructor to go back see how well the students mastered the course concepts.
Summative evaluation = outcomes that demonstrate effective teaching (student performance on end of unit assessments, including tests, quizzes, papers, etc.)
Formative v. Summative Evaluation
Primarily prospective
Primarily retrospective
Analyze strengths and weaknesses towards improving
Document achievement
Develop habits
Document habits
Shape direction of professional development
Show results of such forays
Opportunity to reflect on meaning of past achievements?
Evidence of regular formative evaluation?

Formative evaluation =  more subtle in the process; information gathered during the learning process

In an online class = more challenging to informally assess students’ understanding; however, there are still ways to monitor these types of evaluation pieces:

  • One can look at how long students took on the online exams
  • Offer and monitor student blogs about how they feel about course material
  • FAQ section to an online class may offer an opportunity for an instructor to watch for questions that may not have been asked elsewhere.


This link offers additional explanations and sample forms to use for both evaluation types.

Explanations and example ways to evaluate your own course using common standards.

Description of the current movement for course assessment and the strategies that are being employed.

My Final Assignment:  Assessment Plan for NCESL 45, Level 7

No comments:

Post a Comment