News & Events: Newsletter
Oral Proficiency Testing: Should Teachers be Involved in the Testing and Rating Process?
By Dr. Thekla Fall, World Language Consultant
As proficiency testing becomes more widespread in standards-based communicative language programs, the question arises, should a school district or institute of higher learning engage a company (or another institution) to outsource both the test and rating of student speech and writing samples? Or should the school or district control the testing and the rating process?
Having a third party take over the entire testing / rating process may, on the surface, appear to be the easiest choice. After all, although it may cost more, it simplifies the process. All the teacher needs to do is send students to the computer lab and some time later the results appear. While appealing, farming out the tests and ratings divorces the teacher from the testing process and does not serve to guide the teaching/learning process. (Wiggins, 1998)
Clarifying this issue is critical because, as noted by Paul Black, professor emeritus at King College London's School of Education, and Dylan Wiliam, head of the school and professor of educational assessment, “There is a body of firm evidence that formative assessment is an essential component of classroom work and that its development can raise standards of achievement.”
A recent Education Week article (2008), “Test Industry Split Over 'Formative' Assessment” focuses on the current controversy. In the article, Ray Wilson, the executive director of assessment and accountability for the Poway Unified School District in Poway, Calif. is quoted, “I still contend that so long as a teacher doesn't have primary control [over assessment], you will never have a truly formative assessment.” Testing expert Richard J. Stiggins, executive director of the Portland, Ore.-based Assessment Training Institute, maintains that “formative assessment isn't something you buy—it is something you practice.” Thus, in addition to end of the year or end of a sequence summative assessment, teachers need a way to frequently assess the efficacy of their instruction and use the resulting feedback to directly impact student learning.
In-House Assessment Means Teacher Involvement
As a longtime supervisor of the world language program in a large urban school district, I was able to observe firsthand what happens when 65-75 teachers are engaged, annually, in the testing and rating process. From the beginning, we decided that it was crucial to involve foreign language teachers in both summative and formative assessments. Starting in 2003, the district implemented large-scale summative testing using OWL Testing Software to collect and rate student speech samples, in-house, using a variation of a Simulated Oral Proficiency Interview (SOPI) type test structure (Stansfield, 1996).
For more than 10 years, district world language teachers:
- developed a large bank of speaking tasks for proficiency testing;
- rated resulting student speech samples, (tests created using OWL Testing Software can be set to prevent teachers from rating their own students, if the district so chooses);
- analyzed the resulting data and used it to improve instruction;
- developed proficiency-oriented instructional tools; and
- participated and contributed to staff development to make instruction more proficiency-oriented.
Teachers Support Testing
From the start, most teachers were supportive of district-wide testing because they saw the nationwide push for standards and accountability, and because they were directly involved in the development process. In the mid-1990s, teacher committees developed the first cassette tape-mediated SOPI-like assessments. Most teachers agreed that although speaking is the most difficult skill to assess, it is also the most important skill to assess. By 2003, even though some non-techie teachers were taken out of their comfort level with the need to administer the test in a computer lab using their new OWL Testing Software, no one complained. Teachers were willing to work through initial bugs and hardware/lab difficulties because they were invested in its development and realized the huge potential for simplifying the testing and rating process.
Rating Sessions are Invaluable Staff Development Opportunities
At the end of each school year, several weeks were set aside to rate 1,300 to 1,800 student speech samples, in French, German, Italian, Japanese, and Spanish at four levels (5th grade, 8th grade, level 3 high school, and seniors). All teachers were asked to listen to and rate 20 speech samples. Rating sessions were preceded with a rubric review and calibration practice. New teachers were given a more intensive rater training session that included initial paired ratings. Most teachers learned to rate the lower level test (No Ratings to Intermediate Low). More experienced upper level teachers rated the higher level test (up to Intermediate High). This simplified the amount of rater training needed.
We found that the rating sessions are invaluable staff development opportunities. To be successful in teaching for proficiency, teachers must have a thorough understanding of how proficiency is measured. Teachers began to understand the ACTFL Scale at a much deeper level when they used it to actually rate students' speech samples. For many teachers, this is eye-opening, in terms of what students can and cannot do with real-life tasks. Teachers begin to appreciate the real difference between routine classroom achievement tests and life simulating proficiency tests. Lower level teachers were thrilled to actually hear what more advanced students could say and they gained a better understanding of their role in preparing students to reach higher levels. Higher level teachers gained a better appreciation for the work of the lower level teachers. Inevitably, during the rating process, teachers began to talk about program articulation, instructional gaps, as well as what works and what doesn't. They shared their tried and true tips with one another.
Throughout, the goal for graduating seniors was to attain an Intermediate Low level or higher of speaking proficiency (the standard level advocated by the PA State Board of Education). We found that periodic proficiency testing and annual ratings are powerful motivators—keeping both teachers and students focused on the goal. Students know where they are on the Scale and what they have to learn to get to the next level. Likewise, teachers spend five or more hours recalibrating, listening/rating student speech samples, and talking about the data. As a result, teachers not only are focused on a common goal, but also have a common language based on the ACTFL Scale for use across K-12 levels and across languages.
Teacher Initiated Remedies
Over the years, as results were posted, teachers reviewed and analyzed the data at team meetings and district-wide in-services. Unsatisfied with some of the ratings and finding the district's older textbooks deficient, teachers decided they needed new instructional tools to help students attain higher levels of proficiency. With their deeper understanding of proficiency testing, teachers came up with the idea of developing whole class protocol involving situations for communication to encourage students to rev up their speech to ever higher levels. It is unlikely that there would have been this level of teacher involvement if the tests had been farmed out.
By doing the ratings in-house, the district identified the lack of vocabulary as a major stumbling block for Novice level students. This resulted in teachers also playing a major role in advocating for the vocabulary practice activities component of OWL Testing Software. This game-like feature encourages students to practice vocabulary in situational contexts with various responses. Teacher committees provide the contexts and vocabulary.
Teachers Identify Changes in Their Own Instruction
Most importantly, teachers started remarking on changes they were making in their instruction. For example, teachers stated that the test and rating process "served as a catalyst to make them more aware of the need to design more classroom experiences that engendered real-life speaking tasks and student interaction; to compliment the textbook by filling in gaps regarding survival level functional language tasks and vocabulary; to explain early in the year the PPS ORALS rubric to both students and parents—in other words, to unveil the objectives and the goals of the curriculum" (Fall 2007). This is an on-going process as teachers see what works and what doesn't.
Achievement Testing and Teacher-Based Tests
In addition to the district-wide oral proficiency testing, the district recently began using the latest iteration of OWL Testing Software to phase-in annual pre- and post-achievement tests. Furthermore, the district is now giving teachers private accounts, making OWL available for individual classroom use. Teachers are encouraged though, to share and collaborate on the development of effective formative assessments. Once this phase of the program is fully implemented, there will be additional data for teachers to make informed instructional decisions tied directly to their teaching and students will have meaningful data to help focus their learning.
Formative Testing Increases Student Speaking Proficiency
Is in-house proficiency testing a panacea? No. Does it make a difference? Yes! This is demonstrated by the resulting test data. The district has seen definite trends over six years of online testing—the percentage of students at lower levels of proficiency is decreasing and more students are attaining the Intermediate Low goal.
There is still room for improvement. What becomes clear is that substantive change for the better takes time; ingrained traditional teaching habits don't change overnight. However, proficiency levels will rise when there is a sustained instructional/learning focus, data analysis that informs teaching, staff development, and the purchase of instructional materials—based on clearly defined needs.
Implications for New OWL Users
There are advantages for institutions that are just starting in-house proficiency testing at this time since much of the early, iterative development work has been done. Many of the staff development tools, student/parent awareness materials, and instructional tools have been disseminated, free of charge (see "additional resources" links below). Also, the newest version of OWL Testing Software includes support components: a rater calibration feature, vocabulary practice component, help buttons, and an individual teacher testing component that enables teachers to input their classroom tests for routine formative testing. All of these features will enable institutions and teachers to quickly focus on the formative and summative aspects of testing to guide teaching and learning.
In Conclusion
There certainly are times when it is appropriate and desirable to use outside, double-rated tests such as the official ACTFL OPI or CAL SOPI tests. Districts use them for validation studies and for students who demonstrate ACTFL Advanced levels of proficiency. However, when a high level of speaking proficiency is a major goal for all students, all teachers should thoroughly understand the test and the rating scale. Most importantly, when teachers help to create, administer and rate tests, and help analyze the resulting data, they become invested in the test, the process, and in seeing improved results.
Sources:
- Black, Paul and Dylan Wiliam, (1998) Inside the BlackBox: Raising Standards Through Classroom Assessment, Phi Delta Kappan, 148.
- Cech, Scott J. (2008) "Test Industry Split Over ‘Formative' Assessment", Education Week, Vol. 28, No. 4. 1, 15.
- Fall, T., Adair-Hauck, B., & Glisan, I. (2007) "Assessing Students' Oral Proficiency: A Case for Online Testing". Foreign Language Annals, 40, 377 - 406.
- Stansfield, C. W. (1996). Test Development Handbook: Simulated Oral Proficiency Interview (SOPI). Washington, DC: Center for Applied Linguistics.
- Wiggins, G. (1998). Educative Assessment: Designing Assessments to Inform and Improve Student Performance. San Francisco: Jossey-Bass Publishers.
Additional Resources:
- Pittsburgh Public Schools' World Language Teacher Resources
- Pittsburgh Public Schools' Foreign Language Assistance Program (FLAP) Dissemination
About the Author
Dr. Thekla Fall is a world language consultant and retired curriculum supervisor from Pittsburgh Public Schools.
Spotlight: Monterey Institute of International Studies Using OWL
Students attending the renowned Graduate School of Language and Educational Linguistics at Monterey Institute of International Studies in California now have a first-hand opportunity to use OWL Testing Software. An elective Computer-Assisted Assessment course examines ways in which computer technology can enhance the validity of assessments, as well as the delivery and scoring of assessments. The tests—which include oral, aural, reading, and writing assessments—are developed using a Web-based tool from OWL Testing Software.
“We use OWL because it is comprehensive, advanced, and flexible. It is user friendly, and the support we receive from the company is excellent.” says Dr. Jean Turner, a professor at MIIS's GSLEL. Turner, who has a doctorate in applied linguistics, taught language at a variety of schools from 1976 to 1989 and has been teaching applied linguistics since.
“While many schools do a good job of preparing future language instructors, most do not place enough emphasis on the importance of assessments in the classroom. At MIIS, we want our students to understand what it means to create valid and defendable assessments and why this is such an important part of teaching.”
Learn more about MIIS GSLEL at http://language.miis.edu/
Focus on Features: Calendar Landing Page
You've just finished creating a homework assignment complete with realia, such as graphics, audio, and video. It requires your students to provide oral and written responses—something they can do from their home, dorm room, student union, or while traveling abroad. You use OWL to assign the activity to your class and schedule it to be available online beginning at 8 a.m. on Friday. When your students log in, they see their Calendar Landing Page. The Assignment graphically spans the available dates and times, while linking them directly to it. OWL's Calendar Landing Page helps both teachers and students by providing a powerful, useful, and familiar way to schedule and connect to assignments so that everyone makes the most of their time online.
In Perspective: Bring Your Jukebox Money
By Greg Russak, Vice President, OWL Testing Software
If you can remember The B-52s—the new wave group, not the bomber—then we probably have a lot in common. Perhaps you marvel as I do about how technology has advanced in the last 30-odd years. I see my teenagers thumbing away on their cell phones and managing half a dozen open chat windows while they update their Facebook pages. I catch myself thinking, "It wasn't all that long ago that portable cell phones were the size of a lady's handbag, used a battery you could build brick houses out of, and had a talk time of about an hour if you were lucky. It certainly didn't play music from The B-52s; we had LPs and 8-tracks for that! And a personal computer? I learned programming in college on punch cards and never encountered a 'desktop' computer—the infamous RadioShack TRS-80—until after I graduated college...as an engineer, no less!"
It truly is amazing how all manner of computing and communication technology has become something that we—and our kids—take for granted. Want some even more sobering evidence of the use, role, and expectations for technology in the life of a student, especially a college student? Peter Schilling, Director of IT at Amherst College, can tell you all about it. It's not slowing down. (Big surprise, right?) He posted his "IT Index" on AcademicCommons.org on September 20th. It's based upon 1,680 or so students, 438 of which are enrolled as first-year. Here are some of his findings that I think are especially telling:
- The number of first-year applicants who applied online went from 33% to 89% between 2003 and 2008
- Of the 438 students who make up the class of 2012, 432 are members of the Amherst College Facebook group
- Of those same 438 first-year students, 370 of them registered 443 devices such as computers, iPhones, and game consoles on the campus network—by the end of the first day they moved into their dorms
- A mere 14 out of the 438 students in the class of 2012 brought a desktop computer to school
- Only five students on the entire campus have landline phone service
This is fascinating, but not surprising. I have two teenagers, one of whom is a sophomore dual major in Japanese and Theater at a state university. The other is a high school senior. They have never known a time in their lives when the Internet did not exist. They have never experienced circumstances when computing wasn't, for all intents and purposes, ubiquitous. Except, regrettably, in their educational experiences. Even today.
As people concerned with education—language education, in particular—and the application of technology to fostering more effective and more meaningful education for every child, we must acknowledge the ever-increasing level of expectation that both students and parents have when it comes to what schools are investing in. How much is being invested in technology that makes a difference to language learning regardless of enrollment or status as a major?
Students live on the Net—and language students are certainly no exception. They, perhaps more than other students, communicate, entertain, and engage one another from laptops and portable devices from wherever they may be and whenever they like. Can they say that about their language program at your school? Are they gaining access to reading, writing, listening, and speaking skills, activities, and assignments that you want them to have and that you believe are meaningful? Or, are they left to their own devices, leaving you without enough opportunity to truly assess and provide meaningful feedback about their proficiency? Does your current technology offer you the means to gauge what is and isn't working in the curriculum? Is your administration telling you they have no money for your programs while they build new football stadiums, leaving you and your students without the technology and tools you both deserve so that you can stop administering language tests and activities with pen and paper, cassette tape recorders, or cobbled together, incomplete, and unreliable freeware solutions?
OWL Testing Software can help you enhance the educational experience—and outcomes—for what amounts to far less than "jukebox money" relative to the tuition college kids have to spend or that new football stadium the school board approved. With OWL, you can reach your students where they live—online—and do it in ways that engage them more fully, but that never requires you to learn a single line of programming.
Let's face it. If you and your language students are suffering through testing and assessments using 1970's vintage technology, or some homespun program that's too hard to use but administrators think is free, then what conclusion can your students and their parents reasonably be expected to come to about the school's actual commitment to their education?
It's probably something like the difference between listening to The B-52s on that scratchy old stereo turntable and the crystal clear digital recording captured by an MP3 player. Which do you think your students prefer? Which one do they deserve?
Contact us for more information about what OWL can do for you and your students, and to schedule a no-obligation demonstration.











