Thursday 15 May 2014

New Educator Evaluation System

New Educator Evaluation System
As a part of an Omnibus School Code bill, the Pennsylvania Legislature has passed a new law pertaining to educator evaluation. The law establishes the framework for a new evaluation system, which will be implemented for classroom teachers beginning in the 2013-14 school year and “non teaching professional employees” in 2014-15.

Introduction:
Well, after along time in the making, the world has finally lost its respect for teachers… And in a way, I can’t blame them. The problem is, not every teacher is to blame. In many cases, teachers can grow too comfortable in there positions. Think about it, you teach a class like US History and you spend the first 5 years of career getting all of your information together. Lesson plans, tests, Powerpoint presentations and study guides, now I ask you what do you do for the next 40 years. How can you continue to make it better? So, is the teacher really at fault? Or what about the 10th grade math teacher that has to work with the same information all the time, nothing new about math is coming down the road. It’s easy to be on the other side of the bus stop and make comments about what they think teachers are doing… Right or wrong. But in the end, the negative comments always out weight the positive ones. I think we have a long battle a head of us, I just hope we can figure it out sooner then latter.  I believe teachers on a whole, do what they think is best for the students, but I do believe we could be doing it better.

Overview
This summer (2012) Pennsylvania put into effect a new Teacher Evaluation System that is broken down into 4 sections. For classroom teachers, 50 percent of the overall rating is to be based on multiple student performance measures which shall be comprised of the following: 15 percent building level data, including but not limited to:

  • student performance on assessments
  • value-added data from PDE;
  • graduation rate;
  • promotion rate;
  • attendance rate;
  • advanced placement course participation; and
  • SAT and PSAT data.
15 percent teacher specific data, including but not limited to: 20 percent elective data including student achievement measures that are locally developed and selected by the school entity from a list approved by PDE, including but not limited to:

  • school-designed measures and examinations;
  • nationally recognized standardized tests;
  • industry student projects; and
  • student portfolios.

The remaining 50 percent will be based on classroom observation and practice models related to student achievement in each of the following areas:

  • planning and preparation;
  • classroom environment;
  • instruction, and
  • professional responsibilities.
Ok, so there’s the quick overview of the new evaluation system, and as someone who feels like they do a good job teaching, I say bring it on. The section that teachers need to really focus on is the last section. Teachers need to start here, because I believe that if you have a great classroom, solid planning and preparation and a caring professional attitude, the first 3 sections should work itself out. I feel that right now, everyone wants to see results in high stakes testing and I understand why schools are killing themselves to teach to the test. Teachers are being asked to make sure they prepare their students for testing. Up until now schools funding and structure was base on how well the test score are. Now teachers are going to be evaluated on the same things; data, test score and success rates after graduation. Now here’s the real issues that are going to effect teachers and the biggest one is no one is safe… not even if you have tenure!
Specifics:

In 2013-2014 all the schools in Pennsylvania will be asked to be using the new teacher evaluation system. The evaluation tool should be complete by the end of the 2012 - 2013 school year. PDE and PSEA are working together to make sure all parties are probably supported in these new laws. The only schools that might not be using this system would be; any school that has a contract that labels it other wise. Once the contract time is up, the school will have to take on the new system. The rating system that will be in place is Distinguished, Proficient, Needs Improvement and Failing. Needs Improvement and Failing are considered unsatisfactory. If a teacher receives 2 unsatisfactory reviews in a row the teacher may be let go. – Now here is one of the first problems, in many school districts they can’t get teachers to work there, let alone get the students to care about passing a test. I would question, that even if a teacher is unsatisfactory, would a school district get rid of him knowing it won’t be easy to replace them. These types of situation will be interesting to see what really happens.

Remember the new evaluation tool has been designed to protect teachers from being fired because of data and test score. 50% of the evaluation is based on being a good teacher. This is where the teacher’s passion should show up, but I do believe that if the observation section is rated distinguished or proficient, I would bet, the test score will stand up to the requirements. I also, think that in this case, the schools will need to do a better job with pre and post test assessments. Another issue that might be a big problem is, how will the information will be provided to the public? Can you imagine if parents get a hold of the scores and demand that teachers are not capable to work in the school. Sometimes people feel that they know why a school might be failing. It would be nice, if parents put that much effort into their kids homework. One interesting thing that I saw, is that rating tool will also target principles and administration. I am a strong believer that people want to be lead by a passionate director. In a lot of cases I do feel that schools go in the wrong direction, because of the culture the administration sets. I have seen a few different schools that thrive with one director and go down a bad path with another.

Conclusion
All in all… I’m Excited. I think this is a step in the right direction, and it is time that we put a system in place. I think that this system really puts a lot of pressure on the administrators to enforce this evaluation. The most important part of this system is for the state to actually support it. And not come out with a full head of steam and then in two years nobody supports it any more. The state needs to be the systems biggest fan. They need to highlight schools, teachers and staff members for outstanding work. They need to develop professional plans for administrators to use to support the programs. To many times we see education initiatives just come and go. Here are some of the problems that I see coming down the road. The administrator in charge of conducting these evaluations needs to be very confident and straightforward in this approach. If it is taken to lightly from the start, the school will never really be able to use it as a solid tool for the promotion or dismissal of teachers. If teachers don’t begin to understand the process is truly important for the schools and personal growth, then this will just go down hill fast. This process is going to need strong leadership to accomplish its goals. Also, administration needs to be very careful on how they provide feedback. For example; if a teacher gets two unsatisfactory in a row and losses their job, you can bet the lawsuits are going to be coming. The paperwork trail needs to be better then perfect to protect everyone involved. I wonder, how this will all work out with PSEA. But it might not be so bad because they are part of the development of the evaluation…

Online Learning in Higher Education

Higher education in the United States, especially the public sector, is increasingly short of resources. States continue to cut appropriations in response to fiscal constraints and pressures to spend more on other things, such as health care and retirement expenses. Higher tuition revenues might be an escape valve, but there is great concern about tuition levels increasing resentment among students and their families and the attendant political reverberations. President Obama has decried rising tuitions, called on colleges and universities to control costs, and proposed to withhold access to some federal programs for colleges and universities that do not address “affordability” issues.
Costs are no less a concern in K–12 education. Until the 2008 financial crisis and the subsequent slowdown in U.S. economic growth, per-pupil expenditures on elementary and secondary education had been steadily rising. The number of school personnel hired for every 100 students more than doubled between 1960 and the first decade of the 21st century. But in the past few years, local property values have stagnated and states have faced intensifying fiscal pressure. As a result, per-pupil expenditures have for the first time in decades shown a noticeable decline, and pupil-teacher ratios have begun to shift upward (see “Public Schools and Money,” features, Fall 2012). With the rising cost of teacher and administrator pensions, the squeeze on school districts is expected to continue. A subject of intense discussion is whether advances in information technology will, under the right circumstances, permit increases in productivity and thereby reduce the cost of instruction. Greater, and smarter, use of technology in teaching is widely seen as a promising way of controlling costs while reducing achievement gaps and improving access. The exploding growth in online learning, especially in higher education, is often cited as evidence that, at last, technology may offer pathways to progress (see Figure 1).

However, there is concern that at least some kinds of online learning are of low quality and that online learning in general depersonalizes education. It is important to recognize that “online learning” comes in a dizzying variety of flavors, ranging from simply videotaping lectures and posting them online for anytime access, to uploading materials such as syllabi, homework assignments, and tests to the Internet, all the way to highly sophisticated interactive learning systems that use cognitive tutors and take advantage of multiple feedback loops. Online learning can be used to teach many kinds of subjects to different populations in diverse institutional settings.

Click to enlarge
Despite the apparent potential of online learning to deliver high-quality instruction at reduced costs, there is very little rigorous evidence on learning outcomes for students receiving instruction online. Very few studies look at the use of online learning for large introductory courses at major public universities, for example, where the great majority of undergraduate students pursue either associate or baccalaureate degrees. Even fewer use random assignment to create a true experiment that isolates the effect of learning online from other factors.
Our study overcomes many of the limitations of prior studies by using the gold standard research design, a randomized trial, to measure the effect on learning outcomes of a prototypical, interactive online college statistics course. Specifically, we randomly assigned students at six public university campuses to take the course in a hybrid format, with computer-guided instruction accompanied by one hour of face-to-face instruction each week, or a traditional format, with three to four hours of face-to-face instruction each week. We find that learning outcomes are essentially the same: students in the hybrid format pay no “price” for this mode of instruction in terms of pass rates, final-exam scores, or performance on a standardized assessment of statistical literacy. Cost simulations, although speculative, indicate that adopting hybrid models of instruction in large introductory courses has the potential to reduce instructor compensation costs quite substantially.

Research Design
Our study assesses the educational outcomes generated by what we term interactive learning online (ILO), highly sophisticated, web-based courses in which computer-guided instruction can substitute for some (though usually not all) traditional, face-to-face instruction. Course systems of this type take advantage of data collected from large numbers of students in order to offer each student customized instruction, as well as to enable instructors to track students’ progress in detail so that they can provide more targeted and effective guidance.

We worked with seven instances of a prototype ILO statistics course at six public university campuses (including two separate courses in separate departments on one campus). The individual campuses include, from the State University of New York (SUNY): the University at Albany and SUNY Institute of Technology; from the University of Maryland: the University of Maryland, Baltimore County, and Towson University; and from the City University of New York (CUNY): Baruch College and City College. We examine the learning effectiveness of a particular interactive statistics course developed at Carnegie Mellon University (CMU), considered a prototype for ILO courses. Although the CMU course can be delivered in a fully online environment, in this study most of the instruction was delivered through interactive online materials, but the online instruction was supplemented by a one-hour-per-week face-to-face session in which students could ask questions or obtain targeted assistance.

The exact research protocol varied by campus in accordance with local policies, practices, and preferences, but the general procedure followed was 1) at or before the beginning of the semester, students registered for the introductory statistics course were asked to participate in our study and offered modest incentives for doing so; 2) students who consented to participate filled out a baseline survey; 3) study participants were randomly assigned to take the class in a traditional or hybrid format; 4) study participants were asked to take a standardized test of statistical literacy at the beginning of the semester; and 5) at the end of the semester, study participants were asked to take the standardized test of statistical literacy again, as well as to complete another questionnaire.

Of the 3,046 students enrolled in these statistics courses in the fall 2011 semester, 605 agreed to participate in the study and to be randomized into either a hybrid- or traditional-format section. An even larger sample size would have been desirable, but the logistical challenges of scheduling at least two sections (one hybrid section and one traditional section) at the same time, to enable students in the study to attend the statistics course regardless of their (randomized) format assignment, restricted our prospective participant pool to the limited number of “paired” time slots available. Also, student consent was required in order for researchers to randomly assign them to the traditional or hybrid format. Not surprisingly, some students who were able to make the paired time slots elected not to participate in the study. All of these complications notwithstanding, our final sample of 605 students is in fact quite large in the context of this type of research.

The baseline survey administered to students included questions on students’ background characteristics, such as socioeconomic status, as well as their prior exposure to statistics and the reason for their interest in possibly taking the statistics course in a hybrid format. The end-of-semester survey asked questions about their experiences in the statistics course. Students in study-affiliated sections of the statistics course took a final exam that included a set of items that was identical across all the participating sections at that campus. The scores of study participants on this common portion of the exam were provided to the research team, along with background administrative data and final course grades of all students (both participants and, for comparison purposes, nonparticipants) enrolled in the course.

The participants in our study are a diverse group. Half come from families with incomes less than $50,000 and half are first-generation college students. Less than half are white, and the group is about evenly divided between students with college GPAs above and below 3.0. Most students are of traditional college-going age (younger than 24), enrolled full-time, and in their sophomore or junior year.  The data indicate that the randomization worked properly in that traditional- and hybrid-format students in fact have very similar characteristics overall. The 605 students who chose to participate in the study also have broadly similar characteristics to the other students registered for introductory statistics. The differences that do exist are quite small. For example, participants are more likely to be enrolled full-time but only by a margin of 90 versus 86 percent. Their outcomes in the statistics course are also comparable, with participants earning similar grades and being only slightly less likely to complete and pass the course than nonparticipants.

An important limitation of our study is that while we were successful in randomizing students between treatment and control groups, we could not randomize instructors in either group and thus could not control for differences in teacher quality. Instructor surveys reveal that, on average, the instructors in traditional-format sections were much more experienced than their counterparts teaching hybrid-format sections (median years of teaching experience was 20 and 5, respectively). Moreover, almost all of the instructors in the hybrid-format sections were using the CMU online course for either the first or second time, whereas many of the instructors in the traditional-format sections had taught in this mode for years. The “experience advantage,” therefore, is clearly in favor of the teachers of the traditional-format sections. The questionnaires also reveal that a number of the instructors in hybrid-format sections began with negative perceptions of online learning, which may have depressed the performance of the hybrid sections. The hybrid-format sections were somewhat smaller than the traditional-format sections, however, which may have conferred some advantage on the students randomly assigned to the hybrid format.

Learning Outcomes
Our analysis of the experimental data is straightforward. We compare the outcomes for students randomly assigned to the traditional format to the outcomes for students randomly assigned to the hybrid format. In a small number of cases—4 percent of the 605 students in the study—participants attended a different format section than the one to which they were randomly assigned. In order to preserve the randomization procedure, we associated students with the section type to which they were randomly assigned. This is sometimes called an “intent to treat” analysis, but in this case it makes little practical difference because the vast majority of students complied with their initial assignment. Our analysis controls for student characteristics, including race/ethnicity, gender, age, full-time versus part-time enrollment status, class year in college, parental education, language spoken at home, and family income. These controls are not strictly necessary, since students were randomly assigned to a course format. We obtain nearly identical results when we do not include these control variables, just as we would expect given the apparent success of our random assignment procedure.

We first examine the impact of assignment to the hybrid format, relative to the traditional format, on students’ probability of passing the course, their performance on a standardized test of statistics, and their score on a set of final-exam questions that were the same in the two formats. We find no clear differences in learning outcomes between students in the traditional- and hybrid-format sections. Hybrid-format students did perform slightly better than traditional-format students on the three outcomes, achieving pass rates that were about 3 percentage points higher, standardized-test scores about 1 percentage point higher, and final-exam scores 2 percentage points higher, but none of these differences is statistically significant (see Figure 2).

Click to enlarge
It is important to note that these non-effects are fairly precisely estimated. This precision implies that if there had been pronounced differences in outcomes between traditional-format and hybrid-format groups, it is highly likely that we would have found them. In other words, we can be quite confident that the actual effects were in fact close to zero, and therefore differ from a hypothetical finding of “no significant difference” that may result from excessively noisy data or an insufficiently large sample. We also calculate results separately for subgroups of students defined in terms of various characteristics, including race/ethnicity, gender, parental education, primary language spoken, score on the standardized pretest, hours worked for pay, and college GPA. We do not find any consistent evidence that the hybrid-format effect varies by any of these characteristics. There are no groups of students that benefited from or were harmed by the hybrid format consistently across multiple learning outcomes. In addition, we examine how much students liked the hybrid format of the course, and find that students gave the hybrid format a modestly lower overall rating than their counterparts gave the traditional-format course (the rating was about 11 percent lower). By similar margins, hybrid students report feeling that they learned less and that they found the course more difficult. But there were no notable differences in students’ reports of how much the course raised their interest in the subject matter.

We also asked students how many hours per week they spent outside of class working on the statistics class. Hybrid-format students report spending 0.3 hours more each week, on average, than traditional-format students. This difference implies that in a course where a traditional section meets for three hours each week and a hybrid section meets for one hour, the average hybrid-format student would spend 1.7 fewer hours each week in total time devoted to the course, a difference of about 25 percent. This result is consistent with nonexperimental evidence that ILO-type formats can achieve the same learning outcomes as traditional-format instruction in less time, which has potentially important implications for scheduling and the rate of course completion.

Potential Savings
In other sectors of the economy, the use of technology has increased productivity, measured as outputs divided by inputs, and often increased output as well. Our study shows that a leading prototype hybrid-learning system did not increase outputs (student learning) but could potentially increase productivity by using fewer inputs. It would seem to be straightforward to compare the side-by-side costs of the hybrid version of the statistics course and the traditional version. The problem, however, is that contemporaneous comparisons can be nearly useless in projecting long-term costs, because the costs of doing almost anything for the first time are very different from the costs of doing the same thing numerous times. This is especially true in the case of online learning, where there are substantial start-up costs that have to be considered in the short run but are likely to decrease over time. For example, the development of sophisticated hybrid courses will be a costly effort that would only be a sensible investment if the start-up costs were either paid for by others (foundations and governments) or shared by many institutions.

There are also transition costs entailed in moving from the traditional, mostly face-to-face model to a hybrid model that takes advantage of more sophisticated ILO systems employing computer-guided instruction, cognitive tutors, embedded feedback loops, and some forms of automated grading. Instructors need to be trained to take full advantage of such systems. On unionized campuses, there may also be contractual limits on section size that were designed with the traditional model in mind but that do not make sense for a hybrid model. It is possible that these constraints would be changed in future contract negotiations, but that too will take time. We address these issues by conducting cost simulations based on data from three of the campuses in our study. Our basic approach is to start by looking, in as much detail as possible, at the actual costs of teaching a basic course in traditional format (usually, but not always, the statistics course) in a base year. Then, we simulate the prospective, steady-state costs of a hybrid version of the same course. These exploratory simulations are based on explicit assumptions, especially about staffing, which allow us to see how sensitive our results are to variations in key assumptions.

We did exploratory simulations for two types of traditional teaching models: 1) students taught in sections of roughly 40 students per section, and 2) students attending a common lecture and assigned to small discussion sections led by teaching assistants. We focus on instructor compensation because these costs comprise a substantial portion of the recurring cost of teaching and are the most straightforward to measure. We compare the current compensation costs of each of the two traditional teaching models to simulated costs of a hybrid model in which most instruction is delivered online, students attend weekly face-to-face sessions with part-time instructors, and the course is overseen by a tenure-track professor. These simulations are admittedly speculative and subject to considerable variation depending on how a particular campus organizes its teaching, but they suggest that significant cost savings are possible. In particular, we estimate savings in compensation costs for the hybrid model ranging from 36 percent to 57 percent compared to the all-section traditional model, and 19 percent compared to the lecture-section model.

These simulations confirm that hybrid learning offers opportunities for significant savings, but that the degree of cost reduction depends (of course) on exactly how hybrid learning is implemented, especially the rate at which instructors are compensated and section size. A large share of cost savings is derived from shifting away from time spent by expensive professors toward both computer-guided instruction that saves on staffing costs overall and time spent by less-expensive staff in Q and A sessions. Our simulations substantially underestimate the savings from moving toward a hybrid model in many settings because we do not account for space costs. It is difficult to put a dollar figure on space costs because capital costs are difficult to apportion accurately to specific courses, but the difference in face-to-face meeting time implies that the hybrid course requires 67 to 75 percent less classroom use than the traditional course.

In the short run, institutions cannot lay off tenured faculty or sell or demolish their buildings. In the long run, however, using hybrid models for some large introductory courses would allow institutions to expand enrollment without a commensurate increase in space costs, a major savings relative to what institutions would have to spend to serve the same number of students with a traditional model of instruction. In other words, the hybrid model need not just “save money”; it can also support an increase in access to higher education. It serves the access goal both by making it more affordable for the institution to enroll more students and by accommodating more students because of greater scheduling flexibility. This flexibility may be especially important for students who have to balance family and work responsibilities with course completion, as well as for students who live far from campus.

Conclusions
In the case of online learning, where millions of dollars are being invested by a wide variety of entities, we should perhaps expect that there will be inflated claims of spectacular successes. The findings in this study warn against too much hype. To the best of our knowledge, there is no compelling evidence that online learning systems available today—not even highly interactive systems, which are very few in number—can in fact deliver improved educational outcomes across the board, at scale, on campuses other than the one where the system was born, and on a sustainable basis. This is not to deny, however, that these systems have great potential. Our study demonstrates the potential of truly interactive learning systems that use technology to provide some forms of instruction, in properly chosen courses, in appropriate settings. We find that such an approach need not affect learning outcomes negatively and conceivably could, in the future, improve them, as these systems become ever more sophisticated and user-friendly. It is also entirely possible that by reducing instructor compensation costs for large introductory courses, such systems could lead to more, not less, opportunity for students to benefit from exposure to modes of instruction such as independent study with professors, if scarce faculty time can be beneficially redeployed.

What would be required to overcome the barriers to adoption of even simple online learning systems—let alone more sophisticated systems that are truly interactive? First, a system-wide approach will be needed for a sophisticated customizable platform to be developed, made widely available, maintained, and sustained in a cost-effective manner. It is unrealistic to expect individual institutions to make the up-front investments needed, and collaborative efforts among institutions are difficult to organize, especially when nimbleness is needed. In all likelihood, major foundation, government, or private-sector investments will be required to launch such a project. Second, as ILO courses are developed in different fields, it will be important to test them rigorously to see how cost-effective they are in at least sustaining and possibly improving learning outcomes for various student populations in a variety of settings. Such rigorous testing should be carried out in large public university systems, which may be willing to pilot such courses. Hard evidence will be needed to persuade other institutions, and especially leading institutions, to try out such approaches.

Finally, it is hard to exaggerate the importance of confronting the cost problems facing American public education at all levels. The public is losing confidence in the ability of the higher-education sector in particular to control costs. All of higher education has a stake in addressing this problem, including the elite institutions that are under less immediate pressure than others to alter their teaching methods. ILO systems can be helpful not only in curbing cost increases (including the costs of building new space), but also in improving retention rates, educating students who are place-bound, and increasing the throughput of higher education in cost-effective ways. We do not mean to suggest that ILO systems are a panacea for this country’s deep-seated education problems. Many claims about “online learning” (especially about simpler variants in their present state of development) are likely to be exaggerated. But it is important not to go to the other extreme and accept equally unfounded assertions that adoption of online systems invariably leads to inferior learning outcomes and puts students at risk. We are persuaded that well-designed interactive systems in higher education have the potential to achieve at least equivalent educational outcomes while opening up the possibility of freeing up significant resources that could be redeployed more productively.

Extrapolating the results of our study to K–12 education is hardly straightforward. College students are expected to have a degree of self-motivation and self-discipline that younger students may not yet have achieved. But the variation among students within any given age cohort is probably much greater than the differences from one age group to the next. At the very least, one could expect that online learning for students planning to enter the higher-education system would be an appropriate experience, especially if colleges and universities continue to expand their online offerings. It is not too soon to seek ways to test experimentally the potential of online learning in secondary schools as well.

The Educational Value of Field Trips

The school field trip has a long history in American public education. For decades, students have piled into yellow buses to visit a variety of cultural institutions, including art, natural history, and science museums, as well as theaters, zoos, and historical sites. Schools gladly endured the expense and disruption of providing field trips because they saw these experiences as central to their educational mission: schools exist not only to provide economically useful skills in numeracy and literacy, but also to produce civilized young men and women who would appreciate the arts and culture. More-advantaged families may take their children to these cultural institutions outside of school hours, but less-advantaged students are less likely to have these experiences if schools do not provide them. With field trips, public schools viewed themselves as the great equalizer in terms of access to our cultural heritage. Today, culturally enriching field trips are in decline. Museums across the country report a steep drop in school tours. For example, the Field Museum in Chicago at one time welcomed more than 300,000 students every year. Recently the number is below 200,000. Between 2002 and 2007, Cincinnati arts organizations saw a 30 percent decrease in student attendance. A survey by the American Association of School Administrators found that more than half of schools eliminated planned field trips in 2010–11. The decision to reduce culturally enriching field trips reflects a variety of factors. Financial pressures force schools to make difficult decisions about how to allocate scarce resources, and field trips are increasingly seen as an unnecessary frill. Greater focus on raising student performance on math and reading standardized tests may also lead schools to cut field trips. Some schools believe that student time would be better spent in the classroom preparing for the exams. When schools do organize field trips, they are increasingly choosing to take students on trips to reward them for working hard to improve their test scores rather than to provide cultural enrichment. Schools take students to amusement parks, sporting events, and movie theaters instead of to museums and historical sites. This shift from “enrichment” to “reward” field trips is reflected in a generational change among teachers about the purposes of these outings. In a 2012‒13 survey we conducted of nearly 500 Arkansas teachers, those who had been teaching for at least 15 years were significantly more likely to believe that the primary purpose of a field trip is to provide a learning opportunity, while more junior teachers were more likely to see the primary purpose as “enjoyment.” If schools are de-emphasizing culturally enriching field trips, has anything been lost as a result? Surprisingly, we have relatively little rigorous evidence about how field trips affect students. The research presented here is the first large-scale randomized-control trial designed to measure what students learn from school tours of an art museum. We find that students learn quite a lot. In particular, enriching field trips contribute to the development of students into civilized young men and women who possess more knowledge about art, have stronger critical-thinking skills, exhibit increased historical empathy, display higher levels of tolerance, and have a greater taste for consuming art and culture.

Design of the Study and School Tours
The 2011 opening of the Crystal Bridges Museum of American Art in Northwest Arkansas created the opportunity for this study. Crystal Bridges is the first major art museum to be built in the United States in the last four decades, with more than 50,000 square feet of gallery space and an endowment in excess of $800 million. Portions of the museum’s endowment are devoted to covering all of the expenses associated with school tours. Crystal Bridges reimburses schools for the cost of buses, provides free admission and lunch, and even pays for the cost of substitute teachers to cover for teachers who accompany students on the tour.

Because the tour is completely free to schools, and because Crystal Bridges was built in an area that never previously had an art museum, there was high demand for school tours. Not all school groups could be accommodated right away. So our research team worked with the staff at Crystal Bridges to assign spots for school tours by lottery. During the first two semesters of the school tour program, the museum received 525 applications from school groups representing 38,347 students in kindergarten through grade 12. We created matched pairs among the applicant groups based on similarity in grade level and other demographic factors. An ideal and common matched pair would be adjacent grades in the same school. We then randomly ordered the matched pairs to determine scheduling prioritization. Within each pair, we randomly assigned which applicant would be in the treatment group and receive a tour that semester and which would be in the control group and have its tour deferred.

We administered surveys to 10,912 students and 489 teachers at 123 different schools three weeks, on average, after the treatment group received its tour. The student surveys included multiple items assessing knowledge about art as well as measures of critical thinking, historical empathy, tolerance, and sustained interest in visiting art museums. Some groups were surveyed as late as eight weeks after the tour, but it was not possible to collect data after longer periods because each control group was guaranteed a tour during the following semester as a reward for its cooperation. There is no indication that the results reported below faded for groups surveyed after longer periods.

We also assessed students’ critical-thinking skills by asking them to write a short essay in response to a painting that they had not previously seen. Finally, we collected a behavioral measure of interest in art consumption by providing all students with a coded coupon good for free family admission to a special exhibit at the museum to see whether the field trip increased the likelihood of students making future visits. All results reported below are derived from regression models that control for student grade level and gender and make comparisons within each matched pair, while taking into account the fact that students in the matched pair of applicant groups are likely to be similar in ways that we are unable to observe. Standard validity tests confirmed that the survey items employed to generate the various scales used as outcomes measured the same underlying constructs.

The intervention we studied is a modest one. Students received a one-hour tour of the museum in which they typically viewed and discussed five paintings. Some students were free to roam the museum following their formal tour, but the entire experience usually involved less than half a day. Instructional materials were sent to teachers who went on a tour, but our survey of teachers suggests that these materials received relatively little attention, on average no more than an hour of total class time. The discussion of each painting during the tour was largely student-directed, with the museum educators facilitating the discourse and providing commentary beyond the names of the work and the artist and a brief description only when students requested it. This format is now the norm in school tours of art museums. The aversion to having museum educators provide information about works of art is motivated in part by progressive education theories and by a conviction among many in museum education that students retain very little factual information from their tours.

Results
Recalling Tour Details. Our research suggests that students actually retain a great deal of factual information from their tours. Students who received a tour of the museum were able to recall details about the paintings they had seen at very high rates. For example, 88 percent of the students who saw the Eastman Johnson painting At the Camp—Spinning Yarns and Whittling knew when surveyed weeks later that the painting depicts abolitionists making maple syrup to undermine the sugar industry, which relied on slave labor. Similarly, 82 percent of those who saw Norman Rockwell’s Rosie the Riveter could recall that the painting emphasizes the importance of women entering the workforce during World War II. Among students who saw Thomas Hart Benton’s Ploughing It Under, 79 percent recollected that it is a depiction of a farmer destroying his crops as part of a Depression-era price support program. And 70 percent of the students who saw Romare Bearden’s Sacrifice could remember that it is part of the Harlem Renaissance art movement. Since there was no guarantee that these facts would be raised in student-directed discussions, and because students had no particular reason for remembering these details (there was no test or grade associated with the tours), it is impressive that they could recall historical and sociological information at such high rates. These results suggest that art could be an important tool for effectively conveying traditional academic content, but this analysis cannot prove it. The control-group performance was hardly better than chance in identifying factual information about these paintings, but they never had the opportunity to learn the material. The high rate of recall of factual information by students who toured the museum demonstrates that the tours made an impression. The students could remember important details about what they saw and discussed.

Critical Thinking. Beyond recalling the details of their tour, did a visit to an art museum have a significant effect on students? Our study demonstrates that it did. For example, students randomly assigned to receive a school tour of Crystal Bridges later displayed demonstrably stronger ability to think critically about art than the control group. During the first semester of the study, we showed all 3rd- through 12th-grade students a painting they had not previously seen, Bo Bartlett’s The Box. We then asked students to write short essays in response to two questions: What do you think is going on in this painting? And, what do you see that makes you think that? These are standard prompts used by museum educators to spark discussion during school tours.

We stripped the essays of all identifying information and had two coders rate the compositions using a seven-item rubric for measuring critical thinking that was developed by researchers at the Isabella Stewart Gardner Museum in Boston. The measure is based on the number of instances that students engaged in the following in their essays: observing, interpreting, evaluating, associating, problem finding, comparing, and flexible thinking. Our measure of critical thinking is the sum of the counts of these seven items. In total, our research team blindly scored 3,811 essays. For 750 of those essays, two researchers scored them independently. The scores they assigned to the same essay were very similar, demonstrating that we were able to measure critical thinking about art with a high degree of inter-coder reliability.

We express the impact of a school tour of Crystal Bridges on critical-thinking skills in terms of standard-deviation effect sizes. Overall, we find that students assigned by lottery to a tour of the museum improve their ability to think critically about art by 9 percent of a standard deviation relative to the control group. The benefit for disadvantaged groups is considerably larger (see Figure 1). Rural students, who live in towns with fewer than 10,000 people, experience an increase in critical-thinking skills of nearly one-third of a standard deviation. Students from high-poverty schools (those where more than 50 percent of students receive free or reduced-price lunches) experience an 18 percent effect-size improvement in critical thinking about art, as do minority students
A large amount of the gain in critical-thinking skills stems from an increase in the number of observations that students made in their essays. Students who went on a tour became more observant, noticing and describing more details in an image. Being observant and paying attention to detail is an important and highly useful skill that students learn when they study and discuss works of art. Additional research is required to determine if the gains in critical thinking when analyzing a work of art would transfer into improved critical thinking about other, non-art-related subjects.

Historical Empathy. Tours of art museums also affect students’ values. Visiting an art museum exposes students to a diversity of ideas, peoples, places, and time periods. That broadening experience imparts greater appreciation and understanding. We see the effects in significantly higher historical empathy and tolerance measures among students randomly assigned to a school tour of Crystal Bridges.
Historical empathy is the ability to understand and appreciate what life was like for people who lived in a different time and place. This is a central purpose of teaching history, as it provides students with a clearer perspective about their own time and place. To measure historical empathy, we included three statements on the survey with which students could express their level of agreement or disagreement: 1) I have a good understanding of how early Americans thought and felt; 2) I can imagine what life was like for people 100 years ago; and 3) When looking at a painting that shows people, I try to imagine what those people are thinking. We combined these items into a scale measuring historical empathy.

Students who went on a tour of Crystal Bridges experience a 6 percent of a standard deviation increase in historical empathy. Among rural students, the benefit is much larger, a 15 percent of a standard deviation gain. We can illustrate this benefit by focusing on one of the items in the historical empathy scale. When asked to agree or disagree with the statement, “I have a good understanding of how early Americans thought and felt,” 70 percent of the treatment-group students express agreement compared to 66 percent of the control group. Among rural participants, 69 percent of the treatment-group students agree with this statement compared to 62 percent of the control group. The fact that Crystal Bridges features art from different periods in American history may have helped produce these gains in historical empathy.

Tolerance. To measure tolerance we included four statements on the survey to which students could express their level of agreement or disagreement: 1) People who disagree with my point of view bother me; 2) Artists whose work is critical of America should not be allowed to have their work shown in art museums; 3) I appreciate hearing views different from my own; and 4) I think people can have different opinions about the same thing. We combined these items into a scale measuring the general effect of the tour on tolerance. Overall, receiving a school tour of an art museum increases student tolerance by 7 percent of a standard deviation. As with critical thinking, the benefits are much larger for students in disadvantaged groups. Rural students who visited Crystal Bridges experience a 13 percent of a standard deviation improvement in tolerance. For students at high-poverty schools, the benefit is 9 percent of a standard deviation.

The improvement in tolerance for students who went on a tour of Crystal Bridges can be illustrated by the responses to one of the items within the tolerance scale. When asked about the statement, “Artists whose work is critical of America should not be allowed to have their work shown in art museums,” 35 percent of the control-group students express agreement. But for students randomly assigned to receive a school tour of the art museum, only 32 percent agree with censoring art critical of America. Among rural students, 34 percent of the control group would censor art compared to 30 percent for the treatment group. In high-poverty schools, 37 percent of the control-group students would censor compared to 32 percent of the treatment-group students. These differences are not huge, but neither is the intervention. These changes represent the realistic improvement in tolerance that results from a half-day experience at an art museum.

Interest in Art Museums. Perhaps the most important outcome of a school tour is whether it cultivates an interest among students in returning to cultural institutions in the future. If visiting a museum helps improve critical thinking, historical empathy, tolerance, and other outcomes not measured in this study, then those benefits would compound for students if they were more likely to frequent similar cultural institutions throughout their life. The direct effects of a single visit are necessarily modest and may not persist, but if school tours help students become regular museum visitors, they may enjoy a lifetime of enhanced critical thinking, tolerance, and historical empathy. We measured how school tours of Crystal Bridges develop in students an interest in visiting art museums in two ways: with survey items and a behavioral measure. We included a series of items in the survey designed to gauge student interest:
• I plan to visit art museums when I am an adult.
• I would tell my friends they should visit an art museum.
• Trips to art museums are interesting.
• Trips to art museums are fun.
• Would your friend like to go to an art museum on a field trip?
• Would you like more museums in your community?
• How interested are you in visiting art museums?
• If your friends or family wanted to go to an art museum, how interested would you be in going?
Interest in visiting art museums among students who toured the museum is 8 percent of a standard deviation higher than that in the randomized control group. Among rural students, the increase is much larger: 22 percent of a standard deviation. Students at high-poverty schools score 11 percent of a standard deviation higher on the cultural consumer scale if they were randomly assigned to tour the museum. And minority students gain 10 percent of a standard deviation in their desire to be art consumers.

One of the eight items in the art consumer scale asked students to express the extent to which they agreed or disagreed with the statement, “I would tell my friends they should visit an art museum.” For all students who received a tour, 70 percent agree with this statement, compared to 66 percent in the control group. Among rural participants, 73 percent of the treatment-group students agree versus 63 percent of the control group. In high-poverty schools, 74 percent would recommend art museums to their friends compared to 68 percent of the control group. And among minority students, 72 percent of those who received a tour would tell their friends to visit an art museum, relative to 67 percent of the control group. Students, particularly those from disadvantaged backgrounds, are more likely to have positive feelings about visiting museums if they receive a school tour.

We also measured whether students are more likely to visit Crystal Bridges in the future if they received a school tour. All students who participated in the study during the first semester, including those who did not receive a tour, were provided with a coupon that gave them and their families free entry to a special exhibit at Crystal Bridges. The coupons were coded so that we could determine the applicant group to which students belonged. Students had as long as six months after receipt of the coupon to use it. We collected all redeemed coupons and were able to calculate how many adults and youths were admitted. Though students in the treatment group received 49 percent of all coupons that were distributed, 58 percent of the people admitted to the special exhibit with those coupons came from the treatment group. In other words, the families of students who received a tour were 18 percent more likely to return to the museum than we would expect if their rate of coupon use was the same as their share of distributed coupons.

This is particularly impressive given that the treatment-group students had recently visited the museum. Their desire to visit a museum might have been satiated, while the control group might have been curious to visit Crystal Bridges for the first time. Despite having recently been to the museum, students who received a school tour came back at higher rates. Receiving a school tour cultivates a taste for visiting art museums, and perhaps for sharing the experience with others.

Disadvantaged Students
One consistent pattern in our results is that the benefits of a school tour are generally much larger for students from less-advantaged backgrounds. Students from rural areas and high-poverty schools, as well as minority students, typically show gains that are two to three times larger than those of the total sample. Disadvantaged students assigned by lottery to receive a school tour of an art museum make exceptionally large gains in critical thinking, historical empathy, tolerance, and becoming art consumers.
It appears that the less prior exposure to culturally enriching experiences students have, the larger the benefit of receiving a school tour of a museum. We have some direct measures to support this explanation. To isolate the effect of the first time visiting the museum, we truncated our sample to include only control-group students who had never visited Crystal Bridges and treatment-group students who had visited for the first time during their tour. The effect for this first visit is roughly twice as large as that for the overall sample, just as it is for disadvantaged students.

In addition, we administered a different version of our survey to students in kindergarten through 2nd grade. Very young students are less likely to have had previous exposure to culturally enriching experiences. Very young students make exceptionally large improvements in the observed outcomes, just like disadvantaged students and first-time visitors. When we examine effects for subgroups of advantaged students, we typically find much smaller or null effects. Students from large towns and low-poverty schools experience few significant gains from their school tour of an art museum. If schools do not provide culturally enriching experiences for these students, their families are likely to have the inclination and ability to provide those experiences on their own. But the families of disadvantaged students are less likely to substitute their own efforts when schools do not offer culturally enriching experiences. Disadvantaged students need their schools to take them on enriching field trips if they are likely to have these experiences at all.

Policy Implications
School field trips to cultural institutions have notable benefits. Students randomly assigned to receive a school tour of an art museum experience improvements in their knowledge of and ability to think critically about art, display stronger historical empathy, develop higher tolerance, and are more likely to visit such cultural institutions as art museums in the future. If schools cut field trips or switch to “reward” trips that visit less-enriching destinations, then these important educational opportunities are lost. It is particularly important that schools serving disadvantaged students provide culturally enriching field trip experiences.

This first-ever, large-scale, random-assignment experiment of the effects of school tours of an art museum should help inform the thinking of school administrators, educators, policymakers, and philanthropists. Policymakers should consider these results when deciding whether schools have sufficient resources and appropriate policy guidance to take their students on tours of cultural institutions. School administrators should give thought to these results when deciding whether to use their resources and time for these tours. And philanthropists should weigh these results when deciding whether to build and maintain these cultural institutions with quality educational programs. We don’t just want our children to acquire work skills from their education; we also want them to develop into civilized people who appreciate the breadth of human accomplishments. The school field trip is an important tool for meeting this goal.

Teacher “training” vs. Teacher “professional development”

My blog posts are picked up in Facebook via RSS feed, and Fred Martin commented there that he prefers “professional development” to “training” to describe in-service educational opportunities for teachers.  It’s a good point.  My adviser, Elliot Solo way, once appeared on PBS talking about how “Dogs are trained. Teachers aren’t trained. They’re taught.”  “Professional development” sounds more like what executives and other knowledge workers do, so it’s a better, more respectful, and more descriptive term.  I agree with all of that, but I propose an argument that claims that “teacher training” is not a bad thing, and may be something we need more of, especially in computing education.
“Training” is defined as activity leading to skilled behavior.  Fire fighters, police officers, emergency medical technicians, and soldiers are “trained.”  Training is associated with providing service to the community, which is certainly what teachers do.  Training is about developing skill, and teaching is clearly a skill.  Athletes train.  I trained for three years for my black belt.  In these senses, “training” is about learning to the point of automatically, so that the learner can demonstrate the skill under stressful conditions.
CS1 teachers do learn to the point of automatically how to help students.  After a few years of teaching Media Computation, I could often tell what was wrong with a student’s program just by looking at the output image or listening to the output sound.  Totally silent output sound?  You may not be increment the target index, so all the source samples are being copied to the same target index.  Black edge on your composed pictures?  Probably an off-by-one error where you’re not changing the right and bottom most edges of the picture.  That automatically comes from knowledge of the domain and seeing lots of examples of student work, so that you learn the common errors.  Such automatically is useful to be able to help many students debug their programs in a brief class time or office hours.
A teacher’s job is stressful.  It is hard for a teacher to manage a classroom of (sometimes unruly, always attention-demanding) students.  A teacher must apply learning under stressful conditions, and reaching automatically will help with multitasking around many students.  However, in computing education especially, we barely have time to teach teachers the basics of computing, let alone become proficient, and in no way, automatic.  Without the time and “training” to develop those automatic responses, teachers have to work harder, spending more time to figure out each student problem.
Fred’s right — “professional development” is more respectful, and clearly conveys that teachers are knowledge-workers.  “Training” is also an appropriate term, that recognizes the skilled service that teachers provide and the hard, stressful job that they have in responding to many students needs.  In computing education especially, we need to give teachers more support that looks like “training,” and not just introducing the concepts in “professional development.”

Wednesday 14 May 2014

Teaching Code in the Classroom – Room for Debate – NYTimes.com

Remarkable debate on the NYTimes website about “Should coding be part of the elementary school curriculum?”  All the debaters have very short statements, and they’re disappointing.
  • Hadi Partovi claims “By high school, it can be too late” and “Students learn fast at a young age, before stereotypes suggest coding is too difficult, just for nerds, or just for boys” — I don’t agree with either statement.  We have lots of examples of women and under-represented minority students discovering CS in high school. It’s not at all clear that students learn everything quickly when they’re young — quantum physics and CS might both be beyond most second graders.
  • But John C. Dvorak’s claim that “This is just another ploy to sell machines to cash-strapped school districts” is also clearly wrong.  The computer manufacturers are not playing a significant role in the effort to push computing  into schools.
Take a look and see what you think.  It’s exciting to have this kind of debate in theNYTimes!
Despite the rapid spread of coding instruction in grade schools, there is some concern that creative thinking and other important social and creative skills could be compromised by a growing focus on technology, particularly among younger students. Should coding be part of the elementary school curriculum?