Scalable, tech-enabled MOOC Cheating

Lately I’ve been thinking about strategies of large-scale cheating in large classes (don’t ask). MOOCs have novel needs in this regard, which are being served by remote proctoring services. But just as education can be technologically enabled at mass scale by MOOCs, so can cheating. Imagine a test-assist service (let’s call it “remote tutoring” for marketing purposes) where the cheater screen shares the online exam with an answer-providing confederate, and the confederate delivers answers on the screen of a cellphone that is place just behind the webcam that is being used by the remote proctoring agency to monitor the exam. (Ideally, the cellphone is taped to the computer screen itself, so that sight lines to the phone point fall within the main screen). With a bit of infrastructure work by the cheating company, they could deliver answers directly from a question database built up over time from prior exams, so that the cheating is as scalable and inexpensive as the MOOC itself. Such a service would of course be hosted in a country with weak rule of law and would provide service at a very reasonable price. It will take some time for the online cheating market to develop, but once it does, would this mode of cheating make online at-home secure exams impossible, at least without demanding root access to the user’s home computer? Fully secure MOOC exams might require a distributed network of in-person testing centers, which sadly limits the ability to reach underdeveloped areas with credit-bearing courses.

GPS weapon lock

In Syria, the U.S. is hesitant to arm the rebels, since the weapons might be redirected against our interests. I imagine of particular concern are anti-aircraft weapons, which could be misused to attack civilian aviation in other countries. Could such weapons be engineered with GPS interlocks, such that they would not function outside of a defined geographical area? The interlock would need to be engineered deep into the structure of the weapon, such that removing the interlock disables the weapon. For modern heavily electronic weapons this should be possible.

Lessons Learned

The story of an inadvertent controlled experiment in correlations of student performance with teacher evaluation.

Alice recently taught two out of four large lecture sections of an introductory science course on a challenging topic. Alice tries to do something different each semester to improve the course. This time, she pared back extraneous details and focussed the lectures more on understanding over formula-crunching. She integrated breaks where students worked through sample questions in class, and asked them to discuss these questions with their neighbors. She conveyed high expectations.

The other instructor, Bob, followed a traditional lecture style. He felt that some students would succeed, some would fail; there wasn’t much he could do about it, so he would deliver to the students what they wanted: sample exam problems in each lecture.

The four sections of the course were identical in all aspects other than lecture: shared homework, labs, recitations, exams, review sessions. Each lecturer contributed half of the questions on each exam. Alice asked the course administrator to track student performance on the exams by lecture section, to see if the new teaching techniques had any effect. Alice was also curious about the end-of-semester student evaluations. Let’s start with those.

Alice was hammered in the student ratings of teaching effectiveness, by far the lowest ratings of any course she had taught (including this same course in the past). Bob’s ratings were fine – a full two points on a six point scale separated them. Beyond raw hatred and anger, Alice’s students overwhelmingly expressed a single sentiment: It was grossly unfair that Bob’s students received sample exam questions in every lecture and they did not. Alice’s students felt strongly that they were not properly prepared for the exams because they were not taught how to work through each kind of problem.

In reality? Both of Alice’s lecture sections performed better than both of Bob’s sections on every exam. The difference increased as the semester proceeded. Roughly one third of Alice’s students moved up one letter grade increment as a result.

Alice considered possible confounding effects: Do better students tend to choose lectures at a particular time of day? Possibly, but colleagues who have studied the effect said that the time-of-day trend tends to be opposite to what Alice and Bob observed, and there was also no significant time-based sub-trend between the two separate sections that each taught. Did a subpopulation of Alice’s students attend Bob’s lectures? Mixing between the sections would generally suppress differences in performance, not accentuate or reverse them*. Otherwise, this was like a controlled experiment: all course elements beyond lecture were identical. The students punished Alice for lowering their exam scores, when in reality she raised them. How much better might Alice’s students have done had they recognized and engaged with what was working?

Alice’s Dean likes to say that the student evaluations are an imperfect measure of teacher performance. But an imperfect measure is one with a modest positive correlation to the target characteristic, not a negative correlation. Alice notes that the university systematically measures student opinion across course and semester, but does not systematically measure student learning. An institution will reward what it can recognize, but will drift otherwise.

The hatred and anger continue into the followup course in the next semester (with a different co-lecturer), as does the improved performance of Alice’s students. The numerical average of the student evaluations continues to be poor, but the student comments now also contain some lengthy positive statements, several of which note with dismay the attitudes and practices of other students.

What attitudes and practices? For many, this required course is perceived mainly as a barrier against the student gaining admission to his or her desired major: a mechanical hurdle, not an intellectual challenge. Internet discussion groups and commercial websites deliver homework solutions. The “response function” to homework is negative: more challenging homework means more cheating and less total student effort. Local tutoring services deliver plug-and-chug training to muddle through exams. And old sample exams.

No-one wants to think of their own actions as illegitimate. But cheating on homework can be made legitimate, in a student’s mind, if the course can be made illegitimate. The motto of the students’ private Facebook group, joined by a majority of the class: “Seriously, f— this course”.

A psychological challenge to the students’ delegitimization of a course – through an instructor’s high expectations for learning – can elicit a forceful defensive emotional response, to push the illegitimacy back onto the course.

Anti-correlation.

*Assuming that the two sets of lectures initially have similar student populations, whatever subgroup moves between lectures can be matched against an existing equivalent subgroup in the destination lecture that receives the same “dose” of lecture. The combination of both subgroups then contributes nothing to between-lecturer differences in student performance.

Hero Reviewers

Referee reporting is an under-recognized service towards the community. I’d like to use this post to thank specifically those hero referees who have handled three or more manuscripts for AIP Advances in the past year:

Wu, Tom
Gambino, Jeff
Hong, Hawoong
Jiang, Hua
Liao, Albert
Mohammad, S.
Pathak, Rajeev
Pautrat, Alain
Albrecht, Joachim
Ansermet, Jean-Philippe
Ariando, Ariando
Aruta, Carmela
Bag, Swarup
Chang, Kai
Chen, Fang-Chung
Chen, Yongyao
Chibowski, Emil
Ciftja, Orion
Das, G. P.
Kan, Erjun
Kitamura, Kyoko
Lew Yan Voon, Lok C.
Li, Qiuzi
Morgan, Benjamin
Nieto, Juan
Shih, Yanhua
Smet, Philippe
Song, Juntao
Varignon, Julien
Vij, J. K.
Wen, Hai-Hu
Xiao, Haiyan
Xu, Jun
Zumer, Slobodan

Econophysics: the correct reference regime?

One classic starting point for quantitative macroeconomic analysis is the assumption of an efficient market. Perturbations can then be applied about this regime, to account for real-world complications. A relative who runs a major division of a biomedical technology firm recently commented that in business, “perfect competition sucks”. Business managers look for market segments in which they have an advantage and can obtain a higher level of profit as a result. Essentially, every business is looking for a little local monopoly, a little local market failure to serve. The competition between firms then helps to minimize the magnitudes of these market failures. This point of view brings a striking observation into view: all of the market actors in an economy have as their main goal the avoidance of an efficient market, i.e. the reference point about which much of economic analysis is based. Is there another possible reference point, perhaps a “distributed mini-monopoly” assumption, that could provide an alternative basis for analysis?

These two limits remind me in a rough sense of the nearly free electron model and tight binding model for solids: each one is appropriate in a different limit.

Phone Physics

Of course, many computer games make use of physics engines, but a few bring out salient fundamental laws of physics in a clear manner. Here are two iPhone games that do an exceptional job in this regard:

  • Osmos teaches momentum conservation, gravitation, and orbital mechanics. You are an abstract circular creature that propels by expelling mass; your ejecta engage in inelastic collisions with other objects. Many levels also host gravitational interactions.
  • Hundreds is a charming exploration in the statistical mechanics of hard-core gases, with important roles for fluctuations, PV work (when moving barriers), and free volume. It’s not entirely clear if momentum is 100% conserved in all collisions (some edge cases seem to produce accelerations) but overall the system acts very “physically”.

Both of these games take certain liberties with “the physics” for gameplay purposes, but they generally do this by extending the physics in new directions – negative masses, switchable behaviors, etc. while preserving some underlying sense of physical law.

Any other suggestions?

Nonnewtonian baby food

Toddlers are prone to high velocity explorations of foodstuffs. This is particularly problematic on the more liquid foodstuffs – the sort of purées typical of baby food. Why not engineer the purée to be non-Newtonian? A little cornstarch or equivalent might mitigate the big splashes while maintaining normal consistency for lower-speed manipulations.

Organization

It is generally acknowledged that central planning is ineffective at the level of an entire economy. A unitary central decision maker is neither effective nor efficient. But at smaller scales, individual firms and noncommercial institutions are typically governed in a central manner. (Similarly, entire countries’ political systems are unitary).

If this hybrid unitary/multinary scheme is the most efficient form of organization, economy-wide, then what determines the size scale for the crossover?

To what extent is it an accident of organizational scale that society tends to generate a small number of highly paid leaders?