Dread pirate AI

Imagine an AI agent, released freely onto the internet, that performs crime autonomously. The beneficiary of this crime has plausible deniability and no direct responsibility for or direct control over the criminal acts. But the agent is designed with propensities intended to shape world to the human criminal’s benefit. And perhaps they maintain a well-hidden kill switch. As autonomous artificially intelligent agents advance in capability, their ability to perform acts with ethical and criminal valence, and the thread of responsibility thereof, becomes important. The legal system is riven with assumptions about human psychological dynamics; these will have to be rethought as nonhuman actors develop partial autonomy – to what extent are their behaviors foreseeable be their creators?

Cost of free education

The cost of education is much more than just tuition room and board. The full cost also includes the time required to acquire the education. That opportunity cost, in terms of the forgone wages that were not earned, is comparable to the cost of the tuition itself. Even if an online education in a MOOC is completely free, still the true cost of that education, if it took many years of full-time effort to acquire, is going to be comparable to the cost of traditional education.

Let’s say that the online version of the course is half as effective as the in-person version but cost nothing. Would you be willing to go to college for eight years instead of four if it meant that the tuition was free? Online education is going to have to be comparable in quality to the in-person education, since the end it is going to be comparable (within a factor of 2 or 3) in price.

The exceptions are students with low current earning potential, for example those in less developed countries. MOOCs may well establish themselves there first. MR University is an example.

I see two critical elements missing from many current MOOCy online course offerings: commitment devices and social immersion. Mastering challenging material – the stuff that really makes college worthwhile – is hard work. Without hard deadlines and consequences, most students slack off. Prepaid tuition is a commitment device that drives sustained effort. Endpoint credit is another. Peer modeling is a third. MOOCs don’t need to replicate these specific mechanisms, but they do need to engender the same commitment somehow, if they want to extend beyond feel-good self-inspirational courses. There are ways to do this.

The technology to deliver general MOOCs has existed for almost two decades (I offered one in 1995 actually, used by tens of thousands of students, but it had a very specific and focussed motivational profile). They are catching on now partly for cultural reasons: folks nowadays are conditioned to watch online video and immerse in online social interaction. These cultural developments solve some of the problems of delivering broadly effective MOOCs, but not yet all.

Scalable, tech-enabled MOOC Cheating

Lately I’ve been thinking about strategies of large-scale cheating in large classes (don’t ask). MOOCs have novel needs in this regard, which are being served by remote proctoring services. But just as education can be technologically enabled at mass scale by MOOCs, so can cheating. Imagine a test-assist service (let’s call it “remote tutoring” for marketing purposes) where the cheater screen shares the online exam with an answer-providing confederate, and the confederate delivers answers on the screen of a cellphone that is placed behind the webcam that is being used by the remote proctoring agency to monitor the exam. (Ideally, the cellphone is taped to the computer screen itself, so that sight lines to the phone fall within the main screen). With a bit of infrastructure work by the cheating company, they could deliver answers directly from a question database built up over time from prior exams, so that the cheating is as scalable and inexpensive as the MOOC itself. Such a service could be hosted in a country with weak rule of law and provide service at a very reasonable price. It will take some time for the online cheating market to develop, but once it does, would this mode of cheating make online at-home secure exams impossible, at least without demanding root access to the user’s home computer? Fully secure MOOC exams might require a distributed network of in-person testing centers, which limits the ability to reach underdeveloped areas with credit-bearing courses. These problems could accelerate the separation of education and credential acquisition.

GPS weapon lock

In Syria, the U.S. is hesitant to arm the rebels, since the weapons might be redirected against our interests. I imagine of particular concern are anti-aircraft weapons, which could be misused to attack civilian aviation in other countries. Could such weapons be engineered with GPS interlocks, such that they would not function outside of a defined geographical area? The interlock would need to be engineered deep into the structure of the weapon, such that removing the interlock disables the weapon. For modern heavily electronic weapons this should be possible.

Lessons Learned

The story of an inadvertent controlled experiment in correlations of student performance with teacher evaluation.

Alice recently taught two out of four large lecture sections of an introductory science course on a challenging topic. Alice tries to do something different each semester to improve the course. This time, she pared back extraneous details and focussed the lectures more on understanding over formula-crunching. She integrated breaks where students worked through sample questions in class, and asked them to discuss these questions with their neighbors. She conveyed high expectations.

The other instructor, Bob, followed a traditional lecture style. He felt that some students would succeed, some would fail; there wasn’t much he could do about it, so he would deliver to the students what they wanted: sample exam problems in each lecture.

The four sections of the course were identical in all aspects other than lecture: shared homework, labs, recitations, exams, review sessions. Each lecturer contributed half of the questions on each exam. Alice asked the course administrator to track student performance on the exams by lecture section, to see if the new teaching techniques had any effect. Alice was also curious about the end-of-semester student evaluations. Let’s start with those.

Alice was hammered in the student ratings of teaching effectiveness, by far the lowest ratings of any course she had taught (including this same course in the past). Bob’s ratings were fine – a full two points on a six point scale separated them. Beyond raw hatred and anger, Alice’s students overwhelmingly expressed a single sentiment: It was grossly unfair that Bob’s students received sample exam questions in every lecture and they did not. Alice’s students felt strongly that they were not properly prepared for the exams because they were not taught how to work through each kind of problem.

In reality? Both of Alice’s lecture sections performed better than both of Bob’s sections on every exam. The difference increased as the semester proceeded. Roughly one third of Alice’s students moved up one letter grade increment as a result.

Alice considered possible confounding effects: Do better students tend to choose lectures at a particular time of day? Possibly, but colleagues who have studied the effect said that the time-of-day trend tends to be opposite to what Alice and Bob observed, and there was also no significant time-based sub-trend between the two separate sections that each taught. Did a subpopulation of Alice’s students attend Bob’s lectures? Mixing between the sections would generally suppress differences in performance, not accentuate or reverse them*. Otherwise, this was like a controlled experiment: all course elements beyond lecture were identical. The students punished Alice for lowering their exam scores, when in reality she raised them. How much better might Alice’s students have done had they recognized and engaged with what was working?

Alice’s Dean likes to say that the student evaluations are an imperfect measure of teacher performance. But an imperfect measure is one with a modest positive correlation to the target characteristic, not a negative correlation. Alice notes that the university systematically measures student opinion across course and semester, but does not systematically measure student learning. An institution will reward what it can recognize, but will drift otherwise.

The hatred and anger continue into the followup course in the next semester (with a different co-lecturer), as does the improved performance of Alice’s students. The numerical average of the student evaluations continues to be poor, but the student comments now also contain some lengthy positive statements, several of which note with dismay the attitudes and practices of other students.

What attitudes and practices? For many, this required course is perceived mainly as a barrier against the student gaining admission to his or her desired major: a mechanical hurdle, not an intellectual challenge. Internet discussion groups and commercial websites deliver homework solutions. The “response function” to homework is negative: more challenging homework means more cheating and less total student effort. Local tutoring services deliver plug-and-chug training to muddle through exams. And old sample exams.

No-one wants to think of their own actions as illegitimate. But cheating on homework can be made legitimate, in a student’s mind, if the course can be made illegitimate. The motto of the students’ private Facebook group, joined by a majority of the class: “Seriously, f— this course”.

A psychological challenge to the students’ delegitimization of a course – through an instructor’s high expectations for learning – can elicit a forceful defensive emotional response, to push the illegitimacy back onto the course.

Anti-correlation.

*Assuming that the two sets of lectures initially have similar student populations, whatever subgroup moves between lectures can be matched against an existing equivalent subgroup in the destination lecture that receives the same “dose” of lecture. The combination of both subgroups then contributes nothing to between-lecturer differences in student performance.

Hero Reviewers

Referee reporting is an under-recognized service towards the community. I’d like to use this post to thank specifically those hero referees who have handled three or more manuscripts for AIP Advances in the past year:

Wu, Tom
Gambino, Jeff
Hong, Hawoong
Jiang, Hua
Liao, Albert
Mohammad, S.
Pathak, Rajeev
Pautrat, Alain
Albrecht, Joachim
Ansermet, Jean-Philippe
Ariando, Ariando
Aruta, Carmela
Bag, Swarup
Chang, Kai
Chen, Fang-Chung
Chen, Yongyao
Chibowski, Emil
Ciftja, Orion
Das, G. P.
Kan, Erjun
Kitamura, Kyoko
Lew Yan Voon, Lok C.
Li, Qiuzi
Morgan, Benjamin
Nieto, Juan
Shih, Yanhua
Smet, Philippe
Song, Juntao
Varignon, Julien
Vij, J. K.
Wen, Hai-Hu
Xiao, Haiyan
Xu, Jun
Zumer, Slobodan

Econophysics: the correct reference regime?

One classic starting point for quantitative macroeconomic analysis is the assumption of an efficient market. Perturbations can then be applied about this regime, to account for real-world complications. A relative who runs a major division of a biomedical technology firm recently commented that in business, “perfect competition sucks”. Business managers look for market segments in which they have an advantage and can obtain a higher level of profit as a result. Essentially, every business is looking for a little local monopoly, a little local market failure to serve. The competition between firms then helps to minimize the magnitudes of these market failures. This point of view brings a striking observation into view: all of the market actors in an economy have as their main goal the avoidance of an efficient market, i.e. the reference point about which much of economic analysis is based. Is there another possible reference point, perhaps a “distributed mini-monopoly” assumption, that could provide an alternative basis for analysis?

These two limits remind me in a rough sense of the nearly free electron model and tight binding model for solids: each one is appropriate in a different limit.

Phone Physics

Of course, many computer games make use of physics engines, but a few bring out salient fundamental laws of physics in a clear manner. Here are two iPhone games that do an exceptional job in this regard:

  • Osmos teaches momentum conservation, gravitation, and orbital mechanics. You are an abstract circular creature that propels by expelling mass; your ejecta engage in inelastic collisions with other objects. Many levels also host gravitational interactions.
  • Hundreds is a charming exploration in the statistical mechanics of hard-core gases, with important roles for fluctuations, PV work (when moving barriers), and free volume. It’s not entirely clear if momentum is 100% conserved in all collisions (some edge cases seem to produce accelerations) but overall the system acts very “physically”.

Both of these games take certain liberties with “the physics” for gameplay purposes, but they generally do this by extending the physics in new directions – negative masses, switchable behaviors, etc. while preserving some underlying sense of physical law.

Any other suggestions?