Sunday, March 11, 2012

Replication issues and a solution

Most social psychologists were unaware of the journal PLoS One until John Bargh brought it to their attention a week ago.  PLoS One published an article by Doyen et al. that failed to replicate Bargh's famous priming task in which seeing words associated with the concept 'elderly' led people to subsequently walking more slowly.  Doyen et al. provided evidence to suggest that rather than primes affecting subjects themselves, experimenter expectancies could have altered how experimenters interacted with subjects and recorded the walking speed.  By this account it was a self-fulfilling prophecy on the part of the experimenters, not a true priming effect in the subjects.  Bargh wrote a blog at Psychology Today that was pretty hostile to the Doyen paper as well as PLoS One.  Some of the points Bargh made probably weren't such a good idea, especially arguing that PLoS One, at the forefront of the Open Access movement, is a journal that indiscriminately accepts all articles to make money (this is pretty much the direct opposite of the truth - and by the way, they just rejected a paper I'm a co-author on!).  As a result, the rhetoric itself is now getting a fair amount of attention in blogs (here and here).  That said, Bargh is probably right about the larger issues at stake.

In essence, Doyen turned their initial failure to replicate into an opportunity.  By introducing an expectancy effect (with some experimenters expecting subjects to speed up and others expecting them to slow down) they were able to manipulate subject walking speed.  This is not surprising - my first advisor in graduate school, Robert Rosenthal, ran studies like this with rats back in the 1960s.  Experimenters who expected their rats to perform better in a maze did perform better (objectively better).  The critical questions are (a) do expectancy effects explain the previously observed elderly-slow effects and (b) are primed-induced automatic behavior effects real.  Let's take these in turn and then I will suggest a solution for the larger issue of replication, particularly within social psycholgy where the issue is particularly salient due to the recent Stapel faked data debacle.

Do expectancy effects explain the elderly-slow effects?
Possibly, but my hunch is that they don't.  The point of expectancy effects is that you can take almost any effect, null or real, and move it around through subject or experimenter expectations.  There was a big difference between the Bargh and Doyen studies.  Doyen et al went out of their way to explicitly induce expectations in their experimenters (who in a sense were now really the subjects).  In contrast, Bargh et al. went out of their way to avoid giving their experimenters any sense of the priming conditions for expected effects.  Is there a sliver of possibility that experimenter expectancies were involved in the Bargh study?  It is incredibly difficulty to rule this out 100%, but I think Bargh did a better job than most in ruling this out.  I think skeptics of scientific findings should refrain from saying 'Because there's a 0.01% chance of an effect being due to a different mechanism than the one proposed we should assume the finding can't be trusted'.  There's a proportionality that's needed but often lacking in such discussions.  Most results and accounts of such results are not 100% airtight, but this does not mean they are 0% correct either.

Years ago, I tried to replicate the elderly-slow effect and failed.  I don't pretend to know what that means.  I also tried to replicate the mere exposure effect multiple times before it worked, but once I figured it out it continued to replicate reliably (though my twist on it never worked, hence no publications on the effort).  There are many more possibly reasons for failure than success (think about the runner who wins vs. loses a race).  Even with my own failure to replicate I still believe in the effect.  Despite what some have said there really are countless conceptual replications in which priming someone with the concept of elderly changed their behavior in someway to become more like the stereotype of elderly behavior (see reviews here and here).  There's even nice data showing that the amount of time a subject has spent with the elderly in their lifetimes modulates the effect (here).  That could not be explained in terms of experimenter expectancies without really tortured logic.

Priming-induced automatic behavior
At a certain level it does not matter whether the exact primes Bargh used produce a change in walking speed over the exact distance he measured.  People say 'we need to replicate this exactly.  conceptual replications aren't good enough'.  Who cares about this specific manipulation?  Are we about to start using it as an intervention to treat patients?  Nope.  What we care about is whether priming-induced automatic behavior in general is a real phenomenon.  Does priming a concept linguistic cause us to act as if we embody the concept within ourselves?    The answer to this question is a resounding yes.  This was a shocking finding when Bargh first discovered it in 1996.  We had long known that priming a concept can lead us to interpret another person's ambiguous behavior in terms of the prime.  However, ever since movie theaters in the 1950s failed in their attempts to use subliminal priming to get us to buy more Coca-Cola, we have assumed that primes couldn't change our behavior.  Since Bargh's initial findings, hundreds of studies focusing on a wide range of behaviors, stereotypes, and contexts have shown this general class of phenomena to be real.  Its also been extended to show that goals and motivation can be primed too.  I don't think its fully agreed upon why these effects occur, but I think the existence proof is complete.

A way forward for social psychology
Social psychologists are notorious for producing counterintuitive findings that are enchanting to many, but headscratch-inducing to others.  Its also true that pure replications do not get published often enough.  No top journal will publish pure replications ('what's its novel contribution?') and no one can make a name for themselves running pure replications.  Additionally, because there are both good and bad reasons for the failure to replicate, its even harder to get failures to replicate published.

Here's my solution.  Each year social psychologists nominate 10 findings (the number is arbitrary) from the previous year or two that they would like to see replicated.  These would be findings that if solidified would be extremely significant to the field (but if false, should be done away with quickly).  Perhaps this would happen over the summer.  Then all first and second year graduate students in social psychology would be required to choose one of these to replicate as closely as possible (or perhaps with a gradient from exact to conceptual replication).  This would be part of their training - learning how to run studies well.  They could add other conditions of interest, but the main goal would be to institutionalize the replication of the most important recent findings.  Everyone would get a first author publication in the to-be-created online Journal of Psychology Replications (I'm making that up).    Once a study was in this replication pool for a year, an initial meta-analysis would be written up of all the replications and then this would be updated after more studies come in during the second year.

This would be a useful exercise for new graduate students.  They would each get a publication from this.  And the field would know pretty definitively within a year or two which effects replicate and which don't.  There would be no file drawer effects because all would included regardless of the p-values obtained.

Follow me on twitter: @social_brains