Single-voxel strategy for ROI analysis may cause “voodoo correlations”

(Note: this entry was once lost by the server crash in the April 2009)

Recently, a paper written by Ed Vul caused great controversy among not only psychological but also neuroscientific community. His argument led to more general discussion about inappropriate analysis of experimental data, so called “circular analysis”. Regardless of whether studies criticized by Vul actually ran “non-independent two-step analysis”, a potential danger of such a circular analysis has been broadly known among researchers.

But, I want to say; is that only a factor causing “voodoo correlations”? Indeed, one of Vul’s points was important: an example of correlation between temperature reading at a certain weather station and a specific set of stocks. This example means even a set of independent two variates can cause a significant and higher correlation because of inevitable noises or by a pure chance. It’s a very important arguement.

It also means a danger of picking up a single-voxel value or statistics from fMRI data. The example shown above suggests a single-voxel statistics can be extremely highly correlated with a certain behavioral index by a pure chance. This can lead to an incorrect conclusion.
In order to evaluate this phenomenon quantitatively, I performed a very simple simulation. I assumed 10, 20, 30, 40 or 50 participants joined a certain fMRI experiment. About the fMRI data, the number of voxels were assumed as 64 * 64 * 32 (in-plane resolution was 64 * 64 matrix and 32 contiguous slices). In the fMRI experiment, measured BOLD signals (precisely, they were only beta-values) were assumed to be entirely randomly distributed (normal distribution) and to be independent across voxels. At the same time, the participants were assumed to perform a certain social neuroscientific task; in which, a behavioral index was obtained from each participant.

Consequently, a set of BOLD signals and a behavioral index was obtained from each participant. We can easily compute correlations between BOLD signals and the behavioral indices. But, please remember; they were completely randomly distributed and independent. OK, let’s see the results. Below are distributions of correlations.

As you see, an actual mean value of correlations was zero in each plot. But, the maximum (or minimum) correlation value was extremely high (if n = 10, it was r > 0.95!!)… although these variates were completely randomly-distributed and independent! Indeed, such an inflated correlation decreased as the number of participants increased, but even in the case of n = 30, the correlation value was still r > 0.75.

This result can be concluded as:

  • Less participants easily cause an incorrectly inflated correlation although a set of variates are independent of each other
  • For example, in order to match the criteria Vul proposed (r < 0.75), more than 30 participants are needed

But other readers may wonder; “This is a result from whole-brain data. When a ROI analysis, it would be different”. Sure, it may be true. As Lieberman and/or Jabbi pointed out, social neuroscientific studies usually pick up single-voxel statistics from the ROIs which are determined not only by the single fMRI study but also in context of social neuroscience literature (reporting activations in the striatum, amygdala, OFC etc., by human or primate studies). As far as picking up it so, such an inflated correlation may not be a problem.

That sounds reasonable, but it may be not correct. Why? Because, in such an analysis, an assumption that ROI data should be uniform within single ROIs and should be described by an average value across all voxels within single ROIs, not by a single-voxel statistics, proposed by Kanwisher’s group, is ignored.

Actually, in the case of simulation shown above, “average” correlation values across all voxels were zero; only single-voxel statistics with “best” (maximum or minimum) values showed inflated correlations. It’s plausible.

Next, in order to evaluate the effect of such a single-voxel strategy even for ROI analysis, I performed another simulation. Here, I assumed a ROI with 300 voxels. Other variates were the same as the simulation above.

Wow, it’s surprising! :shock: Even when the number of voxels were only 300, an inflated correlation as r > 0.80 appeared in the case of n = 10. Needless to say, “averaged” correlations were zero, but maximum and minimum correlations were highly inflated. This result clearly demonstrates even in the case of ROI analysis, single-voxel strategy may lead to incorrect conclusions.

In addition, I computed one more example; if n = 18 and a certain ROI with 50 voxels (these are usual sets of experimental parameters in social neuroscience) were assumed, the single-voxel strategy returned an inflated correlation as r = 0.55. These simulations may be a good lesson for interpreting results of correlation analysis in social neuroscience; at least a correlation between single-voxel values and behavioral indices should be more than r = 0.55, and if not so, it would be questionable.

After all, my conclusions are:

  • More than 10 participants should be included in a single study
  • Not single-voxel statistics, but averaged statistics across all voxels within single ROIs should be used

Regardless of “circular analysis”, careless data processing like single-voxel strategy can easily lead to incorrect conclusions. Be careful!

Comments (1) | Trackbacks (0)

    Comments (1)

  1. Anonymous

    emäsnusija

Leave a Reply

:ygrin: :neutral: :twisted: :arrow: :shock: :smile: :???: :cool: :evil: :grin: :idea: :oops: :razz: :roll: :wink: :cry: :eek: :lol: :mad: :sad: :!: :?: :sweat: :fedup: :fullon: :wrygrin: :xx: