Uploading your dreams to Youtube

Jack Gallant of Berkeley and colleagues recently demonstrated how they could use a computer to reconstruct a video image presented to a subject in a functional MRI machine. The work sounds like the stuff of science fiction. It is anything but. The work actually hints at how we might develop a mind-machine interface to allow people with complete paralysis or severe disabilities to communicate their thoughts and control a computer or equipment.

The perhaps more fanciful outcome of later studies in this area might be that we could one day upload our dreams or images from our imagination to Youtube…but that’s probably getting a little ahead of ourselves.

Anyway, I discuss the technical details in my MRI column on SpectroscopyNOW on 1st October.

However, I wanted to ask Gallant more about the validity of the technique, which superficially seems like a statistical trick and not actual imaging of what’s going on in a person’s head. This is what he had to say:

“Thanks for your interest in this work, and on your great article! Our models are validated to much higher standards than most other work in the field. Almost all fMRI studies focus on mere statistical significance. We focus on importance, that is, we focus on maximizing the predictive power of the models. I think it is our focus on predictive power that produces such good models. The model presented in this paper only focuses on the earliest stage of visual processing – primary visual cortex and there is lots of evidence that indicates that the results of studies in this area do not depend much on the subject, and in fact using authors is standard practice for studies in this area. This is a legitimate question, but I have no doubt that the results can be confirmed in others.”

I also probed him about what would be the next step and how others in the field had responded to the work:

“The human brain probably consists of somewhere between 200-500 distinct processing modules, each of which performs a different function. The visual system alone probably includes 50-75 distinct modules. At this time, we only have good computational models of two of these areas. Our focus in past years and in the future is to construct models of as many of these areas as we can.

Everyone has been very positive about the work, though within the field this particular paper is viewed correctly as a natural evolutionary step in computational modelling, rather than a revolutionary step that appears to have emerged de novo. We have been working on projects like this for over a decade, and there are several dozen other laboratories that also engage in quantitative modelling of the visual system. New models emerge all the time.”

In terms of the bulk-scale and low-resolution that is blood flow monitoring using fMRI, even at the capillary level, I was curious as to how this could be comparable to the fine-grained neural activity in the visual cortex:

“I share your skepticism about fMRI, but if it was ‘wholly incomparable’ then nothing that we discover using fMRI would be applicable to the underlying neurons. fMRI measures a complex interaction of blood volume, blood flow rate and blood oxygenation. These are all indirectly coupled to and affected by the aggregate activity of the underlying neuropil. Most of the information available in the underlying neurons is lost in this transformation, but some trace remains. Furthermore, brain activity signals measured using fMRI are contaminated by other non-neural factors, such as changes in blood pressure and the distribution of the veins. One must be very careful in interpreting the results of any fMRI study, and one must always keep in mind that the method tends toward Type II error.

The goal of this work is to build an accurate ENcoding model that accurately describes how the spatio-temporal features in natural movies are reflected in brain activity measured using fMRI, and which can accurately predict the activity elicited by new movies not used to build the model. The fact that such ENcoding models can be converted into DEcoding models and used for reconstructions is really just a byproduct of Bayes theorem. If we build an accurate encoding model, we get decoding essentially for free.

Research Blogging IconNishimoto, S., Vu, A., Naselaris, T., Benjamini, Y., Yu, B., & Gallant, J. (2011). Reconstructing Visual Experiences from Brain Activity Evoked by Natural Movies Current Biology DOI: 10.1016/j.cub.2011.08.031