Neural responses to velocity gradients in macaque cortical area MT

Abstract Visual motion, i.e. the pattern of changes on the retinae caused by the motion of objects or the observer through the environment, contains important cues for the accurate perception of the three-dimensional layout of the visual scene. In this study, we investigate if neurons in the visual system, specifically in area MT of the macaque monkey, are able to differentiate between various velocity gradients. Our stimuli were random dot patterns designed to eliminate stimulus variables other than the orientation of a velocity gradient. We develop a stimulus space (“deformation space”) that allows us to easily parameterize our stimuli. We demonstrate that a substantial proportion of MT cells show tuned responses to our various velocity gradients, often exceeding the response evoked by an optimized flat velocity profile. This suggests that MT cells are able to represent complex aspects of the visual environment and that their properties make them well suited as building blocks for the complex receptive-field properties encountered in higher areas, such as area MST to which many cells in area MT project.


Introduction
Velocity gradients* are ubiquitous in our environment. They provide important cues about the three-dimensional (3-D) layout of the visual scene. In the large optical flow fields that result from the observer's motion through the environment they contain information about observer heading (Regan, 1985;Warren & Hannon, 1988;Crowell & Banks, 1993). Smaller velocity gradients, caused by the self-motion of objects, determine the perceived 3-D structure of moving objects in the perception of structure from motion (Rogers & Graham, 1979;Braunstein & Andersen, 1984;Siegel & Andersen, 1986;Andersen, 1989;Husainetal., 1989;Landy et al., 1991;Treueet al., 1991;Harris et al., 1992). The importance of the information contained in velocity gradients is reflected in the high sensitivity of the human visual system to detecting the presence of such motion patterns (Nakayama, 1981;Golombetal., 1985;Nakayamaetal., 1985).
The analysis of optical flow fields has received much attention from psychophysical (Gibson, 1950;Regan & Beverly, 1981, Reprint requests to: Stefan Treue, Cognitive Neuroscience Laboratory, Department of Neurology, University of Tubingen, Auf der Morgenstelle 15, 72076 Tubingen, Germany. *We use the term velocity gradients to refer to visual motion patterns where feature velocity varies as a function of spatial location. It should be noted that since all of our stimuli (except the ones used to determine the cell's preferred direction) are moving in the preferred direction of the cell, we use the terms velocity and speed interchangeably. 1985; Regan, 1986;Regan et al., 1986;Royden et al., 1994;Warren & Hannon, 1988) and computational vision research (Koenderink, 1986;Longuet-Higgins & Prazdny, 1980;Koenderink & Van Doom, 1981;Rieger & Lawton, 1985) demonstrating the importance of extracting the local characteristics of the flow pattern when trying to recover the 3-D layout of the visual scene from the two-dimensional (2-D) projections on the retinae.
More recently, electrophysiological studies have attempted to localize the cortical areas involved in this analysis Tanaka et al., 1986Duffy & Wurtz, \99\a,b;Orban et al., 1992;Graziano et al., 1994). Area MST in the macaque monkey has been shown to contain cells with large receptive fields that are sensitive to rotating, expanding, contracting, and spiraling patterns, respectively. While naturally occurring optical flow fields can be decomposed into these more elementary motion patterns, they are still rather complex in that not only the speed but also the direction of feature motion varies as a function of spatial position. Single objects in motion and stimuli used for structure from motion studies on the other hand are often simpler in that their flow patterns are unidirectional. Such stimuli might thus serve as a bridge between the studies using very impoverished stimuli (like bars, sine-wave gratings, or simple random dot patterns) to get a detailed understanding of the physiology of motion perception and the aforementioned studies using more complex displays that are closer to natural images yet are harder to characterize. The random dot patterns we developed for this study represent such intermediate stimuli in that they form the building blocks for complex optical flow patterns yet they can be readily parameterized.
Optical flow patterns often cover a large part of the visual field and represent many objects at different depths from the observer, while structure from motion stimuli are smaller in scale and numerosity. Despite these global differences, the two types of motion patterns are locally similar. The characteristic feature of both of them is the presence of local variations in direction and speed of motion, i.e of velocity gradients. To perceive these motion patterns and to recover the 3-D shape of the underlying objects requires the accurate representation of the velocity gradients in the visual system. Here we introduce moving random dot patterns simulating the most elementary velocity gradient types (see Fig. 1 and Methods for details) to establish if neurons in area MT of the awake behaving monkey are able to encode the orientation of such gradients in their receptive fields. Area MT was chosen for this study since its receptive fields are smaller than those in area MST and thus seem well suited to analyze the local components of larger flow fields. At the same time the receptive-field sizes in MT match well with the size of commonly occurring structure from motion (SFM) gradients. Finally, long-lasting deficits in the perception of SFM occur with MT lesions, suggesting that this area plays a role in the perception of 3-D structure from motion (Siegel & Andersen, 1986;Andersen & Siegel, 1990).
The visual environment contains an infinite number of different velocity gradients. Since this is the first physiological study using such stimuli, we limit ourselves to stimuli containing simple linear gradients.! These velocity gradients all belong to the same class and thus can be represented easily in what we call a "deformation space" (see below). The deformation space has great resemblance to the idea of a "spiral space" developed in our laboratory as a reference frame for characterizing the response properties of MST cells to rotating, expanding, and contracting stimuli (see Fig. 3; Graziano et al., 1994; for a similar approach using polar, hyperbolic, and Cartesian gratings to study V4 neurons, see Gallant et al., 1993).
Here we demonstrate that many MT neurons are tuned to velocity gradients, i.e. the response of an MT neuron to a given gradient often varies as a function of that gradients position in the deformation space.

Methods
A detailed description of our recording methods has appeared elsewhere (Snowden et al., 1991) and this section will therefore be limited to a brief overview and a detailed description of the stimuli used.
Two male rhesus monkeys were trained to fixate a small fixation point, while ignoring any other stimuli, and to signal the dimming of the fixation point by releasing a lever. The animals' eye movements and point of fixation were monitored using a scleral search-coil technique (Robinson, 1963). Visual stimulation was provided to the receptive field of individual neurons (which were about 5-10 deg eccentric) during this 4-6 s period of fixation. Electrode penetrations were made through a chamber implanted over the parietal cortex. The electrode's position fOther possible stimuli are, for example, the half sinusoidal velocity profiles characteristic of rotating spherical or cylindrical shapes. Fig. 1. Basic gradient types used in this study: Shearing (a, b) and compressing (c)Atretching (d) velocity gradients form the two basic gradient types used in this study. Each group has two mirror-symmetric members. Shearing patterns are characterized by a right angle between the direction of motion and the direction of the steepest ascent of the gradient while in compressing and stretching gradients both these vectors align. These four basic types fit into a coordinate system that we term deformation space. In this polar coordinate system, stimuli are positioned according to the angle that their direction of motion forms with the vector representing the steepest ascend of their velocity gradient. Note that the stimuli presented in this figure would be used for a cell preferring upward motion. (CW: clockwise; and CCW: counterclockwise) within the chamber, the depth of recording, the topography of receptive-field locations, as well as the properties of the cells encountered during the penetrations were used to determine whether encountered cells were indeed in area MT.

Experimental protocol
Stimuli were presented on a large HP CRT screen or a video monitor at a viewing distance of 57 cm. They consisted of random patterns of bright, high-contrast dots upon a dark background. Each dot was approximately 1 mm in diameter, and thus subtended about 6 min arc. Dot density was 4 dots/cm 2 .
The rate of screen refresh was 50 Hz for the CRT screen and 60 Hz for the video monitor. Each trial commenced with the onset of the fixation point. After about 1 s, a stimulus appeared if the animal had pulled the key and had not broken fixation. This stimulus lasted for 1 s, and another stimulus appeared for 1 s after a 1-s delay. The fixation point dimmed 0.2-2.0 s after the disappearance of the second stimulus; thus a complete trial lasted 4.2-6 s. The response of a neuron was determined by the average firing rate during the stimulus presentation (excluding the first 100 ms after the stimulus onset).

Stimuli
To determine the preferred direction of motion for a cell, we presented random dot patterns moving behind a circular virtual aperture. The preferred direction of the neuron determined in this way was used throughout the rest of the tests performed on this cell.
We next established the cell's preferred speed. To determine if a cell is gradient tuned, we presented it with various linear velocity gradients all moving in its preferred direction. The average speed of each pattern was equal to the preferred speed of the cell. The velocity of every point in the pattern varied linearly as a function of its position within the gradient. Such gradients can be divided into two groups.

Shear stimuli
In these stimuli, the velocity gradient is oriented perpendicular to the direction of motion. Thus, the velocity of a particular point will be constant while it is moving across the stimulus. We call patterns in which the velocity decreases from the left of the stimulus to the right (when facing in the direction of motion) clockwise (CW) shearing stimuli^ (Fig. la). Correspondingly, in counterclockwise (CCW) shearing stimuli (Fig. lb), the velocity increases from the left to the right.
Compressive/stretching stimuli In these stimuli, velocity varies along the direction of motion. Thus, a point in a compressive gradient will decrease its velocity while crossing the display (Fig. lc). Correspondingly, points in stretching gradients will accelerate while crossing the display (Fig. Id).
We used two different gradient slopes in our experiments. The steeper velocity gradient would start at 0 deg/s on the slower end of the display and would reach twice the preferred speed of the cell under study at the opposite end. The shallower gradient would start at half the preferred speed and reach 1.5 times the preferred speed. Note that for both stimuli the average speed as well as the speed in the center of the stimulus is equal to the preferred speed of the cell.
Points that moved across the sides of the flat velocity profiles used for determining the preferred direction and speed of a cell were simply wrapped around to the opposite side of the stimulus. This simple technique is sufficient to insure equal dot density across the stimulus for these simple patterns as well as our shear gradients.
The acceleration or deceleration of individual dots in our compression stimuli on the other hand would lead to changes in stimulus density across the display especially for those compressive gradients in which dot speeds decrease all of the way to zero at one end of the display. We therefore employed two techniques to eliminate this density cue in our displays.
Special dot wraparound. Expansive dot gradients that start with speeds of zero at one end and with even density distributions across them will remain evenly distributed. While the distribution will remain uniform, the dot density will be continuously falling because the distance between any two points will continuously increase. § We prevent this decrease in density in our displays by replotting any points that cross the stimulus boundaries back into the stimulus. Notice that this replotting has to be done randomly across the stimulus rather than using the wraparound method employed for the flat velocity profiles. Stimuli in which the lowest speed is larger than zero were generated by first generating a larger stimulus whose gradient starts at a speed of zero and then masking this stimulus to show only a smaller extent of the velocity gradient.
Compressive gradients are generated by first computing a stretching gradient moving in the opposite direction and then reversing the order of the individual frames making up the stimulus. Notice that in the resulting stimuli dots will disappear while approaching the zero speed stimulus edge. Thus there is no "piling-up" of dots at that edge.
Limited dot lifetimes. Replotting dots within our stretching stimuli and removing dots in the compressive stimuli generates transient events that could possibly influence the response of cells to these patterns. To insure that this transiency does not influence our findings, we introduce it into all of our stimuli. This is achieved by using dots of limited lifetimes. The dots move along a continuous path for only a short period of time -their lifetime. After its lifetime, a dot is randomly replotted within the stimulus. We used a lifetime of 300 ms for all of our random dot patterns which was long enough to not substantially affect the percept of motion while at the same time providing a significant amount of transiency, masking the transiency generated by the appearing and disappearing dots in the stretching and contractive gradient stimuli, respectively.
Our stretching, compressive, and shearing dot patterns are members of a continuous family of stimuli which only vary along one angular dimension.** This dimension is the angle between the direction of the vector describing the velocity gradient (the "gradient vector") and the direction of dot motion in the pattern. If the gradient vector points in the direction of stimulus motion, the stimulus is stretching. If the gradient vector points in the opposite direction, the stimulus is being compressed, while gradient vectors orthogonal to the direction of pattern motion occur in shearing stimuli. These four stimulus types form the cardinal axes of a coordinate system that we term "deformation space" (Fig. 1). Stimuli that fall between these cardinal directions combine elements of stretching or compression with shear components. Note that all of the gradients contain the same velocity vectors, only spatially rearranged.
This deformation space gives us a convenient means of plotting the response of neurons to our gradient stimuli in the same way as responses of direction-selective units can be plotted in a direction of motion space or in the same way as area MST neurons' responses to expanding, contracting, and rotating patterns can be plotted in a spiral space (see also Graziano et al., 1994).
All our stimuli were always centered on the receptive field and their size was chosen so that they would not extend beyond the boundaries of the classical receptive field, i.e. the parts of JWe chose this nomenclature since in a counterclockwise rotating dol field velocity also increases from right to left when facing in the direction of motion. Note though that all of the dots in our shearing stimuli move along a straight path. §This phenomenon is well known in astronomy where the finding that any two stars are moving away from each other lends support to the idea of an expanding universe. "This is true only if all of the gradients have the same steepness of slope. the receptive field from which we were able to elicit a response from the cell with single handheld stimuli.

Results
The aim of our study was to establish if area MT neurons are able to signal the presence and the orientation of velocity gradients in their receptive fields. In several aspects this question is similar to determining if a given cell is able to signal the direction of motion. Generally, a cell is considered direction selective if it displays a tuning curve when presented with a range of different directions. Additionally, one could require that the cell responds stronger to at least one direction of motion than to a stationary pattern. We use both of the corresponding requirements to determine if the MT cells we recorded from are able to signal the orientation of velocity gradients. Fig. 2 shows the full set of tests we performed on every neuron considered for this study. After mapping the receptive field, we determined the cells preferred direction ( Fig. 2A) and then its preferred speed. This established the most effective flat velocity profile. To determine if the cell showed gradient tuning, we ran a block of trials presenting velocity gradients of eight different orientations spaced evenly in the deformation space. Interleaved with these trials were presentations of the best flat velocity profile (as determined in the two previous blocks of trials). Fig. 2C plots the result. The dashed circle is the level of response elicited by the presentation of the flat velocity profile. The solid line connects the responses to the eight velocity gradients. The arrow is an estimate of the preferred velocity gradient of this cell. This cells shows a clear tuning to the velocity gradients with the response to several (neighboring) gradient orientations significantly (P < 0.0001, paired /-test) stronger than the flat profile.
We were able to record long enough from 25 cells (22 from one and three from the other animal) to conduct all of these tests. We classified the type of responses to the velocity gradients according to three criteria. In those cases where the responses to the two gradients slopes used were not similar, we used the one that fell more clearly into one of the classes described below.
1. Did the cell show a tuned (i.e. single-peaked) response to the velocity gradients? (tuned cells) 2. If the cell was tuned, did the response to the most effective velocity gradient clearly exceed the response to the flat velocity profile? (excitatory tuned cells) 3. If the cell was not tuned, was the response to all velocity gradients significantly less than to the flat profile? (inhibited cells) This division of cells into classes is intended to show the existence of certain properties and not to imply that as a population MT cells fall into separable groups. Fig. 3 shows two representative examples of each response type as well as the relative frequency of their occurrence. The response of three cells (12%) did not fit into this scheme.
About a quarter (six of 25) of the cells fell into the tuned category, while another quarter (seven of 25) were categorized as inhibited cells. More than a third (nine of 25, 36%) of the cells that could be characterized were excitatory tuned cells, our most stringent class. Thus, 60% of the cells we recorded from showed tuning to velocity gradients. The existence of excitatory tuned cells is especially important in the context of this study since it rules out an obvious possible artifact. If our stimuli are not perfectly centered on the receptive field and if the velocity tuning curve is not symmetrical around its preferred velocity, the cell would show a tuning curve when presented with our various gradient stimuli. But note that this tuning curve would never exceed the firing rate based on the preferred flat velocity gradient. By being able to demonstrate that some cells show tuning curves whose peaks exceed the response to the preferred flat velocity profile, we have ruled out this possible artifact as a general explanation for our findings.

Discussion
In the experiments presented here, we have investigated the response of area MT neurons in the awake behaving monkey to elementary velocity gradients. We have demonstrated that a substantial proportion (60%) of our sample of MT neurons respond to such gradients in a systematically tuned manner. This enables area MT to encode the shape and orientation of velocity gradients in the visual world. Furthermore, this property makes MT cells well suited to provide the building blocks for the more complex receptive-field properties encountered in area MST.
The single-lobed tuning that we found in our study suggests that the variable used in these experiments, i.e. the orientation of velocity gradients, is indeed encoded in the firing rate of MT neurons. At the same time, MT neurons are tuned to other stimulus parameters like direction of motion and stereoscopic disparity and are thus likely to be involved in the perception of those parameters too. This behavior allows MT cells to contribute to the analysis of a variety of motion stimuli but it also restricts the modulation that can be achieved by varying a single dimension. The modulation we observed, i.e. the difference in a response of a given cell for the best and worst velocity gradient, is small compared to the modulation that can be achieved with stimuli which vary in their direction of motion.
There are several possible reasons for this shallow gradient tuning: 1. The simple gradients we used did not adequately match the preferred gradients of the neurons. This would be analogous to taking a cell's direction tuning curve at an inadequate speed.
2. The response of MT neurons is dominated by the direction of motion of a pattern rather than by the shape of its velocity profile.
3. The neurons in our study were responding close to saturation, given that our stimuli were modulations around the most effective flat velocity profile.
4. The neurons encode surfaces in depth and our stimuli, lacking stereoscopic depth, were thus less than optimal.
The first argument is not likely to be correct since we could drive the cells we encountered very well, even exceeding the firing rate to the best classical stimulus. But given that MT neurons are selective for stimuli in two one-dimensional stimulus spaces, namely the direction of motion space and our deformation space, it might be more appropriate to describe the response of an MT cell in a multidimensional space that not only accounts for selectivity for direction of motion and velocity gradients but also stereoscopic disparity. The role of MT in 3-D shape perception, suggested by our results, fits well with the finding that individual MT neurons have both direction and stereoscopic tuning with complex interactions designed to disambiguate motion stimuli (Bradley et al., 1995). While we did not investigate stereoscopic tuning in this study, it is possible that our velocity gradient tuned cells show a depth tuning that enhances the response modulation caused by the various motion gradients. Such a combination ofdepthfrom-motion and depth-from-binocular-disparity signals might underlie the perceptual similarity of these two cues to 3-D shape (Rogers & Graham, 1982).
Many neurons in area MT show an opponent center-surround organization (Allman et al., 1985) that might influence a cell's response to a particular gradient that straddles the bor-der between the center and the surround of the receptive field. This possibility has been used in a model by Dobbins et al. (1990) to obtain estimates of the optic flow field and its spatial and temporal variation. In our recordings, we avoided such stimulus placements by limiting our stimuli to the classical receptive fields (the center) of all neurons we recorded from. Under natural viewing conditions, stimuli will often overlap the border between the center and surround of the receptive field though. Depending on the respective preferred speeds and direction in the center and the surround, this could lead to a much stronger modulation of the response of the cell to various velocity gradients. In fact, nonsymmetric surrounds such as the ones described by Xiao et al. (1994) seem particularly well suited to enhance the gradient selectivity of the classical receptive field.
Our study does not address the mechanisms responsible for the gradient tuning we observed. Such tuning could be achieved by cells whose gradient selectivity is the result of a mosaic of inputs from cells with small receptive fields of systematically varying preferred velocity. Such cells would not show the property of position invariance that has been observed in MST receptive fields (Lagae et al., 1994;Graziano et al., 1994) and requires more complicated neural wiring. Its existence in area MT could be tested by a study that uses small moving patches to determine the preferred speed at different locations within the receptive field of a given MT cell.