Tag Archives: vision

Receptive Fields: Simple Cells and Tuning Curves

Receptive Fields

The jumping spider makes use of pattern recognition to distinguish prey from mate. Their eyes allow them to detect specific features or templates, specifically, the bar-shaped feature similar to the legs of other spiders (Land, 1969). Frisby and Stone (2010) discuss the jumping spider because they provide an excellent paradigm for how template matching works through a series of steps.

–       First, the original image is projected on to the retina of the spider’s eye

–       Second, the spider’s eye focuses on a specific feature, in this case the leg

–       Then the leg’s image is cast onto the retina, and receptors project this information to the primary visual cortex (V1)

–       Neurons in V1 receive inhibitory and excitatory input from the receptors in the retina

–       The receptors not blocked by the leg have increased activity and send an excitatory light signal

–       The receptors blocked by the leg have decreased activity and send an inhibitory light signal

–       The neuron gathers the total input and if it exceeds its threshold, firing indicates that a bar is present

Template matching, whether in a spider or human, relies on encoding of a pattern whose shape directly matches the input pattern to be detected. Striate cortex cells in V1 are found in all mammals. These cells receive input from the retinal fibres, and each cell is responsible for a limited patch of the retina (receptive field). Accordingly, cell types in the striate cortex are classified according to their receptive fields (Hubel and Wiesel, 1981). Striate cells are broken down into two parts: simple cells and complex cells.

Templates can, however, be impractical because the amount needed would be exponential resulting in a binding problem (Frisby and Stone, 2010). For example, if we have 18 templates for orientation sensitive to 18 sizes, which are sensitive to 18 shades, you can image the ridiculous amount of templates you would need. This problem is known as a combinatorial explosion.

Simple Cells

Simple cells are named simple because they can be simply mapped into excitatory and inhibitory sub-regions. Simple cells are optimally excited by bar-shapes, which is why it makes sense that simple cells are also called slit- and line – detectors. Slit-detectors respond to a light bar on a dark surrounding, and line-detectors respond to the opposite (Hubel, 1988).  Light on dark or vice versa is important because simple cells respond best to patterns that generate luminance differences, in other words, edges. However, because simple cells are so sensitive to edges, the orientation of a bar is important. The optimal stimulus for a simple cell, to emphasize luminance differences, is one that provides maximum excitation and minimum inhibition (Frisby and Stone, 2010).  To provide maximum excitation and minimum inhibition, different orientations are dealt with by different cells (Hubel and Weisel, 1962). The angle to which each cell is tuned is determined by the pattern of its excitatory and inhibitory regions. ‘Slit’ and ‘edge’ simple cells exist for a full range of orientations, which is reflected by the brain’s wiring; the fibres going from the retina to cortex differ depending on which orientation they represent (Frisby and Stone, 2010).

Population Code

As discussed in the previous section, bars maximally excite simple cells; however, cells still respond even when they are not maximally excited. If part of a cells visual field is activated, a partial response will be initiated. As such, context is vital to making sense of our visual field. For example, a non-vertical stimulus stimulates the vertically oriented receptive field just as well as the vertical but faint edge. In order to distinguish between the two outputs, they must be considered in context of the activities of cells examining the same retinal patch.

Fortunately, having sensitive simple cells makes interpolation between neighboring orientation measurements possible. This “talk” or interpolation between cells is known as a “population code.” Even though there are only simple cells for 18-20 different preferred orientations, we manage discriminations of less than <0.26 degrees (Frisby and Stone, 2010). Communication between cells allows us to discriminate when orientations vary very slightly, in the grey area between defined orientations. Populations of cells that have same preferred value of particular stimulus, like orientation, are called a channel. Scientists are able to measure the preferred orientation of cells by recording the symmetric pattern of firing rates.

Unfortunately, a major consequence of having a limited number of cells tuned to a large number of orientations is that cells taking each measurement need to be “broadly tuned” for “coarse coading.” Cognitive psychologists wanted to know how many channels are necessary to resolve the ambiguity problem. The answer is technically two but then the cells would be so broadly tuned that you would not be establishing any type of context. In addition, the tuning curve would turn far to slow to interpret anything. Unless the input to the cell coincided with the flank of the curve, there would be very little difference between the cells outputs. Hence, the brain uses a large number of broadly tuned cells with tuning so that the most sensitive part of the tuning curve can always be representative of one orientation with the less sensitive parts being representative of slight deviations from the optimal orientation (Frisby and Stone, 2010). 

Tuning Curve

The overall relationship between orientation of input edge and the output of the cell is called the tuning curve (Frisby and Stone, 2010). Tuning curves are important because they allow you to pinpoint which cells are sensitive to what orientation. The flank mentioned in the section above is where the slope of the tuning curve is the greatest; it also represents the point of greatest change in firing rate. This peak in sensitivity is found half way from the top of the curve. The trough or top of the curve is the least sensitive part because the slop is equal to zero. Regan and Beverly (1985) proved that humans do have peaks and troughs in their orientation sensitivity.

Seeing Maps


All maps, whether they are used in the brain or not, represent a mathematical function (v = f(u)) transforming one points in space (the domain, u) to another (the codomain, v). For a map to be accurate, it must be continuous, without any breaks. Also, every pair of nearby points must correspond to two nearby points in the codomain. For example, take the cities Sheffield and Leeds. They represent two geographical points. For a map to be accurate, Sheffield and Leeds on a map must correspond proportionally in terms of distance, direction, etc. with Sheffield and Leeds in real life. The map of South Yorkshire would be the codomain and the real cities would be the domain. In the visual system, these points could be two retinal points (domain) accurately reflected onto the striatal cortex (codomain).  In addition, a map of the brain must specify direction. Direction here is not used in the common sense, direction refers to continuity between domain and codomain. For example, mapping from retina to the striate is actually discontinuous; however, mapping from each half of the retina to the striate cortex is continuous, giving rise to the retinotopic map. If you map against the specified direction, it is called inverse mapping. An example would be mapping from the cortex to the retina, which is discontinuous. This is because most nearby points in the striate cortex correspond to nearby points in the retina; however, if two striate points are located on different ocularity stripes, inverse mapping is discontinuous (Blasdel, 1992).

Scientists argue that the reason the striate cortex maintains the retinotopic map is because it economises the length of nerve fibres. Nerve fibres are necessary for inter-hypercolumn communication, and if neighbouring points were spread out, wiring would become chaotic.  Of course, mapping does not occur only in the visual system. The brain has a myriad of maps including ones for auditory, touch and motor output and many copies of these maps exist. All of these maps, like the retinotopic map must be continuous, suggesting spatial organisation is key to a healthy, functioning brain. Unfortunately, because there are so many features of the visual system that need to be represented in the map, singularities arise.

Singularities are jumps in continuity (Frisby and Stone, 2010), and they are the result of the packing problem. To put it simply, the brain wants to pack all features of the visual system into the brain; to maximise efficiency, the brain wants all similar variations of a feature in one place. In other words, the brain wants continuity. However, as was discussed with the binding problem, there are far too many variations and features of our visual system to account for them all in every possible detail. Hence, some continuity must be compromised (Hubel, 1981). The brain’s map of the visual system is continuous with respect to retinal position, but the map is discontinuous with respect to orientation. However, the brain still attempts to keep similar preferred orientations close together to maximise efficiency to the limited extent it can (Frisby and Stone, 2010).

Another way of thinking of singularities and the packing problem is in terms of parameters. Two points, as discussed above, define each retinal position otherwise known as position parameters (x, y). Correspondingly, each retinal point must correspond to a point on the cortex (x’, y’). Even though our cortex exists in three dimensions, there is still a finite amount of space to store information and parameters limit us to representing information in 2D. As orientation brings its own parameter (theta), the brain has to represent x, y and theta in 2D.  This cannot be done without introducing discontinuities in at least one of the parameters because we are limited to two parameters. As the cortex must maintain a smooth map of the retinal map, discontinuities must be introduced into the representation of orientation. Based on this information the packing problem can be redefined in terms of parameters. The packing problem arises from attempting to pack all three dimensions of a 3D parameter space into a 2D one. As a solution to the packing problem, the cortex treats two of the parameters with varying priority, in this case the domain and codomain. Low priority is given to orientation, hence singularities.


Representation of point singularities in the Visual Cortex. Each color represents a different radial phase corresponding to an orientation column. Date 2 December 2011 Source Own work Author Rtang3

As singularities exist for orientation, the topological index was introduced to describe the number of singularities. Specifically, the index tells us how orientation varies as we move around the centre of a singularity (Frisby and Stone, 2010). This can be done by drawing a circle around a singularity and moving clockwise. If the underlying representation orientation changes clockwise, the singularity is positive; however, if the orientation changes anticlockwise, the singularity is negative. Singularities in the striate cortex rotate no more than 180 degrees, so the singularity variable is always between + – ½. Hypothetically, a pinwheel is a full rotation, with an index of +1. To date, now pinwheels have been found in our visual cortex. Tal and Schwartz (1997) found that for any neighbouring singularities, you could usually draw a smooth curve between them. The remaining cells form columns along the curve with the same orientation preference (iso-orientation). In addition, Tal and Schwartz confirmed that nearby singularities have the same topological indices with opposite signs.

In addition to orientation, ocularity is another parameter that has to be represented in the striate cortex. Ocularity refers to the extent in which cells respond to our eyes. The brain as ocularity stripes meaning columns in the striate alternate between monocular and binocular cells. The stripes suggest that the brain wants to ensure that pairs of L and R stripes process every part of the visual field. Unfortunately, adding another feature parameter only furthers the packing problem. Researchers have found that the brain maximised economy by ensuring that each iso-orientation domain in the orientation maps tends to cover a pair of L-R ocularity columns; in other words, each orientation is represented for each eye (Hubel and Wiesel, 1971). Furthermore, the brain needs to perceive lines of different widths. The brain has solved this problem by having cells tuned to the same orientation, sensitive to various widths of spatial frequencies. As with orientation, representation of spatial frequency is continuous except for some singularities.  Lastly, directionality is packed together with orientation. Except for sudden changes in direction (180 degrees), the direction map is continuous. It overlays the orientation map; however, it does not effect the continuity of orientation as orientations defines two possible directions (Frisby and Stone, 2010).

Fortunately, colour does not add to the packing problem! This is because colour is represented exclusively at the centre of orientation singularities. At the centre of singularities, cells have no preferred orientation, so colour does not add any parameter. To fully understand how orientation, ocularity and colour come together, the polymap was constructed to show an overlay of all the parameters. Based on observations from a polymap, in addition to the specific wiring of the brain, some scientists argue that the cortex is not really trying to solve the packing problem. All the maps try to do is to minimise the amount of wiring the brain needs to employ (Frisby and Stone, 2010). In fact, Swindale et al. 2000 found that the cortex does attempt to maximise coverage, and if any small changes were made to the current mapping system, wiring efficiency would be reduced.


Seeing Objects

Binocular disparity is the subtle difference between the left and right images. The left eye sees more of the scene than the right eye to the left of the centre and vice versa. Neighbouring layers responding to the left and right eye can inhibit one another when necessary. Optic nerves from the eye join at the optic chiasm and some of the fibres decussate. Optic nerves contain axons that emanate from retinal ganglion cells in the eye.  Regardless if the fibres decussate, all the fibres pass through the lateral geniculate nucleus. From the lateral geniculate nucleus, fibres feed into the striate cortex. Importantly, the striate cortex preserves the neighbourhood relations between the retinal ganglion cells. In other words, the striate cortex has a retinotopic map. The map is stretch and magnified around the fovea, which is consistent with the quality of foveal vision. Also the quality reflects the number of cells dedicated to this part of the retinal patch.


Types of Retinal Ganglion Cells

First, the midget retinal ganglion cells are the most common, making up about 80%. These cells respond to static form and project to the parvocellular layer of the lateral geniculate nucleus, specifically bilayers 3 through 6. Second, the parasol retinal ganglion cells consist of 8-10% of all retinal ganglion cells.  These cells contain on/off receptive fields and receive their light input from rods. As expected, they respond to increases and decreases in light conditions. In addition, they also respond to motion. Output from the parasol retinal ganglion cells project to the magnoceullar layer of the lateral geniculate nucleus, specifically to bilayers 1 and 2. Thirdly, the bistratified retinal ganglion cells make up less than 10% of all retinal ganglion cells. They respond to short (blue) wavelengths by increasing their frequency rate and to middle wavelengths (yellow) by decreasing their frequency rates. Output then projects to the konio sublayers 3 and 4. Lastly, the biplexiform retinal ganglion cells are equally rare making up less than 10%. The exact function of these cells in unknown, but it is known that they connect directly to rods and contain on-centre receptive field. It is believed that they provide information about ambient light.


Features of the Lateral Geniculate Nucleus

The nucleus consists of six major layers; each layer has a major layer plus a konio cell sub-layer. Each layer is responsible for carrying information from one eye. All of the layers of one lateral geniculate nucleus receive input from half the visual space. Even though layers 2, 3 and 5 correspond with the left eye, they only receive half of the information of the retina in that eye; the other half corresponds to the right visual field. As with the striate cortex, cells in each layer are organised retinotopically. In addition, each layer encodes a different aspect of the retinal image. Each lateral geniculate nucleus contains twelve copies of half the visual field (2/bilayer). However, it is important to note that only 10% of inputs come from the retina. 30% of input comes from outputs including the striate cortex and the midbrain. Thorpe (1996) proposed that the brain uses feed forward connections of retina to lateral geniculate nucleus to striate cortex to perform “quick and dirty” analysis. Feedback then makes connections to the retinal image. Thorpe’s hypothesis is supported by his demonstration that people can make quick visual interpretations from briefly flashed images. 

The Striate Cortex (V1)

The striate cortex is responsible for early feature detection representations including colour. Stimulation of the striate cortex produces hallucinations of swirling colour (Frisby and Stone, 2010). In additional, all cells in the striate cortex have orientation-tuned columns except for layer 4B. The LGN and retinal ganglion cell are not orientation-tuned like the striate. Like the lateral geniculate nucleus, however, it is organised in layers: horizontal, vertical and retinotopy. The top layer of V1 contains pyramidal cells and their dendrites. On the other hand, the bottom layer of V1 contains pyramidal cells as they exit the cortical layer. Neurons in these layers are arranged into vertical columns with each column dedicated to one retinal patch and a specific characteristic. Retinal progress decreases the further you move from the edge of V1.

In 1978 a study was carried out by Hubel et al. to illustrate the existence of orientation-tuned cells in the striate cortex. Anesthetized macaque monkeys had their eyes exposed to a pattern of vertical stripes, continuously for 45 minutes. The stripes were of irregular width, filled the entire visual field, and moved about to activate the entire striate cortex. A chemical was then injected to be taken in by any active cells. Immediately afterwards, an autopsy was performed, which showed increased chemical uptake in the vertical columns.

Now these orientation-tuned columns are called hypercolumns. Each hypercolumn contains a mass of different types of cells that together process the same retinal patch. The patch of the retinal image that each hypercolumn deals with is called the hyperfield. Hyperfields must overlap to some degree, which allows for edge features to be detected (Frisby and Stone, 2010). As well as overlap, inter-hypercolumn communications links edge features together to create on unified image. This communication is possible due to the horizontal fibres that run along the vertical columns.  The area dedicated to a particular area remains quite constant; however, processing decreases the further you move from the centre, which is why feature detection becomes cruder the further you move into the periphery. In addition to orientation, these cells can also be tuned to colour, scales and ocularity (Frisby and Stone, 2010). The ice-cube model (Hubel and Weisel, 1962) argues that each hypercolumn functions as an image-processing mechanism.

Convultion images are used to give us an idea about the activity profile of the striate cortex, representing the output of simple cells (Frisby and Stone, 2010). Each point on a convultion image represents the response of a single simple cell, which is centered over the corresponding retinal image. White represents a large, positive output. Grey represents no output, and black represents a large negative output. Based on this scale, outputs are coded in terms of pixel grey level. Inside a hypercolumn is a pattern of activity corresponding to one area (point) of the convultion image, an area representing the hyperfield (ibid).

Complex Cell

Unlike simple cells, complex cells cannot be mapped into positive and negative regions. An optimal stimulus does not need fall on any particular region of the retinal field. However, a line or slit of a particular orientation is still the preferred stimulus. A theory of complex cells and receptive fields has been proposed (Frisby and Stone, 2010). This theory proposes that the receptive fields of complex cells can be predicted supposing they receive their input from a series of suitably placed simple cells. However, this theory cannot be true because some complex cells do not even receive input from the striate cortex.

Hypercomplex or “end topped” cells cannot be mapped into positive or negative regions either, but unlike simple and complex cells, they prefer moving stimuli. In addition, hypercomplex cells are selective to stimulus length. Of all stimuli, the best is either a bar of defined length or a corner.

Extra-Striate Visual Areas

Due to the heavy burden of the packing problem, the brain alleviates some responsibility of off-loading information to the extra-striate visual areas. These areas are specialised for particular visual data. Colour information is relayed to V4, motor information is relayed to V5, and lastly, object







Secondary visual cortex: V2

The secondary visual cortex envelops V1 and is organised into parallel stripes running perpendicular to the V1/V2 border. Stripes that respond to the same region of the retina run adjacent to each other, preserving the retinotopic map. These stripes come in three types: thick, think and pale. Pale stripes are known as interstripes as they run between thick and thin stripes. Input is fed to the pale stripes from the hypercomplex cells of V1, and it is then forwarded to the LGN via the parvocellular layer. The main job of the pale stripes is to respond to oriented lines. Secondly, thick stripes receive input from layer 4B of V1 and respond to specific orientations as well as cells of binocular disparity. Output from the thick stripes is passed onto V5 via the magnocellular pathway. Lastly, thin stripes receive inputs from colour blobs of V1, hence they are sensitive to colour or brightness. Output from the thin strips project to V4 via the parvocellular stream.


V4 is the colour area, and cells here respond to colour, simple shapes and objects. As colour does not have its own parameter, it does not have an accurate retinotoic map. As such, V4 serves as the first indication of decline in retinal location and the rise of feature based primary indices.


As mentioned above, the thick stripes of V2 project to V5. V5 is known as the motor areas it responds to motion and stereo disparity. Non-spatial parameters are beginning to take precedence over maintaining the retinotopic map as it is no longer maintained here. Zeki (1990) confirmed the hypothesis found when he found that paths across the retina become more chaotic the further they are from the striate cortex.

The Inferotemporal Cortex

Each point of the inferotemporal cortex represents a different view of a face. Despite the highly abstract parameter, nearby cells represent similar views of the face. Over the past years, there has been discussion over whether inferotemporal cortex cells are grandmother cells. The inferotemporal cortex was monitored while a patient was shown various faces; one cell seemed to respond only to pictures of Jennifer Aniston of Friends (Connor, 2005). Nearby cells did not seem to respond to different views of Jennifer Anniston, but they did respond to characters from the same shows. These findings suggest that nearby cells represent many parameters that occur nearby in time. The temporal proximity of views experienced in everyday life may be reflected in the physical proximity of cells in the cortex.

Maps and Infinite Homuncular Regression

Although maps provide an excellent structural representation of what is going on the brain, in themselves they have no intrinsic value (Deacon, 2012). Maps do not really tell us about how the brain is interpreting or processing the information, only how it stores and forwards the information. To be fair, even that is only based on visual interpretation not on concrete behaviour. Unfortunately, that brings up the issue of whether or not topological maps are of any value. The retinotopic map may just be a consequence of development and evolution, an attempt to minimise wiring of the brain. Deacon (2012) stresses that there is massive flaw with the current use of mapping; he summarizes it as the ‘infinite homuncular regression.’ Basically, we have come to a point where other maps are just reading maps. The actual perceptual neurology has not been determined. Deacon warns scientists of the dangerous of neuroimaging and maps when trying to prove the existence of neural activity and behaviour. All this boils down to really is the classic argument in psychology; correlation does not prove causation.


Fortunately, Graziano and his colleges (2009) suggest that perhaps a homunculus does exist that can bridge perception and behaviour: the motor cortex. To put it plainly, a motor homunculus represents the sensitivity and innervation dedicated to particular muscles in our body. This homunculus can be mapped onto our motor cortex, and stimulation of these regions leads to an immediate motor response. Hence, we can bridge the gap between map and action. In other words, the motor cortex may in fact put an end to the infinite homoncular regression. A recent study carried out by Bouchard et al. (2013) found that when participants vocalised constants and vowels, scans showed smooth trajectories in the motor cortex.