Music perception: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Daniel Mietchen
(linking back to course page)
imported>Felipe Gerhard
(+ General structure, Introduction, Modularity, Early processing)
Line 1: Line 1:
{{Subpages}}
{{Subpages}}
{{EZarticle-open-auto|CZ:Guidel 2008 summer course on Music and Brain‎}}
{{EZarticle-open-auto|CZ:Guidel 2008 summer course on Music and Brain‎}}
Processing a highly structured and complex pattern of sensory input as a unified percept of "music" is probably one of the most elaborate features of the human brain. In recent years, attempts have been made to investigate the neural substrates of music processing in the brain. Through progress has been made with the use of rather simplified musical stimuli, understanding how music is perceived and how it may elicit intense sensations is far from being understood.
Theoretical models of music perception are facing the big challenge to explain a vast variety of different aspects which are connected to music, ranging from temporal pattern analysis such as metre and rhythm analysis, over syntactic analysis, as for example processing of harmonic sequences, to more abstract concepts like semantics of music and interplay between listeners' expectations and suspense. It was tried to give some of these aspects a neural foundation which will be discussed below.
==Modularity==
Several authors have proposed a modular framework for music perception. After Fodor, mental "modules" have to fulfil certain conditions, among of which the most important ones are the concepts of information encapsulation and domain-specificity. Information encapsulation means that a (neural) system is performing a specific information-processing task and is doing so independent of the activities of other modules. Domain-specificity means that the module is reacting only to specific aspects of a sensory modality. Fodor defines further conditions for a mental module like rapidity of operation, automaticity, neural specificity and innateness that have been debated with respect to the validity for music-processing modules.
However, there is evidence from various complementary approaches, that music is processed independently from e.g. language and that there is not even a single module for music itself, but rather sub-systems for different relevant tasks.
Evidence for spatial modularity comes mainly from brain lesion studies where patients show selective neurological impairments. Peretz and colleagues list several cases in a meta-study in which patients were not able to recognize musical tunes but were completely unaffected in recognizing spoken language. Such "amusia" can be innate or acquired, for example after a stroke. On the other hand, there are cases of verbal agnosia where the patients can still recognize tunes and seem to have an unaffected sensation of music. Brain lesion studies also revealed selective impairments for more specialized tasks such as rhythm detection or harmonical judgements.
The idea of modularity has also been strongly supported by the use of modern brain-imaging techniques like PET and fMRI. In these studies, participants usually perform music-related tasks (detecting changes in rhythm or out-of-key notes). The obtained brain activations are then compared to a reference task, so one is able to detect brain regions which were especially active for a particular task. Using a similar paradigm, Platel and colleagues have found distinct brain regions for semantic, pitch, rhythm and timbre processing.
To find out the dependencies between different neural modules, brain imaging techniques with a high temporal resolution are usually used. These are e.g. EEG and MEG which can reveal the delay between stimulus onset and the processing of specific features. These studies showed for example that pitch height is detected within 10-100 ms after stimulus onset, while irregularities in harmonic sequences elicit an enhanced brain response 200 ms after stimulus presentation.
Another method to investigate the information flow between the modules in the brain is TMS. In principle, also DTI or fMRI observations with causality analysis can reveal those interdependencies.
==Early auditory processing==
A neural description of music perception has to start with the early auditory system in which the raw sensory input in the shape of sound waves is translated into an early neural representation.
Sound waves are described by their frequency components and the respective intensities. Both frequency (measured in Hz) and intensity (measured in energy per area contained in the air pressure variation) translate into psychophysical quantities, namely pitch and loudness. After the sound waves passed the pinna and auditory canal, they hit the tympanic membrane which is connected via the ossicles and the oval window to the fluid-filled cochlea. Here the basilar membrane responds to movements of the cochlear fluid. The maximal displacement of the basilar membrane is dependent on the frequency and intensity of the stimulus.
A special type of cells, the hair cell receptors, convert the mechanical elongation of the basilar membrane into an electrical signal which is then transducted by the auditory nerve which consists of around 30,000 spiral ganglion cells. Because hair cells at a given location on the membrane are only excited when a particular frequency in the sound wave is present, one can think of the cochlea performing a crude Fourier analysis of the input signal. Each ganglion cells submits by its firing rate how well a certain range of frequencies around a prefered characteristic frequency is contained in the original signal (the frequencies and intensities to which a neuron responds are called the receptive field). Theoretically, such a ganglion cell might be modelled by applying a bandpass filter which corresponds to its receptive field.
From the auditory nerve on, there are multiple pathways on which auditory information is conducted. One primary pathway goes via the brainstem (the cochlear nuclei, superior olive and inferior colliculus) and the thalamus (medial geniculate nuclei) to the primary auditory cortex in the temporal lobes.
A common principle of organization of sound representations along this pathway is known as tonotopy. Tonotopy means that neighbouring neurons encode for nearby frequencies. Inside the cochlea, this is just a physical consequence from the resonance behavior of the basilar membrane. However, this mapping is preserved also in the later stages, though not in all areas of the auditory cortex. Sounds from around 200 Hz up to 20,000 Hz are represented tonotopically. For lower frequencies from 20 Hz to 4 kHz an additional mechanism is used to encode the frequency property of the signal. This mechanism is called phase locking: A single neuron or a whole population is responding with a spike whose firing is locked to the phase of the sound wave. For low frequencies, a single neuron fires one spike per wave period (the period is the inverse of the sound frequency).
[...]
==Pitch encoding==
==Contour analysis==
==Syntax analysis==
==Metre analysis==
==Memory==
==Emotion==
==Motor control==
==Discussion==

Revision as of 17:24, 4 September 2008

This article is developing and not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
Video [?]
 
This editable Main Article is under development and subject to a disclaimer.
Nuvola apps kbounce green.png
Nuvola apps kbounce green.png
This article is currently being developed as part of an Eduzendium student project. The project's homepage is at CZ:Guidel 2008 summer course on Music and Brain‎. One of the goals of the course is to provide students with insider experience in collaborative educational projects, and so you are warmly invited to join in here, or to leave comments on the discussion page. However, please refrain from removing this notice.
Besides, many other Eduzendium articles welcome your collaboration!



Processing a highly structured and complex pattern of sensory input as a unified percept of "music" is probably one of the most elaborate features of the human brain. In recent years, attempts have been made to investigate the neural substrates of music processing in the brain. Through progress has been made with the use of rather simplified musical stimuli, understanding how music is perceived and how it may elicit intense sensations is far from being understood.

Theoretical models of music perception are facing the big challenge to explain a vast variety of different aspects which are connected to music, ranging from temporal pattern analysis such as metre and rhythm analysis, over syntactic analysis, as for example processing of harmonic sequences, to more abstract concepts like semantics of music and interplay between listeners' expectations and suspense. It was tried to give some of these aspects a neural foundation which will be discussed below.

Modularity

Several authors have proposed a modular framework for music perception. After Fodor, mental "modules" have to fulfil certain conditions, among of which the most important ones are the concepts of information encapsulation and domain-specificity. Information encapsulation means that a (neural) system is performing a specific information-processing task and is doing so independent of the activities of other modules. Domain-specificity means that the module is reacting only to specific aspects of a sensory modality. Fodor defines further conditions for a mental module like rapidity of operation, automaticity, neural specificity and innateness that have been debated with respect to the validity for music-processing modules.

However, there is evidence from various complementary approaches, that music is processed independently from e.g. language and that there is not even a single module for music itself, but rather sub-systems for different relevant tasks. Evidence for spatial modularity comes mainly from brain lesion studies where patients show selective neurological impairments. Peretz and colleagues list several cases in a meta-study in which patients were not able to recognize musical tunes but were completely unaffected in recognizing spoken language. Such "amusia" can be innate or acquired, for example after a stroke. On the other hand, there are cases of verbal agnosia where the patients can still recognize tunes and seem to have an unaffected sensation of music. Brain lesion studies also revealed selective impairments for more specialized tasks such as rhythm detection or harmonical judgements.

The idea of modularity has also been strongly supported by the use of modern brain-imaging techniques like PET and fMRI. In these studies, participants usually perform music-related tasks (detecting changes in rhythm or out-of-key notes). The obtained brain activations are then compared to a reference task, so one is able to detect brain regions which were especially active for a particular task. Using a similar paradigm, Platel and colleagues have found distinct brain regions for semantic, pitch, rhythm and timbre processing.

To find out the dependencies between different neural modules, brain imaging techniques with a high temporal resolution are usually used. These are e.g. EEG and MEG which can reveal the delay between stimulus onset and the processing of specific features. These studies showed for example that pitch height is detected within 10-100 ms after stimulus onset, while irregularities in harmonic sequences elicit an enhanced brain response 200 ms after stimulus presentation. Another method to investigate the information flow between the modules in the brain is TMS. In principle, also DTI or fMRI observations with causality analysis can reveal those interdependencies.

Early auditory processing

A neural description of music perception has to start with the early auditory system in which the raw sensory input in the shape of sound waves is translated into an early neural representation.

Sound waves are described by their frequency components and the respective intensities. Both frequency (measured in Hz) and intensity (measured in energy per area contained in the air pressure variation) translate into psychophysical quantities, namely pitch and loudness. After the sound waves passed the pinna and auditory canal, they hit the tympanic membrane which is connected via the ossicles and the oval window to the fluid-filled cochlea. Here the basilar membrane responds to movements of the cochlear fluid. The maximal displacement of the basilar membrane is dependent on the frequency and intensity of the stimulus.

A special type of cells, the hair cell receptors, convert the mechanical elongation of the basilar membrane into an electrical signal which is then transducted by the auditory nerve which consists of around 30,000 spiral ganglion cells. Because hair cells at a given location on the membrane are only excited when a particular frequency in the sound wave is present, one can think of the cochlea performing a crude Fourier analysis of the input signal. Each ganglion cells submits by its firing rate how well a certain range of frequencies around a prefered characteristic frequency is contained in the original signal (the frequencies and intensities to which a neuron responds are called the receptive field). Theoretically, such a ganglion cell might be modelled by applying a bandpass filter which corresponds to its receptive field.

From the auditory nerve on, there are multiple pathways on which auditory information is conducted. One primary pathway goes via the brainstem (the cochlear nuclei, superior olive and inferior colliculus) and the thalamus (medial geniculate nuclei) to the primary auditory cortex in the temporal lobes.

A common principle of organization of sound representations along this pathway is known as tonotopy. Tonotopy means that neighbouring neurons encode for nearby frequencies. Inside the cochlea, this is just a physical consequence from the resonance behavior of the basilar membrane. However, this mapping is preserved also in the later stages, though not in all areas of the auditory cortex. Sounds from around 200 Hz up to 20,000 Hz are represented tonotopically. For lower frequencies from 20 Hz to 4 kHz an additional mechanism is used to encode the frequency property of the signal. This mechanism is called phase locking: A single neuron or a whole population is responding with a spike whose firing is locked to the phase of the sound wave. For low frequencies, a single neuron fires one spike per wave period (the period is the inverse of the sound frequency).

[...]

Pitch encoding

Contour analysis

Syntax analysis

Metre analysis

Memory

Emotion

Motor control

Discussion