People may vary the frequency of their voices, but before the invention of musical instruments, these variations could not be associated with a linear dimension such as a series of air holes in a flute or alliances along a string. On the contrary, variations in height may have been initially associated with emotional significance, as is the case in primate calls, or symbolic significance in human proto-languages (Mithen, 2005). Thus, today, people can associate the height of a sound with a symbolic meaning and a spatial (or musical) scale. For example, many people may recognize the frequency of choosing their phone (N. A. Smith- Schmuckler, 2008), and most musicians could reproduce this frequency on an instrument. Other examples of the association of a stimulation frequency with symbolic significance are absolute (or perfect) heights where musicians learn to directly associate a frequency of stimulation with a music name in the Western musical scale (Levitin – Rogers, 2005; McLachlan et al., 2013b; Wilson et al, 2012); the frequencies that define vocal sounds (Deterding, 1997); and sound languages like Mandarin, where changes in the tone of a vowel can change the meaning of a word. Birds, frogs and reptiles lack neocortex and higher brain auditory treatment centers found in humans. Networks of brains and brains can learn implicitly by forming neural models for stimuli (Fiez et al., 1992; Gebhart et al., 2002; Ravizza et al., 2006). The stimuli models of a sensory modality can be coupled with stimuli models of other modalities that occur with high statistical reliability – in other words, several sensory inputs are related to the same object, event or behavior.

For example, people implicitly learn language and music (Mahon – Caramazza, 2008). Their learning rates are improved by the combination of perception and production (Kotze-Schwatze, 2010), probably because of the additional structure that can offer motor mapping for the formation of neural sensory models in the cervelon and the possibility of using voluntary actions to help with perception learning. In particular, people who learn musical instruments have a better perception of height (McLachlan et al., 2013a, 2013c), have a better phase of auditory brainstem that clings to the sounds of their instruments (Strait et al., 2012) and often apply the psychological height dimension with motor cards for their instrument (Rusconi et al. , 2006). More generally, this phenomenon, known as incarnate cognition (Mahon- Caramazza, 2008), allows musicians to automate many musical perception and production capabilities, thus freeing up brain processing resources for the analysis of the superior characteristics of musical expression. What about secular music? There seems to be here our first manuscripts of the 13th century, adam de la Halle and his contemporaries writing motets for singers and with anonymous dance music, mostly monophonic. The first polyphony, music in more than one part, was generally based on a cantus firmus or tenor, which was often derived from a church song around which other more elaborate parts were woven. Polyphony of this type seems to have been a purely European development; other cultures then prefer, and in many cases still, a single line or a monophony, or when they sing in a group or in a single line with accompaniment, with heterophony, people all sing a lot, but not exactly, the same. The later motets may have three or four independent lines, sometimes each with its own text, intertwined.