tagCloud

Let’s Work Together

A.I. & Music

Didakis, S. (2005) “Foundations of AI Systems for the Automatic Composition of Music”, Chapter 2, Bachelor Thesis, London, 2005

Artificial Intelligence (AI) is a branch of computer science that tries to emulate the way a human mind processes tasks with the use of intelligence. Most AI music systems attempt to simulate the way a human mind works and to explain cognitive processes like perception and conscious understanding of music.  A well-designed AI system should be able to understand abstract concepts like tension, anger, and brightness, and bring entirely new methodologies to music composition that may enhance the music excitation on various levels.

The biggest problem AI confronts in compositional tasks is the subjective nature of music. It is not easy to evaluate something with a musical meaning using measurement of quantities. If something like this was possible, then mathematical representations would be created and tested on programming structures. The AI systems would act then as transducers between the actual experience, and the simulated mathematical calculations. The hypothesis that rises here is that AI systems will find ways to explain the perception of music experience and create techniques and methods for the construction of original musical structures. The systems should be able to generate acceptable melodies, harmonies, or orchestrations, and give the possibility to regenerate new ideas for creativity. The main question that is asked in this work is: Will intelligent systems be able to simulate a human composer, and furthermore bring a new Renaissance in music technology?

Subjects of AI have seen fruitful applications in music composition, interactive accompaniment, listening, and many more. The computational modeling of intelligence is also used for the study of music perception that provides new techniques for text and score analysis, or other techniques for human-computer interfacing. These methods attempt to open and establish a new world of music education, understanding, and awareness.

This study is focused mainly on systems that are used for automatic compositional applications. Applications as such may include composition and orchestration issues, or the automatic harmonization of a melody and its progression inside a context. Approaches have been made to define these systems that come from the necessity to design “clever” compositional systems which exhibit features at a higher level of complexity, behaving as “intelligent machines, able to represent knowledge about music in such a way that it can be manipulated” [Camurri, 1993].

AI systems provide the most sophisticated way to analyze and define a musical task for the reason that their architecture can enable solutions based on intelligent computational procedures. These procedures allow the system to learn a task, understand it, reason it, and define a solution that is valid and appropriate inside a given context. A proper AI system designed for musical purposes should enable the ability to “reflect about abstract notions, and the ability to discover new concepts” [Balaman, 1988].

Other algorithmic methods are presented here, although they do not belong to the AI family. Their architecture is simpler, providing less effective results, nevertheless they can enable the automation of the compositional process so as to accomplish insight into works, and gain aspects of arbitrariness. These systems have been used extensively for the reason that they can give power to the composers to explore new compositional possibilities.


States of the Mind

In spite of the great potential that AI systems may have in the calculation and the modeling of powers concerning stimulating environments (i.e., the laws of physics, environmental or meteorological phenomena), it is rather difficult to define an algorithm that can simulate mental and intellectual abstractions. If scientific explanations could be applied to the understanding of cognitive phenomena, then the adaptation to an AI system would form an utterly well-designed and functional brain simulation. For that reason, scientific experiments and further investigations concerning the tasks that show the activity of intelligence are studied, and the outcome should form new theories for the implementation codes in the programs that must behave analogically.

The term “intelligence” has been defined as: “the ability to learn effectively, to react adaptively, to make proper decisions, to communicate in language or images in a sophisticated way, and to understand” [Kasabov, 1998]. AI systems have to exhibit aspects of intelligence in their programming architecture in order to process information, and they must be exemplified as intelligent, according to their ability to “achieve a specified result with variations, difficulties, and complexities posed by the task environment” [Newell & Simon, 1990].

Technologists, mathematicians, and engineers try constantly to find new ways and methods to approach and express intelligent problem solving in terms meaningful to the computer, mostly with the use of programming languages like C, LISP (LISt-Processing), and PROLOG (PROgramming-in-LOGic). Intelligent programming methods try to demonstrate a replication of the human reasoning in real problems, without any faults, or weaknesses whatsoever. Up to this point the architectural complexity of AI has enabled systems to achieve logic based in real situations, or even common sense, reasoning, planning, heuristics, and pattern recognition.

The purposes of a modern AI system must extend “to more global, complex, and knowledge-intensive tasks”, and these systems should become at some point our “agents”, capable of “handling on their own the full contingencies of the natural world” [Newell & Simon, 1990]. This is an ambitious aim with high complexity in its analysis and design, nevertheless obtainable. In order to achieve a better understanding of these intelligent computational structures, we have to examine first the mechanisms that activate the psychological and emotional patterns in our spiritual hypostasis.


Cognitive Processes

To understand any art, we must look below its surface into the psychological details of its creation and absorption” [Minsky, 1993]

A computational system that can process information of a specified environment with the use of intelligence might not be enough to approach computational methods for the understanding of arts, and music in particular. Other aspects of human nature must be examined such as psychological profiles and cognitive processes that provide useful information concerning the intellectual activity of the music perception. For example, there seems to be an “alignment between the compositional mechanisms of the composer and the perceptual mechanisms of the listener” [Pearce & Wiggins, 2002; Lerdahl, 1988]. It has also been noticed that “aesthetic judgments depend not only on a common body of shared experiences” [Hume, 1965], but on a common set of cognitive schemata, like ideas, symbols, or musical patterns. These factors are able to produce a “structured representation of the object in the mind of the composer or the listener” [Kant, 1952]. A further study on this subject will enable the accurate localization of pattern recognition in the musical brain.

The establishment of functional levels of description of human psychology and cognition provides explanations for study and development, and try to “capture a cognitive vocabulary to reveal certain fundamental patterns of musical creation” [Pylyshyn, 1986]. The observations can lead to a formulation of symbolic representations of the subjects that are under investigation, which leads “to the design and construction of physical-symbol systems” [Newell & Simon, 1990]. The programming structure of the model has to incorporate all the processes into a language for the computer to understand, and also must have an extended definition of the procedures it must follow in this space. The model should be in a position to test and evaluate the theories explicitly and try to show similar behavior to the process that humans follow in these specific tasks. Once a model of music cognition has been achieved effectively it can be incorporated into an AI system and determine more accurately a brain simulation. The more complex the model is, the more things it can tell us about the cognitive states of the composer, the performer, or the listener.


Emotional Indexes

My soul is a hidden orchestra; I know not what instruments, what fiddle strings and harps, drums and tambours I sound and clash inside myself. All I hear is the symphony” [Fernando Pessoa]

In order for AI engineers to model and fully understand a human mind, they would have to model “emotion and motivation too” [Boden, 1990b]; factors that define an important issue with respect to our understanding of music. An emotion is “a psychological state or process that functions in the management of goals” [Wilson, Keil, 1999], and it evaluates results from events towards a crisis that chases a relevant ambition. The output of this mental function is positive if the goal is achieved, otherwise, the result is negative when the goal is obstructed. The complexity arises when emotions seem to have contradicted interests on a topic that make the choice of the state a rather difficult matter, especially when these emotions “are associated with various sorts of motivational controls” [Boden, 1990b].

A hypothetical model of emotional reaction should have motives, thoughts, or beliefs; states that are able to trigger mechanisms or activate filters that “tend to disturb other ongoing activities” [Sloman, 1990]. This interaction of different processes may be very complex, especially when new states are in conflict, and the disturbance that is being created is able to affect the overall performance of the system. In a human model, these disturbances are the continual interrupts in thinking and deciding, as well as in the “influencing of the decision-making criteria and perceptions” [Sloman, 1990].

The emotions would have to be programmed as elements that affect the simulated psychological pattern of the model in such a way that its output rate should mirror the one found in humans. A musical acoustic stimulus appears to trigger emotional stages like tension, excitement, euphoria, and mental movements like memories, desires, perceptual patterns, or unconscious images. The music experience then becomes a constant variable towards these emotional stages, always in analogy with other activated psychological patterns. This musical stimulus arouses our psychological paths and fills us with excitation, especially when it triggers a state that is strongly rooted in some memory location in the brain.

But how can different mechanisms of perception process and form altogether the ongoing activities?


Associations of Activities

The processes that take place during the time when someone listens to music require sophisticated memory mechanisms involving “both conscious manipulations of concepts and subconscious access to millions of networked neurological bonds” [Miranda, 2001]. The study of these bonds becomes even more difficult because of the physiological property of the brain to show holistic behavior, distributing simultaneously the information to its whole structure.

The perception of music requires groups of neurons to gather information about the sound pressure that operates in the ear drum. Auditory neurons might consist of other groups within groups that perform changes in the firing rate of their output, as transmitted electrical signals. According to the theory of Marvin Minsky [Minsky, 1998; 1993] when this firing rate reaches the human processing unit – the brain – the information is distributed in different “agents” and their aim is to decode this firing rate into a meaningful representation of this information to the conscious mind. The agents categorize, organize, and process the assigned tasks autonomously, without knowing the state of a different agent, unless it is absolutely needed.

Although the received information is distributed in parallel, the processed information creates internally symbolic representations that make the human brain understand aspects of the external world – in our case the understanding of music – and construct the “illusion” of consciousness. As Brooks and Stein suggest [Brooks & Stein, 1994], the system that is responsible for the stimulus-response, “abstracts away from particular inputs to treat large classes of input similarly”, and that this “begins the generalization of particular stimuli into complex reactions, as well as the external appearance of categorization”.

These methods are closely related to the chemical, biological and neurophysiological reactions inside the brain that provide meaningful evidence of the operation of the human mind. The formation of the aforementioned cognitive processes into computational systems should be mathematically and algorithmically described in order for a similar behavior of (deep) intelligence and perception to emerge in computational systems.