boii

Abstract

This Project deals with the use of artificial intelligence as a creative medium for generative sound design. Further, it stands for the goal of exploring the possibilities of making man-machine interactions a matter of bidirectional sign-processes in a semiotic sense. To serve its case, the project consists of:

  1. a written thesis on the semiotic theory of “agents” in man-machine interactions;
  2. a project for generative real-time audio synthesis using artificial neural networks;
  3. this web site for presenting the findings of both previous points in conclusion.

If you would just like to hear what it sounds like, you can skip ahead to the Examples.

If you like code, please visit this project on GitHub!

I. Designer's Deputy

The term “Designer’s Deputy” was coined by Clarisse de Souza as one of the core concepts for her approach to a semiotic theory of designing user interfaces. This theory “Semiotic engineering” suggests a directionality to the flow of information during man-machine interaction where it’s the designer sending a message to the user through a medium.

The user is, therefore, a receiver of pieces of information that are supposed to facilitate the reach of certain goals. These pieces of information are artefacts of design processes within which designers anticipate the context of their application.

The diagram below [Fig.1] illustrates the space that is drawn by the moment of interaction when viewed from the perspective of semiotic engineering. Here the computer is the deputy of the designer, communicating one-shot messages of meta-communicative value to the user through predefined channels and codes.

[Fig.1] An ontology of the HCI-design space according to the theory of semiotic engineering.

These one-shot messages consist of signs whose meaning lies within the relations between each other, and their meaning communicates some of the designer’s knowledge on how to accomplish a certain task.

Therefore, each sign that gets to be read and understood by the user transmits some information about how to understand the other signs in the system, and each piece of knowledge derived from its set stems from the designer of the interface. Thus the instance that renders the designer’s message can be called “Designer’s Deputy”.

In a nutshell, this means that man-machine interaction would effectively be restricted to unidirectional user-designer semiosis, even when extending to responsive and artificially intelligent interfaces as designers’ deputies.

What does this mean for the design of intelligent interface agents? On the one hand, it motivates the design of meaningful interfaces that communicate through interaction itself, easing the use of increasingly complex applications, as it gives the designer a voice during interaction time.

On the other hand, it renders the Peircean concept of “perfect semiosis”, which results from the recursive interpretation of signs as other signs and so on, impossible, as it suggests that there is, in fact, no interpretant being formulated by the computer.

In other words:
We cannot exchange meaning with a computer. Semiosis during interaction happens on the user's side only, based on the signs that result from the semiotic process of designing the interface.

To read the full paper on this topic - visit my page on academia.edu

II. Bidirectional Interface

The bidirectionally oriented intelligent interface-agent makes a step in the direction of designing Interfaces that transcend the boundary described before. Trying to design an interface within which information flow occurs bidirectionally means trying to generate a relative response to any meaningful input that might be sent by the user at the time of interaction.

Within boii this is achieved by using artificial neural networks which return a prediction on arbitrary audio-input, according to the relationships that have been learned from a large set of solo-piano recordings composed by Frédéric Chopin and to what has been input just before. The machine-learning algorithm in boii is composed of artificial recurrent neural networks, RNNs [Fig.2].

In machine learning and cognitive science, artificial neural networks are a family of models inspired by biological neural networks (nervous systems, brains) and are used to estimate or approximate functions that can depend on a large number of inputs and are generally unknown. Artificial neural networks are generally presented as systems of interconnected “neurons”. The connections have numeric weights that can be tuned based on optimisation, making neural nets adaptive to inputs and capable of learning.

[Fig.2] A recurrent artificial neural network with one hidden layer.

Deep RNNs consist of many such individual networks stacked one after the other.
The three deep RNNs within boii have the following architecture:

  • Network A
    is 13 units deep with each unit consisting of 250 neurons.
  • Network B
    is 7 units deep with each unit consisting of 511 neurons.
  • Network C
    is 9 units deep with each unit consisting of 511 neurons.

Knobs

You can change the behaviour of the boii by using the knobs on the control-surface to mix/crossfade the outputs of the different networks.

Turning the right knob against the clock will bias the mix so you will only be able to hear Network A, while turning it clockwise will make you hear the mix determined by the left knob.

Turning the left knob will consequently determine if you hear Network B or Network C.

Keys

The 49 keys allow you to send notes with pitches ranging from C1 to C5 to the boii.

Pitch and Noise

Moving this lever to the left and to the right will pitch the note(s) determined by the keyboard up or down consequently. Note that the pitch sent to the boii is continuous, but the output is quantized. This is because the algorithm has been trained with solo piano music which doesn't contain continuous pitch changes.

Pushing the lever away from you will fade the notes played on the keys into noise.

Proximity Sensor

Holding your hand over the proximity sensor will add audio from a microphone to the inputs of the RNNs. You will now hear anything coming through the microphone additionally to the keys through the boii.

[Fig.3] Current version of the boii keyboard-interface.

Examples

Press the play button to play an example. Drag the blue dial around the play button to crossfade between the original audio and the audio generated by the boii.

[Example 1] Ambience in a gallery.

[Example 2] A musician playing with a modular synthesizer through the boii.

[Example 3] An acapella vocal track through the boii.

Footer

This website and the project around it have been created by Alexander Morosow partly to achieve a Bachelor of Arts degree in Interaction Design at BTK Berlin, Nov. 2016.

If you would like to contact me, please do so at alexander.morosow@btk-fh.de