Design and Implementation of an Interactive Mixing Environment for VR
The ultimate goal of immersive audio is to completely erase the distinction between simulation and reality by challenging and leveraging knowledge in the fields of acoustics, the physics of sound, and human anatomy. This project was to investigate these ideals by gaining a detailed understanding of current theories and practices in immersive audio and employing that knowledge to create an interactive music experience and tool.
The first step in this project was to create a virtual environment with 360-degree video and immersive audio. This step primarily employed techniques in recording and production to capture audio signals that possessed certain qualities, like object and reverberant information. Using a combination of different software plugins, DAWs, and gaming engines, the next steps included signal processing and virtual reality (VR) implementation. The final product can be described as interactive mixing. This VR application will give the user the ability to point to any instrument or player in a 360-degree video scene and apply different audio effects. The audio effects, whose parameters will appear on a graphical user interface (GUI), will include volume control, reverb, filtering, and equalization (EQ).
For this project, a series of spot microphones and Sennheiser's AMBEO ambisonic (first-order ambisonic) microphone and 360-degree camera were used.
(A recording session James L. Dolan studio)
A to B format Conversion & Unity Pre-settings
After the recording session finished, The first thing that was needed was to sync the recorded video with the multiple tracks. This process was aided by one of the performers clapping their hands as a symbol of 'rolling' before each piece. Then, there were different ways to process the two kinds of signal. For the spot microphones, we simply bounced them out. To deal with the ambisonic signals, the Ambeo plugin was used to convert four mono A-Format signals into a single B-format file that contains each of the four channels inside. For the specification of Ambeo plugin, there was 0-degree default settings for microphone rotating, no ambisonics correction filter and the low cut filter was applied, upright position, and ambit output format, which is a suggested format according to previous research and official unity documents.
After preparing the audio file well, Unity 5.6.4 was selected as the gaming platform according to our requirements for mounting 360-degree videos and processing the object and ambisonic signals. A sphere-shape game object was built as the carrier of the 360-degree video, in which the camera position was located in the center of the sphere and video rendered on the inner layer of the sphere. Six audio sources were placed on the circumference as the carrier of the signals generated by different spot microphones, which the audio sources’ position corresponds to the players’ positions in the video. According to the official documents from unity, the ambisonic file should be a single B-format WAV file with ACN component ordering and SN3D normalization.
To run this project in a virtual reality environment, the HTC VIVE was selected as the target platform for playback. Besides that, the Steam VR and VRTK offer all the scripts and tools which contributed build the connection between the VR equipment and Unity game engine. Since the final goal of live mixing is an interactive operation, it was necessary to create a virtual interface to let the user modify the audio values in Unity. A Graphical User Interface (GUI) mixing panel was built and attached in the VR scene, to allow the user to see the fader that modifies the volume of each musical instrument. VRTK offers the tool kits in which the controller event script can edit the function of each button on controllers. In this project, users need to hold the grip button to activate the mixing panel, then using the laser of the pointer to indicate the specific position for fader. Finally, the user can move the fader by holding the trigger button.
AmbiX Plug-in suite
The AmbiX plug-in suite is a set of VST plug-ins created by Matthias Kronlachner that can be applied flexibly in digital audio workstations and signal processing softwares such as Reaper or Max/MSP. The plug-in allows users to produce and edit ambisonic audio content either during a recording session or during post-production processing. Using a diverse set of preset virtual loudspeaker setups, it provides many options for decoding ambisonic signals into various playback systems, ranging from 5.1 to 22.1 multichannel loudspeaker setups as well as binaural audio through headphones. Additionally, intended ambisonic order corresponding to the numbers of used ambisonic channels can be selected. For our project, the Sennheiser AMBEO microphone (First-order Ambisonics) was used and four channels (A format) were obtained. The suite uses the AmbiX ambisonic format - ACN channel ordering and SN3D normalization.
SABRE (SOFA/AmbiX binaural rendering) is an open-source toolkit for customizing the AmbiX ambisonics to binaural renderer, which is included in Kronlachner’s plug-in suite. SABRE provides a collection of MATLAB functions that enable developers and engineers to apply personalized HRTFs to the decoder, so that an ideal binaural decoder with appropriate HRTFs can be achieved. To do so, intended HRTFs data sets are required to be in “SOFA” format (Spatially-Oriented Format for Acoustics), which contains space-related audio data such as HRTFs and binaural or directional room impulse responses (BRIRs & DRIRs), and has been recently standardized by Audio Engineering Society (AES) as AES69-2015. During HRTF processing in MATLAB, the SABRE toolkit provides multiple options to carry out HRTFs modification such as interpolation and equalization in order to optimize the HRTFs, depending on the decoder used and intended playback system.
To make both the monophonic signal from each instrument and the ambisonic signal in Unity spatialized, the Native Spatializer for Unity offered by Oculus was used to make it possible for a user to localize the sound source, changing relative to the head movement in a 3D environment through HMD (head mounted display). The Spatializer is based on a previous audio SDK, but is made especially for Unity’s VR environment. Additionally, it can implement the auditory principles mentioned previously, for use in the VR environment: First, it applies the concepts of ITD and ILD on different frequency ranges. If the majority of the frequency content is greater than 1500 Hz, the level difference between two ears will be the primary factor for lateral localization. On contrast, a lower one, especially between 500-800 Hz, will rely more on the interaural time difference, since the low frequency sound will be more difficult to spatialize. For front/back/elevation localization, the spectral modifications are used to simulate the realistic filtering caused by the head and body. Moreover, HRTFs are also applied as the selection filter corresponding to the head motion. Based on the existing HRTF databases, in which several data sets are recorded to represent a signal being heard from different angles, a sound source with known directional information can be applied with the appropriate HRTF data for a realistic sound. And usually this process can be finished by implementing convolution on time domain or fourier transformation pairing. Other typical approaches are also used such as loudness modeling, high frequency attenuation, and appropriate mixing with the dry signal, delay, and reverberation. By implementing these methods, the sound source can be spatialized more accurately.
Product for improvisation (Musical practice/education)
Also with the popularization of VR, players can do more incredible things through this platform, which means a large potential market. By allowing users to play their own MIDI instruments in this project, there is more to be done beyond a simple VR experiment. Currently, users can gain the interactive experience on musical instruments by playing the MIDI keyboard in this project. Therefore, it’s possible to develop this project as a product for improving musical skill. A realistic feeling of presence that VR equipment brings would be helpful if a musician want to improve his or her live playing. Also, this project makes it easier to play with others and develop the user’s musical motivation on existing works in a way that would be impossible in real life.
(First attempt of the virtual ensemble)