The original concept was to create “epic orchestral arrangements” and make sheet music for school concert bands, to draw more people into the world of orchestral music and let people experience how cool it can be. In comparison to what it evolved into today, it’s a lot less visually and technologically appealing. 
The project as it stands today consists of three main elements: the music, which can be orchestral or any other genre; an interactive environment that can generate visuals based on the music; and a surround audio processing Max/MSP patch using Spat5, which is an object-based audio spatializer that can output to an arbitrary speaker arrangement or headphones.
THE UNITY ENGINE ENVIRONMENT
Due to the pandemic of 2020, it is possible that we won’t be able to use A004 as a venue to host pre-MIM. This seems like a very good excuse to use the Unity game engine, which I always wanted to do. I can try and build something that would allow us to host the event virtually and possibly experiment with visual elements. Add to the fact that it is possible to dual-use this project in both pre-MIM and my personal FYP, it seemed like a good idea.
But the direction of this project isn’t fully realised until I stumbled upon a Microsoft video on YouTube randomly, which then led me to research Project Triton. There, I learned of their wave-propagation algorithm, it allows the recreation of the realistic acoustic response of a virtual environment in real-time, it’s free and has an integration package with the Unity engine.
Up until this point, I’ve been thinking about different ways to incorporate a virtual environment with music, there are too many possibilities to choose from and I needed some constraints. Seeing Project Triton gave me an idea: It works at its best when the environment has boundaries and clutter to absorb/reflect sound, which will limit my options to either open ground or a room. I remember learning that a good way to make sci-fi or fantasy relatable is to make it so that it relates to real-life elements.
What’s more relatable than seeing A004 stuck inside your computer screen?
So I started 3D modelling in Blender, after a while, a crude construction of A004 is completed and baked with Project Acoustic(Project Triton’s game engine implementation)’s acoustic response probes inside Unity. I used the lighting plan and floor plan given to us by Allen Sir, in combination with actual photos of the venue as references, so that it’s somewhat accurate. I didn’t know what to expect during the baking process, I was quite excited when the end result actually sounds quite a bit like the real thing.
At this point, I decided to move on to the audio side of things.
MAX/MSP SPAT5 SPATIALIZATION
Spat5 is a French object-based audio spatializer that can take an arbitrary amount of input sources and render them in a virtual space dynamically (e.g. move around in all axis, make it sound like it’s coming from far away or very close) and output the result to an arbitrary speaker set up. It is very crucial to provide Spat5 with the exact location of the speakers down to the centimetre in order to calibrate the volume and delay of each speaker. When it works, the speakers will become transparent, the sound will seem like it comes from behind, in front, or in-between speakers. Or at least that’s what’s supposed to happen, I didn’t test it in real life.
I do have a virtual A004 with a total of 29 speakers though.
I first confirmed that there’s an actual audio delay from distance by Project Acoustics by placing two identical drum audio objects at two ends of the room. Then wrote a script in that goes through each speaker and gives me the distance between it and world origin, measured their azimuth and elevation in Blender and put these in Spat to recreate the speaker set up, then I experimented with the different parameters like audio source distance, room presence, running reverberance, and reverb room size inside Spat to see what it sounds like.
The original plan was to create a binaural mix using Waves Abbey Road Studio 3 in Logic, use the ‘transaural’ function in Spat to decode it into the 29 channel signal, place each channel of audio individually at each speaker object inside Unity’s A004. Then it turns out that the ‘transaural’ function only supports decoding to two speakers. As a result, I discovered that rendering a 29 channel signal from simple stereo or 5.0 sources as object sources might even be better.
Since unity doesn’t allow multichannel inputs as of yet, the process of iteration on different parameters is painful, a 29 channel file has to be recorded in Max, broken up into separate files(I used Audacity for this), imported into Unity scene.
After a while, the following setting is decided: Put stereo signals as 2 sources at plus and minus 90 degrees azimuth, each 1 meter away from the listening position, each with ‘35’ running reverberance. This would save me the trouble of remixing every song into 5.0, while also expanding the stereo sound, making the listener feel like laying inside the original mix. Although there is definitely still more room for experimentation, this is what I settled on for time and sanity constraints.
You might ask why didn’t I just put the stereo source directly as objects in Unity. That would mean there are only two specific points the sound would be coming from, they would have to be very close to and follow the camera object to not sound far away and very reverberant. It might be an interesting option but I like the idea that one can go up to a speaker and hear sounds actually coming out of it. It’s also interesting to me to hear what 29 speakers working in unison sound like. It would also pave the way for the actual live workflow when there’s a chance to use the real A004.
Headphones are necessary for the binaural output. Surround speaker output is available through Project Acoustics, but the experience can be inconsistent for different setups, plus the fact that very few people have them, I focused purely on headphones.
LIGHTING AND VISUALS
I was never really experienced in designing for lighting or live visuals, so the first thing that came to mind is that everything has to be generated, I wouldn’t hand animate every light, every angle/brightness/colour. It would take too long and it would all just be busy work if not done in style.
So I came up with a few control schemes that I think would work.
One idea is to group banks of light together, control them using a master script that takes one set of inputs and translates them into lighting controls. For example, one can control a bank of lights using a video game controller like flying a plane, the pitch is up/down, the roll is “roll”, the accelerator is how “spread open” they are like a wing, and then another control to make their movements “cascade”.
Another idea is to automatically generate light intensity and movement from the music itself, using the amplitude and pitch information. 
Currently, only the light intensity generation and cascade function are implemented, unfortunately, there’s not enough time to experiment more before the FYP deadline, and the design has to be finalized. Depending on if we get to use the real A004 or not, further lighting design might take place in either Unity or Max/MSP using MIDI to DMX converter.
The way it works currently is as follows:
The lighting system is divided into six separate Blocs, three for low/mid/high drums, three for harmoney1/2/lead. Each Bloc gets its own audio track(that is prepared separately by soloing and bouncing the instruments that need to trigger that Block of light, in other words, one can design the lighting response by choosing which instruments correspond to which Bloc), a BlocController script will read take the amplitude information from the track to calculate the target light intensity. But since directly applying the target to the light would result in a flashy light, not suitable in a classy orchestral atmosphere, a buffer is added to make the light fade away in a way that starts slow and decreases brightness faster and faster over time unless the target brightness is higher. The script can also do a cascade function, which makes the “ripple” pattern with a definable speed by adding the buffer values to the start of a list and accessing different parts of the list with different lights.
Every Bloc controller uses an instance of the same script, they all get individual variables, it automatically fetches the correct track and settings for light intensity generation.
THE ACTUAL MUSIC
It’s just 5 of 6 one-minute piano poem collaborations with Manfred, he gives me his piano MIDI and then I arrange them orchestrally, sort of like a one poem per day challenge. I’m still surprised at how much I got done in those 6 days.
At first, it seems fun, and I was having fun doing it. But as days progressed, my standards slowly crept up and I wasn’t happy with what I was writing and I felt terrible, at the time it felt like the first few songs were just flukes and I actually was very bad at arranging. It slowly got worse and worse and I wasn’t enjoying it by day 6 so we stopped. The last two songs took a long time because of that, and I was near burned out.
I looked back after a while and realised that the last two weren’t as bad as I thought, the first few songs looked sketchy in comparison. But that still didn’t mean the process was more enjoyable than it was. I think I might be able to create good music, but because of how much I actually didn’t like the process of doing it I might not do it often enough. I need to find a state of mind that allows me to write music without panic and distress.
I only added the first 5 poems in the FYP, the 6th seems redundant and I wasn’t having fun writing it anyways and it’s probably audible in some way. But I’ll have a link to it anyways.
I also added an additional easter egg as a joke, and a test vehicle, it’s not a part of FYP.
The individual music files outside of the Unity environment are mixed regarding headphones as a priority as well.
WHERE IT FELL FLAT
There are no secondary light reflections, raytracing would provide better visual fidelity, although at considerable GPU cost. 
Current lighting generation relies on direct stems from project mix, sometimes the generated result can be a bit confusing and looks de-desynced.
It is possible to mix a tailor-made stem just for lighting for better results (i.e. flatter amplitude response for low and mid frequencies in percussion).
Audio quality in Unity is sub-optimal, further experimentation with Spat may yield better results.
Current lighting calculation relies on a stable 60FPS frame rate to remain accurate, it would be nice to decouple calculations from frame time.
Overall performance could be improved by optimizing light ray distance for each light.

