I recently made this 360 video tour of Salford University’s anechoic chamber. You’ll need headphones, and I think navigating these videos works best on a mobile.
The visuals were shot using a Ricoh Theta and the audio with a soundfield microphone. The Theta camera only records mono sound, and the quality is really poor, and hence I captured the sound on a better microphone. I thought it would be really easy to change the soundtrack, but I was wrong. Editing 360 video with spatial audio is currently not straightforward. So I thought I’d describe how I did it, in case you want to do the same.
Let me know if you have any tips on stream lining the process in the comments below.
Note, when you upload a 360 video to YouTube, even after it has said uploading and processing is done, this isn’t true for 360 video. There is a subsequent processing that takes place after the video has gone live. So if you find the video isn’t 360 initially, wait for an hour and try playing it again.
360 video editing
- First turn the video into Equirectangular projection. The Theta comes with an app that will do this for you.
- Edit the video. I used Adobe Premiere CC 2015.2. Import the video and then drop it onto the ‘new item’ icon at the bottom of the window. When exporting I used QuickTime, H.264 codec, and matching the resolution, fps etc to the original.
- Update 9/9/2016 Adobe Premiere CC 2015.3 no longer has the H.264 codec option under QuickTime, so I exported as .mp4 and then used ffmpeg to convert to .mov (see below).
- The ambisonics mic produced 4 mono feeds in B-format: W, X, Y and Z. This needed to be turned into a single track in WYZX Furse-Malham format (note change of letter ordering). I used Reaper Digital Audio Workstation for this following these instructions from Morgan Utting.
- I then edited the audio. You’ll need to think about the syncing of the audio and video when you piece them back together. My video featured a balloon pop that I used.
- I used Bruce Wiggins’ plug-in to export to ACN SN3D format (again using Reaper). If you’ve not edited multi-channel audio before, Bruce has some useful instructions and videos.
Replacing 360 soundtrack with Ambisonics
For this I used FFmpeg, a tool that deals with audio and visual conversion. It is free to download, and then you need to open a command window to run it. In this case, the command to take video from one file and add audio from another is:
ffmpeg -i A.mov -i B.wav -channel_layout 4.0 -c:v copy -c:a copy C.mov
Where A.mov was the movie exported from Premiere, B.wav was the ambisonics soundtrack exported from Reaper and C.mov the output filename.
Update 9/9/2016. YouTube wants AAC audio with .mp4 files, but I didn’t have software to make 4-channel AAC. But you can use .wav with .mov. Therefore I now use ffmpeg to convert from .mp4 to .mov while adding the audio.
ffmpeg -i A.mp4 -i B.wav -channel_layout 4.0 -c:v copy -c:a copy -f mov C.mov
Updating the metadata
The metadata on the new movie will be wrong, and so needs changing. YouTube have produced a tool to do this (see https://support.google.com/youtube/answer/6178631). When you run the tool, select Spherical and Spatial Audio. This will produce a new version of the movie with the correct metadata.
Upload to YouTube
And then be patient …
Any tips on how to make this easier?
YouTube Android App
How well does the audio on the work?
Audio updated 13:25 5/9/2016. Sorry not sure what I did wrong with the first ones. (These are very slightly hissy because of the digital recorder I used).
I took the output from my Sony Mobile phone while watching the video on the YouTube app and recorded it. Here are three files to allow you to hear how well YouTube spatial audio works. They have not been compressed, so some are quite large.
The first has the viewing point for my mobile always facing the door of the anechoic. So in this the talker circles the microphone. The recording ends at the balloon burst. The talker is directly behind for the phrase, ‘sound direct from me to the microphone in the middle’. Curious false localisation happens to me for the balloon burst.
The second is the bit of talking after the balloon pop. The talker is stationary in the anechoic chamber. I did a slow, complete circle with my mobile phone (turning anti-clockwise).
The last is with the viewing position of the mobile phone so the talker is directly behind.
What do you think of the audio quality?