We do that.
There is no standard yet. The Two Big Ears solution is working well, as are others (Two Big Ears providing the only Pro Tools based workflow that I know of yet).
The main problem to understand is that the sound field has to be shifted in real time according to the sensor input of the HMD and then dynamically rendered to virtual binaural.
If you have someone in the video in front of you, he has to come out of the center. If you turn your head, the voice has to dynamically pan as well. Other sounds as well. Again, some others not (general ambiance, music, narration etc.)
For this functionality to work, the according proprietary technology has to be implemented (via APIs) into the player/app.
So if your client has her own VR ecosystem in place, you can work with whatever 3D sound API is built into it.
If he doesn't then just stick to good old stereo. The same if he intends to publish his 360 videos on Facebook/Youtube.
As for Unity, it's one of the engines used a lot for VR experiences. Quite a lot of them are actually Unity-based apps. So you can use whatever API is available as a Unity plugin (almost all of them).
However, for pure 360/spherical videos, Unity is not a very good player choice. I know, I built one myself in Unity for testing purposes.
As you can see, with spherical videos, the main problem at this point is actually the distribution.
However, it's a super interesting topic that is changing on a daily base.
Last edited by kosmokrator; 9th February 2016 at 09:21 PM..