The No.1 Website for Pro Audio
"Stems" - is the word now diluted also in post?
Old 18th July 2022 | Show parent
  #31
Gear Guru
 
Brent Hahn's Avatar
 
1 Review written
🎧 15 years
Quote:
Originally Posted by mattiasnyc ➡️
The thread was about what people call it in post production though - i.e. sound to picture mostly.
The guy I was referring to has been a TV music supervisor since back before that job even had a title.
Old 11th August 2022
  #32
Here for the gear
 
Quote:
Originally Posted by mattiasnyc ➡️
Just wondering because I just had to explain to an online editor (!) what the difference was between "mixes" that he should use to create broadcast masters and "stems".. and it kind of surprised me to see that at his level.

In your experience, has there been a 'creep' in the definition of "stems" also in post, to where they just basically mean "the audio files"?
In Discovery channel’s tech requirements doc they call these “wave files” while internally they are referred to as “stems”. Before broadcast companies went all digital, the stems were written to a CD which was delivered with the physical tape.

I read the entire thread and felt I could elaborate particularly from the video world perspective. Below are additional tidbits that could be helpful in the world of stems—or mixes or whatever they’re called.

Since many production companies could not afford VTRs with more than 4 channels of audio, it used to be standard for a production company to deliver a tape with only 4 tracks. For instance the old Discovery 4 track requirement: (1) Full Mix L (2) Full Mix R (3) Dialogue & Narration Mono and (4) Music and Effects Mono. The rest of the audio files required for an expanded audio configuration were sent separately.

The tracks or stems are referred to as such: Full Mix, Mix Minus (or Mix Minus Narr), Music & Effects, Music, Effects, Dialogue or SOT, Narration or VO Some use Sfx instead of Efx. It’s the same. Some may split Efx and Ambience, one being added sound effects and the other nat sound of sorts. I have seen Dialogue called Nat Sound, but that’s generally confusing. I’ve seen Ambience called SOT. That’s confusing too. What we call in the US “Music & Effects,” the Brits call “Mix Minus.” (Some British companies rename their tracks for the US to prevent confusions, but you never know what you’re going to get.) Then there is the surround tracks, and yes, they usually can be delivered as stems.

I can’t imagine that broadcast companies are requiring production houses to send all 12 tracks. They recognize limitations, and sometimes they buy legacy programming which has older configurations.

Should you send a combined stereo track? Usually you don’t have to. For example, you can send a single Music track that contains both L/R or send each separately. Naturally, editors at the receiving end prefer that you send files with stereo pairs just because they are easier to load and line up, but when the stems are loaded into the editing software, they are split to two tracks, as if they are two distinct mono tracks.

While the video start time is required is typically 00:59:40;00 for 59.94i (or 10:59:40:00 for 50i) the same requirement is often not required of the audio. However, to make a video editor’s life easy, provide the same lead time so that the editor doesn’t have to line up your tracks in his editing software.

As for the pan, it doesn’t matter. The way editors work when they have additional tracks is to use “direct out.” For instance, in today’s digital world, an editor will set Avid’s Media Composer so that all the odd tracks are output to the left channel and all the even tracks to the right (which is the default anyway). That’s for practical purposes: an editor can solo tracks to listen the audio on specific tracks, and check for quality control. All these tracks are not meant to be listened to at the same time.

To be clear, if you are working on a project from scratch where you have your footage, B-roll, music, effects, narration and you’re mixing it down to full mix stereo L/R, the pan is important, whether you’re working in 2 channels or 5. But that’s not the case when you have a multi-track situation where the Full Mix has been mixed down from your original edit and then you add the various audio tracks – i.e. all the elements from which you created the full mix. In this situation, editors work in “direct out” setting which has no effect on panning.

Then there’s the bleeping. In the past everything was bleeped. Then the rules changed, only the full mix was to be bleeped. When programs are broadcast around the world, additional bleeps are added to the Full Mix and other tracks based on regional requirements. In the streaming world, the rules changed again. The deliverables should have no bleeps—but only in some instances. I’m certain the rules will change again. If the requirement is to place the bleeps on “all tracks,” that means only on the tracks where the profanity appears. Some production companies think that a bleep is a sound effect and they will place it on the Music & Efx track. Yikes! That’s nuts! Some production companies cut out the audio instead of adding a bleep. That’s frowned upon because it is interpreted as an audio dropout. Yes, the bleeps are then inserted into those gaps before the program is aired abroad. Some producers add a bleep only for effect, where originally no cuss words were spoken. In this case, the bleep is sometimes removed when the program is going to be aired abroad. Most of the world has very strict rules re profanity, except for Australia, New Zealand, and parts of Europe.

Another thing that is rather interesting, is that while the LKFS requirement for Full Mix is -24 and the max peak is -2, always go a bit lower than -2, e.t. -2.5 or -3 because different equipment measures peaks slightly differently, and a small decimal fraction calculation could result in the -2 to be calculated as -1.99 and your program could be kicked back.

These audio track combinations are typically used for broadcasting in various languages. Some regions may use only the music and effects then record new audio for the dialogue and narration (typically used for reality shows). Sometimes the narration is recorded in a different language while the dialogue is kept very low and overdubbed in that language (typically used in documentaries). I have watched a lot of Curiosity Stream documentaries in which the music and effects is turned really low while the narrator and the overdubbing of the dialogue is too high. It is clear to me that the post house that did that work kept the entire M&E low, then added the dialogue and narration. It is either incompetence or laziness—more likely incompetence. Since many are German documentaries, most likely the deliverables were high quality (as has been my experience with German video products). I have seen crappy stems where it’s clear to me that the production house has no understanding of standards or what the stems are for. Knowing what will happen to your program helps in understanding how to create quality stems.

I hope these bits of info are helpful.
Old 11th August 2022 | Show parent
  #33
Lives for gear
 
TVPostSound's Avatar
 
🎧 10 years
Quote:
Originally Posted by GregariousOne ➡️
Since many production companies could not afford VTRs with more than 4 channels of audio, it used to be standard for a production company to deliver a tape with only 4 tracks. For instance the old Discovery 4 track requirement: (1) Full Mix L (2) Full Mix R (3) Dialogue & Narration Mono and (4) Music and Effects Mono. The rest of the audio files required for an expanded audio configuration were sent separately.

The tracks or stems are referred to as such: Full Mix, Mix Minus (or Mix Minus Narr), Music & Effects, Music, Effects, Dialogue or SOT, Narration or VO Some use Sfx instead of Efx. It’s the same. Some may split Efx and Ambience, one being added sound effects and the other nat sound of sorts. I have seen Dialogue called Nat Sound, but that’s generally confusing. I’ve seen Ambience called SOT. That’s confusing too. What we call in the US “Music & Effects,” the Brits call “Mix Minus.” (Some British companies rename their tracks for the US to prevent confusions, but you never know what you’re going to get.) Then there is the surround tracks, and yes, they usually can be delivered as stems.

I can’t imagine that broadcast companies are requiring production houses to send all 12 tracks. They recognize limitations, and sometimes they buy legacy programming which has older configurations.
In the "old days" we used to send a "CRAM" Consolidated Recorded Audio Master. 22 tracks of various stems on a 2" audio tape. Along with the 1" VTR Master.
Old 11th August 2022 | Show parent
  #34
Gear Guru
 
🎧 15 years
Quote:
Originally Posted by GregariousOne ➡️
In Discovery channel’s tech requirements doc they call these “wave files” while internally they are referred to as “stems”.
Well, not really. Almost all stems these days are wav files, not all wav files are stems. That's my point and my question was quite narrow (though I surely appreciate your input).

Discovery Global actually writes that "audio deliverables include a full program mix and up to nine different audio stems and submixes.

So they do make a clear distinction between mixes, submixes and stems.

Quote:
Originally Posted by TVPostSound ➡️
In the "old days" we used to send a "CRAM" Consolidated Recorded Audio Master. 22 tracks of various stems on a 2" audio tape. Along with the 1" VTR Master.

https://www.youtube.com/watch?v=ue7wM0QC5LE&t=1s
Old 11th August 2022 | Show parent
  #35
Here for the gear
 
What Discovery’s specs mean by the stems and submixes is unnecessarily confusing. Stems are music, effects, narration, and dialogue. A Mix Minus is a submix because it is a mix of music, effects, and dialogue. A Music & Effect is a submix because it is music and effects.

The tech specification doc is heavy on the engineering side and doesn’t necessarily reflect the everyday terminology. For the departments that handle those audio files, they are all called “stems” without any differentiation between “stems” and “submixes.” In other words, if you asked a Discovery editor, “Are all the submixes available with this program?” He’d give you a funny look and might say, “What the hell are you talking about??”
Old 12th August 2022 | Show parent
  #36
Gear Guru
 
🎧 15 years
Quote:
Originally Posted by GregariousOne ➡️
What Discovery’s specs mean by the stems and submixes is unnecessarily confusing. Stems are music, effects, narration, and dialogue. A Mix Minus is a submix because it is a mix of music, effects, and dialogue. A Music & Effect is a submix because it is music and effects.
I don't find that confusing at all actually.

Quote:
Originally Posted by GregariousOne ➡️
The tech specification doc is heavy on the engineering side and doesn’t necessarily reflect the everyday terminology. For the departments that handle those audio files, they are all called “stems” without any differentiation between “stems” and “submixes.” In other words, if you asked a Discovery editor, “Are all the submixes available with this program?” He’d give you a funny look and might say, “What the hell are you talking about??”
I actually don't recall a single person I've ever worked with referring to something as a "submix".

But the problem isn't that, the problem is with referring to the entire set of audio deliverables as just "stems" including mixes, be they submixes or the full mix.

As was already mentioned, when editors and others can't understand the difference between a mix and a stem we're bound to have problems, such as dumping everything into an NLE and playing it all back at the same time to the main stereo mix output.
Old 12th August 2022 | Show parent
  #37
Here for the gear
 
Quote:
Originally Posted by mattiasnyc ➡️
I don't find that confusing at all actually.

I actually don't recall a single person I've ever worked with referring to something as a "submix".

But the problem isn't that, the problem is with referring to the entire set of audio deliverables as just "stems" including mixes, be they submixes or the full mix.

As was already mentioned, when editors and others can't understand the difference between a mix and a stem we're bound to have problems, such as dumping everything into an NLE and playing it all back at the same time to the main stereo mix output.
Not disagreeing with you. As a techie, I like accuracy. The world of buying & selling programs worldwide in a standardized format is unknown to those who never deal with the industry giants and hard to grasp without hands-on experience. It is not taught in schools and it is foreign to many instructors. It is scary what the young ones don’t know about the industry. I’ve trained editors in this biz. Young hopefuls think the video biz is about creativity and creating and editing their own hit show. But this part of the business is huge because local markets (in the respective countries) aren’t big enough to pay for the high cost of production, and there is a huge appetite to buy and sell globally. For sure, there is a huge appetite to consume what is produced in the US & Canada.
Old 12th August 2022 | Show parent
  #38
Lives for gear
 
🎧 15 years
Quote:
Originally Posted by GregariousOne ➡️

I can’t imagine that broadcast companies are requiring production houses to send all 12 tracks. They recognize limitations, and sometimes they buy legacy programming which has older configurations.
They do. Don't confuse the difference between acquisition products and original programming.
If an acquisition department buys a program sometimes they just buy what they get because they really want that program.
Original programming commissioned by the same broadcaster requires adhering to their technical specifications, and if that includes delivering 12 tracks or more, that's a mandatory requirement.

Quote:
These audio track combinations are typically used for broadcasting in various languages. Some regions may use only the music and effects then record new audio for the dialogue and narration (typically used for reality shows). Sometimes the narration is recorded in a different language while the dialogue is kept very low and overdubbed in that language (typically used in documentaries). I have watched a lot of Curiosity Stream documentaries in which the music and effects is turned really low while the narrator and the overdubbing of the dialogue is too high. It is clear to me that the post house that did that work kept the entire M&E low, then added the dialogue and narration. It is either incompetence or laziness—more likely incompetence. Since many are German documentaries, most likely the deliverables were high quality (as has been my experience with German video products). I have seen crappy stems where it’s clear to me that the production house has no understanding of standards or what the stems are for. Knowing what will happen to your program helps in understanding how to create quality stems.
Don't jump to conclusions like 'due to incompetence' too fast.
You never know what the post house had to deal with.
Very often you only get a mix minus to dub to foerign languages. If you get DME stems, these are often corrupt in one way or another (dipped stems f.e.)
If you only have the mix minus, you have to ride it down during dubbed interviews, resulting in double-dipping of music and effects since they were already dipped under the interview in the original language mix.

And narration and interviews are king in the documentary world. You never get feedback that these are too low in the mix from stakeholders or consumers. Only ever that music or effects are too loud in comparison.
[/QUOTE]
Old 12th August 2022 | Show parent
  #39
Here for the gear
 
Quote:
Originally Posted by kosmokrator ➡️


Don't jump to conclusions like 'due to incompetence' too fast.
You never know what the post house had to deal with.
Very often you only get a mix minus to dub to foerign languages. If you get DME stems, these are often corrupt in one way or another (dipped stems f.e.)
If you only have the mix minus, you have to ride it down during dubbed interviews, resulting in double-dipping of music and effects since they were already dipped under the interview in the original language mix.

And narration and interviews are king in the documentary world. You never get feedback that these are too low in the mix from stakeholders or consumers. Only ever that music or effects are too loud in comparison.
[/QUOTE]

I'd like to clarify. Since I was referencing German documentaries, I have yet to see crappy work coming out of Germany, which is why I was guessing incompetence. Same for the UK. They have VERY high standards. Australia comes close. As for the rest of the world, including Canada and the US, There's crap galore, and in that case I would never assume incompetence.

Yes, I agree you never get feedback. It is one of my pet peeves that there isn’t enough communication between the production companies who create the original content and the video people at the other end of the biz who repackage the product for global audiences. I don’t think it is the video peoples fault on either side. It’s all the gatekeepers in between, without whom we can all communicate and do a better job.
Old 12th August 2022
  #40
Lives for gear
 
gsilbers's Avatar
 
🎧 15 years
Quote:
Originally Posted by mattiasnyc ➡️
Just wondering because I just had to explain to an online editor (!) what the difference was between "mixes" that he should use to create broadcast masters and "stems".. and it kind of surprised me to see that at his level.

In your experience, has there been a 'creep' in the definition of "stems" also in post, to where they just basically mean "the audio files"?
IT might be a specific scenario that this happened so who knows.
This might be one of my guesses...

In recent years many productions have been shifting the deliverables to the encoding departments where there might be an IMF deliverable in which the final video has all the audio files in the same package.
And sometimes including the international dubbing files, subtitles etc. Many studios nowadays that deal with both strreaming and broadcasting are delivering both the orignal mix and dubbed audio at the same time to distributors to run the "premieres" on all countries at the same time. So your first run will be global already translated all showing at the same time and day (of their timezone of course).
Maybe they got a working print for translation/dubbing, once the fullyfilled M&E has been completed dubbers will turn it aournd in 48hr or less.
So the online editor gets your main mix (stereo) and outputs the final master as well as a pseudo textless version, a network version with lower thirds and what not, and then gives it to the encoder engineer along with the english/original stems+different languages and they put it all together.
Maybe there are audio guys working there as well dealing with editing your stems to be frame acurate and so on.
So it might be that some online editors are not really using your stems.. or maybe just a couple and they deliver to either another dept on their premise or a different studio altogether who QCs every language and final mix as well.
These deliverables have been getting more tech dense as they incorporate Atmos, different language audio and subtitle/forced text, versions for different countries, HDR, dolby vision, one hundred file specs with a millions metadata and naming conventions. so the encoding team is dealing more and more with the final end result that will live in movie studios archives and broadcast/strreaming files.
And theres less and less reason to do it at different stages like waiting months for dubbers and subtitle poeple if the same mega corp who own broadcasters and streaming medias or has deals with them are also dealing directly with dubbers around the world, unlike in the past where local countries broadcasters would order the dubbing, mix it or edit however they want and send it years later to LA to do a streaming IMF file and getting tons of QC rejections and not fitting in the IMF package. So your final stems goes to an audio guy who works with the online editor, and they deal with packaging making it pretty for the encoding engineer who makes the IMF with everything in there , gets QCed and then delivered.

but thats my guess along with online editors not being taught some better audio stuff and music at schools. Or its a young assistant.

Last edited by gsilbers; 12th August 2022 at 10:03 PM..
Old 14th August 2022
  #41
Gear Nut
 
🎧 10 years
When I started out the terms were;
Units
Stem
Mix
Printmaster

Now it is quite confused and something like
Session file
Stem
Stem
Mix (also sometimes called Stems)

Who makes these rules?
📝 Reply
Post Reply

Welcome to the Gearspace Pro Audio Community!

Registration benefits include:
  • The ability to reply to and create new discussions
  • Access to members-only giveaways & competitions
  • Interact with VIP industry experts in our guest Q&As
  • Access to members-only sub forum discussions
  • Access to members-only Chat Room
  • Get INSTANT ACCESS to the world's best private pro audio Classifieds for only USD $20/year
  • Promote your eBay auctions and Reverb.com listings for free
  • Remove this message!
You need an account to post a reply. Create a username and password below and an account will be created and your post entered.


 
 
Slide to join now Processing…

Forum Jump
Forum Jump