Our last technical blog post talked about color management, including considerations for maintaining color accuracy through the post process by keeping displays & projectors calibrated and understanding how each application manages the colors you see. We alluded to the fact that at face value it can seem pretty complicated to manage color between all of the different kinds of camera technologies and technical standards, especially when dealing with multiple delivery specifications.
With this post we want to go into more detail about the issue of color grading for multiple standards, and how the new paradigm for color correction simplifies the process and keeps your grades more futureproof. We’re going to examine a few things today. First, how do computers process colors; second, where the problems we’re trying to fix come from; third, what the solutions are; and lastly, how to implement the solutions and why.
In previous posts (here, here, and here) we’ve mentioned the importance of using a color space agnostic color engine when doing specific color correction tasks - something like DaVinci Resolve Studio. As a quick review, a color space agnostic engine is like a glorified R-G-B or R-G-B-Y calculator: It takes a specific set of decoded R-G-B or R-G-B-Y data, applies a specific set of mathematical operations based on the user input, and outputs the new R-G-B or R-G-B-Y data.
Agnostic color engines don’t care about what the data is supposed to mean, instead simply producing the results of an operation. It’s up to the user to know if the results are right, or if the operation has put values out of spec or create unwanted distortions. This is a double edged tool, since it places far more importance on user understanding to get things right, while being powerful enough to apply its corrections to any combination of custom situations.
As an example of how an agnostic engine works, let’s look at three of the simplest color correction operations: lift, gamma, and gain, operating strictly on the brightness (Y) component of the image.
Lift operates essentially as a global addition value: add or subtract a specific amount to each pixel’s value. Because of the way that traditional EOTFs work and our human perception of brightness changes, lift tends to have the greatest effect on the darks: quickly raising or lowering the blacks while having a much more reduced effect on the mids and lights.
Gain operates essentially as a global multiplication value: multiply or divide the value of each pixel by a specific amount. Since the operation essentially affects all tones within the image evenly, all parts of the image see a similar increase or decrease in brightness, though once again because of the EOTF considerations it has the greatest effect in the brights.
Gamma operates as an exponential value adjustment, affecting the linearity of values between the brights and the darks. Lowering the gamma value has the effect of pulling more of the middle values towards the darks, while raising the gamma value has the effect of pushing the middle values brighter. Once again, it still affects the brights and the darks, but at a much lower rate.
Notice that these operations don’t take into account what the data is supposed to mean. And with new HDR EOTFs, especially with the Perceptual Quantization EOTF, you may find extreme changes across the image with very small values, which is why I recommend adding a roll-off curve as the last adjustment to your HDR grading workflow.
The combination of lift, gamma, and gain allow the colorist to adjust the overall brightness and contrast of the image, with fairly fine granularity and image control.
Compare these functions of an agnostic engine to their equivalents in a color space dependant engine. In a color space dependent engine you’re more likely to find only two adjustments for controlling brightness and contrast: brightness and contrast.
The brightness and contrast controls tend to be far more color space dependent, since they’re designed to affect the image in a way that more evenly affects the brightness or contrast along the expected EOTFs. For the end user, this works a far simpler and often faster approach for minor corrections, at the expense of power, precision, and adaptability. Which hasn’t been too bad of a trade off, so long as all digital video data operated in the same color space.
But adding support for new color spaces and EOTFs to a brightness and contrast operation requires rewriting the rules for how each of the new color spaces behave as digital RGB values. That takes time to get right, and is oftimes not done at all. Meaning that color space dependant engines tend to adapt more slowly to the emerging standards, and there’s no clear path for how to implement the upgrades.
Every color engine, whether we’re talking about a computer application or a chip found in a camera or display, makes assumptions about how to interpret the operations it's instructed to do. Where the engines lie on the scale from fully color managed to completely color agnostic defines how the operations work, and what effect the ‘same’ color transformation has on the image.
The overall point here is that the same color transformations applied to different color spaces have different effects on the end image. A hue rotation will accomplish something completely different in Rec. 709 than it will in Rec. 2020; standard gain affects HDR curves in ways that are somewhat unpredictable when compared with SDR curves. Color engines can either try to compensate for this, or simply assume the user knows what he or she is doing. And the more assumptions any single operation within an engine makes about the data, the more pronounced the differences if it’s applied to another color space. These seemingly small differences can create massive problems in today’s color correction and color management workflows.
Understanding the Problem
With that background in mind, let’s explore where these problems come from.
Here’s something that may come as a shock if you haven’t dived into color management before: every camera ‘sees’ the world differently. We’re not just talking about the effects of color temperature of light or the effects of the lenses (though those are important to keep in mind), but we’re talking about the camera sensors themselves. All things being equal, different makes and models of camera will ‘see’ the same scene with different RGB values at the sensor. In inexpensive cameras you may even see variation between individual cameras of the same make and model.
This isn’t a problem, it’s just how cameras work. Variations in manufacturing process, decisions about which microfilters, microlenses, and OLPF to use, and the design of the sensor circuitry all play a part in changing the raw values the sensor sees. But to keep things consistent, these unique camera RGB color values are almost always conformed to an output color standard using the camera’s image processor (or by the computer’s RAW interpreter) before you see them.
In the past, all video cameras conformed to analog video color spaces: NTSC/SMPTE-C, PAL, etc., and their early digital successors conformed to the digital equivalent standards: first Rec. 601 and then Rec. 709.
When it comes to conforming camera primaries to standard primaries, manufacturers had two choices: apply the look effects before or after the conforming step. If you apply color transformations before the conforming step, you often have more information available for the change. But by conforming first to the common color space, color correction operations would behave the same way between different camera makes and models.
Most camera manufacturers took a hybrid approach, applying some transformations like gain and white balance before the conforming step, and then applying look effects after the conforming step. And everything was golden, until the advent of digital cinema cameras.
Digital cinema class cameras started edging out film as the medium of choice for high quality television and feature film production a decade ago, and now vastly outnumber the quantity of film-first productions. And here’s where we run into trouble. Because digital cinema uses a different color space than digital video: DCI-P3. Oh, and recently the video broadcast standards shifted to a much wider color space, Rec. 2020, to shake off the limiting shackles of the cathode ray tube.
Color space selection suddenly became an important part of the camera selection and workflow process, one that few people talked about. Right from the get go the highest end cameras started offering multiple spaces that you could shoot and conform to, one of which was usually camera RGB. But changing the conforming space means that any corrections or effects added to the image after conforming behaved differently than they did before, and many user generated looks would be color space dependent.
To fix this, many (but not all) digital cinema camera manufactures moved the ‘look’ elements of their color processing to before the conforming step. This way, regardless of which color space you, as the operator, chooses any looks you apply will have the same effect on the final image.
Which is fine in camera, and fine through post production. Unless your color correcting platform doesn’t understand what the primary color values mean, or if it can’t directly transform the values into your working space. Then you need to create and add additional conforming elements as correction layers, which can increase the computational complexity and reduce the overall image quality.
Oh, and if you start working with multiple cameras with different look settings available, you can get into trouble almost instantly, since there isn’t usually a simple way of conforming all of them to your working space if it’s not Rec. 709.
Oh, and you may have to deliver to all of the different color spaces: Rec. 2020 for 4K television broadcast, DCI-P3 for your digital cinema delivery, Rec. 709 for HD Blu-ray and traditional broadcast, and Rec. 601 for DVDs. And for sanitiy’s sake, let’s add HDR.
And don’t break the bank.
What if there was a way to make sure that a) all of your looks would move simply between cameras, regardless of make and manufacturer, and b) you could color grade once and deliver in all formats simply, without needing to manage multiple grades?
There is a way to do it: create a new RGB color space that encompasses all possible color values, and do all of your color corrections there.
Here’s the block diagram:
Camera RGB -> Very Wide Gamut Working Space (Log or Linear) -> Color Correction / Looks -> Tone Map to Standard Space
By mapping all of the camera sensor values into the same log or linear space, with very wide RGB color primaries1, you can make sure that you have access to all of the image data captured by every camera and that all operations will have the same effect on all images.
But what do I mean by a “very wide gamut RGB space”? There are two types of gamuts I’m talking about here, both of which have advantages and disadvantages.
The first kind is a color space with virtual RGB primaries: the RGB color primaries land outside of the CIE 1931 color colorimetry diagram. Remember that CIE 1931 maps the combinations of various wavelengths of light and the perceivable colors they produce onto an X-Y coordinate plane, and that any color space requires at least three primary color vertices on this chart. But since the chart is bigger than all mapped colors, you can put these values outside of the actual set of real colors.
By putting the values outside of the visible color ranges you’re defining ‘red’, ‘green’, and ‘blue’ values that simply don’t and can’t exist. But they end up being quite useful, because when you map your primaries this way you can define an RGB color space that includes up to all possible color values. Yes, you could simply use CIE XYZ values to map all colors, but all of the math needed for color manipulations would need to be redefined and rebuilt from the ground up (and it always requires at least 16 bits precision). But an RGB space with virtual primaries allows you to use standard RGB math, while maintaining as many colors as possible.
Examples of color spaces using virtual RGB primaries include the color space defined by the Academy of Motion Picture Arts & Sciences, ACES AP0, and many manufacturer specific spaces like Sony S-Gamut3 / S-Gamut3.cine, ARRI Log C Wide Gamut, Canon Cinema Gamut, and the new RedWideGamutRGB found in RED’s IPP2.
The catch with these virtual primaries is that many operations that you as a colorist may be accustomed to doing won’t behave exactly the same way. The reason being is that the RGB balance as it relates to hues and saturations doesn’t quite apply the same way. Without getting mired in the details, the effects of these operations are related to the relative shape of the triangle produced by the RGB color primaries, and the color space triangles using all virtual primaries tend to be more dissimilar with the traditional RGB color spaces than the RGB spaces themselves.
So instead some wide gamut formats use all real, or mostly all real primaries to somewhat match the shape (i.e. color correction feel) of working with the smaller color gamuts. A couple of examples here are Rec. 2020 (called Wide Gamut on 4K televisions), Adobe Wide Gamut, and ACES AP1. While not covering all possible color values, these spaces cover very large portions of the visible color gamut, making them very useful for color correction working spaces.
Whichever very wide color space you choose to work in is up to you and your needs. If your company or workflow requires ACES, use ACES. If you’re only using one type of camera, such as a RED Weapon or an ARRI Alexa, you may find it beneficial to work in as specific manufacturer’s RGB space.
For most of the work we do here at Mystery Box that’s destined for anything other than web, I typically conform everything to Rec. 2020 and do my coloring and mastering in that space. There are a couple of reasons for this:
As a defined color spaces it uses real, pure wavelength primaries. Meaning that so long as only 3 color primaries are used for image reproduction, it’s about as wide as we’ll ever go.
It encompasses 100% of Rec. 709 / sRGB and 99.98% of DCI-P3 (only losing a tiny amount of the reds)
It encompasses 99.9% of Pointer’s gamut, a gamut that maps all real-world reflectable colors (not perceivable colors, just those found in the real world) onto the CIE XYZ gamut - essentially every color producible through the subtractive primaries.
While it behaves differently than DCI-P3 and Rec. 709, they all behave fairly similarly so the learning curve is low.
It requires fewer tone mapping corrections for the final output.
Whether these reasons are convincing enough for you or not is up to you. Personally, I don’t find the 0.02% of DCI-P3 it doesn’t cover to actually matter, nor do the set of greens and blue-greens it doesn’t store (and no three color system can produce). These differences are so small that only in the absolute best side-by-sides in a lab could you hope to see a difference.
Whatever you do choose to use as a working space, it’s worth investing the time to pick one and stick with it. Since the grading transformations do behave differently in the different color spaces, it’s easiest to pick one and refine your technique there to get the best possible results.
Looking at the generalized workflow block diagram you’ll want to consider how to implement the different conversion steps for your own productions, in order to maintain the highest quality image pipeline with the lowest time and resource costs. So let’s go into the two main places that you need to make new choices in the pipeline, and how to plan for them.
Conforming Camera RGB to Very Wide Gamut Working Space
Moving from Camera RGB to a very wide gamut space is a slightly different process for each camera system, and can depend on whether you’re capturing RAW data or a compressed video image.
When you’re using RAW formats, you’ll manage this step in the color correction or DIT software, which is the prefered workflow when image quality is paramount. If you’re recording to non-raw intermediate format, like ProRes, DNxHR, or H.264 or any other flavor of MPEG video, you’ll need to select camera settings that best match your target wide gamut space.
Most RAW formats ignore camera looks applied by the operator and store the color decisions as metadata, but most video formats don’t. Once again, camera settings vary, so it’s important to look at your specific system and run tests to find out where looks are applied in your camera’s image pipeline, and whether you can add a separate look on the video outputs for on-set monitoring while capturing a flattened LOG or linear image.
If your camera can’t separate the looks applied to the video files and the video output, and you want to capture a flat image but need to see it normalized on set, loading LUTs into on-set monitors is the ideal choice for image monitoring. The process of creating and applying monitoring LUTs varies with your workflow, but we often find ourselves using a two-step process that uses Lattice to generate color space conversion LUTs, which we bring into DaVinci Resolve Studio to add creative looks and generate the final monitoring LUT.
Some cameras or DIT applications export their look settings as CDLs, LUTs, or other metadata for you to use later in the grading process, which you can then apply in post as the starting point for the grading process. Again, workflows vary.
Generally you’ll want to move directly from camera RGB into the working space to preserve as much sensor information as possible. That implies that you need to decide what your working space will be before capture (ACES AP0 or Rec. 2020 are recommended for future broadest compatibility), though it’s sometimes not an option. While RAWs maintain camera primaries and allow you to jump directly into a wide working space later, if you’re forced to conform to standardized RGB for video intermediates you'll need to make that decision as early as possible. In that case, put them into the widest color space available by the camera, whether that’s Rec. 2020, DCI-P3, or the manufacturer’s proprietary Wide Gamut space.
If RAWs aren’t an option, using a 12 bit log format video is your next best choice. 10 bit is fine too, but you won’t get as clean of corrections later, and may see some banding in fine gradients. Anything less than 10 bits per channel create severe problems when color grading and really should only be used as a last resort. When recording using an 8 bit format, you should only use a standard SDR EOTF (never LOG) - LOG with only 8 bits of precision can create MASSIVE amounts of banding.
To summarize: To maintain the highest image quality with the smallest resource pain, use RAW formats when possible, convert to the working or widest color space if you have to record as video files, and use LUTs on display outputs to avoid baking camera looks into the video data.
Tone Mapping the Working Space to Output
Moving from a wide working space to a final deliverable space is generally relatively simple process: simply convert each color value from the working space to the color value equivalent of the target space, and discard any data that lands outside of the target range. In most Rec. 2020 -> DCI-P3, or Rec. 2020 -> Rec. 709 conversions, this is completely fine. You may find minor clipping in a few of the most saturated colors, but overall you shouldn’t see too many places where the color is so bad you can’t live with it.
Where you do run into problems is when you’ve graded using an HDR transfer function and are moving into SDR. A straight translation here results in very, very large amounts of clipping. I haven’t mentioned EOTFs much yet, simply because most color engines where you’ll be doing wide gamut work use linear internals, since that tends to offer the most dynamic range and manipulation potential.
However, displays rarely offer a linear EOTF and so you’ll have to be monitoring in some transfer function or another. Display monitoring is another reason I typically grade in BT.2020 (and usually in HDR), since displays need to be set to a specific color space and EOTF. Which means that if you’re using a very wide working space, you must apply a tone map to your monitoring output, regardless of whether you’re grading in HDR or SDR (especially when you’re working in linear light).
The first series we published here on the blog about HDR video included a section on “Grading Mastering, and Delivering HDR”, where we presented a few bezier curves you can apply as the last element in your node structure for HDR grading in PQ or HLG. These bezier curves are essentially luminance tone maps, converting the linear light values into the specific range of digital values you use for HDR.
A full tone map typically includes considerations for converting color information as well. Just like the bezier curves control the roll off of the lights into your target range, tone maps roll off color values between color spaces, to minimize the amount of hard clipping. Here its important to exercise caution and experiment with your specific needs before selecting a tone map, since this step can create hue or saturation shifts you don’t expect.
Tone mapping is the golden goose of simplifying multi-space color corrections. It’s what brings everything together by making it possible to very, very quickly move from your working space into your delivery space.
If you grade in ACES AP0 or AP1, the tone maps are already prepared for your conversions. Simply apply the tone map for the target system and voila, the conversion’s ready for rendering, preserving all (or rather most - they aren’t quite perfect) of the authorial intent of the grade. We did this on our Yellowstone video to generate the HDR master.2
Grading in other wide color spaces often requires custom tone maps, or on-the-fly maps generated by a program such as DaVinci Resolve Studio. RED Digital Cinema, for instance, has produced LUT based tone maps for converting their RedWideGamutRGB Log3G10 footage into various HDR and SDR color spaces. The entire Dolby Vision format is essentially a shot by shot set of tone maps for various screen brightnesses.
Or, you may find yourself doing what we’ve done - spend time to create your own tone mapped LUTs for converting HDR and SDR of various formats, and refining these maps for each individual piece of HDR content so that you end up with the optimal SDR look for that work.
Wide color space corrections and tone mapping for various output systems is the way that color correction will be treated and handled in the future. With the arrival of BT. 2020 and HDR transforms, in just the last few years the number of delivery color encoding formats has increased three fold at the very least. The only way to ensure your content will be compatible in the future is to adopt the new paradigm and multi-space coloring workflow.
DaVinci Resolve Studio’s latest update (release of version 14) saw a significant overhaul of the color management engine in the last few beta versions to optimize the core functionality for this kind of color management workflow. If you’re using DaVinci Color Management or ACES color management in the latest version, DaVinci will automatically select the optimal RAW interpretation of your footage and conform it to your working space, removing the ambiguity of how to interpret your footage and maintaining the maximum image quality.
Another manufacturer who’s natively implemented a similar color pipeline is RED, with their new IPP2 color workflow. They’ve moved all of their in-camera looks and apply them to the image data after the sensor RGB is converted into RedWideGamutRGB, tone mapping all outputs to your monitoring space. With that they now allow you to select whether color adjustments in camera are burned into the ProRes files, or simply attached as a LUT or CDL. This way, regardless of what your monitoring or eventual mastering space is, the color changes you make in camera will have the exact same effect across the board.
This is the workflow of the future. Like with HDR, which we can assume will be the EOTF of the future, the efficiencies and simplicities of this particular workflow are so great that the sooner you get on board with it the better your position will be in a few years time. Grading in ACES AP0 offers a level of future proofing that not even BT.2020 provides. While BT.2020 still exceeds what current technologies can really do, ACES AP0 ensures that regardless of where color science heads in the future (4+ color primaries?), your footage will already be common format that’s simple to convert to the new standard, preserving all color data.
While there is a learning curve to this workflow, at a technical level it’s simpler to learn and apply than even understanding how HDR video works. Yes, it takes some getting used to, but it’s worth learning. Because in the end, you’ll find better quality than you can otherwise hope for.
Written by Samuel Bilodeau, Head of Technology and Post Production
1 Yes, I’m making up the term “Very Wide Gamut” or “Very Wide Gamut RGB” simply because “Wide Gamut” and “Wide Gamut RGB” can refer to many different specific spaces, depending on the circumstances. Here I’m referring to any of these typical wide gamut spaces, or any space that covers a very large portion of the perceivable gamut.
2 A caveate note about ACES tone mapping: We used ACES AP0 with an ACEScc EOTF for our Yellowstone video. The tone mapping into HDR was fantastic and allowed me to skip my own range limiting map, and the ability to select different input transforms for each shot was fantastic. However, ACES failed when trying to generate an SDR version of the film: instead of tone mapping the higher dynamic range into the smaller SDR range, it clipped at the limits of SDR. This limitation makes me hesitant about recommending ACES for mixed-dynamic range work. It works wonderful for one or the other, but don't expect it to tone map directly between the two.