YouTube launched 8K streaming back in 2015, but the lack of cameras available to content creators meant 8K uploads didn’t start in earnest until late 2016. That’s around the time when we uploaded our first 8K video to YouTube, and while we ran into some interesting problems getting it up there (which aren’t worth discussing because they’ve all been fixed), overall we're impressed with YouTube’s ability to stream in 8K.
Being naturally curious, I wanted to know more about what they were using for 8k compression, so I downloaded the mp4 version YouTube streams to see which codec it was using. Let me save you some time finding it yourself and show you what settings YouTube uses for 8K streaming on the desktop:
Does anything look weird to you? Unless you’re a compressionist, maybe not.
Here’s what’s strange: it lists the codec as AVC, otherwise known as H.264. The problem with that is the largest frame size permitted by the H.264 video codec standard is 4,096 x 2,304, and yet somehow this video has a resolution of 7,680 x 4,320. Which means that either this video, or the video standard must be lying.
Well, not exactly. The frame resolution is Full Ultra High Definition (FUHD - 7,680 x 4,320), and the video codec is H.264 / AVC. It’s just a non-standard H.264 / AVC.
Being able to make and use your own non-standard H.264 (or any other codec) video files is a really useful trick, and right now it’s an important thing to know for working with 8K video files. Specifically, it’s important to know what benefits and drawbacks working outside the standard format offers and how to make the best use out of them.
In 2014, a client asked about 5K, high frame rate footage to use on a demonstration display. Since we’d been filming all of our videos at 5K resolution, remastering the files at their native camera resolution wasn't an issue and we were happy to work with them.
But as things moved forward with their marketing team we ran into a little problem. We had no problem creating and playing 5K files on our systems, but when their team tried to play back the ProRes or DPX master files on their Windows based computer (which they were required to use for the presentation), they weren’t available to get real-time playback. Why not? The ProRes couldn’t be decoded fast enough by the 32 bit QuickTime player on Windows, and the DPX files had too high of a data rate to be read from anything but a SAN at the time.
Fortunately, we’d already been experimenting with encoding 5K files in a few different delivery formats: High Efficiency Video Coding (HEVC / H.265), VP8 and VP9, and Advanced Video Coding (AVC / H.264). The HEVC was too computationally complex to be decoded in real time for 5K HFR, since there were no hardware decoders that could handle the format (even in 8 bit precision) and FFMPEG still needed optimizations to playback HEVC beyond 1080p60 in real-time, on almost every system. The VP8 and VP9 scared the client, since they weren’t comfortable working with the Matroska video container (for reasons they never explained - quality wise, this was the best choice at the time), which left us with H.264.
Which is how we delivered the final files: AVC Video with AAC Audio in an MP4 container, at a resolution of 5,120 x 2880, though we ended up dropping the playback frame rate to only 30fps for better detail retention.
Finding a way to encode and to play back these 5K files in H.264 wasn’t easy. But once we did, we opened up the possibility of delivering files in any resolution to any client, regardless of the quality of their hardware.
So how did we do it? We cheated the standard. Just like Google does for 8K streaming on YouTube. And for delivering VR video out of Google’s Jump VR system.
And since you’re probably now asking: “how do you cheat a standard?”, let’s review exactly what standards are.
Standards like MPEG-4 Part 10, Advanced Video Coding (AVC) / ITU-T Recommendation H.264 (H.264) exist to allow different hardware and software manufacturers to exchange video files with the guarantee they’ll actually work on someone else’s system.
Because of this standards have to impose limits on things like frame size, frame rate for a given frame size, and data rate in bits per second. For AVC/H264, the different sets of limits are called Levels. At its highest level, Level 5.2, AVC/H.264 has a maximum frame size of 4,096 x 2,304 pixels @ 56 frames per second, or 4,096 x 2160 @ 60 frames per second, so that standard H.264 decoders don’t have to accommodate any frame size or frame rate larger than that.
Commercial video encoders like those paired with the common NLEs Adobe Premiere, AVID Media Composer, and Final Cut Pro X, assume that you’ll want the broadest compatibility with the video file, so the software makes most of the decisions on how to compress the file, and strictly adheres to the available limits. Which for H.264 means that you’ll never be able to create an 8K file out of one of these apps.
While standards allow for broad compatibility, sometimes codecs are needed to work in a more limited use setting. “Custom video solutions” are built for a specific purposes, and may need frame sizes, frame rates, or data rates that aren’t standard. This is where the standard commercial AVC/H.264 encoding softwares often won’t work, and you either write a new encoder yourself (time consuming and expensive) or turn to the open source community.
Open source projects for codec encoding and decoding, like the x264 encoder/decoder implementation of the H.264 standard, often write code for all parts of the standard. x264 even includes playback features beyond the AVC/H.264 standard, specifically an ‘undefined’ or ‘unlimited’ profile or level where you can apply H.264 compression to any frame size or frame rate. The catch is that it just won’t playback with hardware acceleration because it’s out of standard; it’ll need a software package that can decode it.
Spend enough time with codecs and compression and you’ll run across a term: FFMPEG. FFMPEG is an open source software package that provides a framework for encoding or decoding audio and video. It’s free, it’s fast, and it’s scriptable (meaning it can be automated by a server) so a lot of companies who don’t write audio-video software themselves can simply incorporate FFMPEG and codec libraries like x264 for handling the multimedia aspect of their programs.
Which is exactly what YouTube does.
That’s right, when you upload a video to YouTube, Google’s servers create encoding scripts for FFMPEG, which are sent off to various physical servers in Google’s data centers to encode all of the different formats that YouTube uses to optimize the video experience for desktops, televisions, phones, and tablets, and for internet connections ranging from dial-up to fiber optic.
And for 8K content streaming on the desktop, that means encoding it in 8K H.264.
Why AVC/H.264 for 8K?
Which, of course, leads us to our last two questions: Why H.264 and not something else? And How can you do it too?
For YouTube, using AVC/H.264 is a matter of convenience. At the time that YouTube launched 8K support (and even today) HEVC/H.265, which officially supports 8K resolutions, is still too new to see broad hardware acceleration support - and even then few hardware solutions support at 8K resolution. (Side note - as of the last time we tested it [Jan 2017] the open source HEVC/H265 encoder x265 struggles with 8K resolutions, so there’s that too). Google’s own VP9/VP10 codecs still weren't ready for broad deployment when 8K support was announced, and hardware VP9 support is just starting to appear.
YouTube selecting either HEVC/H.265 or the VP9/VP10 codecs would severely limit where 8K playback would be allowed. And since software decoding 8K H.264 can work in real time while H.265 doesn’t on most computers (H.264 is about 5 - 8 times less processor intensive than H.265) we have YouTube streaming in 8K in the AVC/H.264 codec, at least until VP10 or H265 streaming support is added to the platform.
Encoding 8K Video into H.264
So you want to encode your own 5K or 8K H.264? It’s easy - just download FFMPEG and run it from the command line. Just kidding, that’s a horrible experience. Use FFMPEG, but run it through a frontend instead.
An FFMPEG frontend is a piece of software that gives you a nicer user interface to decide your settings, then sends off your decisions to FFMPEG and its associated software to do the actual work. Handbrake is a good example of a user-friendly cross platform front end for simple jobs, but it doesn’t give you access to all the options available. The best that I’ve found for that is a frontend called Hybrid.
Hybrid is a little more complicated than, say, Adobe Media Encoder, but it gives you access to all of the features that are available in the x264 library (i.e. all of the AVC/H.264 standard + out of standard encoding) instead of the more limited features that other packages give you. It’s a cross-platform solution that works on Windows and MacOS, it’s updated regularly to add new features and optimizations, and it by default hides some of the complexity if you just want to do a basic encode with it.
Here are the settings we’d use for a 5K or higher H.264 video:
On the first pane of the program, select your input file, generate or select your output file name, and decide on which video codec you want to use (in this example, x264) and whether to include audio or not (set it to custom).
Now, under the x264 tab, make the following changes: Switch the encoding mode to “average bitrate (1-pass)”, and change the Bitrate (kbits/s) value to 200,000. That’ll set our target bitrate to 200Mbps, which for 8K it’s the equivalent quality as 50Mbps for 4K.
Then, under the restriction settings, change the “AVC Profile/Level” drop downs to “none” and “unrestricted”. Leave everything else the same and jump over to the Audio tab at the top.
In the audio tab, add an audio source if your main source file doesn’t have one, turn on the Audio Encoding Options pane by using the check box, choose your audio format and bit rate (in this case I’m using the default AAC with 128 kbps audio, then click the big plus sign at the top right of the audio cue to add that track of audio to your output file.
That’s it. You’re done. Jump back to the Main tab, click the “add to queue” button to add your job to the batch, and either follow the same steps to add another, or click on “start queue” to get things rendering.
When you’re done you’ll find yourself with a perfectly useable 8K file compressed into H.264!
Is this useless knowledge to have? Not if you regularly create 8K video for YouTube, or if you create VR content using the GoPro Odyssey rig with Google Jump VR. In both of those cases you’ll need to upload an 8K file. While the ProRes format works, it’s quite large (data wise) and may be problematic for upload times. Uploading AVC/H.264 is a better option in some cases, and it can always be used as a delivery file for 8K content when data rates prohibit DPX or an intermediate format.
To playback files created this way, you need a video player that also leverages lightweight playback and non-standard video, like MPC-HC on Windows or MPV on Windows or MacOS. Sometimes QuickTime will work, though it rarely works on Windows because it’s still a 32 bit core, and VLC is also a solid option in many cases. But both of those have more overhead than FFMPEG core players and can cause jittery playback.
Spending time learning new programs, especially ones that aren’t at face value user friendly like Hybrid or FFMPEG, doesn't seem like it’ll pay off. But the process of discovery, trial, and error is your friend when you’re trying to stay ahead of the game in video. Don’t be afraid to test out something new.
It’s how we were able to deliver 5K video content to a client when no one else could, and how we still stay at the forefront of video technologies today.