Mistakes to Avoid When Producing Podcasts for iTunes
- 11-14-2008
- Categorized in: H.264 production

In the fall, of 2008, I gave a presentation on producing
H.264 video at StreamingMedia West in
Not surprisingly, the market was podcasts distributed via
iTunes. I say not surprisingly because iTunes is all about iPod/iPhone devices,
which play only H.264 and MPEG-4 video. It’s like finding Sox fans in
Having found the Mecca of H.264 usage, I decided to download 50 podcasts, try to load them on my iPod Nano and see what happened. Interestingly, six refused to load at all, and three had what I’ll call “compromised” displays. After analyzing all the podcasts in Inlet HD’s most excellent streaming media analysis tool Semaphore, I noticed that many others used sub-optimal encoding parameters. While producing podcasts is probably a tiny part of what we all do for a living, it’s still a useful skill, so I thought I would detail these findings for you.
First, some background. MPEG-4 is the overarching standard that includes two video codecs, the MPEG-4 codec itself, and a more advanced video codec, H.264, also known as AVC. When used in an MPEG-4 “wrapper,” H.264 files typically have an mp4 or m4v extension, the first being the official designation and the latter being the extension Apple created for its devices. You can also “wrap” an MPEG-4 file in a QuickTime file with an mov extension, or encode it for Flash with an flv or f4v extension. Soon, you’ll be able to encode H.264 to Windows Media, presumably with a WMV extension.
H.264 has multiple “profiles” that specify levels of
playback compatibility. For example, the Baseline profile is typically for
devices like iPods or cell phones that have limited playback horsepower.
Accordingly, the Baseline profile doesn’t use many of H.264’s more advanced
encoding techniques that can produce higher quality but also create a stream
that’s too hard to decode. Then there’s the
Obviously, when producing for devices, rather than general purpose playback, job number 1 is to use the appropriate profile. Interestingly, of the six videos that wouldn’t play on my iPod Nano, five used the Main Profile, which is verboten. The sixth used the Sorenson Video 3 codec of all things, which also won’t play.
So, when producing for podcasts, always use the Baseline profile
of the H.264 codec. Before encoding, however, go to Apple.com and print the video
playback specs for the latest iPod, and make sure that you’re within the
resolution and data rate requirements. Unfortunately, this is more complicated than
it sounds because the initial iPod could only play H.264, Baseline-profile
videos at 320x240 resolution, while current iPods and iPhones can play Baseline
H.264 video up to 640x480 resolution.
You can see this in the iPod preset shown in Figure 1, which is from the Adobe Media Encoder CS4. Note that if you choose the Apple iPod Video Small preset, you'll encode at 320x240, and, of course, that the preset uses the Baseline profile.

Figure 1. An iPod preset from the Adobe Media Encoder CS4.
Anyway, so your next decision is target resolution. In the sample of 44 videos that loaded on my iPod Nano, 25 went with 320x240, which is obviously the safe route, while the other 19 (and 5 of six that failed to play) went 640x360 or larger.
Why go larger than 640x480 when the screen resolution of most iPods is 320x240? First, many iPods have composite output ports that let you play the video on a TV set or other analog device. Though display on the device itself is limited to 320x240, 640x480 video will look better when displayed on a TV set than 320x240. More importantly, iPhones and the iPod Touch have 480x320 resolutions, and six of the 19 producers using greater than 320x240 resolution produced at 16:9, which looked better on the iPhone/iPod Touch than 4:3 video.
Which leads me to the three podcasts with “compromised displays.” Briefly, if you display your 16:9 video on a 4:3 iPod, the device displays the “center-cut,” much like a 4:3 television does with 16:9 broadcasts. This means that it displays the middle section of the video and cuts off the right and left edges rather than displaying the entire video with letterboxes on the top and bottom.
Several producers of 16:9 video – including Photoshop User
TV – included screencam videos with content on the edges that wasn’t visible
when viewed on a 4:3 display. So, while the announcer was saying “click this
menu item,” the menu item was off-screen on 4:3 displays. Your viewers can
change this center cut option to letter box the video, but unless you tell your
viewers how and where to change that preference.

Figure 2. The outer edges of this 16:9 video won't show up when played on a 4:3 iPod using the default "center cut" video configuration.
Interestingly, when I examined footage converted from 16:9
broadcast to 16:9 podcasts, like Oprah, it was clear that the cameraperson was
framing for 4:3 display, so even during wide shots, the main subjects were
within the 4:3 center-cut window. If producing a 16:9 podcast, you should do
the same.


Figure 3. Because the camera operator shot with "center cut" display in mind, this video looks good on all iPods.
The other mistakes were more technical, like exceeding the recommended data rate and using too frequent key frames, which can degrade quality and add a pulsing effect to your video. In this regard, note that the iPod preset in Apple Compressor uses a data rate of 1.12 mbps for 640x480 video, and inserts key frames every 150 – 300 frames, depending upon content, or one every five to ten seconds.


