Wednesday, April 11, 2012

Seeking in FFmpeg: Know Your Timestamp!

History

I've been working on a plugin lately that reads videos with FFmpeg. Things are making pretty good progress, but by far the biggest problem I've had so far is seeking accurately. Particularly, I would try to seek to the very beginning of a file, but for some reason would end up at the second keyframe, not the first.

So I tested it further, talked on IRC with some devs about it, and decided it must be a bug. So I filed a bug report.

The Lightbulb

Thankfully, my bug report got a quick response. It turned out that, yet again, the "bug" I discovered really wasn't a bug. It was me misunderstanding FFmpeg.

Here's the catch: the documentation for av_seek_frame and avformat_seek_file both talk about being able to seek by a timestamp. This whole time, I thought FFmpeg was seeking by the PTS. As Reimar points out in the bug report, this is not the case. FFmpeg seeks by the DTS, not the PTS.*

Why This is Important

The video file I was using was an mpeg2video MOV file. The first packet has a DTS of -1 and a PTS of 0, and the second packet had a DTS of 0 and a PTS of 1. This means that when I tried to seek to a timestamp of 0 (which I thought was the beginning), it was really seeking to a PTS of 1, which is why it would skip over the first keyframe (at PTS=0 and DTS=-1) and stop at the second keyframe.

This also screwed me up when trying to seek by one frame throughout the file. I kept track of my current position by the PTS, and when I wanted to seek forward/backward by one frame, I would just add/subtract 1 from my PTS. This meant that if I was at PTS=12 (there was a keyframe here), and I tried to go back one frame, I tried to seek to a timestamp of 12 - 1 = 11. The problem was that I was thinking in PTS and FFmpeg was thinking in DTS, so when I asked FFmpeg to seek to a timestamp of 11, it seeked to a DTS of 11 (or PTS=12, which is where I already was!).

In the end, it's important to know which timestamp you're using in seeking. At the time of writing, the FFmpeg documentation doesn't say that the DTS is the timestamp used, as it just mentions "timestamp." The docs will probably have this note added (soonish), though.

*Update! So I tried submitting a patch that changed the documentation to say that seeking was done by DTS and not by PTS. Michael Niedermayer then informed me that this isn't true for all demuxers. Apparently some use DTS and some use PTS. Just be sure to be very aware of this. I believe most demuxers, however, seek by DTS.

Double Update! It should be noted that Michael Niedermayer recently added the flag AVFMT_SEEK_TO_PTS, to be used in AVInputFormat.flags, to specify that the demuxer seeks by PTS and not by DTS. Otherwise, you can expect the demuxer to seek by DTS. Note that this is a recent change (at the time or writing), and most relevant demuxers haven't been updated with this (so until the update is passed on to more demuxers, it's possible a demuxer may be seeking by PTS without having AVFMT_SEEK_TO_PTS set).