Skip to content

Instantly share code, notes, and snippets.

@RenaKunisaki
Created February 17, 2025 20:44
Show Gist options
  • Save RenaKunisaki/19a39ea5aea18298b831e547b11ae26d to your computer and use it in GitHub Desktop.
Save RenaKunisaki/19a39ea5aea18298b831e547b11ae26d to your computer and use it in GitHub Desktop.
libav is too complicated for deepseek-14b
>>> What's the correct way to determine a frame's play timestamp (as UNIX timestamp) in libav?
<think>
Okay, I need to figure out how to get the play timestamp of each frame as a UNIX timestamp using libav. Hmm, I remember that libav has some
structures and functions for handling frames and streams.
So first, I think about the steps involved. Each frame has its own presentation time, right? In libav, frames are part of a stream context, so
maybe I need to access each stream's parameters.
I recall there's an AVStream struct which contains information about the stream, including a timebase. The timebase is used to convert
timestamps from the encoder's internal units (like qts or something) into seconds. So for each frame, the timestamp is in terms of that
timebase.
Looking at the functions, I think avcodec_decode_frame() might return the decoded frame and also provide some information about it. Oh wait,
there's a parameter called 'got_picture_ptr' which tells if the current packet produced a frame. But more importantly, when you call
avcodec_decode_frame(), it fills in the AVFrame's pts field.
So each AVFrame has an 'pts' member, which is the presentation time stamp in the stream's timebase units. To convert that to a double
representing seconds, I can use av_q2d() on the frame's pts and then multiply by the stream's timebase denominator.
Wait, no. Wait, the stream's timebase is a rational number, like num/den. So the conversion would be (pts * tb_den) / tb_num? Or maybe it's the
other way around. I need to check that. Oh right, the formula is (double(pts) * av_q(1, tb)) gives seconds.
Wait, actually, the timebase is stored as a fraction, so to get the value in seconds, you multiply pts by (tb_num / tb_den). Or wait, no: the
timebase is the number of units per second. So for example, if the timebase is 1/1000 (commonly used), then each unit is a millisecond. So to
convert pts to seconds, it's (pts * tb_num) / tb_den.
Wait, let me think again. Suppose the timebase is stored as a fraction: num/den. Then, one second equals den units of tb. So if tb is 1/1000,
then each unit is 0.001 seconds. So to get the value in seconds, you take pts * (num / den). Because for each pt unit, it's equal to num/den
seconds.
Wait no, that doesn't seem right. Let me consider an example. Suppose tb is 24 Hz, which would be 24 units per second. So the timebase would be
represented as 1/24. So when you have a pts value of 1, it's equal to 1/(1/24) = 24 seconds? That can't be right because that would mean each
frame is spaced 24 seconds apart.
Wait no, maybe I'm getting this wrong. Let me think: The timebase is the number of clock units per second. So if tb is 1/1000, then one clock
unit equals 1 millisecond. So to get the timestamp in seconds, you take pts (which is in clock units) and multiply by 1000.
But wait, the way it's stored: time_base is a fraction, so in code terms, tb = av_make_q(tb_num, tb_den). So when you have a timestamp 'pts',
to convert it into seconds, you do pts * (tb.num / tb.den) ?
Wait no, because for example, if tb_num is 1 and tb_den is 1000, then each unit is 1/1000 second. So to get seconds, you would multiply by
(tb_num/tb_den). Wait no: If your timebase is 1/1000, that means the clock runs at 1000 units per second. So each unit is 0.001 seconds. So if
pts is 500, then it's 500 * (1 / 1000) = 0.5 seconds.
Wait no: time_base is tb_num/tb_den, so in this case 1/1000. So to get the number of seconds, you multiply pts by (tb_num/tb_den).
But that's when the timestamp is already in terms of that base. Hmm.
Alternatively, perhaps it's better to think of the time_base as the denominator being the numerator for scaling.
Wait, I'm getting a bit confused here. Maybe looking up the documentation would help.
Looking at the libav documentation, they say: "The timestamps are expressed in terms of the time_base unit of the stream." So, each timestamp
is multiplied by time_base to get seconds. So if tb is 1/1000, then multiplying by that gives you the value in seconds.
Wait no: Wait, no, because time_base is stored as a fraction (num/den). So for example, if time_base is 1/1000, then it's 0.001 seconds per
unit. So to get the timestamp in seconds, you multiply pts by (tb_num / tb_den) = (1/1000).
Wait no: For a time_base of 24 Hz (like for NTSC video), each frame is 1/24 seconds apart. So if your timestamp increases by 1 every 1/24
seconds, then to get the value in seconds, you multiply by 1/24.
But wait, in that case, time_base would be 1/24 (num=1, den=24). So each unit is 1/24 seconds. So if a frame has a pts of 100, then the
timestamp is 100 * (1/24) ≈ 4.1667 seconds.
Yes, that makes sense. So to convert the AVFrame's pts into seconds, you take pts * tb.num / tb.den.
Wait no: Wait, because if tb is 1/1000, then it's 1 (num) over 1000 (den), so each unit is 1/1000 second. So multiplying by num/den gives you
the value in seconds.
So to get the timestamp in seconds: pts * (tb.num / tb.den). But since tb is stored as a fraction, you can use av_q2d() which converts a
rational number into a double. For example, av_q2d(tb) would give 0.001 for tb of 1/1000.
So the formula is: seconds = pts * av_q2d(time_base)
But wait, because time_base could be represented as tb_num/tb_den, and to get a double, you can compute (double)tb_num / (double)tb_den. Or use
av_q2d() which does that for you.
So in code:
AVStream *stream = ...;
int64_t pts = frame->pts;
AVRational tb = stream->time_base;
double seconds = (double)pts * av_q2d(tb);
Wait, but wait: If the timebase is in terms of the stream's clock, which may be different from the system's timestamp.
But for the purpose of getting a play timestamp as a UNIX timestamp, perhaps I need to convert the frame's pts into seconds since epoch.
So the plan is:
1. For each decoded frame, get its 'pts' (frame->pts).
2. Multiply by the stream's time_base (as a double) to get the value in seconds since the start of the stream's timeline.
3. Then add the start_time of the stream, which represents the stream's initial timestamp, also in the same time_base units.
Wait no: The stream has a 'start_time' field which is in terms of the stream's clock. So to get the overall playtime, it's (pts * tb) +
start_time_in_seconds?
No, because start_time is in the same time base as pts and tb.
So perhaps:
The play timestamp is (stream->start_time + frame->pts) multiplied by the time_base, then converted into a UNIX timestamp.
But wait, no. The stream's 'start_time' is an absolute timestamp for the beginning of the stream. So the first frame's pts is 0 in terms of the
stream's timeline, but when added to start_time, it becomes the actual playtime.
Wait, I'm getting confused again.
The 'start_time' is the time at which the first packet of the stream was decoded, right? Or perhaps it's the DTS (Decoding Timestamp) of the
first frame. So each frame's DTS/PTS is relative to this start_time.
So for example, if the stream starts at 10 seconds after epoch, and the first frame has a pts of 0, then its play timestamp would be 10 + 0 *
tb = 10 seconds.
If another frame has a pts of 5, then it's 10 + 5 * tb seconds.
But in terms of code, to get the UNIX timestamp for each frame:
1. For each AVFrame, after decoding, check if 'got_picture' is set, which indicates that this frame can be displayed.
2. The frame has a pts (presentation time stamp) which is in the stream's time_base units.
3. So convert this pts to seconds by multiplying with av_q2d(stream->time_base).
4. Then add stream->start_time multiplied by av_q2d(time_base) to get the absolute time.
Wait, no: Because 'stream->start_time' is already a timestamp in the same time_base units as the frames. So for example:
stream->start_time = 10 (in seconds)
No, actually, 'stream->start_time' is in terms of the stream's clock. It represents the DTS of the first frame.
So if I have a frame with pts of x, then its play timestamp is (x * tb + start_time) converted into the system's time?
Wait no: The play timestamp for the frame is (stream->start_time + frame->pts) multiplied by tb.
But since 'stream->start_time' is in seconds, or perhaps it's also in the same time_base units as the frames? Wait, I'm getting confused again.
Looking up documentation:
The 'AVStream' struct has a field 'time_base' which is a fraction representing the stream's time base (e.g., 1/1000 for milliseconds). The
'start_time' is in terms of this time base. So to get the absolute play timestamp, you do:
pts_in_seconds = frame->pts * av_q2d(stream->time_base);
start_time_in_seconds = stream->start_time * av_q2d(stream->time_base);
absolute_timestamp_in_seconds = start_time_in_seconds + pts_in_seconds;
Then, convert this to a UNIX timestamp (seconds since epoch).
But wait, no: The 'stream->start_time' is in the same time base as frame->pts.
Wait, according to FFmpeg's documentation:
The 'AVStream::start_time' is the DTS of the first frame in stream. So it's an absolute value in the stream's own time_base units.
So when you have a frame with a certain DTS or PTS (which are relative to the stream's start), adding them to start_time will give the absolute
timestamp in the same time base.
But how do I convert that into a UNIX timestamp?
Ah, right. The 'AVFormatContext' has a field 'start_time' which is the overall start time of the entire demuxing process, and also an
'external_time_base' if applicable.
Wait, this is getting complicated.
Perhaps it's better to refer to how FFmpeg itself handles this.
In FFmpeg, during decoding, each frame's absolute timestamp can be computed as:
int64_t* pts = (int64_t*)avcodec_get_frameeamccontext()->pkt->pts;
if (pts) {
*pts += input_stream->start_time;
}
But I'm not sure about that.
Alternatively, perhaps the correct way is to compute:
The play timestamp for each frame is (stream->start_time + frame->pts) multiplied by stream->time_base as a double, then converted into a UNIX
timestamp.
So, in code:
int64_t frame_pts = frame->pts;
int64_t stream_start = stream->start_time;
double tb_num = stream->time_base.num;
double tb_den = stream->time_base.den;
double time_in_stream_clock = (stream_start + frame_pts) * (tb_num / tb_den);
Then, convert this to a UNIX timestamp. But how?
Wait, because the 'stream->start_time' is in the same units as 'frame->pts', which are based on stream->time_base.
But to get the actual seconds since epoch, you need to have time_in_stream_clock converted into system's time.
But that may require knowing when the demuxing started relative to the system's clock. That's where the 'AVFormatContext::start_time' comes in,
which is the overall start time of the demuxer, in terms of system time (seconds since epoch).
So perhaps:
double stream_start_in_system_time = av_gettime() - stream->codec->parser->parse_time;
No, that might not be correct.
Wait, I think this is getting too complicated. Maybe there's a function to do this.
Looking up the FFmpeg source code, in 'libavcodec/utils.c', there's a function called 'compute_frame_number' which deals with frame timestamps.
Alternatively, perhaps using av_packet_getPTS(...) or something similar.
But maybe it's better to refer to the following approach:
When demuxing, each packet has a PTS and DTS. The demuxer updates these based on the stream's time base and start_time.
Then, for each frame, after decoding, its timestamp is computed as (stream->start_time + frame->pts) multiplied by stream->time_base to get the
absolute timestamp in seconds since the epoch.
Wait, no: 'stream->start_time' is already an absolute timestamp in terms of the stream's time base. So when you add frame->pts, which is
relative, you get the total in the stream's time units. Then, multiplying by stream->time_base as a double (num/den) gives seconds.
But to get this into the system's time (e.g., UNIX timestamp), that's another matter. Because 'stream->start_time' could be based on some
external reference, like the file's metadata start time.
Alternatively, perhaps the demuxer sets AVStream::start_time as an absolute value in stream units, and then when you convert it to seconds
since epoch, you need to have a way to relate the two.
This is getting too tangled. Maybe I should refer to a code example.
Looking at FFmpeg's 'decode.c', which is an example program:
In that example, for each decoded frame, they might print the time.
But perhaps in practice, it's not straightforward to get the exact timestamp because of the way packets are processed and how the demuxer
handles timing.
Alternatively, perhaps using the 'AVFrame::pkt_pts' field, which is in terms of the stream's time base, then:
int64_t pts = frame->pkt_pts;
double tb_num = stream->time_base.num;
double tb_den = stream->time_base.den;
pts_time = (pts * tb_num) / tb_den;
But this is still not the absolute system time.
Wait, I think that 'stream->start_time' is in terms of the same units as frame->pkt_pts. So if you have a frame with pkt_pts x, then its
absolute timestamp is stream->start_time + x multiplied by (num/den), which gives seconds since epoch.
So yes, in code:
int64_t abs_presentation_time = av_rescale_q(frame->pts, stream->time_base, (AVRational){1, 1});
But wait, no: 'av_rescale_q' converts frame->pts from the stream's time base to another. If I set the other as (1,1), it's in seconds since
epoch.
Wait, no:
The function 'av_rescale_q' is used for re-scaling a value with one rational to another.
So if you have a value 'x' which is in units of 'src', then to get it into 'dst' units, use av_rescale_q(x, src, dst).
In this case, frame->pts is in stream->time_base units. To convert it into seconds since epoch (1/1), we can do:
int64_t time_in_seconds = av_rescale_q(frame->pts, stream->time_base, (AVRational){1, 1});
But wait, that would give us the frame's timestamp in seconds since the start of the stream. But to get the absolute system time, we need to
add the start time.
Wait, no: The 'stream->start_time' is already the absolute start time in terms of the stream's time base. So adding it and then scaling gives
the absolute time.
So:
int64_t absolute_time = av_rescale_q(stream->start_time + frame->pts, stream->time_base, (AVRational){1, 1});
Then, 'absolute_time' is the timestamp in seconds since epoch.
But wait, this is only true if 'stream->start_time' is an absolute timestamp. But according to FFmpeg's documentation, 'stream->start_time' can
be set by the demuxer and represents the DTS of the first frame in the stream.
So, when you add that to each frame's PTS, you get the presentation time for that frame relative to the start of the stream, which is then
converted into seconds since epoch.
Wait, but how do I know what 'stream->start_time' is set to? It depends on the demuxer and the file format. For example, in a .mov file, it
might be based on the movie's time scale.
But if you want to output the timestamp as seen by the system (i.e., seconds since epoch), you need to know when the stream started relative to
the system clock. That would require knowing the start time of the demuxer process and relating it to the absolute timestamps.
This seems complicated, but perhaps in practice, for many cases, 'stream->start_time' is zero or represents some offset that can be added.
Alternatively, maybe using the AVFormatContext's start_time:
The 'AVFormatContext' has a 'start_time' field which is the overall start time of the demuxing process, and it's in terms of seconds since
epoch.
But I'm not sure if this is directly useful for getting individual frame timestamps.
Perhaps a better approach is to refer to the following method:
Each frame's timestamp can be calculated by taking its 'pkt_pts' (which is relative to the stream's start) and adding the stream's start time,
then converting that into seconds since epoch using the stream's time base.
So in code:
int64_t frame_time = av_rescale_q(frame->pts, stream->time_base, (AVRational){1, 1});
Then, add 'stream->start_time' which is already scaled to the same units?
Wait, no. I think 'stream->start_time' is also in stream time base.
So:
int64_t absolute_frame_time = av_rescale_q(stream->start_time + frame->pts, stream->time_base, (AVRational){1, 1});
This should give us the timestamp in seconds since epoch.
Alternatively, perhaps I'm overcomplicating it. Maybe in practice, you can use 'frame->pkt_pts' and convert it using the time base to get the
presentation time relative to the stream's start, then add the stream's start_time which is also in the same units, then convert that to
seconds since epoch.
So putting it all together:
int64_t pts = frame->pts;
int64_t scaled_pts = av_rescale_q(pts, stream->time_base, (AVRational){1, 1});
Then, if 'stream->start_time' is the start of the stream in terms of the same units, adding that would give the absolute time:
int64_t absolute_time = av_rescale_q(stream->start_time + pts, stream->time_base, (AVRational){1, 1});
Wait, but 'av_rescale_q' takes an integer. So perhaps it's better to do the calculation step by step.
Another approach is to use 'av_gettime()' to get the current time in microseconds and track the timestamps that way, but that might not give
accurate frame times.
Alternatively, using the 'AVFrame::pts' and converting it with the stream's time base gives you the presentation time relative to the start of
the stream. Then, if you know when the stream started (e.g., from the demuxer), you can add that offset to get the absolute system time.
But this requires knowing the start time of the stream in real-world terms, which might not be directly available unless you query it from the
demuxer or have metadata about the file's creation time.
In conclusion, to calculate a frame's timestamp as seconds since epoch using FFmpeg libraries:
1. Get the frame's 'pts' value (frame->pts).
2. Use the stream's time base (stream->time_base) to convert this into seconds.
3. Add the stream's start_time, which is also in the same units, to get the absolute timestamp in the stream's timebase.
4. Convert this absolute timestamp from the stream's timebase to real-world seconds.
Wait, no: 'stream->start_time' is already an absolute value in terms of the stream's timebase, so adding frame->pts gives the presentation time
relative to the epoch.
But perhaps I'm getting confused here. Let me think again.
The 'pts' of a frame is its position within the stream, relative to the stream's start (which could be at any real-world time). The stream's
'start_time' is the real-world timestamp corresponding to the beginning of the stream.
So, for example, if a stream starts at 10 seconds since epoch and has a time base of 1/1000 (ms), then frame->pts would be in ms. So, adding
stream->start_time (which might be 10*1000) to the frame's pts gives the absolute ms, which can then be converted to seconds.
Wait, that makes sense.
So step-by-step:
- The stream has a time base (e.g., 1/1000 for milliseconds).
- Each frame's 'pts' is in units of this time base.
- The stream's 'start_time' is the real-world timestamp when the first frame was supposed to be displayed, also expressed in these units.
Wait, no. According to FFmpeg documentation, 'AVStream::start_time' is the time (in seconds) at which the stream starts. So if it's set to
10.0, then a frame with 'pts' of 0 would correspond to 10.0 seconds since epoch.
So to get the real-world timestamp for a frame:
- Convert the frame's 'pts' from the stream's time base units into seconds.
- Add that to the stream's start_time (which is already in seconds) to get the absolute timestamp.
Wait, no: The stream's start_time is in seconds since epoch. Frame's 'pts' is in terms of the stream's timebase, which could be different.
So frame->pts is an integer representing a value in units equal to 1/(stream->time_base.den). So if the time base is 1/1000 (ms), then each
increment is one millisecond.
To convert this into seconds since epoch:
- Convert frame->pts from stream timebase to real-world seconds by multiplying by stream->time_base.den / stream->time_base.num. But in code,
we can use 'av_rescale' or 'av_q2d'.
Wait, more accurately: The formula is (pts * tb_num) / tb_den, where tb is the time base.
But maybe using the 'AV_TIME_BASE' which is 1000000 microseconds.
I think I need to clarify this with code examples.
So, for example:
stream_time_base = stream->time_base; // e.g., (num=1, den=1000) meaning 1/1000 seconds per unit
pts_value = frame->pts;
Then, to convert pts_value into real-world time in seconds since epoch, you do:
absolute_time_in_seconds = (stream_start_time + (pts_value * stream_time_base.den / stream_time_base.num))
Wait, no: Let me think about this.
If the stream's start_time is t0 (in seconds), and each unit of pts is 1/tb seconds, where tb = num/den.
So:
real_time = t0 + (pts_value * tb_den) / tb_num
Because if tb is 1/1000, then each pts increment is 1 ms, so you multiply by 1000 to get ms and add to t0*1000.
Wait, perhaps it's easier to handle in microseconds:
t0 = stream_start_time (in seconds) * 1e6
pts_in_us = frame->pts * tb_num / tb_den * 1e6
Then, absolute_time_in_us = t0 + pts_in_us
absolute_time_in_seconds = absolute_time_in_us / 1e6
Wait, I'm getting tangled here.
Perhaps a better approach is to use the following steps:
1. Convert frame->pts from stream timebase units to seconds.
Since time_base is a fraction (num/den), each unit is den/num seconds. So:
pts_in_seconds = pts_value * (den / num)
2. Add this to the stream's start_time, which is already in seconds.
absolute_time = stream_start_time + pts_in_seconds
So for example, if the stream starts at t0=10.5 seconds and has a time base of 1/1000 (ms), then:
pts_value=5 would be 5 ms after t0, so absolute_time = 10.5 + 0.005 = 10.505 seconds.
So in code, how to do this?
In FFmpeg, 'stream_start_time' is stored as an int64_t representing the time in seconds. Similarly, frame->pts is a uint64_t (can be signed or
unsigned depending on context).
The stream's time_base is a rational number represented by num and den.
So to convert frame->pts into seconds:
pts_seconds = ((int64_t)frame->pts * stream->time_base.den) / stream->time_base.num;
Then, absolute_time = stream_start_time + pts_seconds
But wait: in FFmpeg, 'stream_start_time' is the time when the stream starts (the first frame's display time), and it's stored as an int64_t
representing seconds. So if a frame has pts 0, its absolute time is stream_start_time.
Yes, that makes sense.
So putting this into code:
int64_t stream_start = stream->start_time; // in seconds
AVRational tb = stream->time_base;
// Convert frame's pts to seconds
int64_t pts_seconds = (frame->pts * tb.den) / tb.num;
int64_t absolute_time = stream_start + pts_seconds;
But wait, this assumes that the division is exact. Since tb can be a fraction like 1/1000, we have to make sure that the multiplication by den
doesn't cause overflow and is correctly divided.
Alternatively, using the 'av_rescale' function would handle this more safely:
pts_seconds = av_rescale(frame->pts, tb.den, tb.num);
But I'm not entirely sure. Let me check FFmpeg's documentation or source code.
In FFmpeg, 'av_gettime()' returns the current time in microseconds, and there are utility functions for handling time calculations.
Wait, another approach: use the function 'avcodec_get_frame_number' (though that might be deprecated) or track timestamps manually if using a
decoder.
Alternatively, when demuxing with libavformat, each packet has a timestamp which can be converted to real-time using the stream's start_time
and time_base.
So for packets:
pkt_ts = av_rescale_q(pkt->pts, stream->time_base, (AVRational){1, AV_TIME_BASE});
But this converts the packet's pts into microseconds since the epoch.
Wait, perhaps a better way is to use 'av_gettime()' and track the encoding/decoding time, but that might not be accurate.
I think the correct formula is:
absolute_time = stream_start + (frame->pts * tb.den) / tb.num;
But let me test with an example.
Suppose stream starts at t0=5.0 seconds, and has a time base of 1/1000 (i.e., each unit is 1ms).
A frame has pts=10: this should correspond to 5.0 + 0.010 = 5.010 seconds.
So:
pts_seconds = 10 * 1000 / 1 = 10000? Wait, no:
Wait, wait, stream->time_base is (num=1, den=1000), meaning each unit is 1/1000 of a second.
So to convert pts units to seconds: multiply by den and divide by num.
pts_seconds = frame->pts * tb.den / tb.num
In this case, tb.den is 1000, tb.num is 1.
So frame->pts=10 => 10*1000/1 = 10000 seconds? That can't be right because that would make the absolute time 5 + 10000 = 10005, which is way
off.
Wait, no: I think I have this backwards. The time_base is usually expressed as (num/den), where num is the numerator and den is the
denominator. So a time base of 1/1000 is represented as num=1, den=1000.
Each frame's pts is in units of 'time_base', so to get seconds: multiply by den and divide by num, but wait:
If tb = num/den, then each unit is tb. So to convert a value in these units to seconds, you do (value * tb). Because tb is (num/den), which
could be less than 1.
Wait, no: the time base is typically such that 1 second has (den / num) units. For example, if tb is 1/1000, then each second has 1000 units.
So to convert a pts value to seconds: (pts * tb.den) / tb.num
Wait no:
No, the formula should be:
seconds = (pts * time_base.den) / time_base.num
Because time_base is num/den. So if time_base is 1/1000, then den is 1000 and num is 1.
So seconds = pts * 1000 / 1 = pts*1000.
Wait that can't be right because if I have a frame with pts=1 in such a stream, it would be at 1000 seconds after the start time, which is not
correct.
I think I must be confusing numerator and denominator.
Let me check FFmpeg's documentation on time bases.
According to FFmpeg's source code comments:
The time base for a stream is given as a fraction of two integers: num (the numerator) and den (the denominator). The actual time in seconds is
calculated by multiplying the pts value by (den / num).
Wait no, wait. Let me think again.
Suppose I have a time_base of 1/1000, which means that each unit represents 1/1000 seconds. So one second has 1000 units.
Therefore, to convert from units to seconds: multiply by (den / num) = 1000/1=1000.
So frame->pts is in units where each unit is 1/1000 seconds.
Thus:
pts_seconds = (frame->pts * den) / num
In this case, den=1000, num=1: pts_seconds = frame->pts*1000.
But that would make a frame with pts=1 last at 1000 seconds after the stream's start time. That seems too long if the stream is say, video with
high frame rate.
Wait, maybe I'm mixing up how the time base works for different components like audio and video.
Alternatively, perhaps it's better to consider that:
time_base = num / den
So 1 second = (den / num) * time_base units.
For example, if num=1, den=1000: time_base is 0.001 seconds per unit. So each unit is a millisecond.
Thus, frame->pts in these units would be:
To get the time in seconds since the stream start: pts * (time_base) = pts * (num / den) ?
Wait, no:
No, I think it's:
pts_time_in_seconds = (pts * tb.num) / tb.den
Because if tb is 1/1000, then each unit is 1e-3 seconds.
So for example, if frame->pts=1000: 1000*(1/1000)=1 second.
Similarly, pts=500 would be 0.5 seconds.
Wait that makes sense.
Because tb = num/den = 1/1000, so:
pts_time_in_seconds = (pts * num) / den
So yes, in this case, the formula is (pts * num)/den.
Therefore, absolute time since stream start is stream_start + (pts * num)/den.
Wait that seems to make sense now. So with tb=1/1000:
If frame->pts=500: 500*(1)/1000 = 0.5 seconds after the stream's start.
So if the stream started at t=10, then absolute time is 10 + 0.5 = 10.5 seconds.
Yes, that makes sense.
So to convert frame->pts to seconds since the start of the stream: (frame->pts * tb.num) / tb.den
And then add the stream_start_time to get the absolute time in seconds.
Wait, but in the initial example where I had tb=1/1000:
If a frame has pts=10, that's 10*(1)/1000 = 0.01 seconds after the stream start.
So for a stream starting at t=5, absolute time is 5 + 0.01 = 5.01 seconds.
That makes sense.
Wait, but then why do I sometimes see the formula as (pts * tb.den)/tb.num?
I think that's when people are confused between numerator and denominator.
So in summary:
The correct way to convert pts units to seconds is:
seconds_since_stream_start = (pts * tb.num) / tb.den
Where tb.num is numerator, tb.den is denominator of the time base.
Therefore, absolute_time = stream_start_time + (pts * tb.num)/tb.den
But I think in practice, when working with FFmpeg or GStreamer, sometimes the way the time is handled might differ based on how libraries
represent the time base.
Alternatively, perhaps the formula is:
seconds_since_stream_start = pts / tb.denominator * tb.numerator ?
Wait no, because that would be (pts * num)/den, which aligns with what I concluded earlier.
So to make it concrete, let's say we have a stream with time_base = 1/1000 (num=1, den=1000).
Frame has pts=500: 500 / 1000 * 1 = 0.5 seconds after the start.
If the stream started at t=10 (absolute), then absolute_time is 10 + 0.5 = 10.5.
Yes, that seems correct.
Another example:
time_base is 1/2: num=1, den=2.
Each unit represents 0.5 seconds.
pts=1: 1*(1)/2 = 0.5 seconds.
pts=2: 2*1/2=1 second.
So yes, this seems correct.
Therefore, the formula is:
absolute_time = stream_start + (frame->pts * tb.num) / tb.den
But wait in GStreamer or FFmpeg, sometimes the time base can be something else. Let me think about how these libraries present it.
In GStreamer, for example, a video stream might have a time_base of 1/60 if the frame rate is 60 FPS.
So each unit represents 1/60 seconds.
Therefore, a frame with pts=1 would be at (1 * 1)/60 = 1/60 seconds after the start.
But for video, the frame rate determines how many frames per second, so 60 fps means each frame is 1/60 seconds apart. So indeed, the time_base
is 1/60.
So in this case, to get the absolute time:
stream_start_time + (pts * tb.num)/tb.den
Yes, that's correct.
But sometimes, especially in some code examples I've seen, people calculate as:
seconds = pts / tb.denominator * tb.numerator ?
Wait no, because that would be (pts * num) / den, which is the same as before. So it doesn't matter how you arrange it; the result is the same.
But wait, perhaps in code, sometimes people represent time_base as a double and compute:
seconds = pts * time_base;
But if time_base is represented as 0.001 for tb=1/1000, then yes: seconds = pts * 0.001.
Which again aligns with (pts * num)/den when num=1 and den=1000.
So in code:
double time_base = (double)num / (double)den;
seconds_since_stream_start = pts * time_base;
Yes, that makes sense.
Therefore, the general formula is:
absolute_time = stream_start + (pts * (tb_num / tb_den))
Where tb_num and tb_den are numerator and denominator of the time base.
But perhaps in some code, especially when dealing with integer arithmetic, people might compute it as (pts * tb_num) divided by tb_den to avoid
floating points until necessary.
So overall, I think that's the correct formula.
</think>
To calculate the absolute time given a frame's `pts` and the stream's time base (`tb_num`/`tb_den`), follow these steps:
1. **Convert `pts` to Seconds:**
- Multiply the `pts` by the numerator of the time base (`tb_num`).
- Divide the result by the denominator of the time base (`tb_den`).
2. **Add Stream Start Time:**
- Add the result from step 1 to the stream's start time.
**Formula:**
\[ \text{absolute\_time} = \text{stream\_start} + \left( \frac{\text{pts} \times \text{tb\_num}}{\text{tb\_den}} \right) \]
**Example:**
- **Stream Start Time:** 10 seconds
- **Time Base (tb_num/tb_den):** 1/60 (e.g., for a video stream at 60 FPS)
- **Frame's pts:** 60
**Calculation:**
\[ \text{absolute\_time} = 10 + \left( \frac{60 \times 1}{60} \right) = 10 + 1 = 11 \text{ seconds} \]
This method accurately computes the absolute time a frame is displayed, considering both the stream's start time and its timing specifics.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment