Improve the audio & video RTP timestamps

* Video: instead of using now() when the RTP packet is created, use the earlier packet->frame_timestamp that we're already collecting for host latency stats. This timestamp should be more accurate to when we captured the frame. I am not sure if all backends support this timestamp, so there is a fallback to the current method. * Audio: fix bug where the RTP timestamps stop advancing when no audio is being sent. Audio now uses the same timer-based source for this field, but remains as milliseconds. * Both use steady_clock and microseconds / 1000 to maintain precision. Video still uses 90khz time base per the spec, and audio is effectively at packet duration resolution of 5ms/10ms. Known issue: I had to comment out one assert in Moonlight, because this patch changes one of the FEC timestamps. The FEC RFC says "The timestamp SHALL be set to a time corresponding to the repair packet's transmission time. Note that the timestamp value has no use in the actual FEC protection process and is usually useful for jitter calculations.", so I don't think this assert is correct anyway. This probably needs to be resolved before this can be accepted, either by only sending correct timestamp packets to newer Moonlight versions, or fixing it in the FEC code. RtpAudioQueue.c:300 LC_ASSERT_VT(existingBlock->fecHeader.baseTimestamp == fecBlockBaseTs);
LizardByte · Jan 4, 2025 · b517d38 · b517d38
1 parent 151ff8f
commit b517d38
Show file tree

Hide file tree

Showing 2 changed files with 22 additions and 9 deletions.
diff --git a/.gitignore b/.gitignore
@@ -56,3 +56,7 @@ package-lock.json
 # Python
 *.pyc
 venv/
+
+# Syncthing
+.stfolder/
+.stversions/
diff --git a/src/stream.cpp b/src/stream.cpp
@@ -365,12 +365,12 @@ namespace stream {
 
     struct {
       crypto::cipher::cbc_t cipher;
-      std::string ping_payload;
 
-      std::uint16_t sequenceNumber;
+      std::string ping_payload;
       // avRiKeyId == util::endian::big(First (sizeof(avRiKeyId)) bytes of launch_session->iv)
       std::uint32_t avRiKeyId;
-      std::uint32_t timestamp;
+      std::uint16_t sequenceNumber;
+      std::chrono::steady_clock::time_point timestamp_epoch;
       udp::endpoint peer;
 
       util::buffer_t<char> shards;
@@ -1273,7 +1273,7 @@ namespace stream {
   videoBroadcastThread(udp::socket &sock) {
     auto shutdown_event = mail::man->event<bool>(mail::broadcast_shutdown);
     auto packets = mail::man->queue<video::packet_t>(mail::video_packets);
-    auto timebase = boost::posix_time::microsec_clock::universal_time();
+    auto timestamp_epoch = std::chrono::steady_clock::now();
 
     // Video traffic is sent on this thread
     platf::adjust_thread_priority(platf::thread_priority_e::high);
@@ -1478,8 +1478,13 @@ namespace stream {
             auto *inspect = (video_packet_raw_t *) shards.data(x);
 
             // RTP video timestamps use a 90 KHz clock
-            auto now = boost::posix_time::microsec_clock::universal_time();
-            auto timestamp = (now - timebase).total_microseconds() / (1000 / 90);
+            std::uint32_t timestamp = static_cast<std::uint32_t>(
+              std::chrono::duration_cast<std::chrono::microseconds>(
+                packet->frame_timestamp
+                  ? *packet->frame_timestamp - timestamp_epoch
+                  : std::chrono::steady_clock::now() - timestamp_epoch // is this fallback needed?
+              ).count() / (1000.0 / 90)
+            );
 
             inspect->packet.fecInfo =
               (x << 12 |
@@ -1628,7 +1633,12 @@ namespace stream {
       auto session = (session_t *) channel_data;
 
       auto sequenceNumber = session->audio.sequenceNumber;
-      auto timestamp = session->audio.timestamp;
+      // Audio timestamps are in milliseconds and should be AudioPacketDuration (5ms or 10ms) apart
+      std::uint32_t timestamp = static_cast<std::uint32_t>(
+        std::chrono::duration_cast<std::chrono::microseconds>(
+          std::chrono::steady_clock::now() - session->audio.timestamp_epoch
+        ).count() / 1000.0
+      );
 
       *(std::uint32_t *) iv.data() = util::endian::big<std::uint32_t>(session->audio.avRiKeyId + sequenceNumber);
 
@@ -1645,7 +1655,6 @@ namespace stream {
       audio_packet.rtp.timestamp = util::endian::big(timestamp);
 
       session->audio.sequenceNumber++;
-      session->audio.timestamp += session->config.audio.packetDuration;
 
       auto peer_address = session->audio.peer.address();
       try {
@@ -2062,7 +2071,7 @@ namespace stream {
       session->audio.ping_payload = launch_session.av_ping_payload;
       session->audio.avRiKeyId = util::endian::big(*(std::uint32_t *) launch_session.iv.data());
       session->audio.sequenceNumber = 0;
-      session->audio.timestamp = 0;
+      session->audio.timestamp_epoch = std::chrono::steady_clock::now();
 
       session->control.peer = nullptr;
       session->state.store(state_e::STOPPED, std::memory_order_relaxed);