2024-09-04
GStreamer and WebRTC HTTP Signalling

The WebRTC nerds among us will remember the first thing we learn about WebRTC, which is that it is a specification for peer-to-peer communication of media and data, but it does not specify how signalling is done.

Or put more simply, if you want call someone on the web, WebRTC tells you how you can transfer audio, video and data, but it leaves out the bit about how you make the call itself: how do you locate the person you're calling, let them know you'd like to call them, and a few following steps before you can see and talk to each other.

WebRTC signalling

WebRTC Signalling

While this allows services to provide their own mechanisms to manage how WebRTC calls work, the lack of a standard mechanism means that general-purpose applications need to individually integrate each service that they want to support. For example, GStreamer's webrtcsrc and webrtcsink elements support various signalling protocols, including Janus Video Rooms, LiveKit, and Amazon Kinesis Video Streams.

However, having a standard way for clients to do signalling would help developers focus on their application and worry less about interoperability with different services.

Standardising Signalling

With this motivation, the IETF WebRTC Ingest Signalling over HTTPS (WISH) workgroup has been working on two specifications:

(author's note: the puns really do write themselves :))

As the names suggest, the specifications provide a way to perform signalling using HTTP. WHIP gives us a way to send media to a server, to ingest into a WebRTC call or live stream, for example.

Conversely, WHEP gives us a way for a client to use HTTP signalling to consume a WebRTC stream -- for example to create a simple web-based consumer of a WebRTC call, or tap into a live streaming pipeline.

WHIP and WHEP

WHIP and WHEP

With this view of the world, WHIP and WHEP can be used both for calling applications, but also as an alternative way to ingest or play back live streams, with lower latency and a near-ubiquitous real-time communication API.

In fact, several services already support this including Dolby Millicast, LiveKit and Cloudflare Stream.

WHIP and WHEP with GStreamer

We know GStreamer already provides developers two ways to work with WebRTC streams:

  • webrtcbin: provides a low-level API, akin to the PeerConnection API that browser-based users of WebRTC will be familiar with

  • webrtcsrc and webrtcsink: provide high-level elements that can respectively produce/consume media from/to a WebRTC endpoint

At Asymptotic, my colleagues Tarun and Sanchayan have been using these building blocks to implement GStreamer elements for both the WHIP and WHEP specifications. You can find these in the GStreamer Rust plugins repository.

Our initial implementations were based on webrtcbin, but have since been moved over to the higher-level APIs to reuse common functionality (such as automatic encoding/decoding and congestion control). Tarun covered our work in a talk at last year's GStreamer Conference.

Today, we have 4 elements implementing WHIP and WHEP.

Clients

  • whipclientsink: This is a webrtcsink-based implementation of a WHIP client, using which you can send media to a WHIP server. For example, streaming your camera to a WHIP server is as simple as:

    gst-launch-1.0 -e \
      v4l2src ! video/x-raw ! queue ! \
      whipclientsink signaller::whip-endpoint="https://my.webrtc/whip/room1"
    
  • whepclientsrc: This is work in progress and allows us to build player applications to connect to a WHEP server and consume media from it. The goal is to make playing a WHEP stream as simple as:

    gst-launch-1.0 -e \
      whepclientsrc signaller:whep-endpoint="https://my.webrtc/whep/room1" ! \
      decodebin ! autovideosink
    

The client elements fit quite neatly into how we might imagine GStreamer-based clients could work. You could stream arbitrary stored or live media to a WHIP server, and play back any media a WHEP server provides. Both pipelines implicitly benefit from GStreamer's ability to use hardware-acceleration capabilities of the platform they are running on.

GStreamer WHIP/WHEP clients

GStreamer WHIP/WHEP clients

Servers

  • whipserversrc: Allows us to create a WHIP server to which clients can connect and provide media, each of which will be exposed as GStreamer pads that can be arbitrarily routed and combined as required. We have an example server that can play all the streams being sent to it.

  • whepserversink: Finally we have ongoing work to publish arbitrary streams over WHEP for web-based clients to consume this media.

The two server elements open up a number of interesting possibilities. We can ingest arbitrary media with WHIP, and then decode and process, or forward it, depending on what the application requires. We expect that the server API will grow over time, based on the different kinds of use-cases we wish to support.

GStreamer WHIP/WHEP server

GStreamer WHIP/WHEP server

This is all pretty exciting, as we have all the pieces to create flexible pipelines for routing media between WebRTC-based endpoints without having to worry about service-specific signalling.

If you're looking for help realising WHIP/WHEP based endpoints, or other media streaming pipelines, don't hesitate to reach out to us!

2024-07-30
Notes from GStreamer spring hackfest of 2024

Some time has passed since the GStreamer spring hackfest took place in Thessaloniki, Greece in the month of May. Second time attending the GStreamer hackfest and thought about summarizing some of my thoughts this time around.

Thanks

Before getting into the details, want to send out a thank you to:

  • The GStreamer foundation for sponsoring the event as a whole
  • Sebastian, Vivia and Jordan for making all the arrangements
  • Asymptotic, for sponsoring my presence at the event

The event

At the hackfest.

It was good to see some familiar faces at the event, folks whom I had met at the previous hackfest and conference. Also nice when you finally meet people you have only conversed with online and get to put a face on the online persona you have been conversing with.

Work

Originally the plan was to work on adding stream multiplexing support to QUIC elements. However, missed pushing some of the work to GitLab which was on desktop and decided to work on that later.

HTTP Live Streaming (HLS)

A merge request for adding multi-variant playlist support with HLS has been pending review for a while. One of the features missing from that merge request was support for codec string generation when using MPEG-TS with H.264 and H.265. Decided to work on that.

H.264 or H.265 has what are known as stream-formats. H.264 or H.265 can be stream oriented or packet oriented. In the case of former, stream-format is said to be byte-stream, while in the case of latter, stream-format is said to be avc. For byte-stream, the required parameter sets are sent in-band with the video, but for avc in GStreamer, the video metadata is conveyed via an additional caps field named codec_data which can be considered as out-of-band. codec_data is only present when the video is packet oriented, that's when stream-format is avc, this value represents an AVCDecoderConfigurationRecord structure.

GStreamer already has helper functions in codec utilities which can provide information like profile-level which are required for constructing codec strings. However, these helper functions require the existence of codec_data.

When using MPEG-TS as the container, the only possible stream-format is byte-stream with H.264 or H.265. In this case, one needs to parse the in-band information for getting information like profile-level or other video metadata. In Rust, there is the cros-codecs crate which has a parser module. Using this, it was easy to parse the in-band data and then generate the codec string required for HLS playlist.

Threadshare

Before the hackfest, had spend some time on understanding the threadshare elements. Met François Laignel at the hackfest who helped with clearing doubts I had with how some of the code was laid out in threadshare.

If you are interested in understanding about what makes the threadshare elements different, highly recommend going through the blog post here.

There was some end-of-stream handling missing with the threadshare, tcpclientsrc and udpsrc elements. Spend some time working on adding support for that, which has now been merged upstream.

Play

After the three days of hackfest, a day trip was planned to the Palace of Aigai.

GStreamer hackers & co. exploring the Palace of Aigai.

Conclusion

All in all, this hackfest turned out to be a productive and fun filled hackfest. Also, have to add that Greek cuisine is excellent and look forward to the next hackfest and visiting Thessaloniki/Greece again.

2024-05-07
Writing a simple PipeWire parametric equalizer module

Motivation

When using headphones or in-ear monitors (IEMs), one might want to EQ their headphones or IEMs. Equalization or EQ is the process of adjusting the volume of different frequency bands in an audio signal. Some popular EQ software are EasyEffects on Linux and Equalizer APO on Windows. PipeWire supports EQ via the filter-chain module.

For an understanding of EQ, following resources might help.

The basic idea is that there are some “standard” frequency response curves that might sound good to different individuals, and knowing the frequency response characteristics of a specific headphone/IEM model, you can apply a set of filters via an equalizer to achieve something close to the “standard” frequency response curve that sounds good to you.

Websites like Squig or autoeq.app generate a file for parametric equalization for a given target, but this isn't a format that can be directly given to filter chain module. Squig is also useful for evaluating the frequency response curves of various in-ear monitors and headphones when making buying decisions.

An example of Parametric EQ generated from either AutoEQ or Squig looks like below.

Preamp: -6.8 dB
Filter 1: ON PK Fc 20 Hz Gain -1.3 dB Q 2.000
Filter 2: ON PK Fc 31 Hz Gain -7.0 dB Q 0.500
Filter 3: ON PK Fc 36 Hz Gain 0.7 dB Q 2.000
Filter 4: ON PK Fc 88 Hz Gain -0.4 dB Q 2.000

Fc is the frequency, Gain is the amount with which the signal gets boosted or attenuated around that frequency. Q factor controls the bandwidth around the frequency point. To be more precise, Q is the ratio of center frequency to bandwidth. If the center frequency is fixed, the bandwidth is inversely proportional to Q implying that as one raises the Q, the bandwidth is narrowed. Q is by far the most useful tool a parametric EQ offers, allowing one to attenuate or boost a narrow or wide range of frequencies within each EQ band.

If one wants to build a better intuition for this, playing around with the filter type and parameters here, and seeing the effects on the frequency response helps. This linked article also goes into the basics of filters.

EasyEffects allows importing such a file via it’s Import APO option, however, one might want to use an EQ input like this directly in PipeWire without having to resort to additional software like EasyEffects. However, during the course of testing, trying out multiple EQ is definitely much easier with EasyEffects GUI.

Now, this needs to be converted manually into something which filter-chain module can accept.

To simplify this, a simple PipeWire module is implemented which reads a parametric EQ text file like preceding and loads filter chain module while translating the inputs from the text file to what the filter chain module expects.

Continue reading

2024-03-19
Asymptotic: A 2023 Review

It's been a busy few several months, but now that we have some breathing room, I wanted to take stock of what we have done over the last year or so.

This is a good thing for most people and companies to do of course, but being a scrappy, (questionably) young organisation, it's doubly important for us to introspect. This allows us to both recognise our achievements and ensure that we are accomplishing what we have set out to do.

One thing that is clear to me is that we have been lagging in writing about some of the interesting things that we have had the opportunity to work on, so you can expect to see some more posts expanding on what you find below, as well as some of the newer work that we have begun.

(note: I write about our open source contributions below, but needless to say, none of it is possible without the collaboration, input, and reviews of members of the community)

WHIP/WHEP client and server for GStreamer

If you're in the WebRTC world, you likely have not missed the excitement around standardisation of HTTP-based signalling protocols, culminating in the WHIP and WHEP specifications.

Tarun has been driving our client and server implementations for both these protocols, and in the process has been refactoring some of the webrtcsink and webrtcsrc code to make it easier to add more signaller implementations. You can find out more about this work in his talk at GstConf 2023 and we'll be writing more about the ongoing effort here as well.

Low-latency embedded audio with PipeWire

Some of our work involves implementing a framework for very low-latency audio processing on an embedded device. PipeWire is a good fit for this sort of application, but we have had to implement a couple of features to make it work.

It turns out that doing timer-based scheduling can be more CPU intensive than ALSA period interrupts at low latencies, so we implemented an IRQ-based scheduling mode for PipeWire. This is now used by default when a pro-audio profile is selected for an ALSA device.

In addition to this, we also implemented rate adaptation for USB gadget devices using the USB Audio Class "feedback control" mechanism. This allows USB gadget devices to adapt their playback/capture rates to the graph's rate without having to perform resampling on the device, saving valuable CPU and latency.

There is likely still some room to optimise things, so expect to more hear on this front soon.

Compress offload in PipeWire

Sanchayan has written about the work we did to add support in PipeWire for offloading compressed audio. This is something we explored in PulseAudio (there's even an implementation out there), but it's a testament to the PipeWire design that we were able to get this done without any protocol changes.

This should be useful in various embedded devices that have both the hardware and firmware to make use of this power-saving feature.

GStreamer LC3 encoder and decoder

Tarun wrote a GStreamer plugin implementing the LC3 codec using the liblc3 library. This is the primary codec for next-generation wireless audio devices implementing the Bluetooth LE Audio specification. The plugin is upstream and can be used to encode and decode LC3 data already, but will likely be more useful when the existing Bluetooth plugins to talk to Bluetooth devices get LE audio support.

QUIC plugins for GStreamer

Sanchayan implemented a QUIC source and sink plugin in Rust, allowing us to start experimenting with the next generation of network transports. For the curious, the plugins sit on top of the Quinn implementation of the QUIC protocol.

There is a merge request open that should land soon, and we're already seeing folks using these plugins.

AWS S3 plugins

We've been fleshing out the AWS S3 plugins over the years, and we've added a new awss3putobjectsink. This provides a better way to push small or sparse data to S3 (subtitles, for example), without potentially losing data in case of a pipeline crash.

We'll also be expecting this to look a little more like multifilesink, allowing us to arbitrary split up data and write to S3 directly as multiple objects.

Update to webrtc-audio-processing

We also updated the webrtc-audio-processing library, based on more recent upstream libwebrtc. This is one of those things that becomes surprisingly hard as you get into it -- packaging an API-unstable library correctly, while supporting a plethora of operating system and architecture combinations.

Clients

We can't always speak publicly of the work we are doing with our clients, but there have been a few interesting developments we can (and have spoken about).

Both Sanchayan and I spoke a bit about our work with WebRTC-as-a-service provider, Daily. My talk at the GStreamer Conference was a summary of the work I wrote about previously about what we learned while building Daily's live streaming, recording, and other backend services. There were other clients we worked with during the year with similar experiences.

Sanchayan spoke about the interesting approach to building SIP support that we took for Daily. This was a pretty fun project, allowing us to build a modern server-side SIP client with GStreamer and SIP.js.

An ongoing project we are working on is building AES67 support using GStreamer for FreeSWITCH, which essentially allows bridging low-latency network audio equipment with existing SIP and related infrastructure.

As you might have noticed from previous sections, we are also working on a low-latency audio appliance using PipeWire.

Retrospective

All in all, we've had a reasonably productive 2023. There are things I know we can do better in our upstream efforts to help move merge requests and issues, and I hope to address this in 2024.

We have ideas for larger projects that we would like to take on. Some of these we might be able to find clients who would be willing to pay for. For the ideas that we think are useful but may not find any funding, we will continue to spend our spare time to push forward.

If you made this this far, thank you, and look out for more updates!

2024-03-18
Supporting ALSA compressed offload in PipeWire

Editor's note: this work was completed in late 2022 but this post was unfortunately delayed.

Modern day audio hardware these days comes integrated with Digital Signal Processors integrated in SoCs and audio codecs. Processing compressed or encoded data in such DSPs results in power savings in comparison to carrying out such processing on the CPU.

     +---------+      +---------+       +---------+
     |   CPU   | ---> |   DSP   | --->  |  Codec  |
     |         | <--- |         | <---  |         |
     +---------+      +---------+       +---------+

This post takes a look at how all this works.

Continue reading

2022-08-03
GStreamer for your backend services

For the last year and a half, we at Asymptotic have been working with the excellent team at Daily. I'd like to share a little bit about what we've learned.

Daily is a real time calling platform as a service. One standard feature that users have come to expect in their calls is the ability to record them, or to stream their conversations to a larger audience. This involves mixing together all the audio/video from each participant and then storing it, or streaming it live via YouTube, Twitch, or any other third-party service.

As you might expect, GStreamer is a good fit for building this kind of functionality, where we consume a bunch of RTP streams, composite/mix them, and then send them out to one or more external services (Amazon's S3 for recordings and HLS, or a third-party RTMP server).

I've written about how we implemented this feature elsewhere, but I'll summarise briefly.

This is a slightly longer post than usual, so grab a cup of your favourite beverage, or jump straight to the summary section for the tl;dr.

Continue reading

2021-11-13
Automate debugging using GDB scripting

Recently I was working on a GStreamer plugin in Rust. The plugin basically rounds the corners of an incoming video, something akin to the border-radius property in CSS. Below is how it looks like when running on a video.

The GStreamer pipeline for the same:

gst-launch-1.0 filesrc location=~/Downloads/bunny.mp4 ! \
  decodebin ! videoconvert ! \
  roundedcorners border-radius-px=100 ! \
  videoconvert ! gtksink

This was my first time working on a video plugin in GStreamer. Had a lot to learn on how to use the BaseTransform class from GStreamer, among other things. Without getting into the GStreamer specific details here, I basically ran into a problem for which I needed to do some debugging for figuring out what was going on in the internals of GStreamer.

Now, while I never had problems using GDB from the command line, but, the way I was using it earlier was just not good enough. I would start the pipeline, then attach gdb to a running process, place breakpoints by manually typing out the whole thing and then start. For one off debugging sessions, where may be you just want to quickly inspect the backtrace from a crash or may be look into a deadlock condition where your code hung, this could be fine. However, when you have to repeat this multiple times, do a source code change, compile and then check again, it becomes frustrating.

Let's look at how we can make this easier.

Continue reading

2021-08-02
Building a diverse team from the ground up

It's 2021, and while many things have changed for the better, they haven't changed enough, particularly in technology.

Right from the get-go, we knew we wanted to build a diverse company. Our founding team was balanced in gender terms. Our next hire was someone intimately familiar with the world we inhabited, and whom we knew through a Rust meetup that he organised.

We are determined to ensure that our hiring pipeline reflects the world we live in (rather than the world we work in). We've had a broad set of people vet our hiring page to ensure that the language is inclusive, and welcoming. We've reached out to folks we know and asked them to help us reach under-represented members in tech.

These are baby steps. We will continue doing our very best to build an inclusive, and welcoming space for everyone. We're acutely aware that gender is not the only axis of representation, and this is something we hope to address over time as well.

Here are some of the policies we've adopted to further our aim of building a diverse, and inclusive space. As we grow (slowly and sustainably), our intention is to continue to put progressive and employee-friendly policies in place.

Remote work

Offices are often designed around male-centric preferences. From the ergonomics of the workstation, to the temperature of the room, or the typical work schedule, professional spaces rarely account for differing priorities or perspectives. Remote work allows for more diverse participation.

Working hours

A full working week for us is 35 hours. We want to allow you the flexibility of scheduling your work day in a way that's most comfortable to you, while allowing for reasonable overlap with your team. We don't work weekends.

Conferences

We encourage and support both speaking at, and attending conferences, all over the world.

Taking nilenso's lead, we have adopted a no questions asked paid menstrual leave policy.

We currently operate with a flexible ("as needed") paid leave policy. We encourage people to regularly take time off. As we grow, we will likely introduce a more formal policy.

2021-07-22
Hello, world!

We started asymptotic 2½ years ago, and have been fortunate to have had a wild and busy ride so far. Nevertheless, it's probably high time we introduced ourselves! :-)

We are a small team of people who care about software freedom and take pride in their craft. Being tinkerers, we like working on low-level systems and are happiest when we're close to the metal.

We contribute to open source projects upstream, and help companies use them in products that are useful to people in the real world. This allows us to produce a virtuous cycle of improving the projects we maintain, while ensuring that they solve real world problems.

We want to live in a world where open source software is the default choice, and we aim to achieve that by making the projects we work on be the best option available.

Together, we're building a diverse, inclusive and sustainable company. This may mean choosing to be slow and deliberate while we grow. We're fine with that, because we believe that is the right path for us to achieve our goals while upholding the values we care about.

If you would like to join us on our journey, or would like to talk to us, please reach out!