Media management

Media management is a key discussion point in the overall performance of a real-time application. Quobis wac media server makes a innovative approach to media management which consists of a mixed architecture with an audio MCU (Multipoint Control Unit) and a video SFU (Single Forwarding Unit) for better and more flexible configuration to address a wide range of use cases.

Audio management

Audio streams are handled by the a MCU (Multipoint Control Unit) which mixes the audio coming from all the participants (also referred as “audiomixer”). Mixing audio is much less expensive in terms of CPU than video and this enables the option to join external SIP calls, typically coming from PSTN to the conferences. Additionally this reduces the bandwidth consumption, since from the web and native clients using the there will be a single bi-directional audio flow which will even take advantage of silence suppression techniques to reduce the effective bitrate.

_images/mcu.png

Video management

Video streams are handled in the video Selective Forwarding Unit (SFU). That means that each participant in a conference will publish a copy of their video stream, and they will subscribe to the videos from the rest of participants. The subscription to receive the video from the rest of participants is done directly against the SFU. This means that, in a conference with N participants, each participant will be publishing one video stream and will be receiving N-1 video streams as shown in the image below. There are a number of advantages of the SFU architecture for video:

  • This architecture is less demanding to the server resources as compared to other video conferencing architectures.

  • Since there is only one outgoing video stream, the client does not need a lot of upstream bandwidth

  • The endpoint has control of each independent video stream in order to create its own layout

  • Reduced CPU load in the server, as video is not manipulated

  • Improved scalability and ability to have different video layouts

_images/sfu.png

Note

Currently, the SDKs subscribes to all the available video flows. This needs to be taken into account when the number of participants is too high.

How that fits together

As explained above, the media management in Quobis wac media server is twofold: MCU for audio and SFU for video. The developers does not need to deal with this complexity as the Quobis wac is in charge of managing this setup automatically, by creating an audio room for each video room. The audio and video flows can be explained by the picture below:

_images/mixed_sfu_mcu_2.png

The green arrows represent the audio streams, while the blue, red and orange arrows represent the video streams. The logical entity formed by a joint combination of an audio room (MCU setup) and a video room (SFU setup) is known in our terminology as a “conference room” in the Quobis terminology.

A conference room is managed from the Quobis Signaling Server. The section Communication setup explains how a conference room can be used to implement a wide range of use cases, from basic telephony applications (“A” calls “B”) to more complex use cases such as multiparty conferencing, personal rooms, etc…