We are constantly asked by companies to provide information on latency of Bluetooth audio solutions, specifically, when using an SBC codec, obligatory with classic Bluetooth.
In this article we demonstrate that the latency of the audio channel with SEARAN’s use of the SBC codec should be acceptable for most consumer Bluetooth audio products. Our assessment is that in cases where customers find the overall audio delay not acceptable (some off-the-shelf products), the culprit is not the SBC codec itself but the overall implementation of the audio.
A few articles and posts have been written on the subject of audio codec latency and the SBC latency in particular. Those articles cover this subject in good detail but we felt we could also contribute being in control of all relevant parameters of our own Bluetooth stack, dotstackTM and the audio channels.
First, here are some articles on the subject of Bluetooth audio codecs.
- A good overview of various codecs used with A2DP https://habr.com/en/post/456182/
- An explanation how quality of audio encoded with SBC can be improved by simply modifying some encoding parameters https://habr.com/en/post/456476/
- Comparison of various codecs to SBC with parameters from the previous article. SBC seems to be on par with or better than AptX HD. Unfortunately for SBC, not all A2DP speakers (audio sinks) on the market can cope with the settings of “improved” SBC. http://soundexpert.org/articles/-/blogs/audio-quality-of-sbc-xq-bluetooth-audio-codec
We first describe the audio source part of the demo, results of testing it with off-the-shelf Bluetooth headsets and speakers, and finally testing using SEARAN own audio receiver implementation.
We also demonstrate that some Bluetooth speakers have higher latency not because of SBC codec they use but for another reason – they have been poorly designed and most likely used much larger audio buffers than they should.
Here’s how we at SEARAN demonstrate a low latency performance using an SBC codec. We send an audio signal from a microphone to a headset by having two demo boards connected via Bluetooth. One demo board has a mic working as a source of audio, and the other demo board has a headset connected to it, and is working as an audio sink (receiver). We analyze and measure the audio delay between the signal from the mic and the headset on the other side of Bluetooth connection. We use A2DP and AVRCP to stream and control sound from a mic to a headset. This demo shows how a small size audio buffer ensures low audio latency. We also demonstrate how the change in sampling frequency affects audio latency.
To demonstrate frequency effect on latency we used a mic at different rates – 16 kHz and 48 kHz. The settings of the SBC encoder were similar to those used in FastStream (modification of SBC codec for bidirectional operation).
A headset sends a “play” command over AVRCP. The demo starts sampling the mic. When 128 samples are collected, they are encoded and sent to the controller. The process repeats until the headset sends a “pause” AVRCP command. We take 128 samples because one SBC frame always carries 128 samples regardless of the sampling frequency.
Please note that we measured (not estimated) the delay introduced by sampling, encoding and sending. Here are the results of our measurements:
Sampling rate 16 kHz:
- Sampling time – 8msec
- Encoding time – ~303µsec
Total maximum delay is about 9msec.
This delay includes sampling, encoding and sending to the controller. This delay is the time from the moment the first sample from the mic was taken to the moment the complete SBC frame has been sent to the controller. This time actually defines the latency of our microphone demo. All other delays are added by the controller and the receiver of the packet.
Now, if we use the sampling rate of 48 kHz the latency is even better:
- Sampling time – 2.7msec.
- Encoding time – ~303µsec.
Total maximum delay is about 9msec.
This delay of 9msec in the second test with 48KHz sampling rate seems a bit odd, because one would expect the total delay be around 3msec, but there is an explanation to this. The demo starts sampling when it receives a “play” button release AVRCP command from the headset. In response to this command the demo sends back two packets which takes around 6msec. So, even though, an audio packet is ready in about 3msec, it cannot be sent because the controller is busy sending responses to the AVRCP command. All next audio packets are sent with much smaller delay but it does not change the latency because the initial delay is set by the transmission delay of the first packet. In case of 16 kHz rate sampling takes longer than is needed to send AVRCP responses, so by the time an audio packet is ready to be sent the AVRCP responses have already been sent and they do not add anything to the total delay.
Packets produced by the demo are very small. Transferring them over the air takes around 1 baseband slot (0.625msec). This does not add any latency because packets are generated much slower than they are being delivered to the receiver.
As you can see the latency of the demo in both cases (16kHz and 48kHz) is the same. It is consistent and does not even depend on the BT controller used. The other point is that SBC encoding is taking only ~303 µsec and does not add much to the latency at all. SBC contribution to the overall audio latency is nowhere near 100msec (or higher) quoted in some sources.
Let us look here into how receivers (headsets, speakers) and specifically their buffers affect the overall audio latency. Bluetooth speakers and headsets have buffer(s) where they store decoded PCM samples before sending them to a DAC.
The size of audio buffers can theoretically be adjusted automatically by changing SBC configuration parameters. SBC parameters are set when Bluetooth connection between the source and the sink is established. The problem is that in most cases the buffer size in the speaker/headset is fixed by manufacturers. The larger the buffer the bigger the delay.
There’s also another consideration – the duration of sound accumulated in the buffer before it is played depends on the sampling frequency.
Let’s say we have a 1kB buffer. Each SBC frame, regardless of sampling frequency, carries 128 samples. Each sample is 16-bit of data, and there are 2 samples one for each of the two channels for stereo audio. This contributes to 512 bytes of data for 128 samples (128 samples x 2bytes x 2 channels). Let us remind again that we take 128 samples because one SBC frame always carries 128 samples regardless of the sampling frequency. 1kB buffer would hold twice as many samples – 256 (128 x 2) samples of 16bit data for each of two channels.
The sampling rate of 16 kHz (1/16msec sampling period) corresponds to 16msec of sound stored in the buffer, or an audio delay (1/16msec * 256 samples). If we would use a 2kB buffer the buffer would contain 32msec of sound or, in other words, the delay would be 32msec. Therefore, the audio delay is exactly proportional to the length of the buffer. Again, the larger the buffer the bigger the delay.
But let’s see what happens if we keep the buffer the same size – 1kB and increase the sampling rate.
The sampling rate of 48 kHz produces 5.33msec of sound (1/48msec * 256). This means that higher sampling rate adds shorter delay on the receiver.
Most headsets and speakers have a much larger buffer size than 1kB. This is our conclusion after listening to audio streaming from our audio source (as described above) to 3rd party Bluetooth speakers and compare the audio delay to the one demonstrated using 512 bytes buffer of our own demonstration (see below).
You can hear the effect of various sampling rates using our demo (see below) and various off-the-shelf headsets. You most likely will not be pleased with the 16 kHz version. The delay will be long and very noticeable.
The 48 kHz sampling rate should give a much better result. However, it all depends on the actual headset and the size of its buffer. In our tests we had the best results (not perfect anyway) when using LG headsets. All other speakers and headsets including the ones from Sony and Bose that we used had their buffers too large to perform with low latency. At all those tests our microphone side of the demo (audio source) had only 9msec of latency. All other delays were added by the receivers.
To demonstrate low latency audio from source to sink we used our own implementation of an A2DP/AVRCP sink device, where we set the size of the buffers arbitrarily. We had the best results when the buffer was set at 512 bytes, which added only 8msec delay at 16 kHz and 2.7msec delay at 48 kHz.
For the audio source (mic) we used STM3240G-EVAL development board with an input for a regular
analog microphone, and Laird’s BT830 dual mode Bluetooth module.
For the audio receiver we used STM32F407G-DISC1 development board with its audio channel and speaker along with Laird BT830 module.
If you are interested in evaluating our low latency solution please contact SEARAN to receive software and instructions.