Saturday, October 22, 2011

On demand audio streaming with icecast

The project goal was to make the impossible happen: Turn an icecast streaming server into an audio-on-demand server.

The background is, that I bought a NAS, which is basically a PC with Atom CPU. After erasing the firmware and installing Ubuntu Server on a 10 TB Raid 5 system, I was thinking what else I could do with the box.

Live streaming via icecast to my Wifi-radio worked for some time now, but this needs a running PC with a soundcard. What I had in my mind, was different:
  • It should run exclusively on the NAS, no need to switch on a PC
  • It should support an arbitrary number of playlists, each one corresponding to an icecast URL.
  • The upstream mechanism should work on demand because encoding many mp3 streams in parallel overloads the Atom CPU.
  • The current song should be shown in the display of the radio
Song titles
For the last requirement I added live metadata updating to the API for gmerlin broadcasting plugins. After learning, that Vorbis streams with changing song titles make my radio reboot, I wrote an MP3 broadcasting plugin (with libshout and lame). It seems that later firmware versions for the radio fix the vorbis problem, but the firmware update requires a windows software.

Commandline recorder
The recording and broadcast architecture for gmerlin was already working reliably, so I wrote a plugin, which takes a gmerlin album (=playlist), shuffles the tracks and makes them available as if it record from a soundcard. In addition, I wrote a commandline recorder, which could be started from a script. There is one script for starting a broadcast:

$cat start_broadcast.sh

#!/bin/sh

BITRATE=320
NAME="NAS $1"
STATION_DIR="/nas/Stations/lists/"
PASSWORD="secret"

AUDIO_OPT='do_audio=1:plugin=i_audiofile{album='$STATION_DIR$1':shuffle=1}'
VIDEO_OPT="do_video=0"
METADATA_OPT="metadata_mode=input"
ENC_OPT='audio_encoder=b_lame{server=nas_ip:mount=/'$1':password='$PASSWORD':name='$NAME':cbr_bitrate='$BITRATE'}'

gmerlin-record -aud $AUDIO_OPT -vid $VIDEO_OPT -m $METADATA_OPT -enc "$ENC_OPT" -r 2>> /dev/null >> /dev/null &
echo $! > $1.pid


If you call the script with start_broadcast.sh foo, it will load the album /nas/Stations/lists/foo and send the stream to the icecast server, which will make it available under nas.ip:8000/foo. In addidion, the PID of the process will be written to ./foo.pid so it can be stopped later.

The foo broadcast can be stopped with stop_broadcast.sh foo, where the
script looks like:
#!/bin/sh
kill -9 `cat $1.pid`
rm -f $1.pid

Icecast configuration
No critical options had to be changed in the icecast configuration, except queue-size, which was doubled to 1048576 because it's better for 320 kbps streams.

Icecast stats in awk friendly format
For the on-demand meachism described below, we also need to get the
running channels and connected clients from the server ideally in an awk friendly
format. This is done by getting the server statistics in xml format and process it
with xsltproc, a small commandline tool which comes with libxml2:
$cat get_stats.sh

#!/bin/sh
wget --user=admin --password=secret -O - http://127.0.0.1:8000/admin/stats.xml 2> /dev/null | \
xsltproc stats.xsl - | cut -b 2-
If you have two channels foo (1 listener) and bar (2 listeners) it will output

foo 1
bar 2

The transformation file stats.xsl looks like:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:for-each select="icestats/source">
<xsl:value-of select="@mount"/>
<xsl:text> </xsl:text>
<xsl:value-of select="listeners"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
On demand mechanism
Now since we have commands for starting, stopping and querying channels, we can start a channel when the first listener connects and stop it after the last listener disconnected. Since icecast doesn't support on demand streaming, we must trick it into doing so. The idea is to put a second http server in front of the icecast server, which handles the connection requests, starts the channel (if necessary) and then does a http redirect to the real icecast url. The icecast server runs on port 8000, the redirection server (to which the listeners connect) runs on port 8001. The redirection server can be built simply within shell scripts using the netcat (traditional) utility. The server script is simple:
$cat server.sh
#!/bin/sh

cd /nas/mmedia/Stations

while true; do
nc.traditional -l -p 8001 -c ./handle.sh
done
Whenever a TCP connection on port 8001 arrives, the following handler script is executed:
$cat handle.sh
#!/bin/bash

# Read request, path and protocol
read REQ URLPATH PROTO
# Read header variables
while true; do
read VAR VAL
if test "x$VAL" = "x"; then
break
fi
done

# Reject anything but GET requests
if test "x$REQ" != "xGET"; then
echo -e "HTTP/1.1 400 Bad Request\r\n\r\n"
exit
fi

# Remove leading "/"
FILE=`echo $URLPATH | cut -b 2-`

# Close unused streams
./clean.sh $FILE

# Check if we are broadcasting already
RESULT=`./query_station.sh $FILE`
if test "x$RESULT" = "x"; then
./start_broadcast.sh $FILE 2>> /dev/null &
sleep 1
fi

# Send redirection header
URL="http://nas_ip:8000/$FILE"
echo -e "HTTP/1.1 307 Temporary Redirect\r\nLocation: $URL\r\n\r\n"

Here we use 2 additional scripts. clean.sh stops all streams with zero listeners except the one, which was given as commandline argument.
#!/bin/sh
./get_stats.sh | awk -v NAME=$1 '($1 != NAME) && ($2 == 0) { system("./stop_broadcast.sh " $1) }'
query_station.sh lists just the number of listeners of the given station:
#!/bin/sh
./get_stats.sh | awk -v NAME=$1 '$1 == NAME { print $2 }'
Energy saving mode
When we just use the radio, the NAS must be switched on manually. The PCs do that automatically with wake-on-lan. The NAS detects, when it is no longer needed and switches off automatically then. This is done by querying the TCP connections to IP addresses other than localhost. If we don't have any external connections for more than 30 minutes, we switch off. The following script can be interesting for many other applications as well. Simply start it during booting:
#!/bin/sh
# Switch off after this time
THRESHOLD=1800
# Delay between 2 checks
DELAY=60

DATE_START=`date +%s`

while :
do
CONNECTIONS=`netstat -tn | grep tcp | grep -v " 127\." | wc -l`
DATE_NOW=`date +%s`

if test "x$CONNECTIONS" = "x0"; then
DATE_DIFF=`echo "$DATE_NOW - $DATE_START - $THRESHOLD" | bc`
if test $DATE_DIFF -gt "0"; then
poweroff
exit
fi
else
DATE_START=$DATE_NOW
fi
sleep $DELAY
done



Mission accomplished.

Thursday, December 9, 2010

New prereleases

Lots of bugs have been fixed after the last prereleases, so here are
new ones:

http://gmerlin.sourceforge.net/gmerlin-dependencies-20101209.tar.bz2
http://gmerlin.sourceforge.net/gmerlin-all-in-one-20101209.tar.bz2

The good news is, that no new features were added so the code can stabilize better.

Please test this and report any problems.

The final gmerlin release is expected by the end of the year

Saturday, September 18, 2010

Saturday, August 7, 2010

Gmerlin configuration improvements

Up to now, gmerlins configuration philosophy was simple: Export all user settable parameters as possible to the frontends, no matter now important they are. There are 2 reasons for that:
  • As a developer, I don't like to decide which configuration options are important.
  • As a user I (personally) want to have full control over all program-and plugin settings. Nothing annoys me more in other applications than features, which could easily achieved by the backend, but they are not supported in the frontend.
The downside of this approach is simple: For an average user, the gmerlin applications are way too complicated. And of course, for me it's also annoying to tweak that many parameters all the time. Now, since I reach a one-zero version, It's time to look at such usability issues.

A little look behind the GUI
A configuration dialog can contain of multiple nested sections. If you have more than one section, you see a tree structure on the right, which lets you select the section. A section contains all the configuration widgets you can see at the same time. Therefore the code must always distinguish if an action is for a section or for the whole dialog.

Factory defaults
Most configuration sections now have a button Restore factory defaults. It does, what the name suggests. You can use this if you think you messed something up.

Presets
Some configuration sections support presets. You can save all parameters into a file and load them again after. In some situations, presets are per section. In this case you see the preset menu below the parameter widgets. If the presets are global for the whole dialog window, you see the menu below the tree view. The next image shows a single-section dialog with the preset menu next to the restore button.



The next image shows a dialog with multiple sections. The preset menu is for the whole dialog, the restore button is for the section only.



The presets are designed such, that multiple applications can share them. E.g. an encoding setup configured in the transcoder can be reused in the recorder etc. Presets are available for:
  • All plugins (always global for the whole plugin)
  • Whole encoding setups
  • Filter chains
There is no reason, not to support presets for other configurations as well. Suggestions are welcome.

Tuesday, August 3, 2010

Gmerlin prereleases

gmerlin prereleases can be downloaded here:

http://gmerlin.sourceforge.net/gmerlin-dependencies-20100803.tar.bz2

http://gmerlin.sourceforge.net/gmerlin-all-in-one-20100803.tar.bz2

Highlights of this development iteration:
Please test this as much as possible and report any problems.

Sunday, August 1, 2010

Getting serious with sample accuracy

gmerlin-avdecoder has a sample accurate seek API for some time now. What was missing was a test to prove that seeking happens really with sample accuracy.

Test tool
The strictest test if a decoder library can seek with sample accuracy is to seek to a position, decode a video frame or a bunch of audio samples. Compare these with the frame/samples you get if you decode the file from the beginning. Of course, the timestamps must also be identical. A tool, which does this, is in tests/seektest.c. I noticed, that video streams easily pass this test, usually even if no sample accurate access was requested. That's probably because I thought, that video streams are more difficult. So I put more brainload into them. Therefore I'll concentrate on audio streams in this post.

Audio codec details
When seeking in video streams, you have keyframes, which tell you where decoding of a stream can be resumed after a seek. It's sometimes difficult to implement this, but at least you always know what to do.

The naive approach for audio streams is to assume, that all blocks (e.g. 1152 samples for mp3) can be decoded independently. Unfortunately, reality is a bit more cruel:

Bit reservoir
This is a mechanism, which allows to make pseudo VBR in a CBR stream. If a frame can be encoded with fewer bits than allocated for the frame, it can leave the remaining bits to a subsequent (probably more complex) frame. The downside of this trick is, that after a seek, the next frame might need bits from previous frames to be fully decoded.

Oberlapping transform
Most audio compression techniques work in the frequency-domain, so between the audio signal and the compression stage, there is some kind of fft-like transform.

Now, for reasons beyond this post, overlapping transforms are used by some codecs. This means, that for decoding the first samples of a compressed block, you need the last samples of the previous block. The image below shows one channel of an AAC stream for the case that the overlapping was ignored when seeking. You see that the beginning of the frame is not reconstructed properly, because the previous frame is missing.



Both the bit reservoir and the overlapping can be boiled down to a single number, which tells how many sample before the actual seek point the decoder must restart decoding. This number is set by the codec during initialization, and it's used when we seek with sample accuracy.

Mysterious liba52 behavior
Even if sample accuracy was achieved, the AC3 streams (which are on DVDs or in AVCHD files) don't achieve bit exactness. The image below shows, that there is no time shift between the signals (which means that gmerlin-avdecoder seeks correctly), but the values are not exactly the same.



First I blamed the AC3 dynamic range control for this behavior. Dynamic range compressors always have some kind memory across several frames. But even after disabling DRC, the difference was still there. I would really be curious if that's a principal property of AC3 being non-deterministic or if it's a liba52 bug.

Conclusions
The table below lists all audio codecs, which were taken into consideration. They represent a huge percentage of all files found in the wild. The next important codecs are the uncompressed ones, but these are always sample accurate.







Compression Library OverlapBit reservoir Bit exact
MPEG-1, layer II libmad - - +
MPEG-1, layer IIIlibmad + + +
AAC faad2 + ? (assumed -) +
AC3 liba52 + ? (assumed -) - (see image above)
Vorbis libvorbis + - +

Obtaining the information summarized here was a very painful process with web researches and experiments. The documentation of the decoder libraries regarding sample accurate and bit exact seeking is extremely sparse if not non-existing.

Saturday, May 1, 2010

Processing compressed streams with gmerlin

As I already mentioned, a main goal of this development cycle is to read compressed streams on the input side and write compressed streams on the encoding side. It's a bit of work, but it's definitely worth it because it offers enormous possibilities:
  • Lossless transmultiplexing from one container to another
  • Adding/removing streams of a file without recompressing the other streams.
  • Lossless concatenation of compressed files
  • Changing metadata of files (i.e. mp3/vorbis tagging)
  • Quicktime has some codecs, which correspond to image formats (png, jpeg, tiff, tga). Supporting compressed frames can convert single images to quicktime movies and back
  • In some cases broken files can be fixed as well
General approach
To limit the possibilities of creating broken files, we are a bit strict about the
codecs we support for compressed I/O. This means, that with the new feature you cannot automatically transfer all compressed streams. For compressed I/O the following conditions
must be met:
  • The precise codec must be known to gavl. While for decoding it never matters if we have MPEG-1 or MPEG-2 video (libmpeg2 decodes both), for compressed I/O it must be known.
  • For some codecs, we need other parameters like the bitrate or if the stream contains B-frames or field pictures.
  • Each audio packet must consist of an independently decompressable frame and we must know, how many uncompressed samples are contained.
  • For each video packet, we must know the pts, how long the frame will be displayed and if it's a keyframe.
Compression support in gavl
For transferring compressed packets, we need 2 data structures:
  • An info structure, which describes the compression format (i.e. the codec). The actual codec is an enum (similar to ffmpegs CodecID), but other parameters can be required as well (see above).
  • A structure for a data packet.
Both of these are in gavl in a new header file gavl/compression.h. Gavl itself never messes around with the contents of compressed packets, if just provides some housekeeping functions for packets and compression definitions. The definitions were moved here, because it's the only common dependency of gmerlin and gmerlin-avdecoder and I didn't want to define that twice.

gmerlin-avdecoder
There are 2 new functions for getting the compression format of A/V streams:
int bgav_get_audio_compression_info(bgav_t * bgav, int stream,
gavl_compression_info_t * info)

int bgav_get_video_compression_info(bgav_t * bgav, int stream,
gavl_compression_info_t * info)
They can be called after the track was selected with bgav_select_track(). If the demuxer doesn't meet the above goals for a stream it's tried with a parser. If there is no parser for this stream, compressed output fails and the functions return 0.

If you decided to read compressed packets from a stream, pass BGAV_STREAM_READRAW to bgav_set_audio_stream() or bgav_set_video_stream(). Then you can read compressed packets with:
int bgav_read_audio_packet(bgav_t * bgav, int stream, gavl_packet_t * p);

int bgav_read_video_packet(bgav_t * bgav, int stream, gavl_packet_t * p);
There is a small commandline tool bgavdemux, which writes the compressed packets to raw files, but only if the compression supports a raw format. This is e.g. not the case for vorbis or theora.

libgmerlin
In the gmerlin library, the new feature shows up mainly in the plugin API. The input plugin (bg_input_plugin_t) got 4 new functions, which have the identical meaning as their counterparts in gmerlin-avdecoder:
int (*get_audio_compression_info)(void * priv, int stream,
gavl_compression_info_t * info);

int (*get_video_compression_info)(void * priv, int stream,
gavl_compression_info_t * info);

int (*read_audio_packet)(void * priv, int stream, gavl_packet_t * p);

int (*read_video_packet)(void * priv, int stream, gavl_packet_t * p);

On the encoding side, there are 6 new functions, which are used for querying if compressed writing is possible, adding compressed A/V tracks and writing compressed A/V packets:
int (*writes_compressed_audio)(void * priv,
const gavl_audio_format_t * format,
const gavl_compression_info_t * info);

int (*writes_compressed_video)(void * priv,
const gavl_video_format_t * format,
const gavl_compression_info_t * info);

int (*add_audio_stream_compressed)(void * priv, const char * language,
const gavl_audio_format_t * format,
const gavl_compression_info_t * info);

int (*add_video_stream_compressed)(void * priv,
const gavl_video_format_t * format,
const gavl_compression_info_t * info);

int (*write_audio_packet)(void * data, gavl_packet_t * packet, int stream);

int (*write_video_packet)(void * data, gavl_packet_t * packet, int stream);
gmerlin-transcoder
In the gmerlin transcoder you have a configuration for each A/V stream:

The options for the stream can be "transcode", "copy (if possible)" or "forget". Copying of a stream is possible if the following conditions are met:
  • The source can deliver compressed packets
  • The encoder can write compressed packets of that format
  • No subtitles are blended onto video images
All filters are however completely ignored. You can configure any filters you want, but when you choose to copy the stream, none of them will be applied.

If a stream cannot be copied, it will be transcoded.

libquicktime
Another major project was support in libquicktime. It's a bit nasty because libquicktime codecs do tasks, which should actually be done by the (de)multiplexer. In practice this means that compressed streams have to be enabled for each codec and container separately. The public API is in compression.h. It was modeled after the functions in libgmerlin, but the definition of the compression (lqt_compression_info_t) is slightly different because inside libquicktime we can't use gavl.

I made a small tool lqtremux. It can either be called with a single file as an argument, in which case all A/V streams are exported to separate quicktime files. If you pass more than one file on the commandline, the last file is considered the output file and all tracks of all other files are multiplexed into the output file. Note that lqtremux is a pretty dumb application, which was written mainly as a demonstration and testbed for the new functionality. In particular you cannot copy some tracks while transcoding others. For more sophisticated tasks use gmerlin-transcoder or write your own tool.

Status and TODO
Most major codecs and containers work, although not all of them are heavily tested. Therefore I cannot guarantee, that files written that way will be compatible with all other decoders. Future work will be testing, fixing and supporting more codecs in more containers. Of course any help (like bugreports or compatibility testing on windows or OSX) is highly appreciated.

With this feature my A/V pipelines are ready for a 1.x version now.