Monday, July 14, 2014

Introducing gavftools

1. Introduction
The goal when developing the gavf-Format was mainly to have a universal pipe format, which can be used to stream multimedia content from one program to another.
In apps/gavftools, there is a bunch of commandline programs I wrote for making my everyday multimedia work much easier. They allow to build quite complex processing pipelines from the commandline.
The basis is the gavf multimedia format, which can contain audio, video as well as text- and graphical subtitles. A/V data can be either uncompressed or compressed in a large number of formats. In some cases, especially for uncompresed video, the video frames are passed as shared memory segments between the processes.
All programs are prefixed by gavf-. All programs can be called with the -help argument to show commandline options.

2. Simple examples

Read a media file and convert it to the gavf format:

gavf-decode -i file.avi -o file.gavf


gavf-decode -i file.avi > file.gavf

Play a media file:

gavf-decode -i file.avi | gavf-play

3. I/O variants

Playing a media file can happen in many ways. Instead of

gavf-decode -i file.avi | gavf-play

You can use a unix-domain socket:

gavf-decode -i file.avi -o unixserv://socket
gavf-play -i unix://socket

(of course the 2 commands should be called in different terminals or the first command should be put into the background). You can also use a fifo:

mkfifo fifo
gavf-decode -i file.avi -o fifo
gavf-play -i fifo

Or a TCP socket:

gavf-decode -i file.avi -o httpserv://
gavf-play -i

Naturally in the last example the decode and playback commands can run on different machines.

Shared memory segments are always used if the maximum packet size is known in advance and the receiver is a process (i.e. not a file) running on the same machine.

4. Other commands

Recompress the the audio stream to 320 kbps mp3. This can also be used to recompress audio and video simultaneously:

... | gavf-recompress -ac 'codec=c_lame{cbr_bitrate=320}' | ....

Split audio- and video stream into separate files:

... | gavf-demux -oa audio.gavf -ov video.gavf

Multiplex separate streams:

gavf-mux -i audio.gavf -i video.gavf | ....

Display info about the stream (don't do anything else)

... | gavf-info

Split a stream for multiple receivers (can also use more than 2 -o options):

... | gavf-tee -o saved_file.gavf -o "|gavf-play"

Record a stream from your webcam and from your soundcard (replace pulseaudio_device with something meaningful):

gavf-record -vid 'plugin=i_v4l2{device=/dev/video0}' -aud 'plugin=i_pulse{dev=pulseaudio_device}' | ...

Convert an audio-only stream to mp3. If the audio compression is mp3 already, it is written as it is, else it is encoded with 320 kbps:

... | gavf-encode -enc "a2v=0:ae=e_lame" -ac cbr_bitrate=320 -o file.mp3

Flip video images vertically:

... | gavf-filter -vf 'f={fv_flip{flip_v=1}}' | ....

Can also be used for adding audio filters with the -af option. Filters can also be chained.

Wednesday, April 3, 2013

gavf: A multimedia container format for gmerlin


Having programmed a lot of demultiplexers in gmerlin-avdecoder I found out that there is no ideal container format. For my taste an ideal container format
  • Is as codec agnostic as possible, i.e. doesn't require codec specific hacks in (de-)multiplexers. AVI is surprisingly good in this respect. Ogg and mov/mp4 fail miserably here.
  • Supports sample accurate timing. This means that all streams have timestamps in their native timescale. This is solved well in Ogg and mp4, while matroska and many other formats fail.
  • Is fully streamable. This means that a stream can be encoded from a live source and sent over a (non-seekable) channel like a socket. Ogg streams have this property but mov/mp4 doesn't.
  • Is as simple as possible.
Designing a multimedia format for gmerlin was mostly a matter of serializing the C-structs, which were already present in gavl, like A/V formats, compression descriptions and metadata. Furthermore I used some tricks:
  • Use variable length integers like in matroska but extended for 64 bit
  • Introduce so called synchronization headers. They come typically before video keyframes and contain timestamps of the next packets for all elementary streams. If you seek to a sync header you have the full information about the timing when you restart decoding from that point.
  • Write timestamps relative to the last sync header. This means smaller numbers (fewer bytes) but full accuracy and 64 bit resolution. A similar approach is found in matroska files.
  • Eliminate redundant fields. E.g. a video stream with constant framerate and no B-frames doesn't need per-frame timestamps at all.
  • Split global per-stream information into a header (at the beginning of a file) and a footer (at the end of the file). For decoding the file (e.g. when streaming) the header is sufficient. The footer contains e.g. the indices for seeking. In the case of a live stream, there is no footer at all. But it can be generated trivially when the stream is saved to a file.
  • Make bitstream-level parsing of the elemtary streams unnecessary. This means that some fields, which might come handy on the demuxer level, are available in the container format. Examples are the frame type (I-, P- or B-frame) and timecodes.
  • Support arbitrary global and per-stream metadata
  • Allow to update the global metadata on-the-fly. This allows to wrap webradio streams into gavf streams without loosing the song titles.
  • Support chapters and subtitles (text based and graphical).

Now the question is, why yet another multimedia format? Well it's true that there are way too many formats out there as every multimedia programmer knows too well. So let me make clear why I developed gavf. I wanted:
  • to store uncompressed A/V data in all formats, which are supported by gavl. This is especially important for testing and debugging
  • to save a compressed stream (e.g. from an rtsp source) without depending on 3rd party libraries
  • to transfer A/V streams from one gmerlin program to another via a pipe or a socket.
  • to prove, that I can design a format, which is better than all the others :)
All these goals couldn't be met with any existing container format, but they are all met by gavf, so it was worth the effort.

Supported codecs

As mentioned already, gavf supports compressed and uncompressed data. In the uncompressed case, the format is completely defined by the audio- or video format, the ID of the compression info is set to GAVL_CODEC_ID_NONE then. For audio streams, the compression can be one of the following:
  • alaw
  • ulaw
  • mp2
  • mp3
  • AC3
  • AAC
  • Vorbis
  • Flac
  • Opus
  • Speex
For video, we support:
  • JPEG
  • PNG
  • TIFF
  • TGA
  • MPEG-1
  • MPEG-2
  • MPEG-4 (a.k.a. Divx)
  • H.264 (including AVCHD)
  • Theora
  • Dirac
  • DV (several variants)
  • VP8
These allow to wrap a huge number of formats into gavf streams. Adding new codecs is mostly a matter of defining them in gavl/compression.h and adding support for them in gmerlin-avdecoder at least.

Application support

I won't promote gavf as a container format for interchanging multimedia content. In fact the current design makes it even impossible. The gavf format can change without a warning from one gavl version to another. And there are no version fields inside gavf files to ensure backward compatibility. For now, I use it exclusively for passing files between gmerlin applications of the same version.

If, however, someone likes gavf so much that he or she wants it to be more wide spread, it needs some additional work. First of all we need a formal specification document. Secondly we need to add version fields to the internal data structures so one can write backwards compatible (de-)muxers. None of these will be done by me though.

The current svn version of gmerlin has some support for gavf:
  • A reference (de-)multiplexer in gavl (gavl/gavf.h)
  • gavf demultiplexing support in gmerlin-avdecoder
  • An encoder plugin for creating gavf files with gmerlin_transcoder. It supports compressing with the standalone codecs
  • A connector for reading and writing gavf streams via regular files, pipes and sockets. It's the basis of the gavftools, which will be described in another post.

Tuesday, April 2, 2013

Standalone codec plugins for gmerlin

After having implemented the A/V connectors for gmerlin, it was easy to implement standalone codec plugins, which (de-)compress an A/V stream. This means that in addition to simplified A/V processing with on-the-fly format conversion we can also do on-the-fly (de-)compression. There is just one plugin type (for compression and decompression of audio and video): bg_codec_plugin_t defined in gmerlin/plugin.h. In addition to the common stuff (creation, destruction, setting parameters), there are a number of functions, which are specific to codec functionality. For decompression these are:

gavl_audio_source_t * (*connect_decode_audio)(void * priv,
                                              gavl_packet_source_t * src,
                                              const gavl_compression_info_t * ci,
                                              const gavl_audio_format_t * fmt,
                                              gavl_metadata_t * m);

gavl_video_source_t * (*connect_decode_video)(void * priv,
                                              gavl_packet_source_t * src,
                                              const gavl_compression_info_t * ci,
                                              const gavl_video_format_t * fmt,
                                              gavl_metadata_t * m);

The decompressor will get the compressed packets from the packet source. Additional arguments are the compression info, the format (which might be incomplete) and the metadata of the A/V stream. They return an audio- or video source, from where you can read the uncompressed frames.

For opening a compressor, we need to call one of:

gavl_audio_sink_t * (*open_encode_audio)(void * priv,
                                         gavl_compression_info_t * ci,
                                         gavl_audio_format_t * fmt,
                                         gavl_metadata_t * m);

gavl_video_sink_t * (*open_encode_video)(void * priv,
                                         gavl_compression_info_t * ci,
                                         gavl_video_format_t * fmt,
                                         gavl_metadata_t * m);

gavl_video_sink_t * (*open_encode_overlay)(void * priv,
                                           gavl_compression_info_t * ci,
                                           gavl_video_format_t * fmt,
                                           gavl_metadata_t * m);

It will return the sink where we can push the A/V frames. The other arguments are the same as if we open a decoder, but in this case they will be changed by the call. After opening the compressor and before passing the first frame, we need to set a packet sink where the compressed packets will be written:

void (*set_packet_sink)(void * priv, gavl_packet_sink_t * s);

The decompressors work in pull mode, the compressors work in push mode. These are the most suitable modes in typical usage scenarios.

The potential delay between compressed packets and uncompressed frames is handled internally. The decompressor simply reads enough packets to it can output one uncompressed frame. The compressor outputs compressed frame as they become available. When the compressor is destroyed, it might flush it's internal state resulting in one or more compressed packets to be written. This means that at the moment you destroy a compressor, the packet sink must still be able to accept packets.

There are decompressor plugins as part of gmerlin-avdecoder, which handle most formats. The gmerlin-encoders package contains compressor plugins for most formats as well.

Software A/V connectors for gmerlin

As mentioned earlier, I programmed generic connectors for A/V frames and compressed packets. They are much more sophisticated than the old API (based on callback functions), because they also do implicit format conversion and buffer management. The result is a simplified plugin API (consisting of fewer functions) and simplified applications. The stuff is implemented in gavl (include gavl/connectors.h), so it can be used in gmerlin as well as in gmerlin_avdecoder without introducing new library dependencies. There are 3 types of modules:
  • Sources work in pull mode and do format conversion. They are used by input- and recording plugins
  • Sinks work in push mode and are used by output and encoding plugins
  • Connectors connect multiple sinks to a source
Example for the API usage
Assuming you want to read audio samples from a media file and send it to a sink. When you get an audio source (e.g. from gemerlin_avdecoder with bgav_get_audio_source()), your application can look like:

gavl_audio_source_t * src;
gavl_audio_sink_t * sink;
gavl_audio_frame_t * f;
gavl_source_status_t st;

/* Get source */
src = bgav_get_audio_source(dec, 0);

/* Tell the source to deliver the format needed by the sink */
gavl_audio_source_set_dst(src, 0, gavl_audio_sink_get_format(sink));

/* Processing loop */
  /*  Get a frame of internally allocated memory from the sink 
   *  (e.g. shared or mmamp()ed memory). Return value can be NULL.
  f = gavl_audio_sink_get_frame(sink);

  /* Read a frame from the source, if f == NULL we'll get a frame 
   * allocated and owned by the source itself
  st = gavl_audio_source_read_frame(src, &f);

  if(st != GAVL_SOURCE_OK)

  if(gavl_audio_sink_put_frame(sink, f) != GAVL_SINK_OK)

If you want to use the gavl_audio_connector_t, things get even a bit simpler:

gavl_audio_source_t * src;
gavl_audio_sink_t * sink;
gavl_audio_connector_t * conn;

/* Get source */
src = bgav_get_audio_source(dec, 0);

/* Create connector */
conn = gavl_audio_connector_create(src);

/* Connect sink (you can connect multiple sinks) */
gavl_audio_connector_connect(conn, sink);

/* Initialize */

/* Processing loop */

The gmerlin plugin API was changed to use only the sources and sinks for passing around frames. Text subtitles are transported in gavl packets, overlay subtitles are transported in video frames.

In addition to the lower level gavl converters, the sources support some more format conversions. For audio frames, we do buffering such that the number of samples per frame you read from the source can be different from what the source natively delivers. For video, we support a simple framerate conversion, which works by repeating of dropping frames.

The video processing API is completely analogous to the audio API described above. For compressed packets, things are slightly different because we don't do format conversion on compressed packets.

A number of gmerlin modules (e.g. the player and transcoder) are already converted to the new API. In many cases, lots of redundant code could be kicked out, so the resulting code is much simpler and easier to understand.

Wednesday, February 13, 2013

Gmerlin architecture changes

It was a long time ago that I wrote something about the latest gmerlin developments. The reason for that is, that most of the time I was too busy coding and too lazy for documenting stuff. For the latter you need a stable architecture and the architecture changes a bit during the development. I usually think a lot before starting coding. But at some point I need to flush my brain and fine tune things later when I have some working applications.

The gmerlin architecture was reworked dramatically with the following goals:
  1. Implement generic source and sink connectors for transporting A/V frames and (compressed) packets inside one application. These do automatic format conversion and optimized buffer handling.
  2. Change the handling of A/V streams throughout all libraries to use the new connectors. This includes gmerlin-avdecoder as well as the gmerlin plugin API.
  3. Implement standalone codec plugins for on-the-fly (de-)compression of A/V streams.
  4. Define (yet another) Multimedia container format. It can be used as an on-disk format but also (and more importantly) as a generic pipe format for connecting commandline applications. Think of it as a more generic version of the yuv4mpeg format. It is called gavf.
  5. Define an interprocess transport mechanism for gavf streams through pipes or sockets. On machine local connections it can pass A/V frames through shared memory for increased efficiency.
  6. Write a bunch of commandline tools for generating and processing gavf streams, which can be connected in every imaginable way on the Unix commandline. This was the ultimate goal I had in my mind :)
Not everything is finished yet. I'll document each of these subprojects in separate posts.

Saturday, October 22, 2011

On demand audio streaming with icecast

The project goal was to make the impossible happen: Turn an icecast streaming server into an audio-on-demand server.

The background is, that I bought a NAS, which is basically a PC with Atom CPU. After erasing the firmware and installing Ubuntu Server on a 10 TB Raid 5 system, I was thinking what else I could do with the box.

Live streaming via icecast to my Wifi-radio worked for some time now, but this needs a running PC with a soundcard. What I had in my mind, was different:
  • It should run exclusively on the NAS, no need to switch on a PC
  • It should support an arbitrary number of playlists, each one corresponding to an icecast URL.
  • The upstream mechanism should work on demand because encoding many mp3 streams in parallel overloads the Atom CPU.
  • The current song should be shown in the display of the radio
Song titles
For the last requirement I added live metadata updating to the API for gmerlin broadcasting plugins. After learning, that Vorbis streams with changing song titles make my radio reboot, I wrote an MP3 broadcasting plugin (with libshout and lame). It seems that later firmware versions for the radio fix the vorbis problem, but the firmware update requires a windows software.

Commandline recorder
The recording and broadcast architecture for gmerlin was already working reliably, so I wrote a plugin, which takes a gmerlin album (=playlist), shuffles the tracks and makes them available as if it record from a soundcard. In addition, I wrote a commandline recorder, which could be started from a script. There is one script for starting a broadcast:





gmerlin-record -aud $AUDIO_OPT -vid $VIDEO_OPT -m $METADATA_OPT -enc "$ENC_OPT" -r 2>> /dev/null >> /dev/null &
echo $! > $

If you call the script with foo, it will load the album /nas/Stations/lists/foo and send the stream to the icecast server, which will make it available under nas.ip:8000/foo. In addidion, the PID of the process will be written to ./ so it can be stopped later.

The foo broadcast can be stopped with foo, where the
script looks like:
kill -9 `cat $`
rm -f $

Icecast configuration
No critical options had to be changed in the icecast configuration, except queue-size, which was doubled to 1048576 because it's better for 320 kbps streams.

Icecast stats in awk friendly format
For the on-demand meachism described below, we also need to get the
running channels and connected clients from the server ideally in an awk friendly
format. This is done by getting the server statistics in xml format and process it
with xsltproc, a small commandline tool which comes with libxml2:

wget --user=admin --password=secret -O - 2> /dev/null | \
xsltproc stats.xsl - | cut -b 2-
If you have two channels foo (1 listener) and bar (2 listeners) it will output

foo 1
bar 2

The transformation file stats.xsl looks like:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:for-each select="icestats/source">
<xsl:value-of select="@mount"/>
<xsl:text> </xsl:text>
<xsl:value-of select="listeners"/>
On demand mechanism
Now since we have commands for starting, stopping and querying channels, we can start a channel when the first listener connects and stop it after the last listener disconnected. Since icecast doesn't support on demand streaming, we must trick it into doing so. The idea is to put a second http server in front of the icecast server, which handles the connection requests, starts the channel (if necessary) and then does a http redirect to the real icecast url. The icecast server runs on port 8000, the redirection server (to which the listeners connect) runs on port 8001. The redirection server can be built simply within shell scripts using the netcat (traditional) utility. The server script is simple:

cd /nas/mmedia/Stations

while true; do
nc.traditional -l -p 8001 -c ./
Whenever a TCP connection on port 8001 arrives, the following handler script is executed:

# Read request, path and protocol
# Read header variables
while true; do
read VAR VAL
if test "x$VAL" = "x"; then

# Reject anything but GET requests
if test "x$REQ" != "xGET"; then
echo -e "HTTP/1.1 400 Bad Request\r\n\r\n"

# Remove leading "/"
FILE=`echo $URLPATH | cut -b 2-`

# Close unused streams
./ $FILE

# Check if we are broadcasting already
if test "x$RESULT" = "x"; then
./ $FILE 2>> /dev/null &
sleep 1

# Send redirection header
echo -e "HTTP/1.1 307 Temporary Redirect\r\nLocation: $URL\r\n\r\n"

Here we use 2 additional scripts. stops all streams with zero listeners except the one, which was given as commandline argument.
./ | awk -v NAME=$1 '($1 != NAME) && ($2 == 0) { system("./ " $1) }' lists just the number of listeners of the given station:
./ | awk -v NAME=$1 '$1 == NAME { print $2 }'
Energy saving mode
When we just use the radio, the NAS must be switched on manually. The PCs do that automatically with wake-on-lan. The NAS detects, when it is no longer needed and switches off automatically then. This is done by querying the TCP connections to IP addresses other than localhost. If we don't have any external connections for more than 30 minutes, we switch off. The following script can be interesting for many other applications as well. Simply start it during booting:
# Switch off after this time
# Delay between 2 checks

DATE_START=`date +%s`

while :
CONNECTIONS=`netstat -tn | grep tcp | grep -v " 127\." | wc -l`
DATE_NOW=`date +%s`

if test "x$CONNECTIONS" = "x0"; then
if test $DATE_DIFF -gt "0"; then
sleep $DELAY

Mission accomplished.

Thursday, December 9, 2010

New prereleases

Lots of bugs have been fixed after the last prereleases, so here are
new ones:

The good news is, that no new features were added so the code can stabilize better.

Please test this and report any problems.

The final gmerlin release is expected by the end of the year