Wednesday, April 3, 2013

gavf: A multimedia container format for gmerlin

Introduction

Having programmed a lot of demultiplexers in gmerlin-avdecoder I found out that there is no ideal container format. For my taste an ideal container format
  • Is as codec agnostic as possible, i.e. doesn't require codec specific hacks in (de-)multiplexers. AVI is surprisingly good in this respect. Ogg and mov/mp4 fail miserably here.
  • Supports sample accurate timing. This means that all streams have timestamps in their native timescale. This is solved well in Ogg and mp4, while matroska and many other formats fail.
  • Is fully streamable. This means that a stream can be encoded from a live source and sent over a (non-seekable) channel like a socket. Ogg streams have this property but mov/mp4 doesn't.
  • Is as simple as possible.
Designing a multimedia format for gmerlin was mostly a matter of serializing the C-structs, which were already present in gavl, like A/V formats, compression descriptions and metadata. Furthermore I used some tricks:
  • Use variable length integers like in matroska but extended for 64 bit
  • Introduce so called synchronization headers. They come typically before video keyframes and contain timestamps of the next packets for all elementary streams. If you seek to a sync header you have the full information about the timing when you restart decoding from that point.
  • Write timestamps relative to the last sync header. This means smaller numbers (fewer bytes) but full accuracy and 64 bit resolution. A similar approach is found in matroska files.
  • Eliminate redundant fields. E.g. a video stream with constant framerate and no B-frames doesn't need per-frame timestamps at all.
  • Split global per-stream information into a header (at the beginning of a file) and a footer (at the end of the file). For decoding the file (e.g. when streaming) the header is sufficient. The footer contains e.g. the indices for seeking. In the case of a live stream, there is no footer at all. But it can be generated trivially when the stream is saved to a file.
  • Make bitstream-level parsing of the elemtary streams unnecessary. This means that some fields, which might come handy on the demuxer level, are available in the container format. Examples are the frame type (I-, P- or B-frame) and timecodes.
  • Support arbitrary global and per-stream metadata
  • Allow to update the global metadata on-the-fly. This allows to wrap webradio streams into gavf streams without loosing the song titles.
  • Support chapters and subtitles (text based and graphical).
Motivation

Now the question is, why yet another multimedia format? Well it's true that there are way too many formats out there as every multimedia programmer knows too well. So let me make clear why I developed gavf. I wanted:
  • to store uncompressed A/V data in all formats, which are supported by gavl. This is especially important for testing and debugging
  • to save a compressed stream (e.g. from an rtsp source) without depending on 3rd party libraries
  • to transfer A/V streams from one gmerlin program to another via a pipe or a socket.
  • to prove, that I can design a format, which is better than all the others :)
All these goals couldn't be met with any existing container format, but they are all met by gavf, so it was worth the effort.

Supported codecs

As mentioned already, gavf supports compressed and uncompressed data. In the uncompressed case, the format is completely defined by the audio- or video format, the ID of the compression info is set to GAVL_CODEC_ID_NONE then. For audio streams, the compression can be one of the following:
  • alaw
  • ulaw
  • mp2
  • mp3
  • AC3
  • AAC
  • Vorbis
  • Flac
  • Opus
  • Speex
For video, we support:
  • JPEG
  • PNG
  • TIFF
  • TGA
  • MPEG-1
  • MPEG-2
  • MPEG-4 (a.k.a. Divx)
  • H.264 (including AVCHD)
  • Theora
  • Dirac
  • DV (several variants)
  • VP8
These allow to wrap a huge number of formats into gavf streams. Adding new codecs is mostly a matter of defining them in gavl/compression.h and adding support for them in gmerlin-avdecoder at least.

Application support

I won't promote gavf as a container format for interchanging multimedia content. In fact the current design makes it even impossible. The gavf format can change without a warning from one gavl version to another. And there are no version fields inside gavf files to ensure backward compatibility. For now, I use it exclusively for passing files between gmerlin applications of the same version.

If, however, someone likes gavf so much that he or she wants it to be more wide spread, it needs some additional work. First of all we need a formal specification document. Secondly we need to add version fields to the internal data structures so one can write backwards compatible (de-)muxers. None of these will be done by me though.

The current svn version of gmerlin has some support for gavf:
  • A reference (de-)multiplexer in gavl (gavl/gavf.h)
  • gavf demultiplexing support in gmerlin-avdecoder
  • An encoder plugin for creating gavf files with gmerlin_transcoder. It supports compressing with the standalone codecs
  • A connector for reading and writing gavf streams via regular files, pipes and sockets. It's the basis of the gavftools, which will be described in another post.

No comments: