When playing with my new Canon HF200 camera, I got curious where the recording time (and date) is hidden in the AVCHD format.
The first idea was the SEI pic timing message of the H.264 stream. I already parse it for getting information whether pictures are frame- or field coded. So I extended my code to parse the timecode in HH:MM:SS:FF format, only to find out, that this info isn't present at all in my files :(
Googling for more informations about that, I found that nobody knows how to get the recording time and even professional programs fail to display it. But some very few programs do, so we know that it
must be coded in the transport stream itself (and not in the other files written by the camera).
Finally I found
this perl script, which extracts the date and time from canon mts files. It's a pretty simple implementation: It scans the multiplexed transport stream for a particular bit-pattern and then extracts the data. The script works for Canon-files but fails e.g. for Panasonic files.
Then I found where exactly the information is located: A H.264 stream has SEI (supplemental enhancement information) messages, which can contain additional (e.g. timing) information. For each SEI message the parser can obtain the message type (an integer) and the size of the message in bytes. AVCHD files have SEI messages of type 5, which means "user data unregistered" (== proprietary extension). The H.264 standard says, that these messages start with a 16 byte GUID followed by the payload.
Now take a look at the hexdump of such an SEI message:
17 ee 8c 60 f8 4d 11 d9 8c d6 08 00 20 0c 9a 66 ...`.M...... ..f
4d 44 50 4d 09 18 02 20 09 08 19 01 01 25 45 70 MDPM... .....%Ep
c7 f2 ff ff 71 ff ff ff ff 7f 00 00 65 84 e0 10 ....q.......e...
11 30 02 e1 07 ff ff ff ee 19 19 02 00 ef 01 c0 .0..............
00 00 ..
From this I found the following structure:
- The GUID is the first 16 bytes. It's always the same for the info we want, but I found other SEI messages of type 5 with different GUIDs in AVCHD files.
- 4 characters "MDPM". They occur in all files I looked at.
- An unknown byte (0x09, other vendors have other values)
- The byte 0x18 (probably indicating that year and month follow)
- An unknown byte (0x02, other vendors have other values)
- The year and month in 3 BCD coded bytes: 0x20 0x09 0x08
- The byte 0x19 (probably indicating that day and time follow)
- An unknown byte (0x01, other vendors have other values)
- The day, hour, minute and second as 4 BCD encoded bytes (0x01 0x01 0x25 0x45)
In this case, I extract the recording time "2009-08-01 01:25:45" (which is correct).
The remainder of the SEI is completely unknown, but I'm sure if someone would figure out the complete data structure (including the unknown bytes), one might be able to extract other interesting informations.
These messages are present for almost all frames, but I plan to read them only from the first frame because the following ones are redundant.
Next project will be to clean up the parsing code in gmerlin-avdecoder and make the timecode actually appear along with the first decoded frame.