It is highly recommended that you read the pages in correct order, and not just jump somewhere. If you have trouble with some diagrams, try changing the font size in your browser. A modern graphical browser is recommended, but Links can display this correctly too, if you have huge terminal. Sorry about the incompatibilites.
Current status of the specification: Most parts written, but not properly checked for errors. We want to know about every error in it, so please try to find them (typos, grammatical, technical, flaws in design, some parts difficult to understand - anything). If you found something or just want to ask us a question, please drop a line to tronic(ath)trn(doth)iki(doth)fi (email address encrypted to avoid spam). You may also visit #mcf at irc.freenode.net.
So, you are interested in MCF? Let's get this sorted out first...
MCF is an open (the specs are available for everybody, free of charge), free (no royalties) data storage format called Movie Container Format (previously also known as Multimedia Container Format). It is developed by experienced video enthuasists and programmers. The format and all software we develop for it will also stay free; it won't be turned into a commercial project once it's popular.
In short it is a file format, like AVI. However, it is also much more than that, having streaming and broadcasting features in the same format. It is not video or audio compression algorithm, but instead just a container that can hold any media inside it. This does include MPEG-4 (XviD and DivX), AC3, Ogg Vorbis, MP3 and the others you may need.
The reason this project was started is that once upon a time the only usable movie format on PC was Microsoft's AVI, which had seen its best days and couldn't really be extended any further. After searching for existing alternates, without success, the MCF project (by the name of TMF in the early ages) was started. Since then there have been some other similar new projects appearing, most notably OGM ("Ogg Media").
As if things wouldn't have been complex enough otherwise, one of MCF's lead developers, Steve Lhomme (robUx4) started to experiment with completely new way of storing data, EBML. He took the MCF format structure and replaced most of its simple binary fields by these EBML fields. After some discussion (he was proposing to replace the almost ready for release MCF with this new system), he started his own project (the date was 2002-12-06), Matroska, to experiment more with his system. Well, then Lasse Kärkkäinen (Tronic), the other of MCF's lead developers had to go to military and because the developers still left weren't comfortable enough for coding software, the MCF development was stalled for six months.
The current status is that OGM and Matroska have working code available and can be played back on Windows and Linux. MCF is about four months behind Matroska in this matter, but we are continuing the project because this format has many technical advantages over all known competitors.
You probably have already heard about some of its features, but I'll list few interesting ones here.
Strict standard: doesn't allow users to add extensions by their own, like AVI, OGM and Matroska do. There is no technical reason to limit this, but this increases overall compatibility a lot and this is very important because we are really aiming for hardware support. You just can't have support for every "DivX subtitle format" or users inventing their own seeking information or file information headers. It's much better to have a single format that has the important features of all others. To make writing software easy, MCF uses fixed binary structures or otherwise simple methods whenever possible. A one thing you won't find in MCF is XML (however, there might still be some stored inside MCF).
Also, because of MCF's strictly defined binary structure and offsets of few headers, you can read them with single read command; this is very fast and really counts when you want to make inventory of all the files on your HDD, on a network drive, or even of those on some FTP site (just read the first 2.5 Ko of each file and you have all the data you need: no seeking and only a very little transfer).
Menus, chapters, subtitles and everything in the same file/stream.
ASCII-looking binary: MCF uses a lot of human-readable ASCII as identifiers instead of some random binary. This allows easy debugging files with any hex editor. Also, the very beginning of MCF files contains some information you can read with nearly any text editor.
Variable framerates: allows using slightly higher framerates in high motion scenes and lower framerate when the motion is slower, reducing the amount of bandwidth required and improving quality.
Seamless multi-segment: dividing a long movie to several files, that can be burnt on several discs. Okay, so what's new, I hear you asking. It's that if you have all segments available at the same time, you won't even notice the crossing of boundaries (most of the dirty work is done transparently in the parser, so players or codecs don't really need to bother with it). Or of course you can just join the segments into a single big file, or split into segments of different sizes. Oh, and one more thing - every segment is playable alone too, and you can define any overlap time you wish on 'em.
Full CRC32 protection: data can be protected, and if an error occurs (broken resume on download being the most common reason), the broken part is skipped (no more frozen frames or unplayable files). The parser can also tell where exactly the error happened, so you can redownload this specific part of the file. Another related feature is the playback of incomplete downloads without slow index regeneration or a requirement for a smart player with a smart parser.
Digital signatures: allow protecting your releases against changes. If anyone changes the content on route, the digital signature can tell which parts have been changed, removed or added after signing it. One movie can also be signed by several different people, signing different parts of it, with different keys. This system uses commonly known public key algorithms, so it should be virtually unbreakable. Now you're going to ask how this is different from just signing your AVI files... In MCF only the content is signed, not the construct it is inside! So, one can remux it (to better suit streaming, maybe), divide it into segments in a different way (or combine all segments), or basically do anything that doesn't change the the actual contents and the signature will stay valid. Or he can add or remove tracks (some languages he doesn't need, for instance), and the digital signature still validifies the parts that haven't been removed (but also tells which are missing or added).
No digital rights management (DRM) or copy protection bits. There is no and there never will be support for these features trying to limit what users can do with the files they have. If you want to protect your content from being freely distributed, don't give copies of it for people you can't trust in the first place! The technical reasons for this are obvious too - if you can view something and you have specifications to the copy limiting system or sources for the player, you can break any protection in no time. The only systems that could work even for a moment (and probably not much longer) would be closed-source or hardware protection. Both of those also limit who can watch it (ie. do I need a Fritz chip on my computer, with a complete Fritz-protected hardware/OS/software chain, or maybe Windows with some commercial player). And in the end, someone will still break it, like happened with DVD's CSS. Not to mention that one could just take analog copy no matter how strong the digital protection used is.
Some limits of MCF are based on the limits of human perception, which cannot change. Humans can't notice a timing error of 0.5 ms between video and audio, and they don't benefit from over 1000 FPS framerate either. This makes these borders available for us - the format can be simplified by using one millisecond precision in timecodes because we can be absolutely sure that it won't cause trouble in future either (except maybe for writers of some special editing software which needs 100 % accuracy).
File sizes (also applies to size of a single stream without any breaks, or to total size of a movie split over several files) are limited by the 64 bit addressing, that cannot be changed without breaking compatibility. Then how much is this exactly? If we assume that the current exponential growth of 1000 times the amount every 10 years to continue, we'll survive nicely for another 30 years until getting close to the hard limit. And as a physicist (studying at HUT) I'd estimate that the size growth is going to slow down and not stay exponential for much longer.
Frame sizes are limited by 32 bit addressing, so you can have single frames up to four gigaoctets in size. No problem there either.
The timecode system allows a single movie to last for almost 35 years, without breaks. This means 40 bit integer for milliseconds, but actually a combination of 32 bit and 16 bit integers is used, to save space.
If talking about how many different tracks you can put in a single stream/file, the limit is 65536. And you can fit a video stream in one, multi-channel audio in the second one and you still have 65534 to spare.
How about the number of segments you can divide one movie into? The number is 255.
As you can see, none of these critical parts are going to limit our big numbers. What gives us some real problems is the inventing of completely new stuff, that just cannot be supported in containers of old type (without losing compatibility). Because of that I'm saying that the predicted maximum lifetime of this format is 20 years. Of course it can be read after that too, but any new files shouldn't be created.
Every MCF file requires around 3 Ko of headers, so you really need to store at least 20 Ko of data per file for it to be even remotely efficient (however, file systems would lose even much larger amounts of data because of allocation unit waste space).
The minimal possible overhead in normal situations is mostly dictated by frame overhead. This is only 7 octets in MCF! OpenDML AVI's overhead is around 40 octets (with full indices and legacy AVI compatibility) and Matroska's is around 10 octets.
When streaming and looking for lowest possible latency, your smallest possible transfer unit is one frame (with 25 octet overhead each, if using such minimal size units). The overhead is still only a half of that of OpenDML AVI file, and we can offer checksum protection in that! However, our competitor Matroska can do this with just about 16 octets.
So, I'm pretty much out of ideas... Why don't you figure out some other limits and tell us about them (preferrably before we release MCF version 0x01)?