Author: Emanuel Lorrain (PACKED vzw)
Publication date: March 2014
Hundred of thousands of hours of audio-visual material are still being held by Flemish cultural heritage institutions and broadcast archives on already – or soon to become – obsolete carriers. From the end of 2013 the Flemish Institute for Archiving (VIAA)1 will operate as a service provider who will organise the digitisation and storage of audio-visual contents for owners and caretakers. The digital files produced will ultimately replace old tape-based formats and become the new archiving masters2.
File-based video formats have brought a number of new terms (wrapper, codec, compression, etc.) and facets to video preservation that have to be learned by collection caretakers. Confusion regarding technologies can cause heritage institutions to be reluctant about entrusting their collections and devoting their resources to large-scale digitisation projects. In such a context choosing the destination format and specifications is always a very complex phase due to the lack of a real consensus in the archival world as to which formats and specifications should be used for the long-term preservation of video. This decision is, however, a critical step that will have consequences on the future use and accessibility of the digitised content.
In the framework of the preparation for the VIAA digitisation projects, PACKED vzw has conducted some research, looking at common practices in broadcast organisations and audio-visual archives in order to see what would be the best solution for the digitisation of audio-visual collections in Flanders' cultural heritage institutions. The following document gives an overview of the different elements that should be taken into account when choosing the destination format and related specifications, listing the various options available.
1. Video formats
1.1 Codecs and containers
Video files are composed of different data streams encapsulated in a container or wrapper. Video and audio signals are encoded using codecs. A codec is a piece of hardware or software needed to encode a data stream or signal for transmission, storage or encryption and to decode it for playback or other purposes such as editing. Codec is a 'portemanteau' term constructed from the words coding/decoding. The term 'codec' is commonly used to refer directly to a coding or compression format. Video and audio essences (the bit streams) can be encoded with different codecs, with or without compression.
Some examples of codecs for video are: H264, MPEG2, JPEG2000, IV41, Cinepak and Sorenson.
In order to create a video file readable by computer software, the encoded video and audio streams are wrapped together in a container with a number of other data streams such as descriptive metadata and subtitles. The number, type and variety of data streams that a container can hold are specific to the container format used.
Some examples of containers for video are: AVI, MOV, MP4, WMV and MXF.
1.2 Uncompressed video, lossless and lossy compression
As mentioned, audio and video can be encoded with or without compression. In an uncompressed video file, the entire information of the digitised source is captured and encoded without any compression. Uncompressed video leads to very big files requiring important storage capacity when great amount of content needs to be digitised. In order to generate smaller file sizes and bit rates, video compression is used to re-encode the original content differently. Compression codecs can be lossless or lossy. When using a lossless codec, a bit-identical copy of the data can be achieved (as in an uncompressed file). When using a lossy codec, the entirety of the data is not maintained. Video compression can be brought about using different methods and algorithms (wavelet, motion compensation, discrete cosine transform or DCT). Compression methods are usually divided into three main categories:
With lossy compression a number of bits are removed in order to reduce the size of the video file. Most of the time, this is done by reducing the amount of colour information. This process implies that a part of the image, and details of its chrominance (the chroma sub-sampling and the colour bit depth) and also sometimes its luminance are lost permanently. MPEG-2/D10, Apple ProRes, DVCPro and H264 are examples of codecs performing lossy compression. The majority of digital cameras capture video natively with lossy compression codecs; lossy compressed formats are always used for production and for access (a.o. web, TV and DVD).
Manufacturers sometimes label technically 'lossy' compression schemes as 'visually lossless', because the difference between the compressed video and the original is supposed to be imperceptible to the (common) human eye. Despite its name, 'visually lossless' is a compression method in which a part of the data is permanently discarded. For this reason, 'visually lossless' is also sometimes more accurately defined as 'near-lossless compression'. In the remainder of this document, the term 'lossy' will also be used to refer to 'visually lossless' compression.
'Mathematically lossless' compression is also a method used to reduce the size of a file, but here the encoded data remain exactly the same once it is decoded. In 'real lossless compression', no information is lost. The file size is reduced by representing exactly the same information more concisely, using for instance statistical redundancy. Lossless compression codecs can't achieve the same compression ratios as lossy (and 'visually lossless') codecs but they result in smaller files than uncompressed video while retaining the entire information. In the remainder of this document 'lossless' will be used to refer to 'mathematically lossless compression'.
1.3 Compression ratios
Data compression ratio is the ratio between the uncompressed size of a file and its compressed size. Different compression algorithms and methods produce different compression ratios. The examples below show the differences in storage space required when lossy, lossless and uncompressed video codecs are used:
2. Choosing a format for long-term preservation
2.1 Differences between broadcast and heritage archives
Broadcast and cultural heritage sectors often have different views on how audio-visual material should be preserved. While both are entitled to preserve and make audio-visual heritage accessible, they deal with different types and quantities of audio-visual material. This leads to different requirements, views and approaches on what preservation means and on how to do it. Within the context of VIAA, the material to be digitised comes from a wide range of mutually different institutions, with approximately seventy per cent originating from the broadcast sector (public, commercial and local televisions) and the rest from different heritage institutions (museums, cultural archives and heritage libraries).
As broadcast archives store large quantities of material, they often require speed, efficiency and a format that can satisfy their technical tool chain and workflow. Their use cases are clearly defined, i.e., the necessity to re-use material they have produced themselves in the past for their own broadcast activities or to make it available for others. Typically, the message conveyed by the content predominates over the quality of the image. On the other hand, cultural heritage organisations consider themselves as custodians rather than as owners of the audio-visual heritage. In most cases, they didn't produce the material they preserve, which makes them accountable to the donors and means they have a duty to preserve it in the best possible manner. While access is also a crucial aspect for heritage institutions, their approach is determined by conservation principles such as authenticity, integrity and long-term sustainability over short-term efficiency.
In its definition of a museum, UNESCO says: “Today they are non-profit-making, permanent institutions in the service of society and its development, and open to the public, which acquire, conserve, research, communicate and exhibit, for purposes of study, education and enjoyment, material evidence of people and their environment". […] A museum’s primary purpose is to safeguard and preserve the heritage as a whole.” What is said here about museums applies to cultural archives and heritage libraries as well. While they are also interested in providing access to audio-visual material efficiently, they also have an institutional mandate to preserve it.
The archiving community has defined sets of principles to evaluate the sustainability factors of file formats for long-term archiving. One example of these evaluation tools is the list created by the Library of Congress for its own collection3. The criteria considered by PACKED vzw during its research in the framework of VIAA are largely based on this list and others developed by others, such as the InterPares 2 project4 or the United Kingdom's National Archives5
While the criteria are clear, a format combining all the listed requirements such as openness (important for cultural heritage institutions) and efficient handling (critical for the broadcast sector), is not yet apparent. The result of this is that the choice of the archiving format is always a compromise, whereby different types of institutions do not necessarily prioritise the same criteria. Unlike in the realm of audio digitisation for which LPCM and Broadcast Wave (BWAV) are widely considered the de-facto standards for long-term preservation, no consensus has been found amongst archivists for video content. However, digitising all remaining analogue material can't wait for the ideal video format to emerge while obsolete videotapes and playback equipment are slowly degrading on shelves. Different archives have already chosen different codecs, containers and specifications for their file-based archiving masters. Just as for videotape before, no formats, containers or codecs are expected to last forever. The archiving masters will most likely have to be migrated to another format at some point. From a cultural heritage point of view, it should be possible to migrate and transcode a file in the future without loss of information and quality.
3. Risks of lossy compression
3.1 Lossy compression threatens the quality of the content
Lossless compression can't achieve compression ratios equal to lossy compression. This is why lossy compressed codecs have been chosen by some digitisation projects when storage costs were a main concern. However, while lossy compression results in smaller image files, the information loss can also result in severe digital artefacts, especially visible with high compression ratios and on certain types of material. Technically speaking, a compression artefact is a particular type of data error. These artefacts appear because the amount of data discarded from the original was too important. A compression algorithm such as the ones used by MPEG formats cannot always make the distinction between small variations and distortions that will be visible to the naked eye. This results in visual errors such as blurring, blockiness, shimmering and colour aberration.6
3.2 Lossy compression can threaten future uses of the content
Moreover, when specific works such as colour correction or image restoration have to be carried out, the absence of discarded data can be a big problem. Even if the visually "lossless" compression algorithms are considered good enough for today's uses (i.e., TV and Web broadcast), they might not preserve sufficient information for future applications that we can't yet anticipate. Since the recording of moving images was made possible, display standards and technologies have kept evolving. An acceptable compressed image today might look terrible on tomorrow's end users' devices and screens. Choosing a lossy compression codec for long-term preservation is a risk: by greatly decreasing image quality one necessarily throws away some of the potential of the digitised material.
3.3 Lossy compression increases generation loss
Generation loss refers to the loss of quality between copies. This can happen when a tape is copied to another tape or when a file is transcoded to a different format. During the conversion from an analogue to a digital format a certain amount of loss inevitably occurs and even with uncompressed digitisation the digital file is not an exact copy of the original analogue source. Even digital tape formats such as Digital Betacam – which replaced previous analogue formats like Betacam SP as the standard in audio-visual archives for long-term preservation of video – didn't make it technically possible to avoid generation loss during migration from one tape to another. However, the best practice in audio-visual archives has always been to avoid migrating the content to a poorer medium or format. At the time, a copy to Digital Betacam was the closest to a lossless copy that one could get. Today, lossless codecs make it possible to keep the entire information while using smaller storage capacity than uncompressed video. The merits of the already established good practice with tape formats should be maintained in tape-less video archiving.
Digital formats, just like tapes, are not expected to last forever. Digital files created today will also become obsolete and will need to be transcoded to another format at some point in the future. Choosing to digitise heritage material with lossy compression means that one decides to delete some information from the original. Once lossy compression has been processed, there is no reversibility or way back as the deleted data is lost forever. On the contrary a lossless encoding will ensure that the entirety of the information is there for the next migration. With lossy compressed formats the image quality will decrease during every migration/transcoding procedure. If problems appear in the future, the only option is to digitise the tapes again.
3.4 Should different tape formats be digitised differently?
3.4.1 Quality requirements
When choosing a format to digitise video, the question of whether quality requirements should differ based on the original source material often comes up. Choosing a compressed format or a lower bit-depth to digitise certain formats is often considered because of the inherent lower quality of analogue formats such as VHS, VCR, 1/2" EIAJ or even U-matic tapes compared to broadcast standards like digital Betacam. Because these analogue formats have a smaller number of resolution lines it may seem logical to use a lower bit depth or chroma subsampling for their capture. In practice this means that the digitisation settings would use, for instance, an 8-bit depth instead of 10-bit depth or a 4:2:1 instead of a 4:2:2 chroma subsampling.
The process of defining the levels into which analogue variables are separated in order to convert them into digital data is called sampling. In the case of images, pixel resolution defines unit of area, and bit depth defines the unit of luminance. In analogue video the range between white and black is expressed in IRE7 and is fixed between 0 and 100 IRE for PAL. All properly recorded video will contain video levels between 0 and 100 IRE. At one end of the range it is black and at the other end it is white. The higher the bit depth used to digitise video the better the digital sample depth, leading to a continuous, smooth transition from black to white8. For maximal quality retention of the original source a 10 bits digital sample is required. This is true for any videotape formats, even U-matic, Hi8 or VHS. In fact, retaining the maximum chrominance and luminance information from these formats might be even more important than for high-quality standards such as Betacam SP or 2" Quad tapes. The same goes for an already poor analogue source, for which any type of compression will only make the low quality of the image worse.
3.4.2 Keeping the native encodings
Storing digitised video material in one single format makes it easier to manage a file-based archive than storing in different formats. As an example, monitoring the levels of obsolescence of formats and managing future transcoding procedures is more complicated if several formats are used. However, for some digital tape-based formats like DV or HDCAM, keeping the original encoding of the signal without further transcoding is possible. Other tape-based digital formats such as Digital Betacam don't allow one to keep the original encoding9 and should be digitised using the same codec as for analogue tapes.
4. Best practices and available options
4.1 Uncompressed video formats
4.1.1 Uncompressed MXF files
The BBC is the only broadcast archive of such importance known to have chosen uncompressed video for its digital preservation master files of a part of its collection. To digitise material which is still kept on physical carriers (mainly D3 and Digital Betacam tapes), it uses the Ingex Archive system developed by its own R&D department. Files produced by this system consist of an 8-bit uncompressed YUYV or 10-bit uncompressed v210 essence wrapped in an MXF container.
4.1.2 Uncompressed AVI and Quicktime files
Apart from the BBC, uncompressed video is almost only used by small or middle-sized collections holding very valuable works. As an example, institutions with media art collections such as LIMA10 in the Netherlands, the ZKM11 in Germany and others started to digitise their analogue videotape works several years ago already, using uncompressed video. Aja and Black Magic are the most common hardware brands for capture cards used by these collections to encode video with the v210 codec in combination with an AVI or Quicktime container (MOV). They have chosen for this combination because they absolutely wanted to avoid (lossy) compression, and this was a good and affordable combination at that time. Both the AVI and MOV container and the v210 codec are all very well supported containers and encoding formats by the majority of current media players and editing software (e.g., Final Cut Pro). While both AVI and MOV are proprietary formats, their specifications are made available by the manufacturers, and they're implemented in a wide variety of tools available under an open license (e.g., FFmpeg).
4.2 Lossless video codecs
For institutions and archives that can't afford to store uncompressed video, but want to keep the maximum quality from the original source, lossless compression is the only other solution. There are a number of different codecs which encode video with real mathematical lossless compression:
When removing the proprietary codecs from this list, only a few are left. Several of the remaining open source ones are still in a complete or partial experimental state and only very small communities maintain them. This is of course a threat to their long-term availability and the low support in software tools makes them also hard to use for non-technicians or institutions without in-house developers. This basically leaves heritage institutions that want to use a lossless codec, with only two options: Jpeg2000 and FFV1.
JPEG2000 is an image codec and a suite of ISO/IEC standards developed by the Joint Photographic Expert Group. JPEG2000 can be used to compress images in either lossless or lossy modes and is also used to encode audio-visual content from video capture or film scanning.12 JPEG2000 encodes video material frame by frame and does not use inter-frame coding techniques13. In its lossless mode, JPEG2000 has been chosen by a number of big institutions to archive their audio-visual material in combination with a MXF container. Different codec libraries can be used to encode and decode JPEG2000 files. OpenJPG and Kakadu are two examples of JPEG2000 implementations used by open source and proprietary software programmes. JPEG2000 supports several resolutions, sample bit depths and chroma subsampling, however, unlike video codecs like DV or FFV1, JPEG2000 relies on its container (a.o., MXF, QuickTime and Motion JPEG 2000) for some of its technical metadata (e.g., colour space).
'FF video codec 1', known as FFV1, is the most promising open source video codec for long-term preservation. This 'mathematical lossless' only codec is included in the Libavcodec library as part of the FFmpeg project14. Version '3' of the codec was developed with the input of archivists in order to address the specific requirements of the heritage sector. It is being successfully used by the Austrian Mediatek15, the Vancouver City Archive16 and more recently by the MUMOK17 in Vienna for their long-term preservation files. FFV1 has a compression ratio similar to JPEG2000 and decreases the amount of storage space needed to almost thirty percent compared to uncompressed video. The Austrian Mediathek has been using it successfully for three years already and has been able to use it with all current colour spaces like YUV, YV12 and RGB, including different subsampling (4.4.4, 4.2.2, etc.) with PAL SD material in 4:3 and 16:9 aspect ratios as well as with HD content in 1980 x 1080 resolution.
Open source tools used by the aforementioned archives, Archivematica18 and DVA-Profession19 both support this codec and while the adoption of FFV1 remains modest in archives, manufacturers such as NOA Audio Solutions20 are starting to incorporate the codec in their products. FFV1 is being increasingly discussed on forums (e.g., AMIA), expert groups (e.g., Presto4u) and presented in articles (e.g., AV Insider) and conferences (e.g., IASA 2013) as an alternative to JPEG2000 for mathematical lossless encoding of video. Some media art collections are currently running tests to evaluate if FFV1 could offer a good alternative to uncompressed video files. When used in combination with the Matroska open source container it has the benefit of creating fully open source files. If the adoption of FFV1 increases, it might become the preferred choice of institutions wanting to digitise video without any information loss.
AVI stands for Audio Video Interleave. It is a video container format introduced by Microsoft in November 1992 as part of its 'Video for Windows' multimedia framework. AVI is a simple container with a limited set of features. For instance, AVI does not provide a standardised way to encode the information related to the aspect ratio of a video essence. This means that when a file is played in VLC or Quicktime players for instance, the right display aspect ratio is not selected automatically. AVI relies on the codec to express the display aspect ratio. Some formats like DV, FFV1 or MPEG2 can do this uncompressed video and some other codecs can't. AVI is used by the Austrian Mediathek to wrap FFV1 video essence and by several media art collections to store uncompressed video. It is a proprietary container format, but as said earlier, even if its legal situation is unclear, Microsoft makes its documentation freely available.
Quicktime is a proprietary multimedia container format developed by Apple Computer. The format specifies a container file that contains one or more streams, each of which stores a particular type of data, e.g., audio, video and text (e.g., for subtitles). Quicktime files can use two different extensions: .mov or .qt. Just like AVI, the Quicktime format is a proprietary container but its documentation is made available by Apple Computer. While it is a proprietary format, it is widely used and supported by the vast majority of tools on the market. MOV is used by several collections to store uncompressed video or to create access files with lossy codecs such as Apple Pro Res or H264.
MXF21 is a container format used to wrap a number of different audio and video streams, subtitles and descriptive metadata. MXF is theoretically a codec agnostic container and as seen earlier it can be used to wrap video in different encodings such as uncompressed, Mpeg-2 or JPEG2000 in lossy and lossless modes.22 The specifications of its profile for use with video are still very much linked to the hardware and software used to ingest and create the video files. In recent years archivists and digitisation labs have reported a number of interoperability problems with MXF/JPEG2000 files created with different encoders. While it is a SMPTE standard, video files using MXF remain manufacturer dependent, which has led to different types of MXF resulting in compatibility issues between playback software.
The high flexibility of MXF and the lack of a standard profiles for preservation makes it a complex container to handle. Although the use of MXF is widespread, this does not exclude all risks for interoperability problems. While it is technically an open standard, a number of aspects of MXF are only available in documents for which a fee has to be paid. This is one of the reasons why MXF is not as widespread as AVI or MOV in open source tools. The Advanced Media Workflow (AMWA)23 is a group that brings together mainly broadcasters and manufacturers, but also some large American cultural heritage institutions as Library of Congress and National Archives and Records Administration, working on solving these issues by specifying a number of MXF profiles for specific applications. The AS-07 profile in development now is one of them and is intended to be manufacturer independent and specifically designed for long-term preservation requirements. Beyond better handling of lossless JPEG2000, the AS-07 profile should include, amongst others, multiple timecodes from different sources24 as well as captions and subtitles. However, this profile is still a work in progress, with no fixed time frame as to when its final version will be ready and implemented by manufacturers.
The Matroska Multimedia Container is a free open source format that can hold an unlimited number of video, audio, picture, or subtitle tracks in one file. Matroska is similar to other containers but is entirely open in specification, with implementations consisting mostly of open source software. It is intended to serve as a universal format for storing any type of multimedia content like video. On the Matroska website it is announced that the container "can support all known audio and video compression formats by design. To make sure it will also be capable of coping with the future standards it is based on a very flexible underlying framework called EBML, allowing for more functionalities to be added to the container format without breaking backwards compatibility with older softwares and files.25" Matroska use the .mkv extension and is known to the wider public as a container used to wrap content extracted from DVDs with open source software such as Handbrake26. The Vancouver City Archives is using Matroska as a preservation format to store FFV1 essence together with audio essence and metadata. An open source free Matroska command line validator tool is available27 and the container is supported by Archivematica and also used by the Vancouver City Archives.
4.4 Codecs and containers evaluation
|Quicktime (MOV, .QT)||
|D10 & AVC/H264||
5. What is the best digital format for video preservation?
As seen throughout this text, there are only two options for one to be sure that the best quality is captured digitally from the original video source: an uncompressed or lossless codec. As seen also, uncompressed video requires a lot of storage capacity and the capability to handle large amounts of data efficiently. For the same amount of storage needed for one uncompressed copy, lossless encoding with FFV1 or lossless JPEG2000 allows one to store around one and a half to two copies. Although lossless compression results in smaller files, it may require more processing power due to the algorithms used in lossless compression. In a digitisation project, financial means and storage capacity are often considered the crux of decision-making. Compressed format such as MPEG-2/D10 or lossy JPEG2000 allow even more reduced storage space, but represent a big risk for the (future) quality of archival material.
FFV1 is the only 'real' open source codec that could be used by an audio-visual archive such as VIAA, but, as we have mentioned, FFV1 is still in the process of gaining acceptance within the archival community and unknown by a large part of it. Not many archives are keen on being pioneers in using long-term preservation formats. When a more widely used combination of FFV1 and Matroska become available and are further supported by software tools and manufacturers, it might be the best option for collections that can't rely on specific proprietary hardware and software to use their files. However, its limited adoption compared to JPEG2000 and MPEG-2/D10 makes it difficult to convince any heritage institutions and broadcast archives to choose it as their long-term preservation format.
VIAA will provide transcoding services for the creation of derivatives for production and access for all the institutions involved. For small institutions that cannot afford to have the necessary software to edit material in MXF/JPEG2000 or to transcode it, it is crucial29. The Institut National de l'Audiovisuel, the Library of Congress, the Library and Archives of Canada, and the National Archives of Australia are amongst the biggest audio-visual archives in the world digitising their material with a lossless JPEG2000 codec wrapped in MXF. This adoption amongst big institutions constitutes a community with an amount of material and resources important enough to think that there will always be a solution to migrate and access these archives in the future. While there are still disparities in the MXF profiles they use, the work being done on developing the AS-07 specifications to make a common MXF profile for audio-visual archives is encouraging given the number of large institutions and manufacturers involved in it. However, this profile is still a work in progress that started back in 200930, and there is still no fixed time frame as to when it will be ready.
6. Financial considerations
To prepare the digital archive infrastructure properly, costs for (amongst others) storage, transcoding hardware and network requirements need to be evaluated before the digitisation process has even started. Uncompressed and lossless compressed formats generate bigger files than lossy compression and therefore require more storage space and processing power. Costs will accumulate throughout the years, but storage and processing capacity also becomes cheaper every year according to Moore's law. For an important digitisation process, costs will probably decrease before the end of the digitisation is completed. From an archival and conservation point of view, storage costs shouldn't be a decisive criterion, and not one to favour over quality aspects and long-term sustainability. In the future, there is a big chance that the additional cost for storing lossless compressed files will become a lesser concern than the quality of the digitised content. Use cases that we can't even imagine today might require a very high-quality video file. Lossy compression is a risky path that could lead to important quality problems and the safest decision is to avoid any type of information loss. If the digitisation process has to be remade because the quality standards chosen were too low, the costs would be far greater than the supplementary investment needed to store lossless video files today. More importantly, there is a big chance that renewed digitisation would no longer be possible because of the advanced deterioration of the carriers, the obsolescence of the necessary playback equipment and the lack of skilled operators.
The VIAA digitisation projects are a unique opportunity for Flemish heritage institutions to digitise the audio-visual material they still have retained on obsolete formats. These obsolete master tapes should only be digitised once and should therefore be done with the best possible quality. As said earlier, when the necessary playback equipment is no longer available and the ageing of the carriers worsened, digitising the tapes again will be more difficult, more expensive, achieve lesser quality and in some cases simply be impossible. An ideal video file format combining all the criteria for long-term preservation may not yet exist , however several initiatives encourage us to think that one will soon emerge. Uncertainty as to how formats will or will not become the future standard makes it difficult to commit to one codec and one container. However, digitisation needs to take place now and it is not possible to wait for the perfect format to appear. Choosing a format should therefore be a trade off for which the best quality requirements and long-term sustainability are ensured.
Sue Bigelow (Vancouver City Archive), Carl Fleischhauer (Library of Congress), Hermann Lewetz (Austrian Mediathek) and Dave Rice (CUNY TV), for sharing their experience through emails and for providing some of the information used in this text. Thanks also to Peter Bubestinger (Austrian Mediathek) for his constructive and technical feedback.