Saving cost and creating efficiencies in audio broadcasts with ingest pre-processing

Sunday, July 08, 2012

Article Image
By processing audio (and video) content and automatically making adjustments or correcting errors prior to the capture by a video server, time and labour is saved through reduced secondary processing or re-ingest and there is a reduction in the amount of computer hardware required reducing power consumption and heat generation.
Today in the worlds of both video and audio broadcast, server production and play-out combined with file-based workflows dominate. The majority of programme and interstitial material broadcast worldwide is played from video server systems. However a great deal of media (audio as well as video) is delivered to broadcasters or resides within archives is in the form of video tapes and this material needs to be transferred (Ingested) to video servers prior to transmission.

Early designs of video servers were simplistic and today we see a proliferation of new low-cost video servers where the functionality and system sophistication is restricted in order to limit the cost of these systems. In particular, the new generation Channel-in-a-box systems are limited in their scope and sophistication. One of the key areas where this is apparent is media ingest – the process by which material is transferred from tape into media files. 

In these servers, the Ingest process is commonly fully automated and the material is reviewed after Ingest by playing the file back from the server, as a result there is no checking of levels or signal format prior to the creation of the file. Any errors would, subsequently, need to be corrected by reprocessing of the file based media or by re-ingesting the original with corrections made manually. 

This second stage of processing adds to the ingest workflow time and also the amount of computer processing power required, a further consequence of undertaking processing on file based media is that another decode and recode cycle may be necessary, to allow the media to be processed by linear devices, which may result in reduced audio and video quality.

So, whilst one of the key drivers for these entry-level server systems is reduced investment cost, it can easily occur that once their operating cost is added the total investment cost is much higher than was anticipated. This is just as big a challenge in audio broadcasts as it is in video. But, Axon has shown that intelligent implementation of signal pre-processing ahead of media ingest can resolve many of these quality control issues. 

To do this, the company has developed a range of signal processing modules, which take the form of standard rack mounted plug-in cards,  that perform a broad range of audio processing tasks. Axon’s modular products can automatically correct signal formatting, as well as level and alignment errors in the baseband domain whilst the ingest process is taking place. 

Examples of these corrections include:
  • audio loudness
  • surround sound audio requiring conversion from Dolby-E to discrete channels (or discrete to Dolby-E)
  • Dolby-E framing to match the video frame boundaries
By processing  audio (and video) content and automatically making adjustments or correcting errors prior to the capture by a video server, time and labour is saved through reduced secondary processing or re-ingest and there is a reduction in the amount of computer hardware required reducing power consumption and heat generation.

Axon’s Synapse cards consume less than 90W, a 75% saving when compared to an industry standard IT server, which would be required to undertake the reprocessing of the file based media. For example, the Synapse GDL/HDL200 is a dual-standard legaliser for digital signals, automatically detecting HD or SD input and utilising the correct colour space standard. It legalises 3Gb/s, HD and SD SDI streams in the RGB domain within 709 colour space (601 for SD).

It also has a preview output with a highlight function that indicates the areas that are being processed. This card is also equipped with the Quad Speed Audio bus which enables you to build a complete video and audio legalisation platform with the addition of a card from the DLA series such as the DLA44.

All cards in the DLA series utilise industry tried and tested algorithms to adjust incoming audio levels to produce a consistent audio loudness across all ingested material, protecting viewers from loudness shifts and the “loud commercial issue”. The DLA 42 can process four PCM audio pairs simultaneously and the DLA43/44 can be configured to process three PCM pairs grouped for 5.1 surround sound and a fourth PCM pair used for a stereo mix. 

Surround Sound Processing
Cards within Axon’s DLA series can additionally (depending on exact card type) switch from 5.1 to stereo or automatically provide an up-mix of a stereo source to Dolby 5.1 and thereby ensuring that all ingested audio meets the station’s audio specification automatically.

Axon offer a range of other Dolby processing cards designed to ensure that the ingested file has the correct configuration of audio tracks, an example of this is the GEP/HEP100 which automatically detects the presence of Dolby-E and decodes it to three pairs of PCM audio for loudness processing by the DLA series card, or if no Dolby-E is present will allow the original PCM audio to pass through to the processor.

If the station’s standard is to ingest 5.1 surround sound as a Dolby-E data-stream on a single AES pair alongside a PCM stereo mix on another AES pair the DBE08 Dolby-E encoder card and be used in conjunction with a GEB/HEB900 audio embedder. The embedder also provides the video delay necessary to retain lip-sync following the Dolby-E encoding process.

An integral element of Dolby-E is the correct generation and processing of metadata, such as DailNorm. At each stage in the process metadata is extracted by Axon Synapse cards, modified as necessary and passed to the next processing stage. It is important that any Dolby-E encoded audio is correctly aligned with the video it is being embedded into; an HES20 card automatically adjusts any errors in the timing relationship between video and Dolby-E audio packets to ensure that the Dolby-E guard-band coincides with the video frame boundary, essential to prevent audible clicks during video transitions.

System Configuration
The diagram below shows a system configured with all of the elements mentioned previously, the core of the system is the GDL/HDL200 video legaliser and DLA44 audio loudness control, other cards are then added to these depending on the exact ingest specifications and the range of video standards in which the media is supplied.

The system shown would be capable of accepting video in any worldwide standard format and converting it into the station’s standard, detect if Dolby-E was present and decode it to 5.1 surround sound carrier on three PCM audio pair. The core of the system would legalise video levels and correct for audio loudness variations, the resultant audio would encoded to Dolby-E with the correct metadata and the alignment of the Dolby-E data packets would be verified and corrected if necessary.

With all options installed an additional video delay of 8-frames would be introduced into the signal path, most Ingest systems have a configuration setting that will account for this and maintain frame-accuracy of the ingested file.

By use of the Quad-Speed Multiplex Audio Bus and the Synapse Bus within the Synapse frame the majority of the audio connections, which would traditionally require external cables, are handled internally. The system shown above would require only seven short BNC to BNC cables (not including those from the VTR and to the Ingest system) and Dolby-E metadata cable.

Synapse External Control and Monitoring
A number of Synapse cards can store multiple configurations in on-board presets which can be recalled by either GPI or via Ethernet commands. By employing this functionality the configuration of the pro-processing system could be dynamically controlled by the Ingest/MAM system. This could be used to change the target video format/frame rate or if the surround sound audio was to encoded using Dolby-E for instance.There is an option for Cortex to add a customised user interfaces, which could be used to give the Ingest operator a tailor-made screen showing all of the essential information about signal status together with only those controls which are thought necessary.

Without doubt, audio broadcast’s future involves the widespread integration of server technology. When devising their implementation strategy, broadcasters would be wise to pause and reflect on the tremendous cost savings that complimentary technology such as ingest pre-processing systems can offer.

In some ways, Axon’s Synapse range may be a Cinderella technology – not regarded in the same way as the latest generation of servers – but they can save operational staff considerable stress in their day to day operations.


Article Search

Search
 

   cmip equinix XStream cmip  cmip 
BPL Broadcast Limited, 3rd Floor, Armstrong House, 38 Market Square, Uxbridge, Middlesex, UB8 1LH, United Kingdom | +44 (0) 1895 454 411 |  e: info@bpl-broadcast.com  | Copyright © 2014