Caption me if you can – WebVTT support for HbbTV terminals with dash.js

Following the german saying “Ehre, wem Ehre gebührt” (“honour to whom honour is due”) we start this blog post a bit different – with the acknowledgements. First we would like to thank ARTE for sponsoring and supporting the development activities and for providing the resulting source code back to the dash.js community.

Next, we would like to thank Dan Sparacio for the inspiration for the title of this post. Dan recently gave a very interesting presentation at the Demuxed conference titled “Caption Me If You Can”, talking about challenges and solutions while implementing timed text features for a global audience at Paramount.

Background

Since version 2.0.3 the HbbTV Specification officially supports the Media Source Extensions (MSE). The Encrypted Media Extensions (EME) are included in the HbbTV Specification since version 2.0.1. The availability of MSE and EME allows HbbTV application developers and content providers to move away from native DASH type-1 media-players to type-3 MSE/EME based implementations. dash.js is a prominent option to be used as a library for playback of DASH content on HbbTV terminals. The official HbbTV reference application uses dash.js as a “player for devices without a suitable native player” as well.

dash.js offers support for multiple subtitles and captions formats including CEA608, CEA708, TTML, IMSC1, EBU-TT-D and WebVTT. However, the implementation for WebVTT differs in a way that it requires native rendering support by the underlying device. dash.js parses the WebVTT files or segments and then uses the native addCue, removeCue, and onenter methods. The styling and rendering is handled by the platform itself.

The HbbTV standard itself does not define native WebVTT support. Quoting from v1.6.1 of the HbbTV specification it states:

Terminals shall be able to correctly render TTML based subtitles

Consequently, we need support by the media-player itself for using WebVTT tracks on HbbTV terminals.

Implementation

The goal of the custom WebVTT implementation in dash.js is to support WebVTT tracks on HbbTV terminals while guaranteeing maximum flexibility for dash.js users. For that reason, we implemented the solution in a way that the custom rendering can easily be activated and deactivated using a single configuration flag:

player.updateSettings({
  streaming: {
    text: {
      webvtt: {
        customRenderingEnabled: true
      }
    }
  }
})

Under the hood we are using vtt.js for the parsing and rendering of the WebVTT tracks. In addition, dash.js exposes an interface for the application to specify a rendering container element:

player.attachVttRenderingDiv(vttRenderingDiv)

The bootstrap logic of dash.js as well as the internal handling are optimized to only call the relevant methods for one of the rendering methods, either native rendering or custom rendering. The screenshot below shows an early version of the implementation in which both rendering methods could be enabled for debugging purposes.

Custom and native rendering of WebVTT subtitles in dash.js

The corresponding pull request has been merged to the development branch of dash.js, a sample is available in the nightly build of dash.js.

Test results

Extensive tests carried out by ARTE and Fraunhofer FOKUS showed that the custom rendering solution works on all the tested devices:

ModelManufacturerYearNative RenderingCustom Rendering
43UK6400PLFLG2018Works
43UM7400PLBLG2019Works
43UM71007LBLG2019Works
43UM7050PLFLG2020Works
43UO75006LFLG2021Works
43UP75009LFLG2021Works
43UQ75009LFLG2022Works
574319D80Loewe2018Works
TX-40FXW724Panasonic2018WorksWorks
TX-40GXW804Panasonic2019WorksWorks
TX-43HXW904Panasonic2020WorksWorks
TX-40JXW854ZPanasonic2021Works
QE43Q68AAUXXCSamsung2021Works
UE43RU7400USamsung2019Works
GU43TU7079USamsung2020Works
GU43AU7179UXZGSamsung2021Works
GQ32LS03BBUXZGSamsung2022Works
KD-43XH8096Sony2020Works
KD-43X80JSony2021Works
43A700FHisense2020Works
32EA5500FHisense2020Works
55PUS7504Philips2019Works

Future work

Future work includes performance optimizations regarding the parsing time of large WebVTT files. HbbTV terminals and SmartTVs typically do not have the same computational power as Desktop devices or smartphones. As a consequence, the parsing time for large WebVTT files can be significantly higher on HbbTV terminals than on other devices increasing the startup time of the player.

To overcome such performance issues a possible solution is to parse the WebVTT tracks on the server side and return a JSON representation of the data to the client. Moreover, moving the parsing logic to a webworker if available can lead to performance improvements.

If you have any question regarding our DASH and HbbTV activities or dash.js in particular, feel free to check out our website and contact us.

Leave a Reply

Your email address will not be published. Required fields are marked *