107
4.8.12.11.2 Sourcing in-band text tracks
Amedia-resource-specific text trackis atext trackp363
that corresponds to data found in themedia resourcep334
.
Rules for processing and rendering such data are defined by the relevant specifications, e.g. the specification of the video format if themedia
resourcep334
is a video. Details for some legacy formats can be found in theSourcing In-band Media Resource Tracks from Media Containers into
HTMLspecification.[INBAND]p1161
When amedia resourcep334
contains data that the user agent recognises and supports as being equivalent to atext trackp363
, the user agent
runsp346
thesteps to expose a media-resource-specific text trackwith the relevant data, as follows.
1. Associate the relevant data with a newtext trackp363
and its corresponding newTextTrackp371
object. Thetext trackp363
is amedia-
resource-specific text trackp367
.
2. Set the newtext trackp363
'skindp364
,labelp364
, andlanguagep364
based on the semantics of the relevant data, as defined by the relevant
specification. If there is no label in that data, then thelabelp364
must be set to the empty string.
3. Associate thetext track list of cuesp365
with therules for updating the text track renderingp365
appropriate for the format in question.
4. If the newtext trackp363
'skindp364
ismetadatap364
, then set thetext track in-band metadata track dispatch typep364
as follows, based on
the type of themedia resourcep334
:
↪
If themedia resourcep334
is an Ogg file
Thetext track in-band metadata track dispatch typep364
must be set to the value of the Name header field.
[OGGSKELETONHEADERS]p1162
↪
If themedia resourcep334
is a WebM file
Thetext track in-band metadata track dispatch typep364
must be set to the value of theCodecIDelement.[WEBMCG]p1164
↪
If themedia resourcep334
is an MPEG-2 file
Letstream typebe the value of the "stream_type" field describing the text track's type in the file's program map section,
interpreted as an 8-bit unsigned integer. Letlengthbe the value of the "ES_info_length" field for the track in the same part of
the program map section, interpreted as an integer as defined by the MPEG-2 specification. Letdescriptor bytesbe thelength
bytes following the "ES_info_length" field. Thetext track in-band metadata track dispatch typep364
must be set to the
concatenation of thestream typebyte and the zero or moredescriptor bytesbytes, expressed in hexadecimal using
uppercase ASCII hex digitsp64
.[MPEG2]p1161
↪
If themedia resourcep334
is an MPEG-4 file
Let the firststsdbox of the firststblbox of the firstminfbox of the firstmdiabox of thetext trackp363
'strakbox in the
firstmoovbox of the file be thestsd box, if any. If the file has nostsd box, or if thestsd boxhas neither amettbox nor a
metxbox, then thetext track in-band metadata track dispatch typep364
must be set to the empty string. Otherwise, if thestsd
boxhas amettbox then thetext track in-band metadata track dispatch typep364
must be set to the concatenation of the string
"mett", a U+0020 SPACE character, and the value of the firstmime_formatfield of the firstmettbox of thestsd box, or the
empty string if that field is absent in that box. Otherwise, if thestsd boxhas nomettbox but has ametxbox then thetext
track in-band metadata track dispatch typep364
must be set to the concatenation of the string "metx", a U+0020 SPACE
character, and the value of the firstnamespacefield of the firstmetxbox of thestsd box, or the empty string if that field is
absent in that box.[MPEG4]p1161
5. Populate the newtext trackp363
'slist of cuesp365
with the cues parsed so far, following theguidelines for exposing cuesp370
, and begin
updating it dynamically as necessary.
6. Set the newtext trackp363
'sreadiness statep364
toloadedp364
.
7. Set the newtext trackp363
'smodep364
to the mode consistent with the user's preferences and the requirements of the relevant
specification for the data.
8. Add the newtext trackp363
to themedia elementp333
'slist of text tracksp363
.
9. Firep44
atrustedp44
event with the nameaddtrackp384
, that does not bubble and is not cancelable, and that uses theTrackEventp382
interface, with thetrackp383
attribute initialised to thetext trackp363
'sTextTrackp371
object, at themedia elementp333
's
textTracksp371
attribute'sTextTrackListp371
object.
For instance, if there are no other active subtitles, and this is a forced subtitle track (a subtitle track giving subtitles in the audio
track's primary language, but only for audio that is actually in another language), then those subtitles might be activated here.
Note
367
129
4.8.12.11.3 Sourcing out-of-band text tracks
When atrackp330
element is created, it must be associated with a newtext trackp363
(with its value set as defined below) and its corresponding
newTextTrackp371
object.
Thetext track kindp364
is determined from the state of the element'skindp331
attribute according to the following table; for a state given in a cell of
the first column, thekindp364
is the string given in the second column:
State
String
Subtitlesp331
subtitlesp364
Captionsp331
captionsp364
Descriptionsp331
descriptionsp364
Chaptersp331
chaptersp364
Metadatap331
metadatap364
Thetext track labelp364
is the element'strack labelp331
.
Thetext track languagep364
is the element'strack languagep331
, if any, or the empty string otherwise.
As thekindp331
,labelp331
, andsrclangp331
attributes are set, changed, or removed, thetext trackp363
must update accordingly, as per the
definitions above.
Thetext track readiness statep364
is initiallynot loadedp364
, and thetext track modep364
is initiallydisabledp364
.
Thetext track list of cuesp365
is initially empty. It is dynamically modified when the referenced file is parsed. Associated with the list are therules for
updating the text track renderingp365
appropriate for the format in question; for WebVTT, this is therules for updating the display of WebVTT text
tracksp58
.[WEBVTT]p1164
When atrackp330
element's parent element changes and the new parent is amedia elementp333
, then the user agent must add thetrackp330
element's correspondingtext trackp363
to themedia elementp333
'slist of text tracksp363
, and thenqueue a taskp843
tofirep44
atrustedp44
event with the
nameaddtrackp384
, that does not bubble and is not cancelable, and that uses theTrackEventp382
interface, with thetrackp383
attribute
initialised to thetext trackp363
'sTextTrackp371
object, at themedia elementp333
'stextTracksp371
attribute'sTextTrackListp371
object.
When atrackp330
element's parent element changes and the old parent was amedia elementp333
, then the user agent must remove thetrackp330
element's correspondingtext trackp363
from themedia elementp333
'slist of text tracksp363
, and thenqueue a taskp843
tofirep44
atrustedp44
event with
the nameremovetrackp384
, that does not bubble and is not cancelable, and that uses theTrackEventp382
interface, with thetrackp383
attribute
initialised to thetext trackp363
'sTextTrackp371
object, at themedia elementp333
'stextTracksp371
attribute'sTextTrackListp371
object.
When atext trackp363
corresponding to atrackp330
element is added to amedia elementp333
'slist of text tracksp363
, the user agent mustqueue a
taskp843
to run the following steps for themedia elementp333
:
1. If the element'sblocked-on-parserp365
flag is true, abort these steps.
2. If the element'sdid-perform-automatic-track-selectionp365
flag is true, abort these steps.
3. Honor user preferences for automatic text track selectionp368
for this element.
When the user agent is required tohonor user preferences for automatic text track selectionfor amedia elementp333
, the user agent must run
the following steps:
1. Perform automatic text track selectionp369
forsubtitlesp364
andcaptionsp364
.
2. Perform automatic text track selectionp369
fordescriptionsp364
.
3. Perform automatic text track selectionp369
forchaptersp364
.
4. If there are anytext tracksp363
in themedia elementp333
'slist of text tracksp363
whosetext track kindp364
ismetadatap364
that correspond
totrackp330
elements with adefaultp332
attribute set whosetext track modep364
is set todisabledp364
, then set thetext track modep364
of all such tracks tohiddenp365
Changes to thetrack URLp331
are handled in the algorithm below.
Note
368
115
5. Set the element'sdid-perform-automatic-track-selectionp365
flag to true.
When the steps above say toperform automatic text track selectionfor one or moretext track kindsp364
, it means to run the following steps:
1. Letcandidatesbe a list consisting of thetext tracksp363
in themedia elementp333
'slist of text tracksp363
whosetext track kindp364
is one of
the kinds that were passed to the algorithm, if any, in the order given in thelist of text tracksp363
.
2. Ifcandidatesis empty, then abort these steps.
3. If any of thetext tracksp363
incandidateshave atext track modep364
set toshowingp365
, abort these steps.
4. If the user has expressed an interest in having a track fromcandidatesenabled based on itstext track kindp364
,text track languagep364
,
andtext track labelp364
, then set itstext track modep364
toshowingp365
.
Otherwise, if there are anytext tracksp363
incandidatesthat correspond totrackp330
elements with adefaultp332
attribute set whose
text track modep364
is set todisabledp364
, then set thetext track modep364
of the first such track toshowingp365
.
When atext trackp363
corresponding to atrackp330
element experiences any of the following circumstances, the user agent muststart thetrack
processing modelp369
for thattext trackp363
and itstrackp330
element:
• Thetrackp330
element is created.
• Thetext trackp363
has itstext track modep364
changed.
• Thetrackp330
element's parent element changes and the new parent is amedia elementp333
.
When a user agent is tostart thetrackprocessing modelfor atext trackp363
and itstrackp330
element, it must run the following algorithm. This
algorithm interacts closely with theevent loopp842
mechanism; in particular, it has asynchronous sectionp845
(which is triggered as part of theevent
loopp842
algorithm). The steps in that section are marked with ⌛.
1. If another occurrence of this algorithm is already running for thistext trackp363
and itstrackp330
element, abort these steps, letting that
other algorithm take care of this element.
2. If thetext trackp363
'stext track modep364
is not set to one ofhiddenp365
orshowingp365
, abort these steps.
3. If thetext trackp363
'strackp330
element does not have amedia elementp333
as a parent, abort these steps.
4. Run the remainder of these stepsin parallelp42
, allowing whatever caused these steps to run to continue.
5. Top:Await a stable statep845
. Thesynchronous sectionp845
consists of the following steps. (The steps in thesynchronous sectionp845
are
marked with ⌛.)
6. ⌛ Set thetext track readiness statep364
toloadingp364
.
7. ⌛ LetURLbe thetrack URLp331
of thetrackp330
element.
8. ⌛ If thetrackp330
element's parent is amedia elementp333
then letcorsAttributeStatebe the state of the parentmedia elementp333
's
crossoriginp335
content attribute. Otherwise, letcorsAttributeStatebeNo CORSp89
.
9. End thesynchronous sectionp845
, continuing the remaining stepsin parallelp42
.
10. IfURLis not the empty string, run these substeps:
1. Letrequestbe the result ofcreating a potential-CORS requestp87
givenURL,corsAttributeState, and with thesame-origin
fallback flagset.
2. Setrequest'sclient
to thetrackp330
element'snode document
'sWindowp758
object'senvironment settings objectp828
and
type
to "track".
3. Fetch
request.
Thetasksp842
queuedp843
by the fetching algorithm on thenetworking task sourcep846
to process the data as it is being fetched must
determine the type of the resource. If the type of the resource is not a supported text track format, the load will fail, as described below.
For example, the user could have set a browser preference to the effect of "I want French captions whenever possible", or "If
there is a subtitle track with 'Commentary' in the title, enable it", or "If there are audio description tracks available, enable one,
ideally in Swiss German, but failing that in Standard Swiss German or Standard German".
Note
369
106
Otherwise, the resource's data must be passed to the appropriate parser (e.g., theWebVTT parserp58
) as it is received, with thetext
track list of cuesp365
being used for that parser's output.[WEBVTT]p1164
This specification does not currently say whether or how to check the MIME types of text tracks, or whether or how to perform
file type sniffing using the actual file data. Implementors differ in their intentions on this matter and it is therefore unclear what
the right solution is. In the absence of any requirement here, the HTTP specification's strict requirement to follow the Content-
Type header prevails ("Content-Type specifies the media type of the underlying data." ... "If and only if the media type is not
given by a Content-Type field, the recipient MAY attempt to guess the media type via inspection of its content and/or the
name extension(s) of the URI used to identify the resource.").
If fetching fails for any reason (network error, the server returns an error code, CORS fails, etc), or ifURLis the empty string, thenqueue
a taskp843
to first change thetext track readiness statep364
tofailed to loadp364
and thenfire a simple eventp854
namederrorp384
at the
trackp330
element. Thistaskp842
must use theDOM manipulation task sourcep846
.
If fetching does not fail, but the type of the resource is not a supported text track format, or the file was not successfully processed (e.g.,
the format in question is an XML format and the file contained a well-formedness error that the XML specification requires be detected
and reported to the application), then thetaskp842
that isqueuedp843
by thenetworking task sourcep846
in which the aforementioned
problem is found must change thetext track readiness statep364
tofailed to loadp364
andfire a simple eventp854
namederrorp384
at the
trackp330
element.
If fetching does not fail, and the file was successfully processed, then the finaltaskp842
that isqueuedp843
by thenetworking task
sourcep846
, after it has finished parsing the data, must change thetext track readiness statep364
toloadedp364
, andfire a simple eventp854
namedloadp384
at thetrackp330
element.
If, while fetching is ongoing, either:
◦ thetrack URLp331
changes so that it is no longer equal toURL, while thetext track modep364
is set tohiddenp365
or
showingp365
; or
◦ thetext track modep364
changes tohiddenp365
orshowingp365
, while thetrack URLp331
is not equal toURL
...then the user agent must abortfetching
, discarding any pendingtasksp842
generated by that algorithm (and in particular, not adding
any cues to thetext track list of cuesp365
after the moment the URL changed), and thenqueue a taskp843
that first changes thetext track
readiness statep364
tofailed to loadp364
and thenfires a simple eventp854
namederrorp384
at thetrackp330
element. Thistaskp842
must
use theDOM manipulation task sourcep846
.
11. Wait until thetext track readiness statep364
is no longer set toloadingp364
.
12. Wait until thetrack URLp331
is no longer equal toURL, at the same time as thetext track modep364
is set tohiddenp365
orshowingp365
.
13. Jump to the step labeledtop.
Whenever atrackp330
element has itssrcp331
attribute set, changed, or removed, the user agent mustimmediatelyp42
empty the element'stext
trackp363
'stext track list of cuesp365
. (This also causes the algorithm above to stop adding cues from the resource being obtained using the
previously given URL, if any.)
4.8.12.11.4Guidelines for exposing cuesin various formats astext track cuesp366
How a specific format's text track cues are to be interpreted for the purposes of processing by an HTML user agent is defined by that format. In the
absence of such a specification, this section provides some constraints within which implementations can attempt to consistently expose such
formats.
To support thetext trackp363
model of HTML, each unit of timed data is converted to atext track cuep366
. Where the mapping of the format's features
to the aspects of atext track cuep366
as defined in this specification are not defined, implementations must ensure that the mapping is consistent
with the definitions of the aspects of atext track cuep366
as defined above, as well as with the following constraints:
The appropriate parser will incrementally update thetext track list of cuesp365
during thesenetworking task sourcep846
tasksp842
,
as each such task is run with whatever data has been received from the network).
Note
370
Documents you may be interested
Documents you may be interested