1 |
|
---|
2 |
|
---|
3 |
|
---|
4 |
|
---|
5 |
|
---|
6 |
|
---|
7 | Network Working Group S. Pfeiffer
|
---|
8 | Request for Comments: 3533 CSIRO
|
---|
9 | Category: Informational May 2003
|
---|
10 |
|
---|
11 |
|
---|
12 | The Ogg Encapsulation Format Version 0
|
---|
13 |
|
---|
14 | Status of this Memo
|
---|
15 |
|
---|
16 | This memo provides information for the Internet community. It does
|
---|
17 | not specify an Internet standard of any kind. Distribution of this
|
---|
18 | memo is unlimited.
|
---|
19 |
|
---|
20 | Copyright Notice
|
---|
21 |
|
---|
22 | Copyright (C) The Internet Society (2003). All Rights Reserved.
|
---|
23 |
|
---|
24 | Abstract
|
---|
25 |
|
---|
26 | This document describes the Ogg bitstream format version 0, which is
|
---|
27 | a general, freely-available encapsulation format for media streams.
|
---|
28 | It is able to encapsulate any kind and number of video and audio
|
---|
29 | encoding formats as well as other data streams in a single bitstream.
|
---|
30 |
|
---|
31 | Terminology
|
---|
32 |
|
---|
33 | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
|
---|
34 | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
|
---|
35 | document are to be interpreted as described in BCP 14, RFC 2119 [2].
|
---|
36 |
|
---|
37 | Table of Contents
|
---|
38 |
|
---|
39 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2
|
---|
40 | 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 2
|
---|
41 | 3. Requirements for a generic encapsulation format . . . . . . . 3
|
---|
42 | 4. The Ogg bitstream format . . . . . . . . . . . . . . . . . . . 3
|
---|
43 | 5. The encapsulation process . . . . . . . . . . . . . . . . . . 6
|
---|
44 | 6. The Ogg page format . . . . . . . . . . . . . . . . . . . . . 9
|
---|
45 | 7. Security Considerations . . . . . . . . . . . . . . . . . . . 11
|
---|
46 | 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12
|
---|
47 | A. Glossary of terms and abbreviations . . . . . . . . . . . . . 13
|
---|
48 | B. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14
|
---|
49 | Author's Address . . . . . . . . . . . . . . . . . . . . . . . 14
|
---|
50 | Full Copyright Statement . . . . . . . . . . . . . . . . . . . 15
|
---|
51 |
|
---|
52 |
|
---|
53 |
|
---|
54 |
|
---|
55 |
|
---|
56 |
|
---|
57 |
|
---|
58 | Pfeiffer Informational [Page 1]
|
---|
59 | |
---|
60 |
|
---|
61 | RFC 3533 OGG May 2003
|
---|
62 |
|
---|
63 |
|
---|
64 | 1. Introduction
|
---|
65 |
|
---|
66 | The Ogg bitstream format has been developed as a part of a larger
|
---|
67 | project aimed at creating a set of components for the coding and
|
---|
68 | decoding of multimedia content (codecs) which are to be freely
|
---|
69 | available and freely re-implementable, both in software and in
|
---|
70 | hardware for the computing community at large, including the Internet
|
---|
71 | community. It is the intention of the Ogg developers represented by
|
---|
72 | Xiph.Org that it be usable without intellectual property concerns.
|
---|
73 |
|
---|
74 | This document describes the Ogg bitstream format and how to use it to
|
---|
75 | encapsulate one or several media bitstreams created by one or several
|
---|
76 | encoders. The Ogg transport bitstream is designed to provide
|
---|
77 | framing, error protection and seeking structure for higher-level
|
---|
78 | codec streams that consist of raw, unencapsulated data packets, such
|
---|
79 | as the Vorbis audio codec or the upcoming Tarkin and Theora video
|
---|
80 | codecs. It is capable of interleaving different binary media and
|
---|
81 | other time-continuous data streams that are prepared by an encoder as
|
---|
82 | a sequence of data packets. Ogg provides enough information to
|
---|
83 | properly separate data back into such encoder created data packets at
|
---|
84 | the original packet boundaries without relying on decoding to find
|
---|
85 | packet boundaries.
|
---|
86 |
|
---|
87 | Please note that the MIME type application/ogg has been registered
|
---|
88 | with the IANA [1].
|
---|
89 |
|
---|
90 | 2. Definitions
|
---|
91 |
|
---|
92 | For describing the Ogg encapsulation process, a set of terms will be
|
---|
93 | used whose meaning needs to be well understood. Therefore, some of
|
---|
94 | the most fundamental terms are defined now before we start with the
|
---|
95 | description of the requirements for a generic media stream
|
---|
96 | encapsulation format, the process of encapsulation, and the concrete
|
---|
97 | format of the Ogg bitstream. See the Appendix for a more complete
|
---|
98 | glossary.
|
---|
99 |
|
---|
100 | The result of an Ogg encapsulation is called the "Physical (Ogg)
|
---|
101 | Bitstream". It encapsulates one or several encoder-created
|
---|
102 | bitstreams, which are called "Logical Bitstreams". A logical
|
---|
103 | bitstream, provided to the Ogg encapsulation process, has a
|
---|
104 | structure, i.e., it is split up into a sequence of so-called
|
---|
105 | "Packets". The packets are created by the encoder of that logical
|
---|
106 | bitstream and represent meaningful entities for that encoder only
|
---|
107 | (e.g., an uncompressed stream may use video frames as packets). They
|
---|
108 | do not contain boundary information - strung together they appear to
|
---|
109 | be streams of random bytes with no landmarks.
|
---|
110 |
|
---|
111 |
|
---|
112 |
|
---|
113 |
|
---|
114 |
|
---|
115 | Pfeiffer Informational [Page 2]
|
---|
116 | |
---|
117 |
|
---|
118 | RFC 3533 OGG May 2003
|
---|
119 |
|
---|
120 |
|
---|
121 | Please note that the term "packet" is not used in this document to
|
---|
122 | signify entities for transport over a network.
|
---|
123 |
|
---|
124 | 3. Requirements for a generic encapsulation format
|
---|
125 |
|
---|
126 | The design idea behind Ogg was to provide a generic, linear media
|
---|
127 | transport format to enable both file-based storage and stream-based
|
---|
128 | transmission of one or several interleaved media streams independent
|
---|
129 | of the encoding format of the media data. Such an encapsulation
|
---|
130 | format needs to provide:
|
---|
131 |
|
---|
132 | o framing for logical bitstreams.
|
---|
133 |
|
---|
134 | o interleaving of different logical bitstreams.
|
---|
135 |
|
---|
136 | o detection of corruption.
|
---|
137 |
|
---|
138 | o recapture after a parsing error.
|
---|
139 |
|
---|
140 | o position landmarks for direct random access of arbitrary positions
|
---|
141 | in the bitstream.
|
---|
142 |
|
---|
143 | o streaming capability (i.e., no seeking is needed to build a 100%
|
---|
144 | complete bitstream).
|
---|
145 |
|
---|
146 | o small overhead (i.e., use no more than approximately 1-2% of
|
---|
147 | bitstream bandwidth for packet boundary marking, high-level
|
---|
148 | framing, sync and seeking).
|
---|
149 |
|
---|
150 | o simplicity to enable fast parsing.
|
---|
151 |
|
---|
152 | o simple concatenation mechanism of several physical bitstreams.
|
---|
153 |
|
---|
154 | All of these design considerations have been taken into consideration
|
---|
155 | for Ogg. Ogg supports framing and interleaving of logical
|
---|
156 | bitstreams, seeking landmarks, detection of corruption, and stream
|
---|
157 | resynchronisation after a parsing error with no more than
|
---|
158 | approximately 1-2% overhead. It is a generic framework to perform
|
---|
159 | encapsulation of time-continuous bitstreams. It does not know any
|
---|
160 | specifics about the codec data that it encapsulates and is thus
|
---|
161 | independent of any media codec.
|
---|
162 |
|
---|
163 | 4. The Ogg bitstream format
|
---|
164 |
|
---|
165 | A physical Ogg bitstream consists of multiple logical bitstreams
|
---|
166 | interleaved in so-called "Pages". Whole pages are taken in order
|
---|
167 | from multiple logical bitstreams multiplexed at the page level. The
|
---|
168 | logical bitstreams are identified by a unique serial number in the
|
---|
169 |
|
---|
170 |
|
---|
171 |
|
---|
172 | Pfeiffer Informational [Page 3]
|
---|
173 | |
---|
174 |
|
---|
175 | RFC 3533 OGG May 2003
|
---|
176 |
|
---|
177 |
|
---|
178 | header of each page of the physical bitstream. This unique serial
|
---|
179 | number is created randomly and does not have any connection to the
|
---|
180 | content or encoder of the logical bitstream it represents. Pages of
|
---|
181 | all logical bitstreams are concurrently interleaved, but they need
|
---|
182 | not be in a regular order - they are only required to be consecutive
|
---|
183 | within the logical bitstream. Ogg demultiplexing reconstructs the
|
---|
184 | original logical bitstreams from the physical bitstream by taking the
|
---|
185 | pages in order from the physical bitstream and redirecting them into
|
---|
186 | the appropriate logical decoding entity.
|
---|
187 |
|
---|
188 | Each Ogg page contains only one type of data as it belongs to one
|
---|
189 | logical bitstream only. Pages are of variable size and have a page
|
---|
190 | header containing encapsulation and error recovery information. Each
|
---|
191 | logical bitstream in a physical Ogg bitstream starts with a special
|
---|
192 | start page (bos=beginning of stream) and ends with a special page
|
---|
193 | (eos=end of stream).
|
---|
194 |
|
---|
195 | The bos page contains information to uniquely identify the codec type
|
---|
196 | and MAY contain information to set up the decoding process. The bos
|
---|
197 | page SHOULD also contain information about the encoded media - for
|
---|
198 | example, for audio, it should contain the sample rate and number of
|
---|
199 | channels. By convention, the first bytes of the bos page contain
|
---|
200 | magic data that uniquely identifies the required codec. It is the
|
---|
201 | responsibility of anyone fielding a new codec to make sure it is
|
---|
202 | possible to reliably distinguish his/her codec from all other codecs
|
---|
203 | in use. There is no fixed way to detect the end of the codec-
|
---|
204 | identifying marker. The format of the bos page is dependent on the
|
---|
205 | codec and therefore MUST be given in the encapsulation specification
|
---|
206 | of that logical bitstream type. Ogg also allows but does not require
|
---|
207 | secondary header packets after the bos page for logical bitstreams
|
---|
208 | and these must also precede any data packets in any logical
|
---|
209 | bitstream. These subsequent header packets are framed into an
|
---|
210 | integral number of pages, which will not contain any data packets.
|
---|
211 | So, a physical bitstream begins with the bos pages of all logical
|
---|
212 | bitstreams containing one initial header packet per page, followed by
|
---|
213 | the subsidiary header packets of all streams, followed by pages
|
---|
214 | containing data packets.
|
---|
215 |
|
---|
216 | The encapsulation specification for one or more logical bitstreams is
|
---|
217 | called a "media mapping". An example for a media mapping is "Ogg
|
---|
218 | Vorbis", which uses the Ogg framework to encapsulate Vorbis-encoded
|
---|
219 | audio data for stream-based storage (such as files) and transport
|
---|
220 | (such as TCP streams or pipes). Ogg Vorbis provides the name and
|
---|
221 | revision of the Vorbis codec, the audio rate and the audio quality on
|
---|
222 | the Ogg Vorbis bos page. It also uses two additional header pages
|
---|
223 | per logical bitstream. The Ogg Vorbis bos page starts with the byte
|
---|
224 | 0x01, followed by "vorbis" (a total of 7 bytes of identifier).
|
---|
225 |
|
---|
226 |
|
---|
227 |
|
---|
228 |
|
---|
229 | Pfeiffer Informational [Page 4]
|
---|
230 | |
---|
231 |
|
---|
232 | RFC 3533 OGG May 2003
|
---|
233 |
|
---|
234 |
|
---|
235 | Ogg knows two types of multiplexing: concurrent multiplexing (so-
|
---|
236 | called "Grouping") and sequential multiplexing (so-called
|
---|
237 | "Chaining"). Grouping defines how to interleave several logical
|
---|
238 | bitstreams page-wise in the same physical bitstream. Grouping is for
|
---|
239 | example needed for interleaving a video stream with several
|
---|
240 | synchronised audio tracks using different codecs in different logical
|
---|
241 | bitstreams. Chaining on the other hand, is defined to provide a
|
---|
242 | simple mechanism to concatenate physical Ogg bitstreams, as is often
|
---|
243 | needed for streaming applications.
|
---|
244 |
|
---|
245 | In grouping, all bos pages of all logical bitstreams MUST appear
|
---|
246 | together at the beginning of the Ogg bitstream. The media mapping
|
---|
247 | specifies the order of the initial pages. For example, the grouping
|
---|
248 | of a specific Ogg video and Ogg audio bitstream may specify that the
|
---|
249 | physical bitstream MUST begin with the bos page of the logical video
|
---|
250 | bitstream, followed by the bos page of the audio bitstream. Unlike
|
---|
251 | bos pages, eos pages for the logical bitstreams need not all occur
|
---|
252 | contiguously. Eos pages may be 'nil' pages, that is, pages
|
---|
253 | containing no content but simply a page header with position
|
---|
254 | information and the eos flag set in the page header. Each grouped
|
---|
255 | logical bitstream MUST have a unique serial number within the scope
|
---|
256 | of the physical bitstream.
|
---|
257 |
|
---|
258 | In chaining, complete logical bitstreams are concatenated. The
|
---|
259 | bitstreams do not overlap, i.e., the eos page of a given logical
|
---|
260 | bitstream is immediately followed by the bos page of the next. Each
|
---|
261 | chained logical bitstream MUST have a unique serial number within the
|
---|
262 | scope of the physical bitstream.
|
---|
263 |
|
---|
264 | It is possible to consecutively chain groups of concurrently
|
---|
265 | multiplexed bitstreams. The groups, when unchained, MUST stand on
|
---|
266 | their own as a valid concurrently multiplexed bitstream. The
|
---|
267 | following diagram shows a schematic example of such a physical
|
---|
268 | bitstream that obeys all the rules of both grouped and chained
|
---|
269 | multiplexed bitstreams.
|
---|
270 |
|
---|
271 | physical bitstream with pages of
|
---|
272 | different logical bitstreams grouped and chained
|
---|
273 | -------------------------------------------------------------
|
---|
274 | |*A*|*B*|*C*|A|A|C|B|A|B|#A#|C|...|B|C|#B#|#C#|*D*|D|...|#D#|
|
---|
275 | -------------------------------------------------------------
|
---|
276 | bos bos bos eos eos eos bos eos
|
---|
277 |
|
---|
278 | In this example, there are two chained physical bitstreams, the first
|
---|
279 | of which is a grouped stream of three logical bitstreams A, B, and C.
|
---|
280 | The second physical bitstream is chained after the end of the grouped
|
---|
281 | bitstream, which ends after the last eos page of all its grouped
|
---|
282 | logical bitstreams. As can be seen, grouped bitstreams begin
|
---|
283 |
|
---|
284 |
|
---|
285 |
|
---|
286 | Pfeiffer Informational [Page 5]
|
---|
287 | |
---|
288 |
|
---|
289 | RFC 3533 OGG May 2003
|
---|
290 |
|
---|
291 |
|
---|
292 | together - all of the bos pages MUST appear before any data pages.
|
---|
293 | It can also be seen that pages of concurrently multiplexed bitstreams
|
---|
294 | need not conform to a regular order. And it can be seen that a
|
---|
295 | grouped bitstream can end long before the other bitstreams in the
|
---|
296 | group end.
|
---|
297 |
|
---|
298 | Ogg does not know any specifics about the codec data except that each
|
---|
299 | logical bitstream belongs to a different codec, the data from the
|
---|
300 | codec comes in order and has position markers (so-called "Granule
|
---|
301 | positions"). Ogg does not have a concept of 'time': it only knows
|
---|
302 | about sequentially increasing, unitless position markers. An
|
---|
303 | application can only get temporal information through higher layers
|
---|
304 | which have access to the codec APIs to assign and convert granule
|
---|
305 | positions or time.
|
---|
306 |
|
---|
307 | A specific definition of a media mapping using Ogg may put further
|
---|
308 | constraints on its specific use of the Ogg bitstream format. For
|
---|
309 | example, a specific media mapping may require that all the eos pages
|
---|
310 | for all grouped bitstreams need to appear in direct sequence. An
|
---|
311 | example for a media mapping is the specification of "Ogg Vorbis".
|
---|
312 | Another example is the upcoming "Ogg Theora" specification which
|
---|
313 | encapsulates Theora-encoded video data and usually comes multiplexed
|
---|
314 | with a Vorbis stream for an Ogg containing synchronised audio and
|
---|
315 | video. As Ogg does not specify temporal relationships between the
|
---|
316 | encapsulated concurrently multiplexed bitstreams, the temporal
|
---|
317 | synchronisation between the audio and video stream will be specified
|
---|
318 | in this media mapping. To enable streaming, pages from various
|
---|
319 | logical bitstreams will typically be interleaved in chronological
|
---|
320 | order.
|
---|
321 |
|
---|
322 | 5. The encapsulation process
|
---|
323 |
|
---|
324 | The process of multiplexing different logical bitstreams happens at
|
---|
325 | the level of pages as described above. The bitstreams provided by
|
---|
326 | encoders are however handed over to Ogg as so-called "Packets" with
|
---|
327 | packet boundaries dependent on the encoding format. The process of
|
---|
328 | encapsulating packets into pages will be described now.
|
---|
329 |
|
---|
330 | From Ogg's perspective, packets can be of any arbitrary size. A
|
---|
331 | specific media mapping will define how to group or break up packets
|
---|
332 | from a specific media encoder. As Ogg pages have a maximum size of
|
---|
333 | about 64 kBytes, sometimes a packet has to be distributed over
|
---|
334 | several pages. To simplify that process, Ogg divides each packet
|
---|
335 | into 255 byte long chunks plus a final shorter chunk. These chunks
|
---|
336 | are called "Ogg Segments". They are only a logical construct and do
|
---|
337 | not have a header for themselves.
|
---|
338 |
|
---|
339 |
|
---|
340 |
|
---|
341 |
|
---|
342 |
|
---|
343 | Pfeiffer Informational [Page 6]
|
---|
344 | |
---|
345 |
|
---|
346 | RFC 3533 OGG May 2003
|
---|
347 |
|
---|
348 |
|
---|
349 | A group of contiguous segments is wrapped into a variable length page
|
---|
350 | preceded by a header. A segment table in the page header tells about
|
---|
351 | the "Lacing values" (sizes) of the segments included in the page. A
|
---|
352 | flag in the page header tells whether a page contains a packet
|
---|
353 | continued from a previous page. Note that a lacing value of 255
|
---|
354 | implies that a second lacing value follows in the packet, and a value
|
---|
355 | of less than 255 marks the end of the packet after that many
|
---|
356 | additional bytes. A packet of 255 bytes (or a multiple of 255 bytes)
|
---|
357 | is terminated by a lacing value of 0. Note also that a 'nil' (zero
|
---|
358 | length) packet is not an error; it consists of nothing more than a
|
---|
359 | lacing value of zero in the header.
|
---|
360 |
|
---|
361 | The encoding is optimized for speed and the expected case of the
|
---|
362 | majority of packets being between 50 and 200 bytes large. This is a
|
---|
363 | design justification rather than a recommendation. This encoding
|
---|
364 | both avoids imposing a maximum packet size as well as imposing
|
---|
365 | minimum overhead on small packets. In contrast, e.g., simply using
|
---|
366 | two bytes at the head of every packet and having a max packet size of
|
---|
367 | 32 kBytes would always penalize small packets (< 255 bytes, the
|
---|
368 | typical case) with twice the segmentation overhead. Using the lacing
|
---|
369 | values as suggested, small packets see the minimum possible byte-
|
---|
370 | aligned overhead (1 byte) and large packets (>512 bytes) see a fairly
|
---|
371 | constant ~0.5% overhead on encoding space.
|
---|
372 |
|
---|
373 |
|
---|
374 |
|
---|
375 |
|
---|
376 |
|
---|
377 |
|
---|
378 |
|
---|
379 |
|
---|
380 |
|
---|
381 |
|
---|
382 |
|
---|
383 |
|
---|
384 |
|
---|
385 |
|
---|
386 |
|
---|
387 |
|
---|
388 |
|
---|
389 |
|
---|
390 |
|
---|
391 |
|
---|
392 |
|
---|
393 |
|
---|
394 |
|
---|
395 |
|
---|
396 |
|
---|
397 |
|
---|
398 |
|
---|
399 |
|
---|
400 | Pfeiffer Informational [Page 7]
|
---|
401 | |
---|
402 |
|
---|
403 | RFC 3533 OGG May 2003
|
---|
404 |
|
---|
405 |
|
---|
406 | The following diagram shows a schematic example of a media mapping
|
---|
407 | using Ogg and grouped logical bitstreams:
|
---|
408 |
|
---|
409 | logical bitstream with packet boundaries
|
---|
410 | -----------------------------------------------------------------
|
---|
411 | > | packet_1 | packet_2 | packet_3 | <
|
---|
412 | -----------------------------------------------------------------
|
---|
413 |
|
---|
414 | |segmentation (logically only)
|
---|
415 | v
|
---|
416 |
|
---|
417 | packet_1 (5 segments) packet_2 (4 segs) p_3 (2 segs)
|
---|
418 | ------------------------------ -------------------- ------------
|
---|
419 | .. |seg_1|seg_2|seg_3|seg_4|s_5 | |seg_1|seg_2|seg_3|| |seg_1|s_2 | ..
|
---|
420 | ------------------------------ -------------------- ------------
|
---|
421 |
|
---|
422 | | page encapsulation
|
---|
423 | v
|
---|
424 |
|
---|
425 | page_1 (packet_1 data) page_2 (pket_1 data) page_3 (packet_2 data)
|
---|
426 | ------------------------ ---------------- ------------------------
|
---|
427 | |H|------------------- | |H|----------- | |H|------------------- |
|
---|
428 | |D||seg_1|seg_2|seg_3| | |D|seg_4|s_5 | | |D||seg_1|seg_2|seg_3| | ...
|
---|
429 | |R|------------------- | |R|----------- | |R|------------------- |
|
---|
430 | ------------------------ ---------------- ------------------------
|
---|
431 |
|
---|
432 | |
|
---|
433 | pages of |
|
---|
434 | other --------| |
|
---|
435 | logical -------
|
---|
436 | bitstreams | MUX |
|
---|
437 | -------
|
---|
438 | |
|
---|
439 | v
|
---|
440 |
|
---|
441 | page_1 page_2 page_3
|
---|
442 | ------ ------ ------- ----- -------
|
---|
443 | ... || | || | || | || | || | ...
|
---|
444 | ------ ------ ------- ----- -------
|
---|
445 | physical Ogg bitstream
|
---|
446 |
|
---|
447 | In this example we take a snapshot of the encapsulation process of
|
---|
448 | one logical bitstream. We can see part of that bitstream's
|
---|
449 | subdivision into packets as provided by the codec. The Ogg
|
---|
450 | encapsulation process chops up the packets into segments. The
|
---|
451 | packets in this example are rather large such that packet_1 is split
|
---|
452 | into 5 segments - 4 segments with 255 bytes and a final smaller one.
|
---|
453 | Packet_2 is split into 4 segments - 3 segments with 255 bytes and a
|
---|
454 |
|
---|
455 |
|
---|
456 |
|
---|
457 | Pfeiffer Informational [Page 8]
|
---|
458 | |
---|
459 |
|
---|
460 | RFC 3533 OGG May 2003
|
---|
461 |
|
---|
462 |
|
---|
463 | final very small one - and packet_3 is split into two segments. The
|
---|
464 | encapsulation process then creates pages, which are quite small in
|
---|
465 | this example. Page_1 consists of the first three segments of
|
---|
466 | packet_1, page_2 contains the remaining 2 segments from packet_1, and
|
---|
467 | page_3 contains the first three pages of packet_2. Finally, this
|
---|
468 | logical bitstream is multiplexed into a physical Ogg bitstream with
|
---|
469 | pages of other logical bitstreams.
|
---|
470 |
|
---|
471 | 6. The Ogg page format
|
---|
472 |
|
---|
473 | A physical Ogg bitstream consists of a sequence of concatenated
|
---|
474 | pages. Pages are of variable size, usually 4-8 kB, maximum 65307
|
---|
475 | bytes. A page header contains all the information needed to
|
---|
476 | demultiplex the logical bitstreams out of the physical bitstream and
|
---|
477 | to perform basic error recovery and landmarks for seeking. Each page
|
---|
478 | is a self-contained entity such that the page decode mechanism can
|
---|
479 | recognize, verify, and handle single pages at a time without
|
---|
480 | requiring the overall bitstream.
|
---|
481 |
|
---|
482 | The Ogg page header has the following format:
|
---|
483 |
|
---|
484 | 0 1 2 3
|
---|
485 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte
|
---|
486 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
---|
487 | | capture_pattern: Magic number for page start "OggS" | 0-3
|
---|
488 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
---|
489 | | version | header_type | granule_position | 4-7
|
---|
490 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
---|
491 | | | 8-11
|
---|
492 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
---|
493 | | | bitstream_serial_number | 12-15
|
---|
494 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
---|
495 | | | page_sequence_number | 16-19
|
---|
496 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
---|
497 | | | CRC_checksum | 20-23
|
---|
498 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
---|
499 | | |page_segments | segment_table | 24-27
|
---|
500 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
---|
501 | | ... | 28-
|
---|
502 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
---|
503 |
|
---|
504 | The LSb (least significant bit) comes first in the Bytes. Fields
|
---|
505 | with more than one byte length are encoded LSB (least significant
|
---|
506 | byte) first.
|
---|
507 |
|
---|
508 |
|
---|
509 |
|
---|
510 |
|
---|
511 |
|
---|
512 |
|
---|
513 |
|
---|
514 | Pfeiffer Informational [Page 9]
|
---|
515 | |
---|
516 |
|
---|
517 | RFC 3533 OGG May 2003
|
---|
518 |
|
---|
519 |
|
---|
520 | The fields in the page header have the following meaning:
|
---|
521 |
|
---|
522 | 1. capture_pattern: a 4 Byte field that signifies the beginning of a
|
---|
523 | page. It contains the magic numbers:
|
---|
524 |
|
---|
525 | 0x4f 'O'
|
---|
526 |
|
---|
527 | 0x67 'g'
|
---|
528 |
|
---|
529 | 0x67 'g'
|
---|
530 |
|
---|
531 | 0x53 'S'
|
---|
532 |
|
---|
533 | It helps a decoder to find the page boundaries and regain
|
---|
534 | synchronisation after parsing a corrupted stream. Once the
|
---|
535 | capture pattern is found, the decoder verifies page sync and
|
---|
536 | integrity by computing and comparing the checksum.
|
---|
537 |
|
---|
538 | 2. stream_structure_version: 1 Byte signifying the version number of
|
---|
539 | the Ogg file format used in this stream (this document specifies
|
---|
540 | version 0).
|
---|
541 |
|
---|
542 | 3. header_type_flag: the bits in this 1 Byte field identify the
|
---|
543 | specific type of this page.
|
---|
544 |
|
---|
545 | * bit 0x01
|
---|
546 |
|
---|
547 | set: page contains data of a packet continued from the previous
|
---|
548 | page
|
---|
549 |
|
---|
550 | unset: page contains a fresh packet
|
---|
551 |
|
---|
552 | * bit 0x02
|
---|
553 |
|
---|
554 | set: this is the first page of a logical bitstream (bos)
|
---|
555 |
|
---|
556 | unset: this page is not a first page
|
---|
557 |
|
---|
558 | * bit 0x04
|
---|
559 |
|
---|
560 | set: this is the last page of a logical bitstream (eos)
|
---|
561 |
|
---|
562 | unset: this page is not a last page
|
---|
563 |
|
---|
564 | 4. granule_position: an 8 Byte field containing position information.
|
---|
565 | For example, for an audio stream, it MAY contain the total number
|
---|
566 | of PCM samples encoded after including all frames finished on this
|
---|
567 | page. For a video stream it MAY contain the total number of video
|
---|
568 |
|
---|
569 |
|
---|
570 |
|
---|
571 | Pfeiffer Informational [Page 10]
|
---|
572 | |
---|
573 |
|
---|
574 | RFC 3533 OGG May 2003
|
---|
575 |
|
---|
576 |
|
---|
577 | frames encoded after this page. This is a hint for the decoder
|
---|
578 | and gives it some timing and position information. Its meaning is
|
---|
579 | dependent on the codec for that logical bitstream and specified in
|
---|
580 | a specific media mapping. A special value of -1 (in two's
|
---|
581 | complement) indicates that no packets finish on this page.
|
---|
582 |
|
---|
583 | 5. bitstream_serial_number: a 4 Byte field containing the unique
|
---|
584 | serial number by which the logical bitstream is identified.
|
---|
585 |
|
---|
586 | 6. page_sequence_number: a 4 Byte field containing the sequence
|
---|
587 | number of the page so the decoder can identify page loss. This
|
---|
588 | sequence number is increasing on each logical bitstream
|
---|
589 | separately.
|
---|
590 |
|
---|
591 | 7. CRC_checksum: a 4 Byte field containing a 32 bit CRC checksum of
|
---|
592 | the page (including header with zero CRC field and page content).
|
---|
593 | The generator polynomial is 0x04c11db7.
|
---|
594 |
|
---|
595 | 8. number_page_segments: 1 Byte giving the number of segment entries
|
---|
596 | encoded in the segment table.
|
---|
597 |
|
---|
598 | 9. segment_table: number_page_segments Bytes containing the lacing
|
---|
599 | values of all segments in this page. Each Byte contains one
|
---|
600 | lacing value.
|
---|
601 |
|
---|
602 | The total header size in bytes is given by:
|
---|
603 | header_size = number_page_segments + 27 [Byte]
|
---|
604 |
|
---|
605 | The total page size in Bytes is given by:
|
---|
606 | page_size = header_size + sum(lacing_values: 1..number_page_segments)
|
---|
607 | [Byte]
|
---|
608 |
|
---|
609 | 7. Security Considerations
|
---|
610 |
|
---|
611 | The Ogg encapsulation format is a container format and only
|
---|
612 | encapsulates content (such as Vorbis-encoded audio). It does not
|
---|
613 | provide for any generic encryption or signing of itself or its
|
---|
614 | contained content bitstreams. However, it encapsulates any kind of
|
---|
615 | content bitstream as long as there is a codec for it, and is thus
|
---|
616 | able to contain encrypted and signed content data. It is also
|
---|
617 | possible to add an external security mechanism that encrypts or signs
|
---|
618 | an Ogg physical bitstream and thus provides content confidentiality
|
---|
619 | and authenticity.
|
---|
620 |
|
---|
621 | As Ogg encapsulates binary data, it is possible to include executable
|
---|
622 | content in an Ogg bitstream. This can be an issue with applications
|
---|
623 | that are implemented using the Ogg format, especially when Ogg is
|
---|
624 | used for streaming or file transfer in a networking scenario. As
|
---|
625 |
|
---|
626 |
|
---|
627 |
|
---|
628 | Pfeiffer Informational [Page 11]
|
---|
629 | |
---|
630 |
|
---|
631 | RFC 3533 OGG May 2003
|
---|
632 |
|
---|
633 |
|
---|
634 | such, Ogg does not pose a threat there. However, an application
|
---|
635 | decoding Ogg and its encapsulated content bitstreams has to ensure
|
---|
636 | correct handling of manipulated bitstreams, of buffer overflows and
|
---|
637 | the like.
|
---|
638 |
|
---|
639 | 8. References
|
---|
640 |
|
---|
641 | [1] Walleij, L., "The application/ogg Media Type", RFC 3534, May
|
---|
642 | 2003.
|
---|
643 |
|
---|
644 | [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement
|
---|
645 | Levels", BCP 14, RFC 2119, March 1997.
|
---|
646 |
|
---|
647 |
|
---|
648 |
|
---|
649 |
|
---|
650 |
|
---|
651 |
|
---|
652 |
|
---|
653 |
|
---|
654 |
|
---|
655 |
|
---|
656 |
|
---|
657 |
|
---|
658 |
|
---|
659 |
|
---|
660 |
|
---|
661 |
|
---|
662 |
|
---|
663 |
|
---|
664 |
|
---|
665 |
|
---|
666 |
|
---|
667 |
|
---|
668 |
|
---|
669 |
|
---|
670 |
|
---|
671 |
|
---|
672 |
|
---|
673 |
|
---|
674 |
|
---|
675 |
|
---|
676 |
|
---|
677 |
|
---|
678 |
|
---|
679 |
|
---|
680 |
|
---|
681 |
|
---|
682 |
|
---|
683 |
|
---|
684 |
|
---|
685 | Pfeiffer Informational [Page 12]
|
---|
686 | |
---|
687 |
|
---|
688 | RFC 3533 OGG May 2003
|
---|
689 |
|
---|
690 |
|
---|
691 | Appendix A. Glossary of terms and abbreviations
|
---|
692 |
|
---|
693 | bos page: The initial page (beginning of stream) of a logical
|
---|
694 | bitstream which contains information to identify the codec type
|
---|
695 | and other decoding-relevant information.
|
---|
696 |
|
---|
697 | chaining (or sequential multiplexing): Concatenation of two or more
|
---|
698 | complete physical Ogg bitstreams.
|
---|
699 |
|
---|
700 | eos page: The final page (end of stream) of a logical bitstream.
|
---|
701 |
|
---|
702 | granule position: An increasing position number for a specific
|
---|
703 | logical bitstream stored in the page header. Its meaning is
|
---|
704 | dependent on the codec for that logical bitstream and specified in
|
---|
705 | a specific media mapping.
|
---|
706 |
|
---|
707 | grouping (or concurrent multiplexing): Interleaving of pages of
|
---|
708 | several logical bitstreams into one complete physical Ogg
|
---|
709 | bitstream under the restriction that all bos pages of all grouped
|
---|
710 | logical bitstreams MUST appear before any data pages.
|
---|
711 |
|
---|
712 | lacing value: An entry in the segment table of a page header
|
---|
713 | representing the size of the related segment.
|
---|
714 |
|
---|
715 | logical bitstream: A sequence of bits being the result of an encoded
|
---|
716 | media stream.
|
---|
717 |
|
---|
718 | media mapping: A specific use of the Ogg encapsulation format
|
---|
719 | together with a specific (set of) codec(s).
|
---|
720 |
|
---|
721 | (Ogg) packet: A subpart of a logical bitstream that is created by the
|
---|
722 | encoder for that bitstream and represents a meaningful entity for
|
---|
723 | the encoder, but only a sequence of bits to the Ogg encapsulation.
|
---|
724 |
|
---|
725 | (Ogg) page: A physical bitstream consists of a sequence of Ogg pages
|
---|
726 | containing data of one logical bitstream only. It usually
|
---|
727 | contains a group of contiguous segments of one packet only, but
|
---|
728 | sometimes packets are too large and need to be split over several
|
---|
729 | pages.
|
---|
730 |
|
---|
731 | physical (Ogg) bitstream: The sequence of bits resulting from an Ogg
|
---|
732 | encapsulation of one or several logical bitstreams. It consists
|
---|
733 | of a sequence of pages from the logical bitstreams with the
|
---|
734 | restriction that the pages of one logical bitstream MUST come in
|
---|
735 | their correct temporal order.
|
---|
736 |
|
---|
737 |
|
---|
738 |
|
---|
739 |
|
---|
740 |
|
---|
741 |
|
---|
742 | Pfeiffer Informational [Page 13]
|
---|
743 | |
---|
744 |
|
---|
745 | RFC 3533 OGG May 2003
|
---|
746 |
|
---|
747 |
|
---|
748 | (Ogg) segment: The Ogg encapsulation process splits each packet into
|
---|
749 | chunks of 255 bytes plus a last fractional chunk of less than 255
|
---|
750 | bytes. These chunks are called segments.
|
---|
751 |
|
---|
752 | Appendix B. Acknowledgements
|
---|
753 |
|
---|
754 | The author gratefully acknowledges the work that Christopher
|
---|
755 | Montgomery and the Xiph.Org foundation have done in defining the Ogg
|
---|
756 | multimedia project and as part of it the open file format described
|
---|
757 | in this document. The author hopes that providing this document to
|
---|
758 | the Internet community will help in promoting the Ogg multimedia
|
---|
759 | project at http://www.xiph.org/. Many thanks also for the many
|
---|
760 | technical and typo corrections that C. Montgomery and the Ogg
|
---|
761 | community provided as feedback to this RFC.
|
---|
762 |
|
---|
763 | Author's Address
|
---|
764 |
|
---|
765 | Silvia Pfeiffer
|
---|
766 | CSIRO, Australia
|
---|
767 | Locked Bag 17
|
---|
768 | North Ryde, NSW 2113
|
---|
769 | Australia
|
---|
770 |
|
---|
771 | Phone: +61 2 9325 3141
|
---|
772 | EMail: Silvia.Pfeiffer@csiro.au
|
---|
773 | URI: http://www.cmis.csiro.au/Silvia.Pfeiffer/
|
---|
774 |
|
---|
775 |
|
---|
776 |
|
---|
777 |
|
---|
778 |
|
---|
779 |
|
---|
780 |
|
---|
781 |
|
---|
782 |
|
---|
783 |
|
---|
784 |
|
---|
785 |
|
---|
786 |
|
---|
787 |
|
---|
788 |
|
---|
789 |
|
---|
790 |
|
---|
791 |
|
---|
792 |
|
---|
793 |
|
---|
794 |
|
---|
795 |
|
---|
796 |
|
---|
797 |
|
---|
798 |
|
---|
799 | Pfeiffer Informational [Page 14]
|
---|
800 | |
---|
801 |
|
---|
802 | RFC 3533 OGG May 2003
|
---|
803 |
|
---|
804 |
|
---|
805 | Full Copyright Statement
|
---|
806 |
|
---|
807 | Copyright (C) The Internet Society (2003). All Rights Reserved.
|
---|
808 |
|
---|
809 | This document and translations of it may be copied and furnished to
|
---|
810 | others, and derivative works that comment on or otherwise explain it
|
---|
811 | or assist in its implementation may be prepared, copied, published
|
---|
812 | and distributed, in whole or in part, without restriction of any
|
---|
813 | kind, provided that the above copyright notice and this paragraph are
|
---|
814 | included on all such copies and derivative works. However, this
|
---|
815 | document itself may not be modified in any way, such as by removing
|
---|
816 | the copyright notice or references to the Internet Society or other
|
---|
817 | Internet organizations, except as needed for the purpose of
|
---|
818 | developing Internet standards in which case the procedures for
|
---|
819 | copyrights defined in the Internet Standards process must be
|
---|
820 | followed, or as required to translate it into languages other than
|
---|
821 | English.
|
---|
822 |
|
---|
823 | The limited permissions granted above are perpetual and will not be
|
---|
824 | revoked by the Internet Society or its successors or assigns.
|
---|
825 |
|
---|
826 | This document and the information contained herein is provided on an
|
---|
827 | "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
|
---|
828 | TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
|
---|
829 | BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
|
---|
830 | HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
|
---|
831 | MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
|
---|
832 |
|
---|
833 | Acknowledgement
|
---|
834 |
|
---|
835 | Funding for the RFC Editor function is currently provided by the
|
---|
836 | Internet Society.
|
---|
837 |
|
---|
838 |
|
---|
839 |
|
---|
840 |
|
---|
841 |
|
---|
842 |
|
---|
843 |
|
---|
844 |
|
---|
845 |
|
---|
846 |
|
---|
847 |
|
---|
848 |
|
---|
849 |
|
---|
850 |
|
---|
851 |
|
---|
852 |
|
---|
853 |
|
---|
854 |
|
---|
855 |
|
---|
856 | Pfeiffer Informational [Page 15]
|
---|
857 | |
---|
858 |
|
---|