1 | % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
|
---|
2 | %!TEX root = Vorbis_I_spec.tex
|
---|
3 | \section{Bitpacking Convention} \label{vorbis:spec:bitpacking}
|
---|
4 |
|
---|
5 | \subsection{Overview}
|
---|
6 |
|
---|
7 | The Vorbis codec uses relatively unstructured raw packets containing
|
---|
8 | arbitrary-width binary integer fields. Logically, these packets are a
|
---|
9 | bitstream in which bits are coded one-by-one by the encoder and then
|
---|
10 | read one-by-one in the same monotonically increasing order by the
|
---|
11 | decoder. Most current binary storage arrangements group bits into a
|
---|
12 | native word size of eight bits (octets), sixteen bits, thirty-two bits
|
---|
13 | or, less commonly other fixed word sizes. The Vorbis bitpacking
|
---|
14 | convention specifies the correct mapping of the logical packet
|
---|
15 | bitstream into an actual representation in fixed-width words.
|
---|
16 |
|
---|
17 |
|
---|
18 | \subsubsection{octets, bytes and words}
|
---|
19 |
|
---|
20 | In most contemporary architectures, a 'byte' is synonymous with an
|
---|
21 | 'octet', that is, eight bits. This has not always been the case;
|
---|
22 | seven, ten, eleven and sixteen bit 'bytes' have been used. For
|
---|
23 | purposes of the bitpacking convention, a byte implies the native,
|
---|
24 | smallest integer storage representation offered by a platform. On
|
---|
25 | modern platforms, this is generally assumed to be eight bits (not
|
---|
26 | necessarily because of the processor but because of the
|
---|
27 | filesystem/memory architecture. Modern filesystems invariably offer
|
---|
28 | bytes as the fundamental atom of storage). A 'word' is an integer
|
---|
29 | size that is a grouped multiple of this smallest size.
|
---|
30 |
|
---|
31 | The most ubiquitous architectures today consider a 'byte' to be an
|
---|
32 | octet (eight bits) and a word to be a group of two, four or eight
|
---|
33 | bytes (16, 32 or 64 bits). Note however that the Vorbis bitpacking
|
---|
34 | convention is still well defined for any native byte size; Vorbis uses
|
---|
35 | the native bit-width of a given storage system. This document assumes
|
---|
36 | that a byte is one octet for purposes of example.
|
---|
37 |
|
---|
38 | \subsubsection{bit order}
|
---|
39 |
|
---|
40 | A byte has a well-defined 'least significant' bit (LSb), which is the
|
---|
41 | only bit set when the byte is storing the two's complement integer
|
---|
42 | value +1. A byte's 'most significant' bit (MSb) is at the opposite
|
---|
43 | end of the byte. Bits in a byte are numbered from zero at the LSb to
|
---|
44 | $n$ ($n=7$ in an octet) for the
|
---|
45 | MSb.
|
---|
46 |
|
---|
47 |
|
---|
48 |
|
---|
49 | \subsubsection{byte order}
|
---|
50 |
|
---|
51 | Words are native groupings of multiple bytes. Several byte orderings
|
---|
52 | are possible in a word; the common ones are 3-2-1-0 ('big endian' or
|
---|
53 | 'most significant byte first' in which the highest-valued byte comes
|
---|
54 | first), 0-1-2-3 ('little endian' or 'least significant byte first' in
|
---|
55 | which the lowest value byte comes first) and less commonly 3-1-2-0 and
|
---|
56 | 0-2-1-3 ('mixed endian').
|
---|
57 |
|
---|
58 | The Vorbis bitpacking convention specifies storage and bitstream
|
---|
59 | manipulation at the byte, not word, level, thus host word ordering is
|
---|
60 | of a concern only during optimization when writing high performance
|
---|
61 | code that operates on a word of storage at a time rather than by byte.
|
---|
62 | Logically, bytes are always coded and decoded in order from byte zero
|
---|
63 | through byte $n$.
|
---|
64 |
|
---|
65 |
|
---|
66 |
|
---|
67 | \subsubsection{coding bits into byte sequences}
|
---|
68 |
|
---|
69 | The Vorbis codec has need to code arbitrary bit-width integers, from
|
---|
70 | zero to 32 bits wide, into packets. These integer fields are not
|
---|
71 | aligned to the boundaries of the byte representation; the next field
|
---|
72 | is written at the bit position at which the previous field ends.
|
---|
73 |
|
---|
74 | The encoder logically packs integers by writing the LSb of a binary
|
---|
75 | integer to the logical bitstream first, followed by next least
|
---|
76 | significant bit, etc, until the requested number of bits have been
|
---|
77 | coded. When packing the bits into bytes, the encoder begins by
|
---|
78 | placing the LSb of the integer to be written into the least
|
---|
79 | significant unused bit position of the destination byte, followed by
|
---|
80 | the next-least significant bit of the source integer and so on up to
|
---|
81 | the requested number of bits. When all bits of the destination byte
|
---|
82 | have been filled, encoding continues by zeroing all bits of the next
|
---|
83 | byte and writing the next bit into the bit position 0 of that byte.
|
---|
84 | Decoding follows the same process as encoding, but by reading bits
|
---|
85 | from the byte stream and reassembling them into integers.
|
---|
86 |
|
---|
87 |
|
---|
88 |
|
---|
89 | \subsubsection{signedness}
|
---|
90 |
|
---|
91 | The signedness of a specific number resulting from decode is to be
|
---|
92 | interpreted by the decoder given decode context. That is, the three
|
---|
93 | bit binary pattern 'b111' can be taken to represent either 'seven' as
|
---|
94 | an unsigned integer, or '-1' as a signed, two's complement integer.
|
---|
95 | The encoder and decoder are responsible for knowing if fields are to
|
---|
96 | be treated as signed or unsigned.
|
---|
97 |
|
---|
98 |
|
---|
99 |
|
---|
100 | \subsubsection{coding example}
|
---|
101 |
|
---|
102 | Code the 4 bit integer value '12' [b1100] into an empty bytestream.
|
---|
103 | Bytestream result:
|
---|
104 |
|
---|
105 | \begin{Verbatim}[commandchars=\\\{\}]
|
---|
106 | |
|
---|
107 | V
|
---|
108 |
|
---|
109 | 7 6 5 4 3 2 1 0
|
---|
110 | byte 0 [0 0 0 0 1 1 0 0] <-
|
---|
111 | byte 1 [ ]
|
---|
112 | byte 2 [ ]
|
---|
113 | byte 3 [ ]
|
---|
114 | ...
|
---|
115 | byte n [ ] bytestream length == 1 byte
|
---|
116 |
|
---|
117 | \end{Verbatim}
|
---|
118 |
|
---|
119 |
|
---|
120 | Continue by coding the 3 bit integer value '-1' [b111]:
|
---|
121 |
|
---|
122 | \begin{Verbatim}[commandchars=\\\{\}]
|
---|
123 | |
|
---|
124 | V
|
---|
125 |
|
---|
126 | 7 6 5 4 3 2 1 0
|
---|
127 | byte 0 [0 1 1 1 1 1 0 0] <-
|
---|
128 | byte 1 [ ]
|
---|
129 | byte 2 [ ]
|
---|
130 | byte 3 [ ]
|
---|
131 | ...
|
---|
132 | byte n [ ] bytestream length == 1 byte
|
---|
133 | \end{Verbatim}
|
---|
134 |
|
---|
135 |
|
---|
136 | Continue by coding the 7 bit integer value '17' [b0010001]:
|
---|
137 |
|
---|
138 | \begin{Verbatim}[commandchars=\\\{\}]
|
---|
139 | |
|
---|
140 | V
|
---|
141 |
|
---|
142 | 7 6 5 4 3 2 1 0
|
---|
143 | byte 0 [1 1 1 1 1 1 0 0]
|
---|
144 | byte 1 [0 0 0 0 1 0 0 0] <-
|
---|
145 | byte 2 [ ]
|
---|
146 | byte 3 [ ]
|
---|
147 | ...
|
---|
148 | byte n [ ] bytestream length == 2 bytes
|
---|
149 | bit cursor == 6
|
---|
150 | \end{Verbatim}
|
---|
151 |
|
---|
152 |
|
---|
153 | Continue by coding the 13 bit integer value '6969' [b110 11001110 01]:
|
---|
154 |
|
---|
155 | \begin{Verbatim}[commandchars=\\\{\}]
|
---|
156 | |
|
---|
157 | V
|
---|
158 |
|
---|
159 | 7 6 5 4 3 2 1 0
|
---|
160 | byte 0 [1 1 1 1 1 1 0 0]
|
---|
161 | byte 1 [0 1 0 0 1 0 0 0]
|
---|
162 | byte 2 [1 1 0 0 1 1 1 0]
|
---|
163 | byte 3 [0 0 0 0 0 1 1 0] <-
|
---|
164 | ...
|
---|
165 | byte n [ ] bytestream length == 4 bytes
|
---|
166 |
|
---|
167 | \end{Verbatim}
|
---|
168 |
|
---|
169 |
|
---|
170 |
|
---|
171 |
|
---|
172 | \subsubsection{decoding example}
|
---|
173 |
|
---|
174 | Reading from the beginning of the bytestream encoded in the above example:
|
---|
175 |
|
---|
176 | \begin{Verbatim}[commandchars=\\\{\}]
|
---|
177 | |
|
---|
178 | V
|
---|
179 |
|
---|
180 | 7 6 5 4 3 2 1 0
|
---|
181 | byte 0 [1 1 1 1 1 1 0 0] <-
|
---|
182 | byte 1 [0 1 0 0 1 0 0 0]
|
---|
183 | byte 2 [1 1 0 0 1 1 1 0]
|
---|
184 | byte 3 [0 0 0 0 0 1 1 0] bytestream length == 4 bytes
|
---|
185 |
|
---|
186 | \end{Verbatim}
|
---|
187 |
|
---|
188 |
|
---|
189 | We read two, two-bit integer fields, resulting in the returned numbers
|
---|
190 | 'b00' and 'b11'. Two things are worth noting here:
|
---|
191 |
|
---|
192 | \begin{itemize}
|
---|
193 | \item Although these four bits were originally written as a single
|
---|
194 | four-bit integer, reading some other combination of bit-widths from the
|
---|
195 | bitstream is well defined. There are no artificial alignment
|
---|
196 | boundaries maintained in the bitstream.
|
---|
197 |
|
---|
198 | \item The second value is the
|
---|
199 | two-bit-wide integer 'b11'. This value may be interpreted either as
|
---|
200 | the unsigned value '3', or the signed value '-1'. Signedness is
|
---|
201 | dependent on decode context.
|
---|
202 | \end{itemize}
|
---|
203 |
|
---|
204 |
|
---|
205 |
|
---|
206 |
|
---|
207 | \subsubsection{end-of-packet alignment}
|
---|
208 |
|
---|
209 | The typical use of bitpacking is to produce many independent
|
---|
210 | byte-aligned packets which are embedded into a larger byte-aligned
|
---|
211 | container structure, such as an Ogg transport bitstream. Externally,
|
---|
212 | each bytestream (encoded bitstream) must begin and end on a byte
|
---|
213 | boundary. Often, the encoded bitstream is not an integer number of
|
---|
214 | bytes, and so there is unused (uncoded) space in the last byte of a
|
---|
215 | packet.
|
---|
216 |
|
---|
217 | Unused space in the last byte of a bytestream is always zeroed during
|
---|
218 | the coding process. Thus, should this unused space be read, it will
|
---|
219 | return binary zeroes.
|
---|
220 |
|
---|
221 | Attempting to read past the end of an encoded packet results in an
|
---|
222 | 'end-of-packet' condition. End-of-packet is not to be considered an
|
---|
223 | error; it is merely a state indicating that there is insufficient
|
---|
224 | remaining data to fulfill the desired read size. Vorbis uses truncated
|
---|
225 | packets as a normal mode of operation, and as such, decoders must
|
---|
226 | handle reading past the end of a packet as a typical mode of
|
---|
227 | operation. Any further read operations after an 'end-of-packet'
|
---|
228 | condition shall also return 'end-of-packet'.
|
---|
229 |
|
---|
230 |
|
---|
231 |
|
---|
232 | \subsubsection{reading zero bits}
|
---|
233 |
|
---|
234 | Reading a zero-bit-wide integer returns the value '0' and does not
|
---|
235 | increment the stream cursor. Reading to the end of the packet (but
|
---|
236 | not past, such that an 'end-of-packet' condition has not triggered)
|
---|
237 | and then reading a zero bit integer shall succeed, returning 0, and
|
---|
238 | not trigger an end-of-packet condition. Reading a zero-bit-wide
|
---|
239 | integer after a previous read sets 'end-of-packet' shall also fail
|
---|
240 | with 'end-of-packet'.
|
---|
241 |
|
---|
242 |
|
---|
243 |
|
---|
244 |
|
---|
245 |
|
---|
246 |
|
---|