1 | % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
|
---|
2 | %!TEX root = Vorbis_I_spec.tex
|
---|
3 | \section{Floor type 1 setup and decode} \label{vorbis:spec:floor1}
|
---|
4 |
|
---|
5 | \subsection{Overview}
|
---|
6 |
|
---|
7 | Vorbis floor type one uses a piecewise straight-line representation to
|
---|
8 | encode a spectral envelope curve. The representation plots this curve
|
---|
9 | mechanically on a linear frequency axis and a logarithmic (dB)
|
---|
10 | amplitude axis. The integer plotting algorithm used is similar to
|
---|
11 | Bresenham's algorithm.
|
---|
12 |
|
---|
13 |
|
---|
14 |
|
---|
15 | \subsection{Floor 1 format}
|
---|
16 |
|
---|
17 | \subsubsection{model}
|
---|
18 |
|
---|
19 | Floor type one represents a spectral curve as a series of
|
---|
20 | line segments. Synthesis constructs a floor curve using iterative
|
---|
21 | prediction in a process roughly equivalent to the following simplified
|
---|
22 | description:
|
---|
23 |
|
---|
24 | \begin{itemize}
|
---|
25 | \item the first line segment (base case) is a logical line spanning
|
---|
26 | from x_0,y_0 to x_1,y_1 where in the base case x_0=0 and x_1=[n], the
|
---|
27 | full range of the spectral floor to be computed.
|
---|
28 |
|
---|
29 | \item the induction step chooses a point x_new within an existing
|
---|
30 | logical line segment and produces a y_new value at that point computed
|
---|
31 | from the existing line's y value at x_new (as plotted by the line) and
|
---|
32 | a difference value decoded from the bitstream packet.
|
---|
33 |
|
---|
34 | \item floor computation produces two new line segments, one running from
|
---|
35 | x_0,y_0 to x_new,y_new and from x_new,y_new to x_1,y_1. This step is
|
---|
36 | performed logically even if y_new represents no change to the
|
---|
37 | amplitude value at x_new so that later refinement is additionally
|
---|
38 | bounded at x_new.
|
---|
39 |
|
---|
40 | \item the induction step repeats, using a list of x values specified in
|
---|
41 | the codec setup header at floor 1 initialization time. Computation
|
---|
42 | is completed at the end of the x value list.
|
---|
43 |
|
---|
44 | \end{itemize}
|
---|
45 |
|
---|
46 |
|
---|
47 | Consider the following example, with values chosen for ease of
|
---|
48 | understanding rather than representing typical configuration:
|
---|
49 |
|
---|
50 | For the below example, we assume a floor setup with an [n] of 128.
|
---|
51 | The list of selected X values in increasing order is
|
---|
52 | 0,16,32,48,64,80,96,112 and 128. In list order, the values interleave
|
---|
53 | as 0, 128, 64, 32, 96, 16, 48, 80 and 112. The corresponding
|
---|
54 | list-order Y values as decoded from an example packet are 110, 20, -5,
|
---|
55 | -45, 0, -25, -10, 30 and -10. We compute the floor in the following
|
---|
56 | way, beginning with the first line:
|
---|
57 |
|
---|
58 | \begin{center}
|
---|
59 | \includegraphics[width=8cm]{floor1-1}
|
---|
60 | \captionof{figure}{graph of example floor}
|
---|
61 | \end{center}
|
---|
62 |
|
---|
63 | We now draw new logical lines to reflect the correction to new_Y, and
|
---|
64 | iterate for X positions 32 and 96:
|
---|
65 |
|
---|
66 | \begin{center}
|
---|
67 | \includegraphics[width=8cm]{floor1-2}
|
---|
68 | \captionof{figure}{graph of example floor}
|
---|
69 | \end{center}
|
---|
70 |
|
---|
71 | Although the new Y value at X position 96 is unchanged, it is still
|
---|
72 | used later as an endpoint for further refinement. From here on, the
|
---|
73 | pattern should be clear; we complete the floor computation as follows:
|
---|
74 |
|
---|
75 | \begin{center}
|
---|
76 | \includegraphics[width=8cm]{floor1-3}
|
---|
77 | \captionof{figure}{graph of example floor}
|
---|
78 | \end{center}
|
---|
79 |
|
---|
80 | \begin{center}
|
---|
81 | \includegraphics[width=8cm]{floor1-4}
|
---|
82 | \captionof{figure}{graph of example floor}
|
---|
83 | \end{center}
|
---|
84 |
|
---|
85 | A more efficient algorithm with carefully defined integer rounding
|
---|
86 | behavior is used for actual decode, as described later. The actual
|
---|
87 | algorithm splits Y value computation and line plotting into two steps
|
---|
88 | with modifications to the above algorithm to eliminate noise
|
---|
89 | accumulation through integer roundoff/truncation.
|
---|
90 |
|
---|
91 |
|
---|
92 |
|
---|
93 | \subsubsection{header decode}
|
---|
94 |
|
---|
95 | A list of floor X values is stored in the packet header in interleaved
|
---|
96 | format (used in list order during packet decode and synthesis). This
|
---|
97 | list is split into partitions, and each partition is assigned to a
|
---|
98 | partition class. X positions 0 and [n] are implicit and do not belong
|
---|
99 | to an explicit partition or partition class.
|
---|
100 |
|
---|
101 | A partition class consists of a representation vector width (the
|
---|
102 | number of Y values which the partition class encodes at once), a
|
---|
103 | 'subclass' value representing the number of alternate entropy books
|
---|
104 | the partition class may use in representing Y values, the list of
|
---|
105 | [subclass] books and a master book used to encode which alternate
|
---|
106 | books were chosen for representation in a given packet. The
|
---|
107 | master/subclass mechanism is meant to be used as a flexible
|
---|
108 | representation cascade while still using codebooks only in a scalar
|
---|
109 | context.
|
---|
110 |
|
---|
111 | \begin{Verbatim}[commandchars=\\\{\}]
|
---|
112 |
|
---|
113 | 1) [floor1\_partitions] = read 5 bits as unsigned integer
|
---|
114 | 2) [maximum\_class] = -1
|
---|
115 | 3) iterate [i] over the range 0 ... [floor1\_partitions]-1 \{
|
---|
116 |
|
---|
117 | 4) vector [floor1\_partition\_class\_list] element [i] = read 4 bits as unsigned integer
|
---|
118 |
|
---|
119 | \}
|
---|
120 |
|
---|
121 | 5) [maximum\_class] = largest integer scalar value in vector [floor1\_partition\_class\_list]
|
---|
122 | 6) iterate [i] over the range 0 ... [maximum\_class] \{
|
---|
123 |
|
---|
124 | 7) vector [floor1\_class\_dimensions] element [i] = read 3 bits as unsigned integer and add 1
|
---|
125 | 8) vector [floor1\_class\_subclasses] element [i] = read 2 bits as unsigned integer
|
---|
126 | 9) if ( vector [floor1\_class\_subclasses] element [i] is nonzero ) \{
|
---|
127 |
|
---|
128 | 10) vector [floor1\_class\_masterbooks] element [i] = read 8 bits as unsigned integer
|
---|
129 |
|
---|
130 | \}
|
---|
131 |
|
---|
132 | 11) iterate [j] over the range 0 ... (2 exponent [floor1\_class\_subclasses] element [i]) - 1 \{
|
---|
133 |
|
---|
134 | 12) array [floor1\_subclass\_books] element [i],[j] =
|
---|
135 | read 8 bits as unsigned integer and subtract one
|
---|
136 | \}
|
---|
137 | \}
|
---|
138 |
|
---|
139 | 13) [floor1\_multiplier] = read 2 bits as unsigned integer and add one
|
---|
140 | 14) [rangebits] = read 4 bits as unsigned integer
|
---|
141 | 15) vector [floor1\_X\_list] element [0] = 0
|
---|
142 | 16) vector [floor1\_X\_list] element [1] = 2 exponent [rangebits];
|
---|
143 | 17) [floor1\_values] = 2
|
---|
144 | 18) iterate [i] over the range 0 ... [floor1\_partitions]-1 \{
|
---|
145 |
|
---|
146 | 19) [current\_class\_number] = vector [floor1\_partition\_class\_list] element [i]
|
---|
147 | 20) iterate [j] over the range 0 ... ([floor1\_class\_dimensions] element [current\_class\_number])-1 \{
|
---|
148 | 21) vector [floor1\_X\_list] element ([floor1\_values]) =
|
---|
149 | read [rangebits] bits as unsigned integer
|
---|
150 | 22) increment [floor1\_values] by one
|
---|
151 | \}
|
---|
152 | \}
|
---|
153 |
|
---|
154 | 23) done
|
---|
155 | \end{Verbatim}
|
---|
156 |
|
---|
157 | An end-of-packet condition while reading any aspect of a floor 1
|
---|
158 | configuration during setup renders a stream undecodable. In addition,
|
---|
159 | a \varname{[floor1\_class\_masterbooks]} or
|
---|
160 | \varname{[floor1\_subclass\_books]} scalar element greater than the
|
---|
161 | highest numbered codebook configured in this stream is an error
|
---|
162 | condition that renders the stream undecodable. Vector
|
---|
163 | [floor1\_x\_list] is limited to a maximum length of 65 elements; a
|
---|
164 | setup indicating more than 65 total elements (including elements 0 and
|
---|
165 | 1 set prior to the read loop) renders the stream undecodable. All
|
---|
166 | vector [floor1\_x\_list] element values must be unique within the
|
---|
167 | vector; a non-unique value renders the stream undecodable.
|
---|
168 |
|
---|
169 | \subsubsection{packet decode} \label{vorbis:spec:floor1-decode}
|
---|
170 |
|
---|
171 | Packet decode begins by checking the \varname{[nonzero]} flag:
|
---|
172 |
|
---|
173 | \begin{Verbatim}[commandchars=\\\{\}]
|
---|
174 | 1) [nonzero] = read 1 bit as boolean
|
---|
175 | \end{Verbatim}
|
---|
176 |
|
---|
177 | If \varname{[nonzero]} is unset, that indicates this channel contained
|
---|
178 | no audio energy in this frame. Decode immediately returns a status
|
---|
179 | indicating this floor curve (and thus this channel) is unused this
|
---|
180 | frame. (A return status of 'unused' is different from decoding a
|
---|
181 | floor that has all points set to minimum representation amplitude,
|
---|
182 | which happens to be approximately -140dB).
|
---|
183 |
|
---|
184 |
|
---|
185 | Assuming \varname{[nonzero]} is set, decode proceeds as follows:
|
---|
186 |
|
---|
187 | \begin{Verbatim}[commandchars=\\\{\}]
|
---|
188 | 1) [range] = vector \{ 256, 128, 86, 64 \} element ([floor1\_multiplier]-1)
|
---|
189 | 2) vector [floor1\_Y] element [0] = read \link{vorbis:spec:ilog}{ilog}([range]-1) bits as unsigned integer
|
---|
190 | 3) vector [floor1\_Y] element [1] = read \link{vorbis:spec:ilog}{ilog}([range]-1) bits as unsigned integer
|
---|
191 | 4) [offset] = 2;
|
---|
192 | 5) iterate [i] over the range 0 ... [floor1\_partitions]-1 \{
|
---|
193 |
|
---|
194 | 6) [class] = vector [floor1\_partition\_class] element [i]
|
---|
195 | 7) [cdim] = vector [floor1\_class\_dimensions] element [class]
|
---|
196 | 8) [cbits] = vector [floor1\_class\_subclasses] element [class]
|
---|
197 | 9) [csub] = (2 exponent [cbits])-1
|
---|
198 | 10) [cval] = 0
|
---|
199 | 11) if ( [cbits] is greater than zero ) \{
|
---|
200 |
|
---|
201 | 12) [cval] = read from packet using codebook number
|
---|
202 | (vector [floor1\_class\_masterbooks] element [class]) in scalar context
|
---|
203 | \}
|
---|
204 |
|
---|
205 | 13) iterate [j] over the range 0 ... [cdim]-1 \{
|
---|
206 |
|
---|
207 | 14) [book] = array [floor1\_subclass\_books] element [class],([cval] bitwise AND [csub])
|
---|
208 | 15) [cval] = [cval] right shifted [cbits] bits
|
---|
209 | 16) if ( [book] is not less than zero ) \{
|
---|
210 |
|
---|
211 | 17) vector [floor1\_Y] element ([j]+[offset]) = read from packet using codebook
|
---|
212 | [book] in scalar context
|
---|
213 |
|
---|
214 | \} else [book] is less than zero \{
|
---|
215 |
|
---|
216 | 18) vector [floor1\_Y] element ([j]+[offset]) = 0
|
---|
217 |
|
---|
218 | \}
|
---|
219 | \}
|
---|
220 |
|
---|
221 | 19) [offset] = [offset] + [cdim]
|
---|
222 |
|
---|
223 | \}
|
---|
224 |
|
---|
225 | 20) done
|
---|
226 | \end{Verbatim}
|
---|
227 |
|
---|
228 | An end-of-packet condition during curve decode should be considered a
|
---|
229 | nominal occurrence; if end-of-packet is reached during any read
|
---|
230 | operation above, floor decode is to return 'unused' status as if the
|
---|
231 | \varname{[nonzero]} flag had been unset at the beginning of decode.
|
---|
232 |
|
---|
233 |
|
---|
234 | Vector \varname{[floor1\_Y]} contains the values from packet decode
|
---|
235 | needed for floor 1 synthesis.
|
---|
236 |
|
---|
237 |
|
---|
238 |
|
---|
239 | \subsubsection{curve computation} \label{vorbis:spec:floor1-synth}
|
---|
240 |
|
---|
241 | Curve computation is split into two logical steps; the first step
|
---|
242 | derives final Y amplitude values from the encoded, wrapped difference
|
---|
243 | values taken from the bitstream. The second step plots the curve
|
---|
244 | lines. Also, although zero-difference values are used in the
|
---|
245 | iterative prediction to find final Y values, these points are
|
---|
246 | conditionally skipped during final line computation in step two.
|
---|
247 | Skipping zero-difference values allows a smoother line fit.
|
---|
248 |
|
---|
249 | Although some aspects of the below algorithm look like inconsequential
|
---|
250 | optimizations, implementors are warned to follow the details closely.
|
---|
251 | Deviation from implementing a strictly equivalent algorithm can result
|
---|
252 | in serious decoding errors.
|
---|
253 |
|
---|
254 | {\em Additional note:} Although \varname{[floor1\_final\_Y]} values in
|
---|
255 | the prediction loop and at the end of step 1 are inherently limited by
|
---|
256 | the prediction algorithm to [0, \varname{[range]}), it is possible to
|
---|
257 | abuse the setup and codebook machinery to produce negative or
|
---|
258 | over-range results. We suggest that decoder implementations guard
|
---|
259 | the values in vector \varname{[floor1\_final\_Y]} by clamping each
|
---|
260 | element to [0, \varname{[range]}) after step 1. Variants of this
|
---|
261 | suggestion are acceptable as valid floor1 setups cannot produce
|
---|
262 | out of range values.
|
---|
263 |
|
---|
264 | \begin{description}
|
---|
265 | \item[step 1: amplitude value synthesis]
|
---|
266 |
|
---|
267 | Unwrap the always-positive-or-zero values read from the packet into
|
---|
268 | +/- difference values, then apply to line prediction.
|
---|
269 |
|
---|
270 | \begin{Verbatim}[commandchars=\\\{\}]
|
---|
271 | 1) [range] = vector \{ 256, 128, 86, 64 \} element ([floor1\_multiplier]-1)
|
---|
272 | 2) vector [floor1\_step2\_flag] element [0] = set
|
---|
273 | 3) vector [floor1\_step2\_flag] element [1] = set
|
---|
274 | 4) vector [floor1\_final\_Y] element [0] = vector [floor1\_Y] element [0]
|
---|
275 | 5) vector [floor1\_final\_Y] element [1] = vector [floor1\_Y] element [1]
|
---|
276 | 6) iterate [i] over the range 2 ... [floor1\_values]-1 \{
|
---|
277 |
|
---|
278 | 7) [low\_neighbor\_offset] = \link{vorbis:spec:low:neighbor}{low\_neighbor}([floor1\_X\_list],[i])
|
---|
279 | 8) [high\_neighbor\_offset] = \link{vorbis:spec:high:neighbor}{high\_neighbor}([floor1\_X\_list],[i])
|
---|
280 |
|
---|
281 | 9) [predicted] = \link{vorbis:spec:render:point}{render\_point}( vector [floor1\_X\_list] element [low\_neighbor\_offset],
|
---|
282 | vector [floor1\_final\_Y] element [low\_neighbor\_offset],
|
---|
283 | vector [floor1\_X\_list] element [high\_neighbor\_offset],
|
---|
284 | vector [floor1\_final\_Y] element [high\_neighbor\_offset],
|
---|
285 | vector [floor1\_X\_list] element [i] )
|
---|
286 |
|
---|
287 | 10) [val] = vector [floor1\_Y] element [i]
|
---|
288 | 11) [highroom] = [range] - [predicted]
|
---|
289 | 12) [lowroom] = [predicted]
|
---|
290 | 13) if ( [highroom] is less than [lowroom] ) \{
|
---|
291 |
|
---|
292 | 14) [room] = [highroom] * 2
|
---|
293 |
|
---|
294 | \} else [highroom] is not less than [lowroom] \{
|
---|
295 |
|
---|
296 | 15) [room] = [lowroom] * 2
|
---|
297 |
|
---|
298 | \}
|
---|
299 |
|
---|
300 | 16) if ( [val] is nonzero ) \{
|
---|
301 |
|
---|
302 | 17) vector [floor1\_step2\_flag] element [low\_neighbor\_offset] = set
|
---|
303 | 18) vector [floor1\_step2\_flag] element [high\_neighbor\_offset] = set
|
---|
304 | 19) vector [floor1\_step2\_flag] element [i] = set
|
---|
305 | 20) if ( [val] is greater than or equal to [room] ) \{
|
---|
306 |
|
---|
307 | 21) if ( [highroom] is greater than [lowroom] ) \{
|
---|
308 |
|
---|
309 | 22) vector [floor1\_final\_Y] element [i] = [val] - [lowroom] + [predicted]
|
---|
310 |
|
---|
311 | \} else [highroom] is not greater than [lowroom] \{
|
---|
312 |
|
---|
313 | 23) vector [floor1\_final\_Y] element [i] = [predicted] - [val] + [highroom] - 1
|
---|
314 |
|
---|
315 | \}
|
---|
316 |
|
---|
317 | \} else [val] is less than [room] \{
|
---|
318 |
|
---|
319 | 24) if ([val] is odd) \{
|
---|
320 |
|
---|
321 | 25) vector [floor1\_final\_Y] element [i] =
|
---|
322 | [predicted] - (([val] + 1) divided by 2 using integer division)
|
---|
323 |
|
---|
324 | \} else [val] is even \{
|
---|
325 |
|
---|
326 | 26) vector [floor1\_final\_Y] element [i] =
|
---|
327 | [predicted] + ([val] / 2 using integer division)
|
---|
328 |
|
---|
329 | \}
|
---|
330 |
|
---|
331 | \}
|
---|
332 |
|
---|
333 | \} else [val] is zero \{
|
---|
334 |
|
---|
335 | 27) vector [floor1\_step2\_flag] element [i] = unset
|
---|
336 | 28) vector [floor1\_final\_Y] element [i] = [predicted]
|
---|
337 |
|
---|
338 | \}
|
---|
339 |
|
---|
340 | \}
|
---|
341 |
|
---|
342 | 29) done
|
---|
343 |
|
---|
344 | \end{Verbatim}
|
---|
345 |
|
---|
346 |
|
---|
347 |
|
---|
348 | \item[step 2: curve synthesis]
|
---|
349 |
|
---|
350 | Curve synthesis generates a return vector \varname{[floor]} of length
|
---|
351 | \varname{[n]} (where \varname{[n]} is provided by the decode process
|
---|
352 | calling to floor decode). Floor 1 curve synthesis makes use of the
|
---|
353 | \varname{[floor1\_X\_list]}, \varname{[floor1\_final\_Y]} and
|
---|
354 | \varname{[floor1\_step2\_flag]} vectors, as well as [floor1\_multiplier]
|
---|
355 | and [floor1\_values] values.
|
---|
356 |
|
---|
357 | Decode begins by sorting the scalars from vectors
|
---|
358 | \varname{[floor1\_X\_list]}, \varname{[floor1\_final\_Y]} and
|
---|
359 | \varname{[floor1\_step2\_flag]} together into new vectors
|
---|
360 | \varname{[floor1\_X\_list]'}, \varname{[floor1\_final\_Y]'} and
|
---|
361 | \varname{[floor1\_step2\_flag]'} according to ascending sort order of the
|
---|
362 | values in \varname{[floor1\_X\_list]}. That is, sort the values of
|
---|
363 | \varname{[floor1\_X\_list]} and then apply the same permutation to
|
---|
364 | elements of the other two vectors so that the X, Y and step2\_flag
|
---|
365 | values still match.
|
---|
366 |
|
---|
367 | Then compute the final curve in one pass:
|
---|
368 |
|
---|
369 | \begin{Verbatim}[commandchars=\\\{\}]
|
---|
370 | 1) [hx] = 0
|
---|
371 | 2) [lx] = 0
|
---|
372 | 3) [ly] = vector [floor1\_final\_Y]' element [0] * [floor1\_multiplier]
|
---|
373 | 4) iterate [i] over the range 1 ... [floor1\_values]-1 \{
|
---|
374 |
|
---|
375 | 5) if ( [floor1\_step2\_flag]' element [i] is set ) \{
|
---|
376 |
|
---|
377 | 6) [hy] = [floor1\_final\_Y]' element [i] * [floor1\_multiplier]
|
---|
378 | 7) [hx] = [floor1\_X\_list]' element [i]
|
---|
379 | 8) \link{vorbis:spec:render:line}{render\_line}( [lx], [ly], [hx], [hy], [floor] )
|
---|
380 | 9) [lx] = [hx]
|
---|
381 | 10) [ly] = [hy]
|
---|
382 | \}
|
---|
383 | \}
|
---|
384 |
|
---|
385 | 11) if ( [hx] is less than [n] ) \{
|
---|
386 |
|
---|
387 | 12) \link{vorbis:spec:render:line}{render\_line}( [hx], [hy], [n], [hy], [floor] )
|
---|
388 |
|
---|
389 | \}
|
---|
390 |
|
---|
391 | 13) if ( [hx] is greater than [n] ) \{
|
---|
392 |
|
---|
393 | 14) truncate vector [floor] to [n] elements
|
---|
394 |
|
---|
395 | \}
|
---|
396 |
|
---|
397 | 15) for each scalar in vector [floor], perform a lookup substitution using
|
---|
398 | the scalar value from [floor] as an offset into the vector \link{vorbis:spec:floor1:inverse:dB:table}{[floor1\_inverse\_dB\_static\_table]}
|
---|
399 |
|
---|
400 | 16) done
|
---|
401 |
|
---|
402 | \end{Verbatim}
|
---|
403 |
|
---|
404 | \end{description}
|
---|