1 | % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
|
---|
2 | %!TEX root = Vorbis_I_spec.tex
|
---|
3 | \section{Floor type 0 setup and decode} \label{vorbis:spec:floor0}
|
---|
4 |
|
---|
5 | \subsection{Overview}
|
---|
6 |
|
---|
7 | Vorbis floor type zero uses Line Spectral Pair (LSP, also alternately
|
---|
8 | known as Line Spectral Frequency or LSF) representation to encode a
|
---|
9 | smooth spectral envelope curve as the frequency response of the LSP
|
---|
10 | filter. This representation is equivalent to a traditional all-pole
|
---|
11 | infinite impulse response filter as would be used in linear predictive
|
---|
12 | coding; LSP representation may be converted to LPC representation and
|
---|
13 | vice-versa.
|
---|
14 |
|
---|
15 |
|
---|
16 |
|
---|
17 | \subsection{Floor 0 format}
|
---|
18 |
|
---|
19 | Floor zero configuration consists of six integer fields and a list of
|
---|
20 | VQ codebooks for use in coding/decoding the LSP filter coefficient
|
---|
21 | values used by each frame.
|
---|
22 |
|
---|
23 | \subsubsection{header decode}
|
---|
24 |
|
---|
25 | Configuration information for instances of floor zero decodes from the
|
---|
26 | codec setup header (third packet). configuration decode proceeds as
|
---|
27 | follows:
|
---|
28 |
|
---|
29 | \begin{Verbatim}[commandchars=\\\{\}]
|
---|
30 | 1) [floor0\_order] = read an unsigned integer of 8 bits
|
---|
31 | 2) [floor0\_rate] = read an unsigned integer of 16 bits
|
---|
32 | 3) [floor0\_bark\_map\_size] = read an unsigned integer of 16 bits
|
---|
33 | 4) [floor0\_amplitude\_bits] = read an unsigned integer of six bits
|
---|
34 | 5) [floor0\_amplitude\_offset] = read an unsigned integer of eight bits
|
---|
35 | 6) [floor0\_number\_of\_books] = read an unsigned integer of four bits and add 1
|
---|
36 | 7) array [floor0\_book\_list] = read a list of [floor0\_number\_of\_books] unsigned integers of eight bits each;
|
---|
37 | \end{Verbatim}
|
---|
38 |
|
---|
39 | An end-of-packet condition during any of these bitstream reads renders
|
---|
40 | this stream undecodable. In addition, any element of the array
|
---|
41 | \varname{[floor0\_book\_list]} that is greater than the maximum codebook
|
---|
42 | number for this bitstream is an error condition that also renders the
|
---|
43 | stream undecodable.
|
---|
44 |
|
---|
45 |
|
---|
46 |
|
---|
47 | \subsubsection{packet decode} \label{vorbis:spec:floor0-decode}
|
---|
48 |
|
---|
49 | Extracting a floor0 curve from an audio packet consists of first
|
---|
50 | decoding the curve amplitude and \varname{[floor0\_order]} LSP
|
---|
51 | coefficient values from the bitstream, and then computing the floor
|
---|
52 | curve, which is defined as the frequency response of the decoded LSP
|
---|
53 | filter.
|
---|
54 |
|
---|
55 | Packet decode proceeds as follows:
|
---|
56 | \begin{Verbatim}[commandchars=\\\{\}]
|
---|
57 | 1) [amplitude] = read an unsigned integer of [floor0\_amplitude\_bits] bits
|
---|
58 | 2) if ( [amplitude] is greater than zero ) \{
|
---|
59 | 3) [coefficients] is an empty, zero length vector
|
---|
60 | 4) [booknumber] = read an unsigned integer of \link{vorbis:spec:ilog}{ilog}( [floor0\_number\_of\_books] ) bits
|
---|
61 | 5) if ( [booknumber] is greater than the highest number decode codebook ) then packet is undecodable
|
---|
62 | 6) [last] = zero;
|
---|
63 | 7) vector [temp\_vector] = read vector from bitstream using codebook number [floor0\_book\_list] element [booknumber] in VQ context.
|
---|
64 | 8) add the scalar value [last] to each scalar in vector [temp\_vector]
|
---|
65 | 9) [last] = the value of the last scalar in vector [temp\_vector]
|
---|
66 | 10) concatenate [temp\_vector] onto the end of the [coefficients] vector
|
---|
67 | 11) if (length of vector [coefficients] is less than [floor0\_order], continue at step 6
|
---|
68 |
|
---|
69 | \}
|
---|
70 |
|
---|
71 | 12) done.
|
---|
72 |
|
---|
73 | \end{Verbatim}
|
---|
74 |
|
---|
75 | Take note of the following properties of decode:
|
---|
76 | \begin{itemize}
|
---|
77 | \item An \varname{[amplitude]} value of zero must result in a return code that indicates this channel is unused in this frame (the output of the channel will be all-zeroes in synthesis). Several later stages of decode don't occur for an unused channel.
|
---|
78 | \item An end-of-packet condition during decode should be considered a
|
---|
79 | nominal occruence; if end-of-packet is reached during any read
|
---|
80 | operation above, floor decode is to return 'unused' status as if the
|
---|
81 | \varname{[amplitude]} value had read zero at the beginning of decode.
|
---|
82 |
|
---|
83 | \item The book number used for decode
|
---|
84 | can, in fact, be stored in the bitstream in \link{vorbis:spec:ilog}{ilog}( \varname{[floor0\_number\_of\_books]} -
|
---|
85 | 1 ) bits. Nevertheless, the above specification is correct and values
|
---|
86 | greater than the maximum possible book value are reserved.
|
---|
87 |
|
---|
88 | \item The number of scalars read into the vector \varname{[coefficients]}
|
---|
89 | may be greater than \varname{[floor0\_order]}, the number actually
|
---|
90 | required for curve computation. For example, if the VQ codebook used
|
---|
91 | for the floor currently being decoded has a
|
---|
92 | \varname{[codebook\_dimensions]} value of three and
|
---|
93 | \varname{[floor0\_order]} is ten, the only way to fill all the needed
|
---|
94 | scalars in \varname{[coefficients]} is to to read a total of twelve
|
---|
95 | scalars as four vectors of three scalars each. This is not an error
|
---|
96 | condition, and care must be taken not to allow a buffer overflow in
|
---|
97 | decode. The extra values are not used and may be ignored or discarded.
|
---|
98 | \end{itemize}
|
---|
99 |
|
---|
100 |
|
---|
101 |
|
---|
102 |
|
---|
103 | \subsubsection{curve computation} \label{vorbis:spec:floor0-synth}
|
---|
104 |
|
---|
105 | Given an \varname{[amplitude]} integer and \varname{[coefficients]}
|
---|
106 | vector from packet decode as well as the [floor0\_order],
|
---|
107 | [floor0\_rate], [floor0\_bark\_map\_size], [floor0\_amplitude\_bits] and
|
---|
108 | [floor0\_amplitude\_offset] values from floor setup, and an output
|
---|
109 | vector size \varname{[n]} specified by the decode process, we compute a
|
---|
110 | floor output vector.
|
---|
111 |
|
---|
112 | If the value \varname{[amplitude]} is zero, the return value is a
|
---|
113 | length \varname{[n]} vector with all-zero scalars. Otherwise, begin by
|
---|
114 | assuming the following definitions for the given vector to be
|
---|
115 | synthesized:
|
---|
116 |
|
---|
117 | \begin{displaymath}
|
---|
118 | \mathrm{map}_i = \left\{
|
---|
119 | \begin{array}{ll}
|
---|
120 | \min (
|
---|
121 | \mathtt{floor0\texttt{\_}bark\texttt{\_}map\texttt{\_}size} - 1,
|
---|
122 | foobar
|
---|
123 | ) & \textrm{for } i \in [0,n-1] \\
|
---|
124 | -1 & \textrm{for } i = n
|
---|
125 | \end{array}
|
---|
126 | \right.
|
---|
127 | \end{displaymath}
|
---|
128 |
|
---|
129 | where
|
---|
130 |
|
---|
131 | \begin{displaymath}
|
---|
132 | foobar =
|
---|
133 | \left\lfloor
|
---|
134 | \mathrm{bark}\left(\frac{\mathtt{floor0\texttt{\_}rate} \cdot i}{2n}\right) \cdot \frac{\mathtt{floor0\texttt{\_}bark\texttt{\_}map\texttt{\_}size}} {\mathrm{bark}(.5 \cdot \mathtt{floor0\texttt{\_}rate})}
|
---|
135 | \right\rfloor
|
---|
136 | \end{displaymath}
|
---|
137 |
|
---|
138 | and
|
---|
139 |
|
---|
140 | \begin{displaymath}
|
---|
141 | \mathrm{bark}(x) = 13.1 \arctan (.00074x) + 2.24 \arctan (.0000000185x^2) + .0001x
|
---|
142 | \end{displaymath}
|
---|
143 |
|
---|
144 | The above is used to synthesize the LSP curve on a Bark-scale frequency
|
---|
145 | axis, then map the result to a linear-scale frequency axis.
|
---|
146 | Similarly, the below calculation synthesizes the output LSP curve \varname{[output]} on a log
|
---|
147 | (dB) amplitude scale, mapping it to linear amplitude in the last step:
|
---|
148 |
|
---|
149 | \begin{enumerate}
|
---|
150 | \item \varname{[i]} = 0
|
---|
151 | \item \varname{[$\omega$]} = $\pi$ * map element \varname{[i]} / \varname{[floor0\_bark\_map\_size]}
|
---|
152 | \item if ( \varname{[floor0\_order]} is odd ) {
|
---|
153 | \begin{enumerate}
|
---|
154 | \item calculate \varname{[p]} and \varname{[q]} according to:
|
---|
155 | \begin{eqnarray*}
|
---|
156 | p & = & (1 - \cos^2\omega)\prod_{j=0}^{\frac{\mathtt{floor0\texttt{\_}order}-3}{2}} 4 (\cos([\mathtt{coefficients}]_{2j+1}) - \cos \omega)^2 \\
|
---|
157 | q & = & \frac{1}{4} \prod_{j=0}^{\frac{\mathtt{floor0\texttt{\_}order}-1}{2}} 4 (\cos([\mathtt{coefficients}]_{2j}) - \cos \omega)^2
|
---|
158 | \end{eqnarray*}
|
---|
159 |
|
---|
160 | \end{enumerate}
|
---|
161 | } else \varname{[floor0\_order]} is even {
|
---|
162 | \begin{enumerate}[resume]
|
---|
163 | \item calculate \varname{[p]} and \varname{[q]} according to:
|
---|
164 | \begin{eqnarray*}
|
---|
165 | p & = & \frac{(1 - \cos\omega)}{2} \prod_{j=0}^{\frac{\mathtt{floor0\texttt{\_}order}-2}{2}} 4 (\cos([\mathtt{coefficients}]_{2j+1}) - \cos \omega)^2 \\
|
---|
166 | q & = & \frac{(1 + \cos\omega)}{2} \prod_{j=0}^{\frac{\mathtt{floor0\texttt{\_}order}-2}{2}} 4 (\cos([\mathtt{coefficients}]_{2j}) - \cos \omega)^2
|
---|
167 | \end{eqnarray*}
|
---|
168 |
|
---|
169 | \end{enumerate}
|
---|
170 | }
|
---|
171 |
|
---|
172 | \item calculate \varname{[linear\_floor\_value]} according to:
|
---|
173 | \begin{displaymath}
|
---|
174 | \exp \left( .11512925 \left(\frac{\mathtt{amplitude} \cdot \mathtt{floor0\texttt{\_}amplitute\texttt{\_}offset}}{(2^{\mathtt{floor0\texttt{\_}amplitude\texttt{\_}bits}}-1)\sqrt{p+q}}
|
---|
175 | - \mathtt{floor0\texttt{\_}amplitude\texttt{\_}offset} \right) \right)
|
---|
176 | \end{displaymath}
|
---|
177 |
|
---|
178 | \item \varname{[iteration\_condition]} = map element \varname{[i]}
|
---|
179 | \item \varname{[output]} element \varname{[i]} = \varname{[linear\_floor\_value]}
|
---|
180 | \item increment \varname{[i]}
|
---|
181 | \item if ( map element \varname{[i]} is equal to \varname{[iteration\_condition]} ) continue at step 5
|
---|
182 | \item if ( \varname{[i]} is less than \varname{[n]} ) continue at step 2
|
---|
183 | \item done
|
---|
184 | \end{enumerate}
|
---|
185 |
|
---|
186 | \paragraph{Errata 20150227: Bark scale computation}
|
---|
187 |
|
---|
188 | Due to a typo when typesetting this version of the specification from the original HTML document, the Bark scale computation previously erroneously read:
|
---|
189 |
|
---|
190 | \begin{displaymath}
|
---|
191 | \hbox{\sout{$
|
---|
192 | \mathrm{bark}(x) = 13.1 \arctan (.00074x) + 2.24 \arctan (.0000000185x^2 + .0001x)
|
---|
193 | $}}
|
---|
194 | \end{displaymath}
|
---|
195 |
|
---|
196 | Note that the last parenthesis is misplaced. This document now uses the correct equation as it appeared in the original HTML spec document:
|
---|
197 |
|
---|
198 | \begin{displaymath}
|
---|
199 | \mathrm{bark}(x) = 13.1 \arctan (.00074x) + 2.24 \arctan (.0000000185x^2) + .0001x
|
---|
200 | \end{displaymath}
|
---|
201 |
|
---|