Stereo to 5.1 Channel Converter in Max/MSP
This is a max patch for stereo to 5.1 channel convert. Three difference upmixng algorithms (PSD, LMS & ADP) are used, and users can select one of them and compare them. EQ preset is provided so that users can optimize the sound. More details about the algorithms are described as below.
The Passive surround Decoder (PSD)
The Passive Surround Decoder (PSD) is an early passive version of the Dolby Surround Decoder. The center channel is obtained by adding the original stereo channels, whereas the surround signal can be given by subtracting the right channel from the left channel. In order to maintain a constant acoustic energy, the center and the surround channel are lowered by 3 dB, which is adjusted by multiplying 1/ √2 to the center and the surround channels.
Center(n) = (L(n) + R(n)) / √2
Surround(n) = (L(n) - R(n)) √2
where L(n) and R(n) represent the left and the right samples at the time index n, respectively.
The LMS-based Upmixing Method
The least-mean-square (LMS)-based method uses a decorrelation technique using the LMS algorithm. One of the original stereo channels is used as the desired signal d(n), and the other is considered as the input, x(n), of the FIR filter. The output y(n) represents the correlated part, which is used as the center channel, whereas the error e(n) that is the uncorrelated part is considered as the surround channel.
w(n+1) = w(n) + 2μ * e(n) * x(n)
y(n) = w(n) * x(n)
e(n) = d(n) - y(n)
where μ is a constant step size which is set to 10^(-2) in this project, and w(n) is the coefficient vector of a FIR filter.
The Adaptive Panning Method
This method is proposed by Irwan and Arts. Using dominant signal y(n) and remaining signal q(n), the center signal and the surround signal can be obtained. To maximized the energy of y(n), two scalar panning weight vector, wL(n) and wR(n) are given, corresponding to the left and right channels. w(n) is determined by using the LMS algorithm with y(n) being the input.
wL(n+1) = wL(n) + μy(n)[xL(n) - wL(n) * y(n)
wR(n+1) = wR(n) + μy(n)[xR(n) - wR(n) * y(n),
where μ is a constant step size which is set to 10^(-7), and the initial value of w(n) is set to 0.7 in this project.
y(n) = w(n) * x(n), where w(n) = wL*wR, x(n) = xL(n) * xR(n)
C(n) = wL(n) * xL(n) + wR(n) * xR(n)
S(n) = wR(n) * xL(n) - xL(n) * xR(n)
For 5.1 channel setup for movie, music, voice and dialing should be emphasized when they are played through the center channel. Thus, low-pass filter whose cut-off frequency is 4 kHz is applied to the center channel after decomposing processing. Also, a low-pass filter with a cut-off frequency of 7 kHz is also used in the surround channels.
The rear surround channels are intended to provide ambience and spaciousness effect, hence, a time delay of 15 ms is applied to the surround channels. In addition, 90-degree Phase shifter is applied in the surround channels.
The Max patch is accessible here
Bai, M.R., & Shih, G. Y. (2007). Upmixing and Downmixing Two-channel Stereo Audio for Consumer Electornics. IEEE Trans. on Consumer Electronics, 53(3), 1011-1019.
Chun, C. J., Kim, Y. G., Yang, J. Y., & Kim, H. K. (2009). Real-Time Conversion of Stereo Audio to 5.1 Channel Audio for Providing Realistic Sounds. International Journal of Signal Processing, Image Processing and Pattern Recognition, 2(4), 85-94.
Irwan, R., & Aarts, R. M. (2002). Two-to-Five Channel Sound processing, Journal of Audio Engineering Society, 50(11), 914-926.