Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding

Md. Ekramul Hamid; Md. Khademul Islam Molla; Md. Iqbal Aziz Khan; Takayoshi Nakai

doi:doi:10.11648/j.cssp.20150401.12

| Peer-Reviewed

Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding

Md. Ekramul Hamid, Md. Khademul Islam Molla, Md. Iqbal Aziz Khan, Takayoshi Nakai

Published in Science Journal of Circuits, Systems and Signal Processing (Volume 4, Issue 1)

Received: 11 April 2015 Accepted: 18 April 2015 Published: 29 April 2015

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

A method of and a system for speech enhancement consists of Hilbert spectrum and wavelet packet analysis is studied. We implement ISA to separate speech and interfering signals from single mixture and wavelet packet based soft-thresholding algorithm to enhance the quality of target speech. The mixed signal is projected onto time-frequency (TF) space using empirical mode decomposition (EMD) based Hilbert spectrum (HS). Then a finite set of independent basis vectors are derived from the TF space by applying principal component analysis (PCA) and independent component analysis (ICA) sequentially. The vectors are clustered using hierarchical clustering to represent the independent subspaces corresponding to the component sources in the mixture. However, the speech quality of the separation algorithm is not enough and contains some residual noises. Therefore, in the next stage, the target speech is enhanced using wavelet packet decomposition (WPD) method where the speech activity is monitored by updating noise or unwanted signals statistics. The mode mixing issue of traditional EMD is addressed and resolved using ensemble EMD. The proposed algorithm is also tested using short-time Fourier transform (STFT) based spectrogram method. The simulation results show a noticeable performance in the field of audio source separation and speech enhancement.

Published in	Science Journal of Circuits, Systems and Signal Processing (Volume 4, Issue 1)
DOI	10.11648/j.cssp.20150401.12
Page(s)	1-8
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Speech Enhancement, Ensemble Empirical Mode Decomposition, Source Separation, Independent Subspace Analysis, Hilbert Spectrum, Wavelet Packet Decomposition

References

[1]	H. Saruwatari, S. Kurita, K. Takeda, F. Itakura, T. Nishikawa, and K. Shikano, “Blind Source Separation Combining Independent Component Analysis and Beamforming.” EURASIP Journal on Applied Signal Processing, vol. 11, pp. 1135-1146, 2003.
[2]	J. M. Valin, J. Rouat, and F. Michaud, “Enhanced Robot Audition Based on Microphone Array Source Separation with Post-Filter,” Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2004.
[3]	Y. Ephraim, and D. Malah, “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,” IEEE Trans. on Acoustic, Speech and Signals Processing, vol. 32, pp. 1109-1121, 1984.
[4]	O. Cappe, “Estimation of the musical noise phenomenon with the Ephraim and Malah noise suppressor,” IEEE Trans. on Acoustic, Speech and Signals Processing, vol. 2, pp. 345-349, 1994.
[5]	S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. on Acoustic, Speech and Signals Processing, vol. 27, pp. 113-120, 1979.
[6]	G. J. Brown, and M. Cooke,“Computational auditory scene analysis,” Computer Speech Language, vol. 8(4), pp. 297-336, 1994.
[7]	M. A.Casey, and A. Westner, “Separation of mixed audio sources by independent subspace analysis,” Proc. of International Computer Music Conference, pp. 154-161, 2000.
[8]	M. K. I. Molla, and K. Hirose, “Single mixture audio source separation by subspace decomposition of Hilbert spectrum,” IEEE transactions on audio, speech and language processing, vol. 15(3), pp. 893-900, 2007.
[9]	Y. Ghanbari, and M. R. K. Mollaei, “A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets”, Speech Communications, Elsevier, vol. 48, pp. 927-940, 2006.
[10]	N. E. Huang, Z.Shen, S. R Long, et al. “The empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time series analysis,” Proc. Roy. Soc. London A, vol. 454, pp. 903-995, 1998.
[11]	Z. Wu, and N. E. Huang, “Ensemble empirical mode decomposition: a noise-assisted data analysis method,” Advances in Adaptive Data Analysis, vol. 1(1), 2009.
[12]	A. Hyvärinen, and E. Oja, “Independent component analysis: algorithms and applications,”Neural Networks, vol.13(4-5), pp. 411-430, 2000.
[13]	J. F. Cardoso, and A. Souloumiac, “Blind beamforming for nongaussian signals,” IEE Proceedings-F,pp. 362-370, 1993.
[14]	J. Rosca, D.Erdogmus, J. Princip, and S. Haykin, Independent component analysis and blind signal separation, Springer, 2006.
[15]	R. A. Singer, R. G. Sea, “A new filter for optimal tracking in dense multi-target environment,” Proceedings of the ninth Allerton Conference Circuit and System Theory. Urbana-Champaign, USA: Univ. of Illinois, pp. 201-211,1971.
[16]	N. E. Huang, et al.,“Application of Hilbert-Huang transform to non-stationary financial time series analysis,” Applied Stochastic Model in Business and Industry, vol. 19, pp. 245-268, 2003.

Cite This Article

Plain Text BibTeX RIS

APA Style

Md. Ekramul Hamid, Md. Khademul Islam Molla, Md. Iqbal Aziz Khan, Takayoshi Nakai. (2015). Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding. Science Journal of Circuits, Systems and Signal Processing, 4(1), 1-8. https://doi.org/10.11648/j.cssp.20150401.12

Copy | Download

ACS Style

Md. Ekramul Hamid; Md. Khademul Islam Molla; Md. Iqbal Aziz Khan; Takayoshi Nakai. Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding. Sci. J. Circuits Syst. Signal Process. 2015, 4(1), 1-8. doi: 10.11648/j.cssp.20150401.12

Copy | Download

AMA Style

Md. Ekramul Hamid, Md. Khademul Islam Molla, Md. Iqbal Aziz Khan, Takayoshi Nakai. Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding. Sci J Circuits Syst Signal Process. 2015;4(1):1-8. doi: 10.11648/j.cssp.20150401.12

Copy | Download

@article{10.11648/j.cssp.20150401.12,
  author = {Md. Ekramul Hamid and Md. Khademul Islam Molla and Md. Iqbal Aziz Khan and Takayoshi Nakai},
  title = {Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding},
  journal = {Science Journal of Circuits, Systems and Signal Processing},
  volume = {4},
  number = {1},
  pages = {1-8},
  doi = {10.11648/j.cssp.20150401.12},
  url = {https://doi.org/10.11648/j.cssp.20150401.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.cssp.20150401.12},
  abstract = {A method of and a system for speech enhancement consists of Hilbert spectrum and wavelet packet analysis is studied. We implement ISA to separate speech and interfering signals from single mixture and wavelet packet based soft-thresholding algorithm to enhance the quality of target speech. The mixed signal is projected onto time-frequency (TF) space using empirical mode decomposition (EMD) based Hilbert spectrum (HS). Then a finite set of independent basis vectors are derived from the TF space by applying principal component analysis (PCA) and independent component analysis (ICA) sequentially. The vectors are clustered using hierarchical clustering to represent the independent subspaces corresponding to the component sources in the mixture. However, the speech quality of the separation algorithm is not enough and contains some residual noises. Therefore, in the next stage, the target speech is enhanced using wavelet packet decomposition (WPD) method where the speech activity is monitored by updating noise or unwanted signals statistics. The mode mixing issue of traditional EMD is addressed and resolved using ensemble EMD. The proposed algorithm is also tested using short-time Fourier transform (STFT) based spectrogram method. The simulation results show a noticeable performance in the field of audio source separation and speech enhancement.},
 year = {2015}
}

Copy | Download

TY - JOUR
T1 - Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding
AU - Md. Ekramul Hamid
AU - Md. Khademul Islam Molla
AU - Md. Iqbal Aziz Khan
AU - Takayoshi Nakai
Y1 - 2015/04/29
PY - 2015
N1 - https://doi.org/10.11648/j.cssp.20150401.12
DO - 10.11648/j.cssp.20150401.12
T2 - Science Journal of Circuits, Systems and Signal Processing
JF - Science Journal of Circuits, Systems and Signal Processing
JO - Science Journal of Circuits, Systems and Signal Processing
SP - 1
EP - 8
PB - Science Publishing Group
SN - 2326-9073
UR - https://doi.org/10.11648/j.cssp.20150401.12
AB - A method of and a system for speech enhancement consists of Hilbert spectrum and wavelet packet analysis is studied. We implement ISA to separate speech and interfering signals from single mixture and wavelet packet based soft-thresholding algorithm to enhance the quality of target speech. The mixed signal is projected onto time-frequency (TF) space using empirical mode decomposition (EMD) based Hilbert spectrum (HS). Then a finite set of independent basis vectors are derived from the TF space by applying principal component analysis (PCA) and independent component analysis (ICA) sequentially. The vectors are clustered using hierarchical clustering to represent the independent subspaces corresponding to the component sources in the mixture. However, the speech quality of the separation algorithm is not enough and contains some residual noises. Therefore, in the next stage, the target speech is enhanced using wavelet packet decomposition (WPD) method where the speech activity is monitored by updating noise or unwanted signals statistics. The mode mixing issue of traditional EMD is addressed and resolved using ensemble EMD. The proposed algorithm is also tested using short-time Fourier transform (STFT) based spectrogram method. The simulation results show a noticeable performance in the field of audio source separation and speech enhancement.
VL - 4
IS - 1
ER -

Copy | Download

Author Information

Md. Ekramul Hamid

Dept. of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh; Dept. of Electric and Electronic Engineering, Shizuoka University, Hamamatsu-shi, Japan
Md. Khademul Islam Molla

Dept. of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh
Md. Iqbal Aziz Khan

Dept. of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh; Dept. of Electric and Electronic Engineering, Shizuoka University, Hamamatsu-shi, Japan
Takayoshi Nakai

Dept. of Electric and Electronic Engineering, Shizuoka University, Hamamatsu-shi, Japan

Download PDF

Sections

Plain Text BibTeX RIS

APA Style

Md. Ekramul Hamid, Md. Khademul Islam Molla, Md. Iqbal Aziz Khan, Takayoshi Nakai. (2015). Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding. Science Journal of Circuits, Systems and Signal Processing, 4(1), 1-8. https://doi.org/10.11648/j.cssp.20150401.12

Copy | Download

ACS Style

Md. Ekramul Hamid; Md. Khademul Islam Molla; Md. Iqbal Aziz Khan; Takayoshi Nakai. Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding. Sci. J. Circuits Syst. Signal Process. 2015, 4(1), 1-8. doi: 10.11648/j.cssp.20150401.12

Copy | Download

AMA Style

Md. Ekramul Hamid, Md. Khademul Islam Molla, Md. Iqbal Aziz Khan, Takayoshi Nakai. Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding. Sci J Circuits Syst Signal Process. 2015;4(1):1-8. doi: 10.11648/j.cssp.20150401.12

Copy | Download

@article{10.11648/j.cssp.20150401.12,
  author = {Md. Ekramul Hamid and Md. Khademul Islam Molla and Md. Iqbal Aziz Khan and Takayoshi Nakai},
  title = {Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding},
  journal = {Science Journal of Circuits, Systems and Signal Processing},
  volume = {4},
  number = {1},
  pages = {1-8},
  doi = {10.11648/j.cssp.20150401.12},
  url = {https://doi.org/10.11648/j.cssp.20150401.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.cssp.20150401.12},
  abstract = {A method of and a system for speech enhancement consists of Hilbert spectrum and wavelet packet analysis is studied. We implement ISA to separate speech and interfering signals from single mixture and wavelet packet based soft-thresholding algorithm to enhance the quality of target speech. The mixed signal is projected onto time-frequency (TF) space using empirical mode decomposition (EMD) based Hilbert spectrum (HS). Then a finite set of independent basis vectors are derived from the TF space by applying principal component analysis (PCA) and independent component analysis (ICA) sequentially. The vectors are clustered using hierarchical clustering to represent the independent subspaces corresponding to the component sources in the mixture. However, the speech quality of the separation algorithm is not enough and contains some residual noises. Therefore, in the next stage, the target speech is enhanced using wavelet packet decomposition (WPD) method where the speech activity is monitored by updating noise or unwanted signals statistics. The mode mixing issue of traditional EMD is addressed and resolved using ensemble EMD. The proposed algorithm is also tested using short-time Fourier transform (STFT) based spectrogram method. The simulation results show a noticeable performance in the field of audio source separation and speech enhancement.},
 year = {2015}
}

Copy | Download