سامانه اطلاعات پژوهشی ایران

این سایت در حال حاضر پشتیبانی نمی شود و امکان دارد داده های نشریات بروز نباشند

یکشنبه 14 تیر 1405


پردازش علائم و داده ها، جلد ۱۷، شماره ۱، صفحات ۹۹-۱۱۶


عنوان فارسی	بهسازی گفتار به‌کمک یادگیری واژه‌نامه مبتنی‌بر داده

چکیده فارسی مقاله	بهسازی گفتار یکی از پرکاربردترین حوزه‌ها در زمینه پردازش گفتار است. در این مقاله، یکی از روش‌های بهسازی گفتار مبتنی‌بر اصول بازنمایی تُنُک بررسی می‌شود. بازنمایی تُنُک این امکان را فراهم می‌سازد که عمده اطلاعات لازم برای بازنمایی سیگنال‌، براساس بُعد بسیار کمتری از پایه‌های فضایی اصلی قابل مدل‌سازی باشد. روش‌ یادگیری در این مقاله براساس تصحیح الگوریتم تطبیقی حریصانه مبتنی‌بر داده خواهد بود که واژه‌نامه در آن، به‌طور مستقیم از روی سیگنال داده و براساس شاخص تُنُکی مبتنی‌بر نُرم به منظور تطابق بیشتر میان اتم‌ها و ساختار داده آموزش می‌بیند. در این مقاله شاخص تُنُکی جدیدی براساس معیار جینی پیشنهاد می‌شود. همچنین محدوده پارامتر تُنُکی بخش‌های نوفه‌ای با توجه به فریم‌های ابتدایی گفتار تعیین و طی یک روال پیشنهادی در تشکیل واژه‌نامه مورد استفاده قرار می‌گیرد. نتایج بهسازی نشان می‌دهد که عملکرد روش پیشنهادی در انتخاب قاب‌‌های داده براساس معیار معرفی‌شده در شرایط نوفه‌ای مختلف بهتر از شاخص تُنُکی مبتنی‌بر نُرم و سایر الگوریتم‌های پایه در این راستا است.

کلیدواژه‌های فارسی مقاله

عنوان انگلیسی	Speech Enhancement using Adaptive Data-Based Dictionary Learning

چکیده انگلیسی مقاله	In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques to attenuate the background noise without causing any distortion in the speech signal. In this paper, we focus on the single channel speech enhancement corrupted by the additive Gaussian noise. In recent years, there has been an increasing interest in employing sparse representation techniques for speech enhancement. Sparse representation technique makes it possible to show the major information about the speech signal based on a smaller dimension of the original spatial bases. The capability of a sparse decomposition method depends on the learned dictionary and matching between the dictionary atoms and the signal features. An over complete dictionary is yielded based on two main steps: dictionary learning process and sparse coding technique. In dictionary selection step, a pre-defined dictionary such as the Fourier basis, wavelet basis or discrete cosine basis is employed. Also, a redundant dictionary can be constructed after a learning process that is often based on the alternating optimization strategies. In sparse coding step, the dictionary is fixed and a sparse coefficient matrix with the low approximation error has been earned. The goal of this paper is to investigate the role of data-based dictionary learning technique in the speech enhancement process in the presence of white Gaussian noise. The dictionary learning method in this paper is based on the greedy adaptive algorithm as a data-based technique for dictionary learning. The dictionary atoms are learned using the proposed algorithm according to the data frames taken from the speech signals, so the atoms contain the structure of the input frames. The atoms in this approach are learned directly from the training data using the norm-based sparsity measure to earn more matching between the data frames and the dictionary atoms. The proposed sparsity measure in this paper is based on Gini parameter. We present a new sparsity index using Gini coefficients in the greedy adaptive dictionary learning algorithm. These coefficients are set to find the atoms with more sparsity in the comparison with the other sparsity indices defined based on the norm of speech frames. The proposed learning method iteratively extracts the speech frames with minimum sparsity index according to the mentioned measures and adds the extracted atoms to the dictionary matrix. Also, the range of the sparsity parameter is selected based on the initial silent frames of speech signal in order to make a desired dictionary. It means that a speech frame of input data matrix can add to the first columns of the over complete dictionary when it has not a similar structure with the noise frames. The data-based dictionary learning process makes the algorithm faster than the other dictionary learning methods for example K-singular value decomposition (K-SVD), method of optimal directions (MOD) and other optimization-based strategies. The sparsity of an input frame is measured using Gini-based index that includes smaller measured values for speech frames because of their sparse content. On the other hand, high values of this parameter can be yielded for a frame involved the Gaussian noise structure. The performance of the proposed method is evaluated using different measures such as improvement in signal-to-noise ratio (ISNR), the time-frequency representation of atoms and PESQ scores. The proposed approach results in a significant reduction of the background noise in comparison with other dictionary learning methods such as principal component analysis (PCA) and the norm-based learning method that are traditional procedures in this context. We have found good results about the reconstruction error in the signal approximations for the proposed speech enhancement method. Also, the proposed approach leads to the proper computation time that is a prominent factor in dictionary learning methods.

کلیدواژه‌های انگلیسی مقاله

نویسندگان مقاله	سمیرا مودتی \| Samira Mavaddati University of Mazandaran دانشگاه مازندران محمد احدی \| Mohammad Ahadi Amirkabir University of Technology دانشگاه صنعتی امیرکبیر

نشانی اینترنتی	http://jsdp.rcisp.ac.ir/browse.php?a_code=A-10-1429-1&slc_lang=fa&sid=1
فایل مقاله	اشکال در دسترسی به فایل - ./files/site1/rds_journals/1315/article-1315-2453597.pdf
کد مقاله (doi)
زبان مقاله منتشر شده	fa
موضوعات مقاله منتشر شده	مقالات پردازش گفتار
نوع مقاله منتشر شده	پژوهشی

برگشت به: صفحه اول پایگاه \| نسخه مرتبط \| نشریه مرتبط \| فهرست نشریات

ارسال پیام برخط

در صورت مشاهده هر نوع اشکال در داده های پایگاه و یا برای ارسال نظرات و پیشنهاد های خود می توانید با پر کردن فرم تماس ما را در جریان قرار دهید.
برای پر کردن فرم تماس اینجا را کلیک کنید.

آمار پایگاه

نمایه شده در ISI 135

نمایه شده در PubMed 109

نمایه شده در Scopus 192

کاربران برخط 2208

بازدید امروز 38352

بازدید کل 43420394

اطلاعات تماس

آدرس : تهران، سعادت آباد، بلوار پاکنژاد شمالی، بالاتر از میدان سرو، نبش کوچه ندا، پلاک ۶۸، ساختمان جاوید، واحد ۱۶

پست الکترونیک: yektaweb-AT-gmail.com

توجه

کلیه حقوق این وب سایت و مطالب آن متعلق به شرکت یکتاوب بوده و استفاده از مطالب آن با ذکر منبع بلامانع است
طراحی و برنامه نویسی: یکتاوب افزار شرق