Skip to content
@416AudioLab

VoXLab

Audio Signal and Information Processing Lab at Xinjiang University

Welcome to our lab

VoXLab

The lab is affiliated with the College of Information Science and Engineering at Xinjiang University. His research interests include sound source separation, music source separation, sound event localization and detection, etc.

Research direction

  • Speech Separation
  • Speech Separation

    Speech separation is the task of separating target speech from background interference. Traditionally, speech separation is studied as a signal processing problem.Speech separation is a fundamental task in signal processing with a wide range of applications, including hearing prosthesis, mobile telecommunication, and robust automatic speech and speaker recognition.The human auditory system has the remarkable ability to extract one sound source from a mixture of multiple sources. Speech separation is commonly called the “cocktail party problem”.

  • Music Source Separation
  • Music Source Separation

    Music source separation is the task of separating mixed audio into multiple target sources, such as vocals, drums, bass, etc. Music source separation is an important part of music information retrieval (MIR), which can be used for many important downstream applications, including melody extraction, pitch estimation, music transcription, music mixing, etc.

  • Sound Source Localization and Detection
  • Music Source Separation

    Sound source localization and detection (SSLD) is a combined task of identifying the boundaries of each sound event, estimating the trajectories of spatial location of sound source when active and classifying the sound events. SSLD is helpful for understanding the surrounding environment and applicable in many applications such as man-machine interaction, bioacoustic monitoring, smart cities and timely warning of dangerous acoustic signals.

  • Melody Extraction & Pitch Estimation
  • Music Source Separation

    Singing melody extraction is still a challenging task in music information retrieval,which aims to estimate the fundamental frequency (F0) of the dominant melody.Singing melody extraction has become an active topic in MIR,since it has many important downstream applications, such as vocal separation from monaural music, music annotation and retrieval, etc.

  • Target Speaker Extraction
  • Music Source Separation

    Speaker extraction Main research objective Speakers appear in a mixed scene of two or more speakers and aim to simulate selective auditory attention in humans by extracting the target speaker's voice from a multi—speaker environment. The speaker extraction system separates the target speech from the complex acoustic environment containing a variety of interference (such as car noise, navigation tone, car FM, etc.) while minimizing the damage to the original speech and improving the efficiency of human-computer interaction and customer service listening.

  • Sound Event Detection
  • Music Source Separation

    Sound event detection (SED) task is to train a SED system by using a large amount of audio data. The target of SED system is to provide not only the event class but also the onset and offset given that multiple events can be present in an audio recording. Sound event detection has many potential applications, such as intelligent city noise monitoring, monitoring system, urban planning, multimedia information retrieval, smart home, health monitoring system and automatic driving.

  • Speech Denoising and Dereverberation
  • Music Source Separation

    In daily life, speech signal transmission will inevitably be polluted by noise and environmental reverberation. Denoising and dereverberation technology means that when the speech signal is disturbed by noise and reverberation, clean speech can be extracted from the polluted speech, and noise and reverb can be suppressed or removed.

  • Speech Emotion Recognition
  • Music Source Separation

    Speech emotion recognition (SER) is a task that utilizes unimodal or multimodal information to extract rich and salient emotional features for human speech emotion recognition. With the development of artificial intelligence, speech emotion recognition has become an indispensable part of human-computer interaction (HCI) and other developed speech processing systems.

Contact us

Please feel free to contact us if you need anything.

email:[email protected]

版权所有 © 2023 VoXLab

Pinned Loading

  1. Music-Source-Separation-master Music-Source-Separation-master Public

    Forked from YadongChen-1016/Music-Source-Separation-master

    HTCN: Hierarchic Temporal Convolutional Network with Cross-Domain Encoder for Music Source Separation. IEEE Signal Processing Letters, SPL. 2022

    Python

Repositories

Showing 8 of 8 repositories
  • IIFC-Net Public Forked from wen0320/IIFC-Net

    IIFC-Net: A Monaural Speech Enhancement Network with High-Order Information Interaction and Feature Calibration. Signal Processing Letters (SPL) 2024.

    416AudioLab/IIFC-Net’s past year of commit activity
    Python 0 2 0 0 Updated Jul 29, 2024
  • Music-Source-Separation-master Public Forked from YadongChen-1016/Music-Source-Separation-master

    HTCN: Hierarchic Temporal Convolutional Network with Cross-Domain Encoder for Music Source Separation. IEEE Signal Processing Letters, SPL. 2022

    416AudioLab/Music-Source-Separation-master’s past year of commit activity
    Python 0 MIT 1 0 0 Updated Jul 29, 2024
  • Speech-Resources Public Forked from ddlBoJack/Speech-Resources

    语音方向实验室/公司/资源/实习等,欢迎推荐或自荐

    416AudioLab/Speech-Resources’s past year of commit activity
    0 69 0 0 Updated Jun 15, 2023
  • .github Public
    416AudioLab/.github’s past year of commit activity
    0 0 0 0 Updated Jun 12, 2023
  • MTANet Public Forked from Annmixiu/MTANet

    Multi-band Time-frequency Attention Network for Singing Melody Extraction from Polyphonic Music. INTERSPEECH 2023.

    416AudioLab/MTANet’s past year of commit activity
    Python 2 3 0 0 Updated Jun 12, 2023
  • SESNet Public Forked from WangLiusong/SESNet

    D²Net: A Denoising and Dereverberation Network Based on Two-branch Encoder and Dual-path Transformer. 2022 APSIPA.

    416AudioLab/SESNet’s past year of commit activity
    0 1 0 0 Updated May 12, 2023
  • Separation Public Forked from YadongChen-1016/Separation

    A repo of the singing voice and accompaniment separation method.

    416AudioLab/Separation’s past year of commit activity
    Python 0 1 0 0 Updated Dec 25, 2021
  • asteroid Public Forked from MTG/asteroid

    The PyTorch-based audio source separation toolkit for researchers

    416AudioLab/asteroid’s past year of commit activity
    Python 0 MIT 431 0 0 Updated Nov 26, 2020

Top languages

Loading…

Most used topics

Loading…