Jinyu Han | Projects
What's Goin On?

(homepage | publications)

Audio Imputation
Collaboration with Gautham J. Mysore from Adobe Systems
| LVA/ICA 2012 paper | Demo |
  • The problem of missing data in an audio spectrogram occurs in many scenarios. For example, the problem is common in signal transmission, where the signal quality is degraded by linear or non-linear filtering operations.
  • We present an approach that allows us to estimate missing values in the time-frequency domain of audio signals. The proposed approach, based on the Nonnegative Hidden Markov Model, enables more temporally coherent estimation for the missing data by taking into account both the spectral and temporal information of the audio signal. This approach is able to reconstruct highly corrupted audio signals with large parts of the spectrogram missing. We demonstrate this approach on real-world polyphonic music signals. The initial experimental results show that our approach has advantages over a previous missing data imputation method.

  • Singing Voice Extraction
    Parts of the work performed at Gracenote with Ching-Wei Chen
    | ICASSP 2011 paper | AAAI workshop paper | Presentation |
  • We propose a semi-supervised approach for automatic singing voice extraction from polyphonic audio, based on Probabilistic Latent Component Analysis (PLCA) and Gaussian Mixture Model. A statistical model for the accompaniment is learned adaptively from non-vocal segments of the polyphonic music based Non-negative Factorization technique. This model can be employed to remove the accompaniment in the vocal segments, leaving mainly the singing components.
  • Experiments on real audio dataset showed that the proposed algorithm achieved promising results compared to two other melody extraction algorithms

  • Audio Source Separation

    1. Resolving Overlapped Notes in Monaural Music
    | ICASSP 2011 paper | Poster |

    We propose an alternate technique for harmonic envelope estimation, based on Harmonic Temporal Envelope Similarity (HTES). We learn a harmonic envelope model for each instrument from the non-overlapped harmonics of notes of the same instrument, wherever they occur in the recording. This model is used to reconstruct the harmonic envelopes for overlapped harmonics. This allows reconstruction of completely overlapped notes. Experiments show our algorithm performs better than an existing system based on Common Amplitude Modulation when the harmonics of pitched instruments are strongly overlapped.

    2. Stereo Music Source Separation
    | detail | WASPAA 2009 paper | Poster |

    Source separation is the process of determining individual source signals, given only mixtures of the source signals.
    We introduce a method that uses binaural spatial cues (amplitude ratio and phase difference) and assumptions regarding the structure of musical source signals (Harmonicity) to effectively separate mixtures of tonal music.

    Multi-Pitch Tracking
    Joint work with Zhiyao Duan
    | detail | ICASSP 2010 paper | ISMIR 2009 paper | Presentation |
  • In audio research, multi-pitch estimation is the process of estimating the instantaneous fundamental frequency of each of a number of simultaneous sound sources (such as voices or musical instruments). Multi-pitch tracking connects the instantaneous estimates between adjacent timeframes into tracks, each corresponding to a single sound source. Most existing multi-pitch tracking (MPT) research concentrates at modeling the evolution of pitch contour using Hidden Markov Model (HMM).
  • We invented a new Constrained Clustering algorithm to automatically estimate the pitch trajectories of different musical instruments. We cast the MPT problem as a clustering problem, where pitches from the same source belongs to the same cluster. The approach produces good results on real world music recordings up to four musical instruments. This algorithm was highly ranked in the MIREX 2009 Multiple Fundamental Frequency Estimation and Tracking Task.

  • Query By Humming
    | ISMIR 2008 paper | JCDL 2008 paper |
    <-- Tunebot is a music search engine developped in Interactive Audio Lab.
  • Melodic search engines let people find music in online collections by specifying the desired melody. Comparing the query melody to every item in a large database is prohibitively slow. If melodies can be placed in a metric space, search can be sped by comparing the query to a limited number of vantage melodies, rather than the entire database.
  • We describe a simple melody metric that is customizable using a small number of example queries. This metric allows use of a generalized vantage point tree to organize the database. We show on a standard melodic database that the general vantage tree approach achieves superior search results for query-by-humming compared to an existing vantage point tree method. The invented search algorithm based on vantage point trees speeds up the core search time by at least 10 times. We then show this method can be used as a preprocessor to speed search for non-metric melodic comparison.