如何在时域计算音高基频 f( 0) )?

我是 DSP 的新手,试图为f(0)音频文件的每个分段帧计算基频( )。F0估计的方法可以分为三类:

  • 基于信号时域的时间动态;

  • 基于频率结构频域,以及

  • 混合方法。

大多数示例都是基于频率结构频域估计基频,我正在寻找基于信号时域的时间动态。

本文提供了一些信息,但我仍然不清楚如何在时域中计算它?

https://gist.github.com/endolith/255291

这是我发现的到目前为止使用的代码:

def freq_from_autocorr(sig, fs):

    """

    Estimate frequency using autocorrelation

    """

    # Calculate autocorrelation and throw away the negative lags

    corr = correlate(sig, sig, mode='full')

    corr = corr[len(corr)//2:]


    # Find the first low point

    d = diff(corr)

    start = nonzero(d > 0)[0][0]


    # Find the next peak after the low point (other than 0 lag).  This bit is

    # not reliable for long signals, due to the desired peak occurring between

    # samples, and other peaks appearing higher.

    # Should use a weighting function to de-emphasize the peaks at longer lags.

    peak = argmax(corr[start:]) + start

    px, py = parabolic(corr, peak)


    return fs / px

如何在时域进行估计?


提前致谢!


qq_花开花谢_0
浏览 160回答 1
1回答

慕容3067478

这是一个正确的实现。不是很健壮,但肯定有效。为了验证这一点,我们可以生成一个已知频率的信号,看看我们会得到什么结果:import numpy as npfrom scipy.io import wavfilefrom scipy.signal import correlate, fftconvolvefrom scipy.interpolate import interp1dfs = 44100frequency = 440length = 0.01 # in secondst = np.linspace(0, length, int(fs * length))&nbsp;y = np.sin(frequency * 2 * np.pi * t)def parabolic(f, x):&nbsp; &nbsp; xv = 1/2. * (f[x-1] - f[x+1]) / (f[x-1] - 2 * f[x] + f[x+1]) + x&nbsp; &nbsp; yv = f[x] - 1/4. * (f[x-1] - f[x+1]) * (xv - x)&nbsp; &nbsp; return (xv, yv)def freq_from_autocorr(sig, fs):&nbsp; &nbsp; """&nbsp; &nbsp; Estimate frequency using autocorrelation&nbsp; &nbsp; """&nbsp; &nbsp; corr = correlate(sig, sig, mode='full')&nbsp; &nbsp; corr = corr[len(corr)//2:]&nbsp; &nbsp; d = np.diff(corr)&nbsp; &nbsp; start = np.nonzero(d > 0)[0][0]&nbsp; &nbsp; peak = np.argmax(corr[start:]) + start&nbsp; &nbsp; px, py = parabolic(corr, peak)&nbsp; &nbsp; return fs / px结果运行freq_from_autocorr(y, fs)得到我们~442.014 Hz,大约 0.45% 的错误。奖金 - 我们可以改进我们可以通过稍微多一点的编码使它更精确和健壮:def indexes(y, thres=0.3, min_dist=1, thres_abs=False):&nbsp; &nbsp; """Peak detection routine borrowed from&nbsp;&nbsp; &nbsp; https://bitbucket.org/lucashnegri/peakutils/src/master/peakutils/peak.py&nbsp; &nbsp; """&nbsp; &nbsp; if isinstance(y, np.ndarray) and np.issubdtype(y.dtype, np.unsignedinteger):&nbsp; &nbsp; &nbsp; &nbsp; raise ValueError("y must be signed")&nbsp; &nbsp; if not thres_abs:&nbsp; &nbsp; &nbsp; &nbsp; thres = thres * (np.max(y) - np.min(y)) + np.min(y)&nbsp; &nbsp; min_dist = int(min_dist)&nbsp; &nbsp; # compute first order difference&nbsp; &nbsp; dy = np.diff(y)&nbsp; &nbsp; # propagate left and right values successively to fill all plateau pixels (0-value)&nbsp; &nbsp; zeros, = np.where(dy == 0)&nbsp; &nbsp; # check if the signal is totally flat&nbsp; &nbsp; if len(zeros) == len(y) - 1:&nbsp; &nbsp; &nbsp; &nbsp; return np.array([])&nbsp; &nbsp; if len(zeros):&nbsp; &nbsp; &nbsp; &nbsp; # compute first order difference of zero indexes&nbsp; &nbsp; &nbsp; &nbsp; zeros_diff = np.diff(zeros)&nbsp; &nbsp; &nbsp; &nbsp; # check when zeros are not chained together&nbsp; &nbsp; &nbsp; &nbsp; zeros_diff_not_one, = np.add(np.where(zeros_diff != 1), 1)&nbsp; &nbsp; &nbsp; &nbsp; # make an array of the chained zero indexes&nbsp; &nbsp; &nbsp; &nbsp; zero_plateaus = np.split(zeros, zeros_diff_not_one)&nbsp; &nbsp; &nbsp; &nbsp; # fix if leftmost value in dy is zero&nbsp; &nbsp; &nbsp; &nbsp; if zero_plateaus[0][0] == 0:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; dy[zero_plateaus[0]] = dy[zero_plateaus[0][-1] + 1]&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; zero_plateaus.pop(0)&nbsp; &nbsp; &nbsp; &nbsp; # fix if rightmost value of dy is zero&nbsp; &nbsp; &nbsp; &nbsp; if len(zero_plateaus) and zero_plateaus[-1][-1] == len(dy) - 1:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; dy[zero_plateaus[-1]] = dy[zero_plateaus[-1][0] - 1]&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; zero_plateaus.pop(-1)&nbsp; &nbsp; &nbsp; &nbsp; # for each chain of zero indexes&nbsp; &nbsp; &nbsp; &nbsp; for plateau in zero_plateaus:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; median = np.median(plateau)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; # set leftmost values to leftmost non zero values&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; dy[plateau[plateau < median]] = dy[plateau[0] - 1]&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; # set rightmost and middle values to rightmost non zero values&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; dy[plateau[plateau >= median]] = dy[plateau[-1] + 1]&nbsp; &nbsp; # find the peaks by using the first order difference&nbsp; &nbsp; peaks = np.where(&nbsp; &nbsp; &nbsp; &nbsp; (np.hstack([dy, 0.0]) < 0.0)&nbsp; &nbsp; &nbsp; &nbsp; & (np.hstack([0.0, dy]) > 0.0)&nbsp; &nbsp; &nbsp; &nbsp; & (np.greater(y, thres))&nbsp; &nbsp; )[0]&nbsp; &nbsp; # handle multiple peaks, respecting the minimum distance&nbsp; &nbsp; if peaks.size > 1 and min_dist > 1:&nbsp; &nbsp; &nbsp; &nbsp; highest = peaks[np.argsort(y[peaks])][::-1]&nbsp; &nbsp; &nbsp; &nbsp; rem = np.ones(y.size, dtype=bool)&nbsp; &nbsp; &nbsp; &nbsp; rem[peaks] = False&nbsp; &nbsp; &nbsp; &nbsp; for peak in highest:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if not rem[peak]:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sl = slice(max(0, peak - min_dist), peak + min_dist + 1)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rem[sl] = True&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rem[peak] = False&nbsp; &nbsp; &nbsp; &nbsp; peaks = np.arange(y.size)[~rem]&nbsp; &nbsp; return peaksdef freq_from_autocorr_improved(signal, fs):&nbsp; &nbsp; signal -= np.mean(signal)&nbsp; # Remove DC offset&nbsp; &nbsp; corr = fftconvolve(signal, signal[::-1], mode='full')&nbsp; &nbsp; corr = corr[len(corr)//2:]&nbsp; &nbsp; # Find the first peak on the left&nbsp; &nbsp; i_peak = indexes(corr, thres=0.8, min_dist=5)[0]&nbsp; &nbsp; i_interp = parabolic(corr, i_peak)[0]&nbsp; &nbsp; return fs / i_interp, corr, i_interp运行freq_from_autocorr_improved(y, fs)产量~441.825 Hz,大约 0.41% 的误差。这种方法在更复杂的情况下会表现得更好,并且需要两倍的时间来计算。通过更长的采样时间(即设置length为例如 0.1 秒),我们将获得更准确的结果。
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python