‹›计算音频使用 DTW 对比录音
导入、剪辑并预处理四段《爱丽丝奇境记》第一句话的录音.
显示完整的 Wolfram 语言输入
urls = {"http://ia800503.us.archive.org/3/items/alices_adventures/\
aliceinwonderland_01_carroll.mp3",
"http://ia800306.us.archive.org/25/items/alice_wonderland_0711_\
librivox/alice_01_carroll.mp3",
"http://ia800201.us.archive.org/32/items/alices_adventures_1003/\
alices_adventures_01_carroll.mp3",
"https://ia800904.us.archive.org/15/items/alicesadventure_abridged_\
pc_librivox/alicesadventuresinwonderlandabridged_01_carroll.mp3"};
times = {{27, 33.5}, {16.5, 25}, {22.3, 28.5}, {31, 38}};
alice = ConformAudio[
MapThread[
AudioNormalize[
AudioChannelMix[AudioTrim[AudioResample[Import[#1], 11025], #2],
1]] &, {urls, times}]]
显示信号的绘图.
AudioPlot[alice, ImageSize -> Medium]
计算并绘制样本的 MFCC 特性.
mfcc = AudioLocalMeasurements[#, "MFCC",
PartitionGranularity -> {.05, .01}]["Values"] & /@ alice;
Column[MatrixPlot[#, PlotTheme -> "Minimal", ImageSize -> Medium] & /@
Transpose /@ mfcc]
用 WarpingDistance 计算录音间的动态时间弯曲距离.
DistanceMatrix[mfcc,
DistanceFunction -> WarpingDistance] // MatrixPlot
用 WarpingDistance 计算两段录音间的动态时间弯曲响应.
{n, m} = WarpingCorrespondence[mfcc[[1]], mfcc[[2]]];
显示完整的 Wolfram 语言输入
dur = QuantityMagnitude[Duration[alice[[1]]], "s"];
s = {n, m}\[Transpose]/Max[{n, m}] dur;
Labeled[
ListLinePlot[
s,
PlotRange -> {{0, dur}, {0, dur}}, AspectRatio -> 1, Axes -> False,
PlotStyle -> Thickness[.01], ImageSize -> Medium, Frame -> True,
FrameTicks -> None,
Prolog -> {RGBColor[
0.6666666666666666, 0.6666666666666666,
0.6666666666666666], {Line[{{#[[1]], 0}, #}],
Line[{{0, #[[2]]}, #}]} & /@ (s[[;; ;; 100]])}
],
AudioPlot[#, PlotStyle -> RGBColor[0.560181, 0.691569, 0.194885],
Frame -> False, Axes -> False, ImageSize -> Medium,
AspectRatio -> 1/15] & /@ (alice[[;; 2]]), {Bottom, Left},
RotateLabel -> True, Spacings -> {0, 0}]