听音频自动识别五线谱软件（牛津小哥开源神器）

十三发自凹非寺
量子位报道 | 公众号 QbitAI

给你一张图片，你能想象它的声音吗？

一个叫SpectroGraphic的神器就能做到这点。

听音频自动识别五线谱软件（牛津小哥开源神器）(1)

例如，给定一个“怪物史莱克”的照片，通过这个工具，就能生成其对应的声谱图。

听音频自动识别五线谱软件（牛津小哥开源神器）(2)

图像嵌入声谱图

大多数声音是许多声波的复杂组合，而每一种声波都有不同的频率和强度。

声谱图(spectrogram)是一种表示声音的方法，它的横轴是时间，纵轴是频谱。

听音频自动识别五线谱软件（牛津小哥开源神器）(3)

△声谱图示例

而SpectroGraphic所做的工作就是获取一张图像，简单地把它解释成一张声谱图。

这样，就可以通过产生的声音，便将图像嵌入到了声谱图中。

是不是非常酷炫？

现在，项目已开源，每个人都可以体验了！

项目安装

首先，运行如下代码来获取命令行工具spectrographic：

pip install spectrographic

也可以直接把stand-alone\文件夹里的spectrographic.py文件，作为命令行工具使用。

听音频自动识别五线谱软件（牛津小哥开源神器）(4)

此外，还需要确保满足 requirements.txt 文件中涉及的所有依赖项。

可以通过如下命令来进行安装：

pip install requirements.txt

在使用 pip 进行安装之后，只需要在控制台中运行 spectrographic […]。

在使用stand-alone脚本时，必须使用 python spectrographic.py […]。

还可以简单地从 SpectroGraphic.base 中导入 SpectroGraphic 类。

命令行工具的使用

usage: spectrographic [-h] [--version] -i PATH_TO_IMAGE [-d DURATION] [-m MIN_FREQ] [-M MAX_FREQ] [-r RESOLUTION] [-c Contrast] [-p] [-s save_FILE] Turn any image into sound. optional arguments: -h, --help show this help message and exit --version show program's version number and exit -i PATH_TO_IMAGE, --image PATH_TO_IMAGE Path of image that we want to embed in a spectrogram. -d DURATION, --duration DURATION Duration of generated sound. -m MIN_FREQ, --min_freq MIN_FREQ Smallest frequency used for drawing the image. -M MAX_FREQ, --max_freq MAX_FREQ Largest frequency used for drawing the image. -r RESOLUTION, --resolution RESOLUTION Vertical resolution of the image in the spectrogram. -c CONTRAST, --contrast CONTRAST Contrast of the image in the spectrogram. -p, --play Directly play the resulting sound. -s SAVE_FILE, --save SAVE_FILE Path to .wav file in which to save the resulting sound.

如果你的源图像在./source.png，想要生成10s的音频，频率范围为10kHz到20kHz，还希望保存为sound.wav，最终还要播放音频。

那么就运行如下代码：

spectrographic --image ./source.png --min_freq 10000 --max_freq 20000 --duration 10 --save sound.wav --play

如果你正在使用stand-alone脚本：

python spectrographic.py --image ./source.png --min_freq 10000 --max_freq 20000 --duration 10 --save sound.wav --play