PCM/LPCM 音频格式 #
PCM:脉冲编码调制(Pulse-code modulation)是一种模拟信号的数字化方法。 A PCM stream has two basic properties that determine the stream’s fidelity to the original analog signal:
the sampling rate
, which is the number of times per second that samples are taken;- and
the bit depth
, which determines the number of possible digital values that can be used to represent each sample.
The compact disc (CD) brought PCM to consumer audio applications with its introduction in 1982. The
CD uses a 44,100 Hz sampling frequency and 16-bit resolution
and stores up to 80 minutes of stereo
audio per disc. stereo audio 是通过 two-channel 提供的。
- The audio contained in a CD-DA consists of
two-channel signed 16-bit LPCM sampled at 44,100 Hz
and written as a little-endian interleaved stream with left channel coming first
LPCM
:Linear pulse-code modulation (LPCM) 是一种数字信号的表示方法,主要用于音频信号。它通过将模拟信号定期采样并量化为线性级别的数字值来工作。
LPCM是脉冲编码调制(PCM)的一种形式,特别强调了量化过程是线性的。这意味着模拟信号的每个采样值都直接转换成相应的数字值, 而这个转换过程不涉及任何非线性压缩。
LPCM的关键步骤包括采样、量化和编码:
- 采样:这是将连续的模拟信号转换为离散信号的过程。根据奈奎斯特定理,为了避免混叠效应,采样频率应至少为信号最高频率的两倍。例如,CD音频以44.1kHz的频率采样,这意味着它可以准确地再现高达22.05kHz 的声音频率,覆盖了人耳可听范围。
- 量化:量化过程涉及将每个采样点的振幅(即大小或强度)近似到一组有限的数值中。在LPCM中,这个过程是线性的,这意味着模拟信号的动态范围被均匀分配给量化级别。量化的精度通常用比特数表示,比如CD音质的LPCM采用16位量化,提供了65536(2^16)个不同的可能振幅级别。
- 编码:最后,量化后的数值被编码为数字信号,可以存储或传输。在LPCM中,这些数值直接表示信号的振幅, 不进行任何额外的压缩或编码。
LPCM 是一种无损的音频格式,因为它不涉及压缩过程中的信息丢失(尽管原始模拟信号在采样和量化过程中可能会有一定程度的近似)。由于它的这个特性, LPCM广泛用于需要高音质的应用中,如CD音频、DVD音频、蓝光音频和一些专业音频录制系统。
LPCM 主要优点包括简单、直接和高质量的音频表示,但它也有一个缺点,即相对较高的数据率。例如,未压缩的CD质量音频(使用44.1kHz的采样率和16位深度的立体声LPCM)的数据率约为1.4Mbps。
相比之下,许多现代音频压缩技术,如MP3或AAC,通过去除人耳难以察觉的音频信息来大幅度减少所需的数据率,但这种压缩是有损的。
立体声和多声道 LPCM:
- 在立体声 LPCM 流中,左声道和右声道的采样值通常是交错存储的。例如,一个典型的存储序列可能是L1、R1、L2、R2、…、Ln、Rn,其中L和R分别代表左声道和右声道的采样值,n是采样点的索引。
- 在多声道LPCM流中,各声道的采样值可以按不同方式组织。最常见的是交错方式,即按照采样时刻顺序依次存储各声道的采样值,比如L1、C1、R1、LS1、RS1、L2、C2、R2、LS2、RS2、…,其中L、C、R、LS、RS分别代表左前、中央、右前、左后和右后声道的采样值。
存储多声道 LPCM 音频流通常涉及以下步骤:
- 选择格式:根据需要支持的声道数、对音质的要求以及对文件大小的考虑,选择合适的音频文件格式。
- 准备音频数据:将LPCM编码的音频数据按照选择的格式要求(如声道排列、采样率、位深度等)进行组织。
- 写入文件:将音频数据连同必要的元数据(如格式头信息)一起写入到文件中。
- 验证:确保写入的音频文件符合所选格式的规范,并且可以被目标播放器或编辑软件正确读取。
使用适当的音频编辑或编码软件,你可以轻松地将多声道LPCM音频流保存到这些格式的文件中,无论是通过图形用户界面操作还是通过编程方式。
参考:
WAV 压缩音频格式 #
PCM 是音频编码调制的方式,而非保存的文件格式。通常使用 WAV (Waveform Audio File Format) or AIFF (Audio Interchange File Format) 文件格式来保存 PCM 数据。
WAV(Waveform Audio File Format)
是一种音频文件格式,它通常用来存储未压缩的音频数据,这些数据大多数情况下使用Linear Pulse-Code Modulation (LPCM) 编码。
WAV格式由微软和IBM开发,最初是为Windows 3.1 操作系统设计的。由于其无损特性和广泛的兼容性,WAV格式成为了保存高质量音频的一种流行选择。
WAV 文件格式的关键信息:
- 采样率
- 采样深度
- 通道数和交织方式
总结:WAV 文件和 LPCM 的关系:
- 存储LPCM数据:WAV文件格式经常用来存储LPCM编码的音频数据。这意味着WAV文件可以
保存按照LPCM方法采样、量化和编码的音频信号,保留原始音频的所有细节而不会丢失任何信息
。 - 无损音频格式:由于LPCM是一种无损编码方式,
因此使用LPCM编码的WAV文件也是无损的
。这使得WAV文件特别适合需要高质量音频,如专业音乐制作、音频编辑和音频分析的场合。 - 高数据率:LPCM编码的音频数据未经过压缩,所以WAV文件通常具有较高的数据率。例如,一段标准的CD质量音频(44.1kHz采样率、16位深度、立体声)的数据率大约为1.4Mbps。 这意味着WAV文件可以变得相当大,尤其是对于较长的录音。
- 广泛的应用:WAV格式由于其简单、无损和高质量的特性,在很多应用中被广泛使用,尤其是在需要原始音质的场合,如音乐制作、电影后期制作、广播和科学研究等。
wav 文件不需要解码,可以直接读取 LPCM 编码数据,可以直接通过 I2S 接口发送给功放芯片播放。
参考:https://pyjamabrah.com/posts/pcm/
// https://github.com/espressif/esp-box/blob/master/examples/watering_demo/main/app/app_audio.c
static void audio_beep_task(void *pvParam)
{
while (true) {
xSemaphoreTake(audio_sem, portMAX_DELAY);
b_audio_playing = true;
sr_echo_play("/spiffs/echo_en_wake.wav"); // 直接播放 wav 文件的音频数据
b_audio_playing = false;
/* It's useful if wake audio didn't finish playing when next wake word detetced */
// xSemaphoreTake(audio_sem, 0);
}
}
esp_err_t sr_echo_play(void *filepath)
{
FILE *fp = NULL;
struct stat file_stat;
esp_err_t ret = ESP_OK;
const size_t chunk_size = 4096;
uint8_t *buffer = malloc(chunk_size);
ESP_GOTO_ON_FALSE(NULL != buffer, ESP_FAIL, EXIT, TAG, "buffer malloc failed");
ESP_GOTO_ON_FALSE(-1 != stat(filepath, &file_stat), ESP_FAIL, EXIT, TAG, "Failed to stat file");
fp = fopen(filepath, "r");
ESP_GOTO_ON_FALSE(NULL != fp, ESP_FAIL, EXIT, TAG, "Failed create record file");
wav_header_t wav_head;
int len = fread(&wav_head, 1, sizeof(wav_header_t), fp);
ESP_GOTO_ON_FALSE(len > 0, ESP_FAIL, EXIT, TAG, "Read wav header failed");
if (NULL == strstr((char *)wav_head.Subchunk1ID, "fmt") &&
NULL == strstr((char *)wav_head.Subchunk2ID, "data")) {
ESP_LOGI(TAG, "PCM format");
fseek(fp, 0, SEEK_SET);
wav_head.SampleRate = 16000;
wav_head.NumChannels = 2;
wav_head.BitsPerSample = 16;
}
ESP_LOGD(TAG, "frame_rate= %" PRIi32 ", ch=%d, width=%d", wav_head.SampleRate, wav_head.NumChannels, wav_head.BitsPerSample);
bsp_codec_set_fs(wav_head.SampleRate, wav_head.BitsPerSample, I2S_SLOT_MODE_STEREO);
bsp_codec_mute_set(true);
bsp_codec_mute_set(false);
bsp_codec_volume_set(100, NULL);
size_t cnt, total_cnt = 0;
do {
/* Read file in chunks into the scratch buffer */
len = fread(buffer, 1, chunk_size, fp);
if (len <= 0) {
break;
} else if (len > 0) {
bsp_i2s_write(buffer, len, &cnt, portMAX_DELAY);
total_cnt += cnt;
}
} while (1);
ESP_LOGI(TAG, "play end, %d K", total_cnt / 1024);
EXIT:
if (fp) {
fclose(fp);
}
if (buffer) {
free(buffer);
}
return ret;
}
三类主要的音频文件格式: #
- 未压缩的音频格式,如 WAV、AIFF、Raw PCM;
- 无损压缩格式:FLAC、APE(Monkey’s Audio)、WV(WavPack)、TTA、ATRAC Advanced Lossless, ALAC (filename extension .m4a Apple Lossless), MPEG-4 SLS, MPEG-4 ALS, MPEG-4 DST, Windows Media Audio Lossless (WMA Lossless), Shorten (SHN)
- 有损压缩: Opus,
MP3
, Vorbis, Musepack, AAC, ATRAC and Windows Media Audio Lossy (WMA lossy)
.m4a
An audio-only MPEG-4 file, used by Apple for unprotected music downloaded from their iTunes Music Store. Audio within the m4a file is typically encoded with AAC, although lossless ALAC
may also be used.
- WAV(Waveform Audio File Format):
- 普遍支持:WAV是最广泛支持的音频文件格式之一,由微软开发,原生支持LPCM音频流。
- 无损质量:WAV文件可以无损存储LPCM音频数据,保持原始音频质量。
- 元数据支持:WAV格式支持存储关于音频流的详细信息,如采样率、位深度、声道数等。
- 文件大小:由于WAV文件通常不使用压缩,文件大小可能会非常大,尤其是对于高采样率、高位深度、多声道音频。
- AIFF(Audio Interchange File Format)
- 类似WAV:AIFF是苹果公司开发的一种音频文件格式,与WAV非常相似,提供无损音频质量和广泛的元数据支持。
- 跨平台:虽然AIFF最初是为Macintosh系统设计的,但现在它在多个平台上都得到支持。
- 文件大小:和WAV一样,AIFF文件也可能相当大,特别是当存储高质量的多声道LPCM音频时。
- FLAC(Free Lossless Audio Codec)
- 无损压缩:FLAC提供无损压缩,能够在不损失音质的情况下减小文件大小,适用于LPCM音频数据。
- 标签和元数据:FLAC支持丰富的标签和元数据,方便音乐管理和播放器识别。
- 广泛支持:尽管主要用于立体声音频,FLAC格式也支持多达8个声道的音频,适用于多声道LPCM音频流的存储。
- Multichannel WAV/RF64
- 大型文件:为了克服WAV文件对文件大小的限制(4GB),扩展格式如RF64被设计用来支持更大的文件,适合长时间的高质量多声道录音。
- 广泛兼容性:这些格式保持了与标准WAV格式的向后兼容性,同时扩展了其能力,以支持更大的数据量。
I2S 接口和播放声音 #
一般来说,一个语音提示文件的 MP3 格式的大小约 5KB,而未压缩的 wav 格式的大小则为 60KB 左右。如果拿 2MB 的 FLASH 空间来存储 MP3 格式的语音提示文件,则其数量要远大于 WAV 格式。
wav 保存的是未压缩的 PCM 数据,可以直接通过 I2S 接口发送给数字音频芯片来播放。
而其他压缩格式如 MP3, 需要通过软件或硬件解码为 PCM 格式,然后才能通过 I2S 数字音频接口发送给功放芯片。
- 使用I2C协议来配置WM8978模块
- 初始化ESP32的I2S通信接口
- 建立数据缓冲,大于4096字节
- 从FLASH读取一个扇区(4096字节)
- 转为解码所需的stream比特流形式(如开源的
mad MP3 解码库
) - 开始MP3解码
- 解码4096字节完成后,把PCM 数据通过I2S送入WM8978模块
综上:
-
使用 ESP32 播放 mp3 文件前,都需要解码,解码输出的格式为 PCM:
- 开源的 MAD (MPEG Audio Decoder) MP3 解码库 :https://www.underbit.com/products/mad/
- ESP32 Box S3 的 esp-audio-player 使用的 libhelix-mp3 解码库:https://github.com/ultraembedded/libhelix-mp3/tree/master
- 开源的 ESP32-audioI2S:https://github.com/schreibfaul1/ESP32-audioI2S
- 可以解码播放:
mp3, m4a
and wav files from SD card via I2S,HELIX-mp3 and -aac decoder is included. There is also an OPUS decoder for Fullband, n VORBIS decoder(.ogg 格式
) anda FLAC decoder
.
- 可以解码播放:
-
然后将解码后的 PCM 编码数据通过 I2S 接口发送给数字音频功放芯片(codec chip);
-
功放芯片进行 DAC 转换,驱动扬声器;
-
对于支持 MIC 输入的 codec chip,drvier 也通过 I2S 接口来读取 ADC 后的音频 PCM 数据,然后进一步处理,如直接保存为未编码的 wav 格式文件,或经过压缩后编码为其他格式,如 mp3、aac 等来存储到 TF 卡,或者再发送给 codec chip 来播放;
注:I2S 接口是数字音频信号的传输协议(不一定是物理接口),而 PCM 是数字音频的编码格式,可以经过 DAC 直接转换为模拟信号。
大一统的 ESP32-audioI2S 解码播放示例:https://github.com/schreibfaul1/ESP32-audioI2S

// https://github.com/schreibfaul1/ESP32-audioI2S
#include "Arduino.h"
#include "WiFi.h"
#include "Audio.h"
#include "SD.h"
#include "FS.h"
// Digital I/O used
#define SD_CS 5
#define SPI_MOSI 23
#define SPI_MISO 19
#define SPI_SCK 18
#define I2S_DOUT 25
#define I2S_BCLK 27
#define I2S_LRC 26
Audio audio;
String ssid = "*******";
String password = "*******";
void setup() {
pinMode(SD_CS, OUTPUT); digitalWrite(SD_CS, HIGH);
SPI.begin(SPI_SCK, SPI_MISO, SPI_MOSI);
Serial.begin(115200);
SD.begin(SD_CS);
WiFi.disconnect();
WiFi.mode(WIFI_STA);
WiFi.begin(ssid.c_str(), password.c_str());
while (WiFi.status() != WL_CONNECTED) delay(1500);
audio.setPinout(I2S_BCLK, I2S_LRC, I2S_DOUT);
audio.setVolume(21); // default 0...21
// or alternative
// audio.setVolumeSteps(64); // max 255
// audio.setVolume(63);
//
// *** radio streams ***
audio.connecttohost("http://stream.antennethueringen.de/live/aac-64/stream.antennethueringen.de/"); // aac
// audio.connecttohost("http://mcrscast.mcr.iol.pt/cidadefm"); // mp3
// audio.connecttohost("http://www.wdr.de/wdrlive/media/einslive.m3u"); // m3u
// audio.connecttohost("https://stream.srg-ssr.ch/rsp/aacp_48.asx"); // asx
// audio.connecttohost("http://tuner.classical102.com/listen.pls"); // pls
// audio.connecttohost("http://stream.radioparadise.com/flac"); // flac
// audio.connecttohost("http://stream.sing-sing-bis.org:8000/singsingFlac"); // flac (ogg)
// audio.connecttohost("http://s1.knixx.fm:5347/dein_webradio_vbr.opus"); // opus (ogg)
// audio.connecttohost("http://stream2.dancewave.online:8080/dance.ogg"); // vorbis (ogg)
// audio.connecttohost("http://26373.live.streamtheworld.com:3690/XHQQ_FMAAC/HLSTS/playlist.m3u8"); // HLS
// audio.connecttohost("http://eldoradolive02.akamaized.net/hls/live/2043453/eldorado/master.m3u8"); // HLS (ts)
// *** web files ***
// audio.connecttohost("https://github.com/schreibfaul1/ESP32-audioI2S/raw/master/additional_info/Testfiles/Pink-Panther.wav"); // wav
// audio.connecttohost("https://github.com/schreibfaul1/ESP32-audioI2S/raw/master/additional_info/Testfiles/Santiano-Wellerman.flac"); // flac
// audio.connecttohost("https://github.com/schreibfaul1/ESP32-audioI2S/raw/master/additional_info/Testfiles/Olsen-Banden.mp3"); // mp3
// audio.connecttohost("https://github.com/schreibfaul1/ESP32-audioI2S/raw/master/additional_info/Testfiles/Miss-Marple.m4a"); // m4a (aac)
// audio.connecttohost("https://github.com/schreibfaul1/ESP32-audioI2S/raw/master/additional_info/Testfiles/Collide.ogg"); // vorbis
// audio.connecttohost("https://github.com/schreibfaul1/ESP32-audioI2S/raw/master/additional_info/Testfiles/sample.opus"); // opus
// *** local files ***
// audio.connecttoFS(SD, "/test.wav"); // SD
// audio.connecttoFS(SD_MMC, "/test.wav"); // SD_MMC
// audio.connecttoFS(SPIFFS, "/test.wav"); // SPIFFS
// audio.connecttospeech("Wenn die Hunde schlafen, kann der Wolf gut Schafe stehlen.", "de"); // Google TTS
}
void loop()
{
audio.loop();
}
// optional
void audio_info(const char *info){
Serial.print("info "); Serial.println(info);
}
void audio_id3data(const char *info){ //id3 metadata
Serial.print("id3data ");Serial.println(info);
}
void audio_eof_mp3(const char *info){ //end of file
Serial.print("eof_mp3 ");Serial.println(info);
}
void audio_showstation(const char *info){
Serial.print("station ");Serial.println(info);
}
void audio_showstreamtitle(const char *info){
Serial.print("streamtitle ");Serial.println(info);
}
void audio_bitrate(const char *info){
Serial.print("bitrate ");Serial.println(info);
}
void audio_commercial(const char *info){ //duration in sec
Serial.print("commercial ");Serial.println(info);
}
void audio_icyurl(const char *info){ //homepage
Serial.print("icyurl ");Serial.println(info);
}
void audio_lasthost(const char *info){ //stream URL played
Serial.print("lasthost ");Serial.println(info);
}
void audio_eof_speech(const char *info){
Serial.print("eof_speech ");Serial.println(info);
}
数字音频功放芯片 codec #
对于数字音频功放芯片,一般也称为 codec chip:
- 将 PCM 数字音频解码,然后 DAC 转换为模型信号输出;
- 将 MIC 收到的模拟声音信号经过 ADC 转换,然后编码为 PCM 数字比特流;
- driver 都是通过 I2S 接口来发送和接受 PCM 数字信号;
- 一般使用 I2C 接口对芯片进行配置;
一般 I2S 接口的数字音频功放芯片,除了可以播放 PCM 编码格式的数字音频信号外,还提供控制(静音、音量大小等)和 MIC 输入功能,如 ES8374
- codec chip 的 MIC 使用 ADC 将声音转换为 PCM 编码数据,driver 可以通过 I2S 接口来读取这些数据,进行后续处理,如编码后保存到 TF 卡或者播放。
示例:https://github.com/espressif/esp-box/blob/master/examples/usb_headset/main/src/usb_headset.c
如果需要更好的音频质量和更多的接口选项,可使用外部 I2S 编解码器来完成所有模拟输入和输出信号的处理。
不同类型的编解码器芯片可提供不同的额外功能,如音频输入信号前置放大器、耳机输出放大器、多个模拟输入和输出、音效处理等。
I2S 是音频编解码器芯片接口的行业标准,通常用于高速、连续传输音频数据。为了优化音频数据处理的性能,可能需要额外的内存。对于这种情况,请考虑使用集成 8 MB PSRAM 和 ESP32 芯片的 ESP32-WROVER-E 模组。
https://docs.espressif.com/projects/esp-adf/en/latest/design-guide/project-design.html

ESP32 提供了乐鑫音频开发框架(ADF),支持常见的编解码格式:https://docs.espressif.com/projects/esp-adf/en/latest/index.html

I (397) PLAY_FLASH_MP3_CONTROL: [ 1 ] Start audio codec chip
I (427) PLAY_FLASH_MP3_CONTROL: [ 2 ] Create audio pipeline, add all elements to pipeline, and subscribe pipeline event
I (427) PLAY_FLASH_MP3_CONTROL: [2.1] Create mp3 decoder to decode mp3 file and set custom read callback
I (437) PLAY_FLASH_MP3_CONTROL: [2.2] Create i2s stream to write data to codec chip
I (467) PLAY_FLASH_MP3_CONTROL: [2.3] Register all elements to audio pipeline
I (467) PLAY_FLASH_MP3_CONTROL: [2.4] Link it together [mp3_music_read_cb]-->mp3_decoder-->i2s_stream-->[codec_chip]
I (477) PLAY_FLASH_MP3_CONTROL: [ 3 ] Set up event listener
I (477) PLAY_FLASH_MP3_CONTROL: [3.1] Listening event from all elements of pipeline
I (487) PLAY_FLASH_MP3_CONTROL: [ 4 ] Start audio_pipeline
I (507) PLAY_FLASH_MP3_CONTROL: [ * ] Receive music info from mp3 decoder, sample_rates=44100, bits=16, ch=2
I (7277) PLAY_FLASH_MP3_CONTROL: [ 5 ] Stop audio_pipeline
示例:https://github.com/espressif/esp-adf/tree/master/examples
记录声音 #
使用麦克风 Module 如 INMP441 module 来将声音转换为数字信号(PCM 编码后的数字流),然后 ESP32 driver 通过 I2S 接口来获取数字音频。
- INMP441 module will be acting as a mic input for capturing mono 16-bit audio signals at rate 8000 samples per second.
- 一般数字音频功放芯片集成有 MIC,也是通过 I2S 接口来获取 PCM 数据,所以也称为 codec chip。
如果是模拟 MIC 则可以使用 ESP32 的 ADC 引脚转换为 LPCM,然后再保存到 wav 文件中。
通过 I2S 从 MIC 读取 PCM 数字音频后,以 wav 文件格式存入 SD 卡:
- wav 文件:medatadata header + LPCM raw data;
// https://www.makerfabs.com/blog/post/how-to-make-an-esp32-sound-recorder
void WM8960_Record(String filename, char *buff, int record_time)
{
int headerSize = 44;
byte header[headerSize];
int waveDataSize = record_time * 16000 * 16 * 2 / 8;
int recode_time = millis();
int part_time = recode_time;
File file = SD.open(filename, FILE_WRITE);
if (!file)
return;
Serial.println("Begin to record:");
for (int j = 0; j < waveDataSize / sizeof(buff); ++j)
{
I2S_Read(buff, sizeof(buff));
file.write((const byte *)buff, sizeof(buff));
if ((millis() - part_time) > 1000)
{
Serial.print(".");
part_time = millis();
}
}
file.seek(0);
CreateWavHeader(header, waveDataSize);
file.write(header, headerSize);
Serial.println("");
Serial.println("Finish");
Serial.println(millis() - recode_time);
file.close();
}
播放 wav 文件:
// https://www.makerfabs.com/blog/post/how-to-make-an-esp32-sound-recorder
void WM8960_Play (String filename, char *buff)
{
File file = SD.open(filename);
if (! file)
return;
Serial.println("Begin to play:");
Serial.println(filename);
file.seek(44); // 跳过 wav header
while (file.readBytes(buff, sizeof(buff)))
{
I2S_Write(buff, sizeof(buff));
}
Serial.println("Finish");
file.close();
}
另一个使用 I2S 从 MIC 读取数据,存入 wav 文件的例子: https://github.com/MhageGH/esp32_SoundRecorder/tree/master
#include "Arduino.h"
#include <FS.h>
#include "Wav.h"
#include "I2S.h"
#include <SD.h>
//comment the first line and uncomment the second if you use MAX9814
//#define I2S_MODE I2S_MODE_RX
#define I2S_MODE I2S_MODE_ADC_BUILT_IN
const int record_time = 10; // second
const char filename[] = "/sound.wav";
const int headerSize = 44;
const int waveDataSize = record_time * 88000;
const int numCommunicationData = 8000;
const int numPartWavData = numCommunicationData/4;
byte header[headerSize];
char communicationData[numCommunicationData];
char partWavData[numPartWavData];
File file;
void setup() {
Serial.begin(115200);
if (!SD.begin()) Serial.println("SD begin failed");
while(!SD.begin()){
Serial.print(".");
delay(500);
}
CreateWavHeader(header, waveDataSize);
SD.remove(filename);
file = SD.open(filename, FILE_WRITE);
if (!file) return;
file.write(header, headerSize);
I2S_Init(I2S_MODE, I2S_BITS_PER_SAMPLE_32BIT);
for (int j = 0; j < waveDataSize/numPartWavData; ++j) {
I2S_Read(communicationData, numCommunicationData);
for (int i = 0; i < numCommunicationData/8; ++i) {
partWavData[2*i] = communicationData[8*i + 2];
partWavData[2*i + 1] = communicationData[8*i + 3];
}
file.write((const byte*)partWavData, numPartWavData);
}
file.close();
Serial.println("finish");
}
void loop() {
}
// wav 头文件
#include "Wav.h"
void CreateWavHeader(byte* header, int waveDataSize){
header[0] = 'R';
header[1] = 'I';
header[2] = 'F';
header[3] = 'F';
unsigned int fileSizeMinus8 = waveDataSize + 44 - 8;
header[4] = (byte)(fileSizeMinus8 & 0xFF);
header[5] = (byte)((fileSizeMinus8 >> 8) & 0xFF);
header[6] = (byte)((fileSizeMinus8 >> 16) & 0xFF);
header[7] = (byte)((fileSizeMinus8 >> 24) & 0xFF);
header[8] = 'W';
header[9] = 'A';
header[10] = 'V';
header[11] = 'E';
header[12] = 'f';
header[13] = 'm';
header[14] = 't';
header[15] = ' ';
header[16] = 0x10; // linear PCM
header[17] = 0x00;
header[18] = 0x00;
header[19] = 0x00;
header[20] = 0x01; // linear PCM
header[21] = 0x00;
header[22] = 0x01; // monoral
header[23] = 0x00;
header[24] = 0x44; // sampling rate 44100
header[25] = 0xAC;
header[26] = 0x00;
header[27] = 0x00;
header[28] = 0x88; // Byte/sec = 44100x2x1 = 88200
header[29] = 0x58;
header[30] = 0x01;
header[31] = 0x00;
header[32] = 0x02; // 16bit monoral
header[33] = 0x00;
header[34] = 0x10; // 16bit
header[35] = 0x00;
header[36] = 'd';
header[37] = 'a';
header[38] = 't';
header[39] = 'a';
header[40] = (byte)(waveDataSize & 0xFF);
header[41] = (byte)((waveDataSize >> 8) & 0xFF);
header[42] = (byte)((waveDataSize >> 16) & 0xFF);
header[43] = (byte)((waveDataSize >> 24) & 0xFF);
}
除了 I2S 接口的数字 MIC 外,常见的还有 模拟输出的 MIC
,这时可以使用 ESP32 的 ADC 引脚来进行模数转换
,将结果以 LPCM 编码的 wav 文件保存:


// https://github.com/AlirezaSalehy/WAVRecorder/blob/main/library/library.ino
#include <SD.h>
#include <SPI.h>
#include "src/WAVRecorder.h"
#include "src/AudioSystem.h"
#include "src/SoundActivityDetector.h"
#define SAMPLE_RATE 16000
#define SAMPLE_LEN 8
// Hardware SPI's CS pin which is different in each board
#ifdef ESP8266
#define CS_PIN 16
#elif ARDUINO_SAM_DUE
#define CS_PIN 4
#elif ESP32
#define CS_PIN 5
#endif
// The analog pins (ADC inputs) which microphone outputs are connected to.
#define MIC_PIN_1 34
#define MIC_PIN_2 35
#define NUM_CHANNELS 1
channel_t channels[] = {{MIC_PIN_1}};
char file_name[] = "/sample.wav";
File dataFile;
#if defined(ESP32) || defined(ESP8266)
AudioSystem* as;
#endif
WAVRecorder* wr;
//SoundActivityDetector* sadet;
void recordAndPlayBack();
void setup() {
for (int i = 0; i < sizeof(channels)/sizeof(channel_t); i++)
pinMode(channels[i].ADCPin, INPUT);
//analogReadResolution(12); for ESP32
pinMode(LED_BUILTIN, OUTPUT);
Serial.begin(115200);
Serial.println();
// put your setup code here, to run once:
if (!SD.begin(CS_PIN)) {
Serial.println("Failes to initialize SD!");
}
else {
Serial.println("SD opened successfuly");
}
SPI.setClockDivider(SPI_CLOCK_DIV2); // This is becuase feeding SD Card with more than 40 Mhz, leads to unstable operation.
// (Also depends on SD class) ESP8266 & ESP32 SPI clock with no division is 80 Mhz.
#if defined(ESP32) || defined(ESP8266)
as = new AudioSystem(CS_PIN);
#endif
//sadet = new SoundActivityDetector(channels[0].ADCPin, 2000, 10 * 512, 6 * 512, &Serial);
wr = new WAVRecorder(12, channels, NUM_CHANNELS, SAMPLE_RATE, SAMPLE_LEN, &Serial);
}
void loop() {
// put your main code here, to run repeatedly:
recordAndPlayBack();
}
void recordAndPlayBack() {
if (SD.exists(file_name)) {
SD.remove(file_name);
Serial.println("File removed!");
}
dataFile = SD.open(file_name, FILE_WRITE);
if (!dataFile) {
Serial.println("Failed to open the file!");
return;
}
// Setting file to store recodring
wr->setFile(&dataFile);
Serial.println("Started");
// With checks Sound power level and it exceeds a threshold recording starts and stops recording when power fall behind another threshold.
//wr->startBlocking(sadet);
// Recording for 3000 ms
wr->startBlocking(3000);
Serial.println("File Created");
Serial.println("Playing file");
#if defined(ESP32) || defined(ESP8266)
as->playAudioBlocking(file_name);
#endif
}
另一个例子:Broadcasting Your Voice with ESP32-S3 & INMP441
The ESP32-S3’s I2S interface is set up to handle the audio data using Direct Memory Access (DMA) buffers. DMA allows for efficient data transfer without involving the main processor, offloading the task to a dedicated DMA controller. By configuring the DMA buffer in I2S, the captured audio samples can be stored and transmitted seamlessly.
void i2s_install() {
// Set up I2S Processor configuration
const i2s_config_t i2s_config = {
.mode = i2s_mode_t(I2S_MODE_MASTER | I2S_MODE_RX),
.sample_rate = 44100,
//.sample_rate = 16000,
.bits_per_sample = i2s_bits_per_sample_t(16),
.channel_format = I2S_CHANNEL_FMT_ONLY_LEFT,
.communication_format = i2s_comm_format_t(I2S_COMM_FORMAT_STAND_I2S),
.intr_alloc_flags = 0,
.dma_buf_count = bufferCnt,
.dma_buf_len = bufferLen,
.use_apll = false
};
i2s_driver_install(I2S_PORT, &i2s_config, 0, NULL);
}
void micTask(void* parameter) {
i2s_install();
i2s_setpin();
i2s_start(I2S_PORT);
size_t bytesIn = 0;
while (1) {
esp_err_t result = i2s_read(I2S_PORT, &sBuffer, bufferLen, &bytesIn, portMAX_DELAY);
if (result == ESP_OK && isWebSocketConnected) {
client.sendBinary((const char*)sBuffer, bytesIn);
}
}
}
参考 #
https://diyi0t.com/i2s-sound-tutorial-for-esp32/
官方 chat demo #
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs$ source ~/esp/esp-idf/v5.2.1/export.sh
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs$ disable_proxy
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs$ idf.py menuconfig
[17/17] lvgl/lvgl (8.4.0)
-- use sdkconfig.defaults
-- use Kconfig.projbuild
-- PLATFORM ESP32_S3_BOX_3.
-- Project sdkconfig file /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/sdkconfig
Loading defaults file /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/sdkconfig.defaults...
-- Compiler supported targets: xtensa-esp-elf
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of time_t
-- Check size of time_t - done
-- Found Python3: /Users/alizj/.espressif/python_env/idf5.2_py3.12_env/bin/python (found version "3.12.3") found components: Interpreter
-- Performing Test C_COMPILER_SUPPORTS_WFORMAT_SIGNEDNESS
-- Performing Test C_COMPILER_SUPPORTS_WFORMAT_SIGNEDNESS - Success
-- App "factory_nvs" version: v0.5.0-63-ga9bbead
-- Adding linker script /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/build/esp-idf/esp_system/ld/memory.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_system/ld/esp32s3/sections.ld.in
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.api.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.libgcc.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.newlib.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.version.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/soc/esp32s3/ld/esp32s3.peripherals.ld
Project name: factory_nvs
Project version: v0.5.0-63-ga9bbead
-- ESP_TINYUF2: 0.2.1
-- BUTTON: 3.2.0
-- ESP_LCD_ILI9341: 1.2.0
-- use sdkconfig
-- PLATFORM ESP32_S3_BOX_3.
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/build
Running ninja in directory /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/build
Executing "ninja menuconfig"...
[0/1] cd /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/build && /Users/alizj/.espress...VERSION=5.2.1 --output config /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/sdkconfi
Loading defaults file /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/sdkconfig.defaults...
TERM environment variable is set to "xterm-256color"
Loaded configuration '/Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/sdkconfig'
No changes to save (for '/Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/sdkconfig')
Loading defaults file /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/sdkconfig.defaults...
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs$ ls -lt
total 176K
drwxr-xr-x 28 alizj 896 5 15 11:49 build/
-rw-r--r-- 1 alizj 73K 5 15 11:49 sdkconfig
-rw-r--r-- 1 alizj 66K 5 15 11:44 sdkconfig.old
drwxr-xr-x 18 alizj 576 5 15 11:44 managed_components/
-rw-r--r-- 1 alizj 3.7K 5 15 11:43 dependencies.lock
drwxr-xr-x 6 alizj 192 5 11 11:45 squareline/
-rw-r--r-- 1 alizj 1.1K 5 11 11:45 sdkconfig.defaults
-rw-r--r-- 1 alizj 43 5 11 11:45 sdkconfig.ci.box-lite
-rw-r--r-- 1 alizj 40 5 11 11:45 sdkconfig.ci.box-3
-rw-r--r-- 1 alizj 38 5 11 11:45 sdkconfig.ci.box
-rw-r--r-- 1 alizj 342 5 11 11:45 partitions.csv
drwxr-xr-x 8 alizj 256 5 11 11:45 main/
-rw-r--r-- 1 alizj 968 5 11 11:45 README.md
-rw-r--r-- 1 alizj 475 5 11 11:45 CMakeLists.txt
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs$ cat partitions.csv
# Name, Type, SubType, Offset, Size, Flags
# Note: if you have increased the bootloader size, make sure to update the offsets to avoid overlap
nvs, data, nvs, 0x9000, 0x4000,
otadata, data, ota, 0xd000, 0x2000,
phy_init, data, phy, 0xf000, 0x1000,
ota_0, app, ota_0, 0x700000, 2M,
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs$ grep zj sdkconfig
CONFIG_ESP_WIFI_SSID="zj-phone"
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs$ idf.py build
Executing action: all (aliases: build)
Running ninja in directory /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/build
Executing "ninja all"...
[0/1] Re-running CMake...
-- Building ESP-IDF components for target esp32s3
Processing 17 dependencies:
[17/17] lvgl/lvgl (8.4.0)
-- use sdkconfig
-- PLATFORM ESP32_S3_BOX_3.
-- Project sdkconfig file /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/sdkconfig
Loading defaults file /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/sdkconfig.defaults...
-- Compiler supported targets: xtensa-esp-elf
-- App "factory_nvs" version: v0.5.0-63-ga9bbead
-- Adding linker script /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/build/esp-idf/esp_system/ld/memory.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_system/ld/esp32s3/sections.ld.in
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.api.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.libgcc.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.newlib.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.version.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/soc/esp32s3/ld/esp32s3.peripherals.ld
Project name: factory_nvs
Project version: v0.5.0-63-ga9bbead
-- ESP_TINYUF2: 0.2.1
-- BUTTON: 3.2.0
-- ESP_LCD_ILI9341: 1.2.0
-- use sdkconfig
-- PLATFORM ESP32_S3_BOX_3.
-- Components: app_trace app_update bootloader bootloader_support bsp bt cmock console cxx driver efuse esp-tls esp_adc esp_app_format esp_bootloader_format esp_coex esp_common esp_eth esp_event esp_gdbstub esp_hid esp_http_client esp_http_server esp_https_ota esp_https_server esp_hw_support esp_lcd esp_local_ctrl esp_mm esp_netif esp_netif_stack esp_partition esp_phy esp_pm esp_psram esp_ringbuf esp_rom esp_system esp_timer esp_wifi espcoredump espressif__button espressif__cmake_utilities espressif__esp-box espressif__esp-box-3 espressif__esp-box-lite espressif__esp_codec_dev espressif__esp_lcd_ili9341 espressif__esp_lcd_touch espressif__esp_lcd_touch_gt911 espressif__esp_lcd_touch_tt21100 espressif__esp_lvgl_port espressif__esp_tinyuf2 espressif__icm42670 esptool_py fatfs freertos hal heap http_parser idf_test ieee802154 json leeebo__esp-inih leeebo__tinyusb_src log lvgl__lvgl lwip main mbedtls mqtt newlib nvs_flash nvs_sec_provider openthread partition_table perfmon protobuf-c protocomm pthread sdmmc soc spi_flash spiffs tcp_transport touch_element ulp unity usb vfs wear_levelling wifi_provisioning wpa_supplicant xtensa
-- Component paths: /Users/alizj/esp/esp-idf/v5.2.1/components/app_trace /Users/alizj/esp/esp-idf/v5.2.1/components/app_update /Users/alizj/esp/esp-idf/v5.2.1/components/bootloader /Users/alizj/esp/esp-idf/v5.2.1/components/bootloader_support /Users/alizj/code/esp32/esp-box/components/bsp /Users/alizj/esp/esp-idf/v5.2.1/components/bt /Users/alizj/esp/esp-idf/v5.2.1/components/cmock /Users/alizj/esp/esp-idf/v5.2.1/components/console /Users/alizj/esp/esp-idf/v5.2.1/components/cxx /Users/alizj/esp/esp-idf/v5.2.1/components/driver /Users/alizj/esp/esp-idf/v5.2.1/components/efuse /Users/alizj/esp/esp-idf/v5.2.1/components/esp-tls /Users/alizj/esp/esp-idf/v5.2.1/components/esp_adc /Users/alizj/esp/esp-idf/v5.2.1/components/esp_app_format /Users/alizj/esp/esp-idf/v5.2.1/components/esp_bootloader_format /Users/alizj/esp/esp-idf/v5.2.1/components/esp_coex /Users/alizj/esp/esp-idf/v5.2.1/components/esp_common /Users/alizj/esp/esp-idf/v5.2.1/components/esp_eth /Users/alizj/esp/esp-idf/v5.2.1/components/esp_event /Users/alizj/esp/esp-idf/v5.2.1/components/esp_gdbstub /Users/alizj/esp/esp-idf/v5.2.1/components/esp_hid /Users/alizj/esp/esp-idf/v5.2.1/components/esp_http_client /Users/alizj/esp/esp-idf/v5.2.1/components/esp_http_server /Users/alizj/esp/esp-idf/v5.2.1/components/esp_https_ota /Users/alizj/esp/esp-idf/v5.2.1/components/esp_https_server /Users/alizj/esp/esp-idf/v5.2.1/components/esp_hw_support /Users/alizj/esp/esp-idf/v5.2.1/components/esp_lcd /Users/alizj/esp/esp-idf/v5.2.1/components/esp_local_ctrl /Users/alizj/esp/esp-idf/v5.2.1/components/esp_mm /Users/alizj/esp/esp-idf/v5.2.1/components/esp_netif /Users/alizj/esp/esp-idf/v5.2.1/components/esp_netif_stack /Users/alizj/esp/esp-idf/v5.2.1/components/esp_partition /Users/alizj/esp/esp-idf/v5.2.1/components/esp_phy /Users/alizj/esp/esp-idf/v5.2.1/components/esp_pm /Users/alizj/esp/esp-idf/v5.2.1/components/esp_psram /Users/alizj/esp/esp-idf/v5.2.1/components/esp_ringbuf /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom /Users/alizj/esp/esp-idf/v5.2.1/components/esp_system /Users/alizj/esp/esp-idf/v5.2.1/components/esp_timer /Users/alizj/esp/esp-idf/v5.2.1/components/esp_wifi /Users/alizj/esp/esp-idf/v5.2.1/components/espcoredump /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/managed_components/espressif__button /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/managed_components/espressif__cmake_utilities /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/managed_components/espressif__esp-box /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/managed_components/espressif__esp-box-3 /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/managed_components/espressif__esp-box-lite /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/managed_components/espressif__esp_codec_dev /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/managed_components/espressif__esp_lcd_ili9341 /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/managed_components/espressif__esp_lcd_touch /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/managed_components/espressif__esp_lcd_touch_gt911 /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/managed_components/espressif__esp_lcd_touch_tt21100 /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/managed_components/espressif__esp_lvgl_port /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/managed_components/espressif__esp_tinyuf2 /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/managed_components/espressif__icm42670 /Users/alizj/esp/esp-idf/v5.2.1/components/esptool_py /Users/alizj/esp/esp-idf/v5.2.1/components/fatfs /Users/alizj/esp/esp-idf/v5.2.1/components/freertos /Users/alizj/esp/esp-idf/v5.2.1/components/hal /Users/alizj/esp/esp-idf/v5.2.1/components/heap /Users/alizj/esp/esp-idf/v5.2.1/components/http_parser /Users/alizj/esp/esp-idf/v5.2.1/components/idf_test /Users/alizj/esp/esp-idf/v5.2.1/components/ieee802154 /Users/alizj/esp/esp-idf/v5.2.1/components/json /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/managed_components/leeebo__esp-inih /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/managed_components/leeebo__tinyusb_src /Users/alizj/esp/esp-idf/v5.2.1/components/log /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/managed_components/lvgl__lvgl /Users/alizj/esp/esp-idf/v5.2.1/components/lwip /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/main /Users/alizj/esp/esp-idf/v5.2.1/components/mbedtls /Users/alizj/esp/esp-idf/v5.2.1/components/mqtt /Users/alizj/esp/esp-idf/v5.2.1/components/newlib /Users/alizj/esp/esp-idf/v5.2.1/components/nvs_flash /Users/alizj/esp/esp-idf/v5.2.1/components/nvs_sec_provider /Users/alizj/esp/esp-idf/v5.2.1/components/openthread /Users/alizj/esp/esp-idf/v5.2.1/components/partition_table /Users/alizj/esp/esp-idf/v5.2.1/components/perfmon /Users/alizj/esp/esp-idf/v5.2.1/components/protobuf-c /Users/alizj/esp/esp-idf/v5.2.1/components/protocomm /Users/alizj/esp/esp-idf/v5.2.1/components/pthread /Users/alizj/esp/esp-idf/v5.2.1/components/sdmmc /Users/alizj/esp/esp-idf/v5.2.1/components/soc /Users/alizj/esp/esp-idf/v5.2.1/components/spi_flash /Users/alizj/esp/esp-idf/v5.2.1/components/spiffs /Users/alizj/esp/esp-idf/v5.2.1/components/tcp_transport /Users/alizj/esp/esp-idf/v5.2.1/components/touch_element /Users/alizj/esp/esp-idf/v5.2.1/components/ulp /Users/alizj/esp/esp-idf/v5.2.1/components/unity /Users/alizj/esp/esp-idf/v5.2.1/components/usb /Users/alizj/esp/esp-idf/v5.2.1/components/vfs /Users/alizj/esp/esp-idf/v5.2.1/components/wear_levelling /Users/alizj/esp/esp-idf/v5.2.1/components/wifi_provisioning /Users/alizj/esp/esp-idf/v5.2.1/components/wpa_supplicant /Users/alizj/esp/esp-idf/v5.2.1/components/xtensa
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/build
[4/1419] Generating ../../partition_table/partition-table.bin
Partition table binary generated. Contents:
,*******************************************************************************
# ESP-IDF Partition Table
# Name, Type, SubType, Offset, Size, Flags
nvs,data,nvs,0x9000,16K,
otadata,data,ota,0xd000,8K,
phy_init,data,phy,0xf000,4K,
ota_0,app,ota_0,0x700000,2M,
,*******************************************************************************
[667/1419] Performing configure step for 'bootloader'
-- Found Git: /opt/homebrew/bin/git (found version "2.44.0")
-- The C compiler identification is GNU 13.2.0
-- The CXX compiler identification is GNU 13.2.0
-- The ASM compiler identification is GNU
-- Found assembler: /Users/alizj/.espressif/tools/xtensa-esp-elf/esp-13.2.0_20230928/xtensa-esp-elf/bin/xtensa-esp32s3-elf-gcc
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Users/alizj/.espressif/tools/xtensa-esp-elf/esp-13.2.0_20230928/xtensa-esp-elf/bin/xtensa-esp32s3-elf-gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /Users/alizj/.espressif/tools/xtensa-esp-elf/esp-13.2.0_20230928/xtensa-esp-elf/bin/xtensa-esp32s3-elf-g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Building ESP-IDF components for target esp32s3
-- Project sdkconfig file /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/sdkconfig
-- Compiler supported targets: xtensa-esp-elf
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of time_t
-- Check size of time_t - done
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/soc/esp32s3/ld/esp32s3.peripherals.ld
-- Bootloader project name: "bootloader" version: 1
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.api.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.libgcc.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.newlib.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/bootloader/subproject/main/ld/esp32s3/bootloader.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/bootloader/subproject/main/ld/esp32s3/bootloader.rom.ld
-- Components: bootloader bootloader_support efuse esp_app_format esp_bootloader_format esp_common esp_hw_support esp_rom esp_system esptool_py freertos hal log main micro-ecc newlib partition_table soc spi_flash xtensa
-- Component paths: /Users/alizj/esp/esp-idf/v5.2.1/components/bootloader /Users/alizj/esp/esp-idf/v5.2.1/components/bootloader_support /Users/alizj/esp/esp-idf/v5.2.1/components/efuse /Users/alizj/esp/esp-idf/v5.2.1/components/esp_app_format /Users/alizj/esp/esp-idf/v5.2.1/components/esp_bootloader_format /Users/alizj/esp/esp-idf/v5.2.1/components/esp_common /Users/alizj/esp/esp-idf/v5.2.1/components/esp_hw_support /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom /Users/alizj/esp/esp-idf/v5.2.1/components/esp_system /Users/alizj/esp/esp-idf/v5.2.1/components/esptool_py /Users/alizj/esp/esp-idf/v5.2.1/components/freertos /Users/alizj/esp/esp-idf/v5.2.1/components/hal /Users/alizj/esp/esp-idf/v5.2.1/components/log /Users/alizj/esp/esp-idf/v5.2.1/components/bootloader/subproject/main /Users/alizj/esp/esp-idf/v5.2.1/components/bootloader/subproject/components/micro-ecc /Users/alizj/esp/esp-idf/v5.2.1/components/newlib /Users/alizj/esp/esp-idf/v5.2.1/components/partition_table /Users/alizj/esp/esp-idf/v5.2.1/components/soc /Users/alizj/esp/esp-idf/v5.2.1/components/spi_flash /Users/alizj/esp/esp-idf/v5.2.1/components/xtensa
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/build/bootloader
[111/112] Generating binary image from built executable
esptool.py v4.7.0
Creating esp32s3 image...
Merged 2 ELF sections
Successfully created esp32s3 image.
Generated /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/build/bootloader/bootloader.bin
[112/112] cd /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/build/bootloader/esp-idf/e...der 0x0 /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/build/bootloader/bootloader.bi
Bootloader binary size 0x5800 bytes. 0x2800 bytes (31%) free.
[1418/1419] Generating binary image from built executable
esptool.py v4.7.0
Creating esp32s3 image...
Merged 2 ELF sections
Successfully created esp32s3 image.
Generated /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/build/factory_nvs.bin
[1419/1419] cd /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/build/esp-idf/esptool_py...rtition-table.bin /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/build/factory_nvs.bi
factory_nvs.bin binary size 0x106330 bytes. Smallest app partition is 0x200000 bytes. 0xf9cd0 bytes (49%) free.
Project build complete. To flash, run:
idf.py flash
or
idf.py -p PORT flash
or
python -m esptool --chip esp32s3 -b 460800 --before default_reset --after hard_reset write_flash --flash_mode dio --flash_size 16MB --flash_freq 80m 0x0 build/bootloader/bootloader.bin 0x8000 build/partition_table/partition-table.bin 0xd000 build/ota_data_initial.bin 0x700000 build/factory_nvs.bin
or from the "/Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/build" directory
python -m esptool --chip esp32s3 -b 460800 --before default_reset --after hard_reset write_flash "@flash_args"
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo$ cd ..
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo$ idf.py menuconfig
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo$ idf.py build
Executing action: all (aliases: build)
Running ninja in directory /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build
Executing "ninja all"...
[0/1] Re-running CMake...
-- Building ESP-IDF components for target esp32s3
Processing 20 dependencies:
[20/20] lvgl/lvgl (8.4.0)
-- use sdkconfig
-- PLATFORM ESP32_S3_BOX_3.
-- Project sdkconfig file /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/sdkconfig
Loading defaults file /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/sdkconfig.defaults...
-- Compiler supported targets: xtensa-esp-elf
-- App "chatgpt_demo" version: v0.5.0-63-ga9bbead
-- Adding linker script /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build/esp-idf/esp_system/ld/memory.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_system/ld/esp32s3/sections.ld.in
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.api.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.libgcc.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.newlib.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.version.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/soc/esp32s3/ld/esp32s3.peripherals.ld
-- OPENAI: 0.3.1
-- BUTTON: 3.2.0
-- ESP_LCD_ILI9341: 1.2.0
-- use sdkconfig
-- PLATFORM ESP32_S3_BOX_3.
-- Components: app_trace app_update bootloader bootloader_support bsp bt chmorgan__esp-audio-player chmorgan__esp-file-iterator chmorgan__esp-libhelix-mp3 cmock console cxx driver efuse esp-tls esp_adc esp_app_format esp_bootloader_format esp_coex esp_common esp_eth esp_event esp_gdbstub esp_hid esp_http_client esp_http_server esp_https_ota esp_https_server esp_hw_support esp_lcd esp_local_ctrl esp_mm esp_netif esp_netif_stack esp_partition esp_phy esp_pm esp_psram esp_ringbuf esp_rom esp_system esp_timer esp_wifi espcoredump espressif__button espressif__cmake_utilities espressif__esp-box espressif__esp-box-3 espressif__esp-box-lite espressif__esp-dsp espressif__esp-sr espressif__esp_codec_dev espressif__esp_lcd_ili9341 espressif__esp_lcd_touch espressif__esp_lcd_touch_gt911 espressif__esp_lcd_touch_tt21100 espressif__esp_lvgl_port espressif__icm42670 espressif__openai esptool_py fatfs freertos hal heap http_parser idf_test ieee802154 json log lvgl__lvgl lwip main mbedtls mqtt newlib nvs_flash nvs_sec_provider openthread partition_table perfmon protobuf-c protocomm pthread sdmmc soc spi_flash spiffs tcp_transport touch_element ulp unity usb vfs wear_levelling wifi_provisioning wpa_supplicant xtensa
-- Component paths: /Users/alizj/esp/esp-idf/v5.2.1/components/app_trace /Users/alizj/esp/esp-idf/v5.2.1/components/app_update /Users/alizj/esp/esp-idf/v5.2.1/components/bootloader /Users/alizj/esp/esp-idf/v5.2.1/components/bootloader_support /Users/alizj/code/esp32/esp-box/components/bsp /Users/alizj/esp/esp-idf/v5.2.1/components/bt /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/chmorgan__esp-audio-player /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/chmorgan__esp-file-iterator /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/chmorgan__esp-libhelix-mp3 /Users/alizj/esp/esp-idf/v5.2.1/components/cmock /Users/alizj/esp/esp-idf/v5.2.1/components/console /Users/alizj/esp/esp-idf/v5.2.1/components/cxx /Users/alizj/esp/esp-idf/v5.2.1/components/driver /Users/alizj/esp/esp-idf/v5.2.1/components/efuse /Users/alizj/esp/esp-idf/v5.2.1/components/esp-tls /Users/alizj/esp/esp-idf/v5.2.1/components/esp_adc /Users/alizj/esp/esp-idf/v5.2.1/components/esp_app_format /Users/alizj/esp/esp-idf/v5.2.1/components/esp_bootloader_format /Users/alizj/esp/esp-idf/v5.2.1/components/esp_coex /Users/alizj/esp/esp-idf/v5.2.1/components/esp_common /Users/alizj/esp/esp-idf/v5.2.1/components/esp_eth /Users/alizj/esp/esp-idf/v5.2.1/components/esp_event /Users/alizj/esp/esp-idf/v5.2.1/components/esp_gdbstub /Users/alizj/esp/esp-idf/v5.2.1/components/esp_hid /Users/alizj/esp/esp-idf/v5.2.1/components/esp_http_client /Users/alizj/esp/esp-idf/v5.2.1/components/esp_http_server /Users/alizj/esp/esp-idf/v5.2.1/components/esp_https_ota /Users/alizj/esp/esp-idf/v5.2.1/components/esp_https_server /Users/alizj/esp/esp-idf/v5.2.1/components/esp_hw_support /Users/alizj/esp/esp-idf/v5.2.1/components/esp_lcd /Users/alizj/esp/esp-idf/v5.2.1/components/esp_local_ctrl /Users/alizj/esp/esp-idf/v5.2.1/components/esp_mm /Users/alizj/esp/esp-idf/v5.2.1/components/esp_netif /Users/alizj/esp/esp-idf/v5.2.1/components/esp_netif_stack /Users/alizj/esp/esp-idf/v5.2.1/components/esp_partition /Users/alizj/esp/esp-idf/v5.2.1/components/esp_phy /Users/alizj/esp/esp-idf/v5.2.1/components/esp_pm /Users/alizj/esp/esp-idf/v5.2.1/components/esp_psram /Users/alizj/esp/esp-idf/v5.2.1/components/esp_ringbuf /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom /Users/alizj/esp/esp-idf/v5.2.1/components/esp_system /Users/alizj/esp/esp-idf/v5.2.1/components/esp_timer /Users/alizj/esp/esp-idf/v5.2.1/components/esp_wifi /Users/alizj/esp/esp-idf/v5.2.1/components/espcoredump /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/espressif__button /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/espressif__cmake_utilities /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/espressif__esp-box /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/espressif__esp-box-3 /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/espressif__esp-box-lite /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/espressif__esp-dsp /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/espressif__esp-sr /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/espressif__esp_codec_dev /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/espressif__esp_lcd_ili9341 /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/espressif__esp_lcd_touch /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/espressif__esp_lcd_touch_gt911 /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/espressif__esp_lcd_touch_tt21100 /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/espressif__esp_lvgl_port /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/espressif__icm42670 /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/espressif__openai /Users/alizj/esp/esp-idf/v5.2.1/components/esptool_py /Users/alizj/esp/esp-idf/v5.2.1/components/fatfs /Users/alizj/esp/esp-idf/v5.2.1/components/freertos /Users/alizj/esp/esp-idf/v5.2.1/components/hal /Users/alizj/esp/esp-idf/v5.2.1/components/heap /Users/alizj/esp/esp-idf/v5.2.1/components/http_parser /Users/alizj/esp/esp-idf/v5.2.1/components/idf_test /Users/alizj/esp/esp-idf/v5.2.1/components/ieee802154 /Users/alizj/esp/esp-idf/v5.2.1/components/json /Users/alizj/esp/esp-idf/v5.2.1/components/log /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/managed_components/lvgl__lvgl /Users/alizj/esp/esp-idf/v5.2.1/components/lwip /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/main /Users/alizj/esp/esp-idf/v5.2.1/components/mbedtls /Users/alizj/esp/esp-idf/v5.2.1/components/mqtt /Users/alizj/esp/esp-idf/v5.2.1/components/newlib /Users/alizj/esp/esp-idf/v5.2.1/components/nvs_flash /Users/alizj/esp/esp-idf/v5.2.1/components/nvs_sec_provider /Users/alizj/esp/esp-idf/v5.2.1/components/openthread /Users/alizj/esp/esp-idf/v5.2.1/components/partition_table /Users/alizj/esp/esp-idf/v5.2.1/components/perfmon /Users/alizj/esp/esp-idf/v5.2.1/components/protobuf-c /Users/alizj/esp/esp-idf/v5.2.1/components/protocomm /Users/alizj/esp/esp-idf/v5.2.1/components/pthread /Users/alizj/esp/esp-idf/v5.2.1/components/sdmmc /Users/alizj/esp/esp-idf/v5.2.1/components/soc /Users/alizj/esp/esp-idf/v5.2.1/components/spi_flash /Users/alizj/esp/esp-idf/v5.2.1/components/spiffs /Users/alizj/esp/esp-idf/v5.2.1/components/tcp_transport /Users/alizj/esp/esp-idf/v5.2.1/components/touch_element /Users/alizj/esp/esp-idf/v5.2.1/components/ulp /Users/alizj/esp/esp-idf/v5.2.1/components/unity /Users/alizj/esp/esp-idf/v5.2.1/components/usb /Users/alizj/esp/esp-idf/v5.2.1/components/vfs /Users/alizj/esp/esp-idf/v5.2.1/components/wear_levelling /Users/alizj/esp/esp-idf/v5.2.1/components/wifi_provisioning /Users/alizj/esp/esp-idf/v5.2.1/components/wpa_supplicant /Users/alizj/esp/esp-idf/v5.2.1/components/xtensa
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build
[3/1583] Generate factory_nvs...
File copied from /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/factory_nvs/build/factory_nvs.bin to /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build/uf2/factory_nvs.bin
[4/1583] Move and Pack models...
Recommended model partition size: 2411K
[7/1583] Generating ../../partition_table/partition-table.bin
Partition table binary generated. Contents:
,*******************************************************************************
# ESP-IDF Partition Table
# Name, Type, SubType, Offset, Size, Flags
nvs,data,nvs,0x9000,16K,
otadata,data,ota,0xd000,8K,
phy_init,data,phy,0xf000,4K,
factory,app,factory,0x10000,6M,
ota_0,app,ota_0,0x700000,2M,
storage,data,spiffs,0x900000,2M,
model,data,spiffs,0xb00000,4000K,
,*******************************************************************************
[680/1583] Performing configure step for 'bootloader'
-- Found Git: /opt/homebrew/bin/git (found version "2.44.0")
-- The C compiler identification is GNU 13.2.0
-- The CXX compiler identification is GNU 13.2.0
-- The ASM compiler identification is GNU
-- Found assembler: /Users/alizj/.espressif/tools/xtensa-esp-elf/esp-13.2.0_20230928/xtensa-esp-elf/bin/xtensa-esp32s3-elf-gcc
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Users/alizj/.espressif/tools/xtensa-esp-elf/esp-13.2.0_20230928/xtensa-esp-elf/bin/xtensa-esp32s3-elf-gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /Users/alizj/.espressif/tools/xtensa-esp-elf/esp-13.2.0_20230928/xtensa-esp-elf/bin/xtensa-esp32s3-elf-g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Building ESP-IDF components for target esp32s3
-- Project sdkconfig file /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/sdkconfig
-- Compiler supported targets: xtensa-esp-elf
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of time_t
-- Check size of time_t - done
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/soc/esp32s3/ld/esp32s3.peripherals.ld
-- Bootloader project name: "bootloader" version: 1
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.api.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.libgcc.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom/esp32s3/ld/esp32s3.rom.newlib.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/bootloader/subproject/main/ld/esp32s3/bootloader.ld
-- Adding linker script /Users/alizj/esp/esp-idf/v5.2.1/components/bootloader/subproject/main/ld/esp32s3/bootloader.rom.ld
-- Components: bootloader bootloader_support efuse esp_app_format esp_bootloader_format esp_common esp_hw_support esp_rom esp_system esptool_py freertos hal log main micro-ecc newlib partition_table soc spi_flash xtensa
-- Component paths: /Users/alizj/esp/esp-idf/v5.2.1/components/bootloader /Users/alizj/esp/esp-idf/v5.2.1/components/bootloader_support /Users/alizj/esp/esp-idf/v5.2.1/components/efuse /Users/alizj/esp/esp-idf/v5.2.1/components/esp_app_format /Users/alizj/esp/esp-idf/v5.2.1/components/esp_bootloader_format /Users/alizj/esp/esp-idf/v5.2.1/components/esp_common /Users/alizj/esp/esp-idf/v5.2.1/components/esp_hw_support /Users/alizj/esp/esp-idf/v5.2.1/components/esp_rom /Users/alizj/esp/esp-idf/v5.2.1/components/esp_system /Users/alizj/esp/esp-idf/v5.2.1/components/esptool_py /Users/alizj/esp/esp-idf/v5.2.1/components/freertos /Users/alizj/esp/esp-idf/v5.2.1/components/hal /Users/alizj/esp/esp-idf/v5.2.1/components/log /Users/alizj/esp/esp-idf/v5.2.1/components/bootloader/subproject/main /Users/alizj/esp/esp-idf/v5.2.1/components/bootloader/subproject/components/micro-ecc /Users/alizj/esp/esp-idf/v5.2.1/components/newlib /Users/alizj/esp/esp-idf/v5.2.1/components/partition_table /Users/alizj/esp/esp-idf/v5.2.1/components/soc /Users/alizj/esp/esp-idf/v5.2.1/components/spi_flash /Users/alizj/esp/esp-idf/v5.2.1/components/xtensa
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build/bootloader
[111/112] Generating binary image from built executable
esptool.py v4.7.0
Creating esp32s3 image...
Merged 2 ELF sections
Successfully created esp32s3 image.
Generated /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build/bootloader/bootloader.bin
[112/112] cd /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build/bootloader/esp-idf/esptool_py &&...8000 bootloader 0x0 /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build/bootloader/bootloader.bi
Bootloader binary size 0x5800 bytes. 0x2800 bytes (31%) free.
[1582/1583] Generating binary image from built executable
esptool.py v4.7.0
Creating esp32s3 image...
Merged 2 ELF sections
Successfully created esp32s3 image.
Generated /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build/chatgpt_demo.bin
[1583/1583] cd /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build/esp-idf/esptool_py && /Users/a...on_table/partition-table.bin /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build/chatgpt_demo.bi
Warning: 1/2 app partitions are too small for binary chatgpt_demo.bin size 0x3761e0:
- Part 'ota_0' 0/16 @ 0x700000 size 0x200000 (overflow 0x1761e0)
Project build complete. To flash, run:
idf.py flash
or
idf.py -p PORT flash
or
python -m esptool --chip esp32s3 -b 460800 --before default_reset --after hard_reset write_flash --flash_mode dio --flash_size 16MB --flash_freq 80m 0x0 build/bootloader/bootloader.bin 0x8000 build/partition_table/partition-table.bin 0xd000 build/ota_data_initial.bin 0x10000 build/chatgpt_demo.bin 0x700000 build/uf2/factory_nvs.bin 0x900000 build/storage.bin 0xb00000 build/srmodels/srmodels.bin
or from the "/Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build" directory
python -m esptool --chip esp32s3 -b 460800 --before default_reset --after hard_reset write_flash "@flash_args"
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo$
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo$ ls -l /dev/*modem*
crw-rw-rw- 1 root 9, 9 5 15 11:41 /dev/cu.usbmodem101
crw-rw-rw- 1 root 9, 8 5 15 11:41 /dev/tty.usbmodem101
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo$ idf.py -p /dev/tty.usbmodem101 flash
Executing action: flash
Running ninja in directory /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build
Executing "ninja flash"...
[1/6] cd /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build/esp-idf/esptool_py && /Users/alizj/....on_table/partition-table.bin /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build/chatgpt_demo.bi
Warning: 1/2 app partitions are too small for binary chatgpt_demo.bin size 0x3761e0:
- Part 'ota_0' 0/16 @ 0x700000 size 0x200000 (overflow 0x1761e0)
[1/1] cd /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build/bootloader/esp-idf/esptool_py && /Us...8000 bootloader 0x0 /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build/bootloader/bootloader.bi
Bootloader binary size 0x5800 bytes. 0x2800 bytes (31%) free.
[3/4] cd /Users/alizj/esp/esp-idf/v5.2.1/components/esptool_py && /Users/alizj/.espressif/tools/cmake/3.2...xamples/chatgpt_demo/build -P /Users/alizj/esp/esp-idf/v5.2.1/components/esptool_py/run_serial_tool.cmak
esptool.py --chip esp32s3 -p /dev/tty.usbmodem101 -b 460800 --before=default_reset --after=hard_reset write_flash --flash_mode dio --flash_freq 80m --flash_size 16MB 0x0 bootloader/bootloader.bin 0x10000 chatgpt_demo.bin 0x8000 partition_table/partition-table.bin 0xd000 ota_data_initial.bin 0xb00000 srmodels/srmodels.bin 0x700000 uf2/factory_nvs.bin 0x900000 storage.bin
esptool.py v4.7.0
Serial port /dev/tty.usbmodem101
Connecting...
Chip is ESP32-S3 (QFN56) (revision v0.2)
Features: WiFi, BLE, Unknown Embedded PSRAM (AP_1v8)
Crystal is 40MHz
MAC: 3c:84:27:04:fe:18
Uploading stub...
Running stub...
Stub running...
Changing baud rate to 460800
Changed.
Configuring flash size...
Flash will be erased from 0x00000000 to 0x00005fff...
Flash will be erased from 0x00010000 to 0x00386fff...
Flash will be erased from 0x00008000 to 0x00008fff...
Flash will be erased from 0x0000d000 to 0x0000efff...
Flash will be erased from 0x00b00000 to 0x00d5afff...
Flash will be erased from 0x00700000 to 0x00806fff...
Flash will be erased from 0x00900000 to 0x00afffff...
Compressed 22528 bytes to 13935...
Writing at 0x00000000... (100 %)
Wrote 22528 bytes (13935 compressed) at 0x00000000 in 0.3 seconds (effective 549.4 kbit/s)...
Hash of data verified.
Compressed 3629536 bytes to 1917750...
Writing at 0x00385dba... (100 %)
Wrote 3629536 bytes (1917750 compressed) at 0x00010000 in 21.6 seconds (effective 1342.9 kbit/s)...
Hash of data verified.
Compressed 3072 bytes to 159...
Writing at 0x00008000... (100 %)
Wrote 3072 bytes (159 compressed) at 0x00008000 in 0.1 seconds (effective 416.7 kbit/s)...
Hash of data verified.
Compressed 8192 bytes to 31...
Writing at 0x0000d000... (100 %)
Wrote 8192 bytes (31 compressed) at 0x0000d000 in 0.1 seconds (effective 841.5 kbit/s)...
Hash of data verified.
Compressed 2468364 bytes to 1726158...
Writing at 0x00d551af... (100 %)
Wrote 2468364 bytes (1726158 compressed) at 0x00b00000 in 16.1 seconds (effective 1225.0 kbit/s)...
Hash of data verified.
Compressed 1073968 bytes to 381268...
Writing at 0x0080452c... (100 %)
Wrote 1073968 bytes (381268 compressed) at 0x00700000 in 6.3 seconds (effective 1361.4 kbit/s)...
Hash of data verified.
Compressed 2097152 bytes to 617837...
Writing at 0x00a10f2b... (100 %)
Wrote 2097152 bytes (617837 compressed) at 0x00900000 in 9.5 seconds (effective 1761.4 kbit/s)...
Hash of data verified.
Leaving...
Hard resetting via RTS pin...
Done
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo$ ls /
Applications/ Library/ System/ Users/ Volumes/ bin/ cores/ dev/ etc@ home@ opt/ private/ sbin/ tmp@ usr/ var@
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo$ ls /Volumes/
ESP-BOX/ MacintoshHD@
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo$ cat /Volumes/ESP-BOX/CONFIG.INI
[configuration]
ssid = zj-phone
password = ho4o45678
ChatGPT_key = sk-proj-lREnCLGE3daNeQmfjFYQT3BlbkFJ0tTtTM4xYiFQ4OI9if7v
Base_url = https://api.openai.com/v1/
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo$
修改配置后需要先 clean 后重新 build, 在 flash 前需要先 erase flash:
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo$ idf.py -p /dev/cu.usbmodem2101 erase-flash erase-otadata
Executing action: erase-flash
Running esptool.py in directory /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build
Executing "/Users/alizj/.espressif/python_env/idf5.2_py3.12_env/bin/python /Users/alizj/esp/esp-idf/v5.2.1/components/esptool_py/esptool/esptool.py -p /dev/cu.usbmodem2101 -b 460800 --before default_reset --after hard_reset --chip esp32s3 erase_flash"...
esptool.py v4.7.0
Serial port /dev/cu.usbmodem2101
Connecting....
Chip is ESP32-S3 (QFN56) (revision v0.2)
Features: WiFi, BLE, Unknown Embedded PSRAM (AP_1v8)
Crystal is 40MHz
MAC: 3c:84:27:04:fe:18
Uploading stub...
Running stub...
Stub running...
Changing baud rate to 460800
Changed.
Erasing flash (this may take a while)...
Chip erase completed successfully in 7.8s
Hard resetting via RTS pin...
Executing action: erase-otadata
Running ninja in directory /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build
Executing "ninja erase-otadata"...
[0/1] cd /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build && /Users/alizj/.espressif/tools/cmake/3.24.0/CMake.app/Contents/bin/cmake -D IDF_PATH=/Users/alizj/esp/esp-idf/v5.2.1 -D "SERIAL_TOOL=/Users/alizj/.espressif/python_env/idf5.2_py3.12_env/bin/python;/Users/alizj/esp/esp-idf/v5.2.1/components/app_update/otatool.py" -D "SERIAL_TOOL_ARGS=--esptool-args;before=default_reset;after=hard_reset;--partition-table-file;/Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/partitions.csv;--partition-table-offset;0x8000;erase_otadata" -D WORKING_DIRECTORY=/Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build -P /Users/alizj/esp/esp-idf/v5.2.1/components/esptool_py/run_serial_tool.cmake
esptool.py v4.7.0
Serial port /dev/cu.usbmodem2101
Connecting....
Detecting chip type... ESP32-S3
Chip is ESP32-S3 (QFN56) (revision v0.2)
Features: WiFi, BLE, Unknown Embedded PSRAM (AP_1v8)
Crystal is 40MHz
MAC: 3c:84:27:04:fe:18
Uploading stub...
Running stub...
Stub running...
Changing baud rate to 460800
Changed.
8192 (100 %)
8192 (100 %)
Read 8192 bytes at 0x0000d000 in 0.9 seconds (70.6 kbit/s)...
Hard resetting via RTS pin...
esptool.py v4.7.0
Serial port /dev/cu.usbmodem2101
Connecting....
Detecting chip type... ESP32-S3
Chip is ESP32-S3 (QFN56) (revision v0.2)
Features: WiFi, BLE, Unknown Embedded PSRAM (AP_1v8)
Crystal is 40MHz
MAC: 3c:84:27:04:fe:18
Uploading stub...
Running stub...
Stub running...
Changing baud rate to 460800
Changed.
Erasing region (may be slow depending on size)...
Erase completed successfully in 0.0 seconds.
Hard resetting via RTS pin...
Running /Users/alizj/.espressif/python_env/idf5.2_py3.12_env/bin/python /Users/alizj/esp/esp-idf/v5.2.1/components/esptool_py/esptool/esptool.py --before default_reset --after hard_reset --port /dev/cu.usbmodem2101 --baud 460800 read_flash 53248 8192 /var/folders/rj/6gjh6pb50xxbmm1m8dv5w2180000gp/T/tmpctmn9o6j...
Running /Users/alizj/.espressif/python_env/idf5.2_py3.12_env/bin/python /Users/alizj/esp/esp-idf/v5.2.1/components/esptool_py/esptool/esptool.py --before default_reset --after hard_reset --port /dev/cu.usbmodem2101 --baud 460800 erase_region 53248 8192...
Erased ota_data partition contents
Done
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo$
正常输出:
zj@a:~/code/esp32/esp-box/examples/chatgpt_demo$ idf.py -p /dev/cu.usbmodem2101 monitor
Executing action: monitor
Running idf_monitor in directory /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo
Executing "/Users/alizj/.espressif/python_env/idf5.2_py3.12_env/bin/python /Users/alizj/esp/esp-idf/v5.2.1/tools/idf_monitor.py -p /dev/cu.usbmodem2101 -b 115200 --toolchain-prefix xtensa-esp32s3-elf- --target esp32s3 --revision 0 /Users/alizj/code/esp32/esp-box/examples/chatgpt_demo/build/chatgpt_demo.elf -m '/Users/alizj/.espressif/python_env/idf5.2_py3.12_env/bin/python' '/Users/alizj/esp/esp-idf/v5.2.1/tools/idf.py' '-p' '/dev/cu.usbmodem2101'"...
--- esp-idf-monitor 1.4.0 on /dev/cu.usbmodem2101 115200 ---
--- Quit: Ctrl+] | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---
ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0x15 (USB_UART_CHIP_RESET),boot:0x2a (SPI_FAST_FLASH_BOOT)
Saved PC:0x4037d1ce
0x4037d1ce: esp_cpu_wait_for_intr at /Users/alizj/esp/esp-idf/v5.2.1/components/esp_hw_support/cpu.c:145
SPIWP:0xee
mode:DIO, clock div:1
load:0x3fce3820,len:0x1918
load:0x403c9700,len:0x4
load:0x403c9704,len:0xe5c
load:0x403cc700,len:0x302c
entry 0x403c993c
I (19) boot: ESP-IDF v5.2.1 2nd stage bootloader
I (19) boot: compile time May 15 2024 12:06:28
I (19) boot: Multicore bootloader
I (22) boot: chip revision: v0.2
I (26) qio_mode: Enabling QIO for flash chip GD
I (31) boot.esp32s3: Boot SPI Speed : 80MHz
I (36) boot.esp32s3: SPI Mode : QIO
I (41) boot.esp32s3: SPI Flash Size : 16MB
I (45) boot: Enabling RNG early entropy source...
I (51) boot: Partition Table:
I (54) boot: ## Label Usage Type ST Offset Length
I (62) boot: 0 nvs WiFi data 01 02 00009000 00004000
I (69) boot: 1 otadata OTA data 01 00 0000d000 00002000
I (77) boot: 2 phy_init RF data 01 01 0000f000 00001000
I (84) boot: 3 factory factory app 00 00 00010000 00600000
I (92) boot: 4 ota_0 OTA app 00 10 00700000 00200000
I (99) boot: 5 storage Unknown data 01 82 00900000 00200000
I (106) boot: 6 model Unknown data 01 82 00b00000 003e8000
I (114) boot: End of partition table
I (118) boot: Defaulting to factory image
I (123) esp_image: segment 0: paddr=00010020 vaddr=3c120020 size=23613ch (2318652) map
I (483) esp_image: segment 1: paddr=00246164 vaddr=3fc9cc00 size=0763ch ( 30268) load
I (489) esp_image: segment 2: paddr=0024d7a8 vaddr=40374000 size=02870h ( 10352) load
I (491) esp_image: segment 3: paddr=00250020 vaddr=42000020 size=11febch (1179324) map
I (676) esp_image: segment 4: paddr=0036fee4 vaddr=40376870 size=162c8h ( 90824) load
I (694) esp_image: segment 5: paddr=003861b4 vaddr=600fe010 size=00004h ( 4) load
I (705) boot: Loaded app from partition at offset 0x10000
I (705) boot: Disabling RNG early entropy source...
I (716) cpu_start: Multicore app
I (717) octal_psram: vendor id : 0x0d (AP)
I (717) octal_psram: dev id : 0x03 (generation 4)
I (720) octal_psram: density : 0x05 (128 Mbit)
I (726) octal_psram: good-die : 0x01 (Pass)
I (731) octal_psram: Latency : 0x01 (Fixed)
I (736) octal_psram: VCC : 0x00 (1.8V)
I (741) octal_psram: SRF : 0x01 (Fast Refresh)
I (747) octal_psram: BurstType : 0x01 (Hybrid Wrap)
I (753) octal_psram: BurstLen : 0x01 (32 Byte)
I (758) octal_psram: Readlatency : 0x02 (10 cycles@Fixed)
I (765) octal_psram: DriveStrength: 0x00 (1/1)
I (770) MSPI Timing: PSRAM timing tuning index: 6
I (775) esp_psram: Found 16MB PSRAM device
I (780) esp_psram: Speed: 80MHz
I (862) mmu_psram: Instructions copied and mapped to SPIRAM
I (1386) esp_psram: SPI SRAM memory test OK
I (1395) cpu_start: Pro cpu start user code
I (1395) cpu_start: cpu freq: 240000000 Hz
I (1395) cpu_start: Application information:
I (1398) cpu_start: Project name: chatgpt_demo
I (1404) cpu_start: App version: v0.5.0-63-ga9bbead
I (1410) cpu_start: Compile time: May 15 2024 14:06:49
I (1416) cpu_start: ELF file SHA256: f768634fd...
I (1421) cpu_start: ESP-IDF: v5.2.1
I (1426) cpu_start: Min chip rev: v0.0
I (1431) cpu_start: Max chip rev: v0.99
I (1436) cpu_start: Chip rev: v0.2
I (1441) heap_init: Initializing. RAM available for dynamic allocation:
I (1448) heap_init: At 3FCA99F0 len 0003FD20 (255 KiB): RAM
I (1454) heap_init: At 3FCE9710 len 00005724 (21 KiB): RAM
I (1461) heap_init: At 600FE014 len 00001FD4 (7 KiB): RTCRAM
I (1467) esp_psram: Adding pool of 15232K of PSRAM memory to heap allocator
I (1475) spi_flash: detected chip: gd
I (1479) spi_flash: flash io: qio
W (1483) i2c: This driver is an old driver, please migrate your application code to adapt `driver/i2c_master.h`
I (1494) sleep: Configure to isolate all GPIO pins in sleep state
I (1501) sleep: Enable automatic switching of GPIO sleep configuration
I (1508) main_task: Started on CPU0
I (1513) esp_psram: Reserving pool of 8K of internal memory for DMA/internal allocations
I (1521) main_task: Calling app_main()
I (1543) settings: stored ssid:zj-phone
I (1543) settings: stored password:ho4o45678
I (1543) settings: stored OpenAI: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
I (1550) settings: stored Base URL:https://ai.opsnull.com/v1/
I (1694) ESP-BOX-3: Partition size: total: 1920401, used: 1072774
I (1695) LVGL: Starting LVGL task
I (1695) gpio: GPIO[4]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0
I (1703) gpio: GPIO[48]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0
I (1713) ili9341: LCD panel create success, version: 1.2.0
W (1839) ili9341: The 36h command has been used and will be overwritten by external initialization sequence
W (1839) ili9341: The 3Ah command has been used and will be overwritten by external initialization sequence
I (1851) gpio: GPIO[3]| InputEn: 1| OutputEn: 0| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:2
I (1858) GT911: TouchPad_ID:0x39,0x31,0x31
I (1863) GT911: TouchPad_Config_Version:65
I (1867) button: IoT Button Version: 3.2.0
I (1872) gpio: GPIO[0]| InputEn: 1| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0
I (1882) button: IoT Button Version: 3.2.0
I (1886) gpio: GPIO[1]| InputEn: 1| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0
I (1896) button: IoT Button Version: 3.2.0
I (1903) ES8311: Work in Slave mode
I (1906) gpio: GPIO[46]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0
I (1915) ES7210: Work in Slave mode
I (1921) ES7210: Enable ES7210_INPUT_MIC1
I (1924) ES7210: Enable ES7210_INPUT_MIC2
I (1931) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (1934) I2S_IF: STD Mode 1 bits:16/16 channel:2 sample_rate:16000 mask:3
I (1948) Adev_Codec: Open codec device OK
I (1948) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (1952) I2S_IF: STD Mode 0 bits:16/16 channel:2 sample_rate:16000 mask:3
I (1960) ES7210: Bits 16
I (1966) ES7210: Enable ES7210_INPUT_MIC1
I (1969) ES7210: Enable ES7210_INPUT_MIC2
I (1975) ES7210: Unmuted
I (1975) Adev_Codec: Open codec device OK
W (1980) bsp_sensor: This example don't support Sensor!!
I (1986) app_main: Display LVGL demo
I (1990) ESP-BOX-3: Setting LCD backlight: 100%
I (2024) app_main: speech recognition start
I (2025) MODEL_LOADER: The storage free size is 14080 KB
I (2025) MODEL_LOADER: The partition size is 4000 KB
I (2026) pp: pp rom version: e7ae62f
I (2034) net80211: net80211 rom version: e7ae62f
I (2031) MODEL_LOADER: Successfully map model partition
I (2045) AFE_SR: afe interface for speech recognition
I (2051) AFE_SR: AFE version: SR_V220727
I (2056) AFE_SR: Initial auido front-end, total channel: 3, mic num: 2, ref num: 1
I (2064) AFE_SR: aec_init: 0, se_init: 1, vad_init: 1
I (2070) AFE_SR: wakenet_init: 1
I (2041) wifi:wifi driver task: 3fcc1ff4, prio:23, stack:6656, core=0
I (2090) wifi:wifi firmware version: a9f5b59
I (2090) wifi:wifi certification version: v7.0
I (2091) wifi:config NVS flash: enabled
I (2092) wifi:config nano formating: disabled
I (2096) wifi:Init data frame dynamic rx buffer num: 32
I (2101) wifi:Init static rx mgmt buffer num: 5
I (2105) wifi:Init management short buffer num: 32
I (2110) wifi:Init static tx buffer num: 16
I (2114) wifi:Init tx cache buffer num: 32
I (2118) wifi:Init static tx FG buffer num: 2
I (2122) wifi:Init static rx buffer size: 1600
I (2126) wifi:Init static rx buffer num: 10
I (2130) wifi:Init dynamic rx buffer num: 32
I (2135) wifi_init: rx ba win: 6
I (2139) wifi_init: tcpip mbox: 32
I (2143) wifi_init: udp mbox: 6
I (2146) wifi_init: tcp mbox: 6
I (2149) wifi_init: tcp tx win: 5760
I (2155) wifi_init: tcp rx win: 5760
I (2159) wifi_init: tcp mss: 1440
I (2163) wifi_init: WiFi/LWIP prefer SPIRAM
I (2167) wifi_init: WiFi IRAM OP enabled
I (2172) wifi_init: WiFi RX IRAM OP enabled
I (2178) phy_init: phy_version 640,cd64a1a,Jan 24 2024,17:28:12
I (2219) wifi:mode : sta (3c:84:27:04:fe:18)
I (2219) wifi:enable tsf
I (2220) wifi station: start connect to the AP
I (2220) wifi station: wifi_init_sta finished.zj-phone, ho4o45678
I (2226) wifi station: NET_EVENT_POWERON_SCAN
MC Quantized wakenet9: wakeNet9_v1h24_hiesp_3_0.63_0.635, tigger:v3, mode:2, p:0, (Jul 7 2023 11:10:53)
I (2321) AFE_SR: wake num: 3, mode: 0, (Jul 7 2023 11:10:53)
I (2321) app_sr: Set language EN
I (2321) app_sr: load wakenet:wn9_hiesp
MC Quantized wakenet9: wakeNet9_v1h24_hiesp_3_0.63_0.635, tigger:v3, mode:2, p:0, (Jul 7 2023 11:10:53)
I (2356) AFE_SR: wakenet wn_frame_size = 512
I (2359) app_sr: Feed Task
I (2362) app_sr: audio_chunksize=1024, feed_channel=3
I (2368) app_sr: Detection task
sr handle task, mute:1
successfully created record_audio_buffer with a size: 256000
audio_rx_buffer with a size: 1048576
I (2543) file_iterator: File : echo_cn_end.wav
I (2556) file_iterator: File : Hi.wav
I (2575) file_iterator: File : echo_cn_wake.wav
I (2581) file_iterator: File : waitPlease.mp3
I (2582) file_iterator: File : echo_en_end.wav
I (2595) file_iterator: File : echo_en_wake.wav
I (2598) file_iterator: File : echo_en_ok.wav
I (2606) file_iterator: File : Hi.mp3
I (2614) file_iterator: File : input.wav
I (2616) file_iterator: File : tts_failed.mp3
I (2619) file_iterator: File : echo_cn_ok.wav
W (2672) app_sr: AFE Fetch Fail
W (2972) app_sr: AFE Fetch Fail
W (3272) app_sr: AFE Fetch Fail
W (3572) app_sr: AFE Fetch Fail
W (3872) app_sr: AFE Fetch Fail
W (4172) app_sr: AFE Fetch Fail
W (4472) app_sr: AFE Fetch Fail
I (4652) wifi station: Total APs scanned = 0, ret:0
W (4772) app_sr: AFE Fetch Fail
W (5072) app_sr: AFE Fetch Fail
W (5372) app_sr: AFE Fetch Fail
I (5652) wifi station: failed return
W (5672) app_sr: AFE Fetch Fail
W (5972) app_sr: AFE Fetch Fail
I (6258) wifi:new:<6,0>, old:<1,0>, ap:<255,255>, sta:<6,0>, prof:1
W (6272) app_sr: AFE Fetch Fail
W (6572) app_sr: AFE Fetch Fail
I (6605) wifi:state: init -> auth (b0)
I (6609) wifi:state: auth -> assoc (0)
I (6613) wifi:state: assoc -> run (10)
W (6872) app_sr: AFE Fetch Fail
I (7073) wifi:connected with zj-phone, aid = 1, channel 6, BW20, bssid = 72:09:9e:5a:f2:17
I (7073) wifi:security: WPA2-PSK, phy: bgn, rssi: -41
I (7080) wifi:pm start, type: 1
I (7081) wifi:dp: 1, bi: 102400, li: 3, scale listen interval from 307200 us to 307200 us
I (7085) wifi:set rx beacon pti, rx_bcn_pti: 0, bcn_timeout: 25000, mt_pti: 0, mt_time: 10000
W (7172) app_sr: AFE Fetch Fail
I (7276) wifi:AP's beacon interval = 102400 us, DTIM period = 1
W (7472) app_sr: AFE Fetch Fail
I (7598) wifi:<ba-add>idx:0 (ifx:0, 72:09:9e:5a:f2:17), tid:0, ssn:2, winSize:64
W (7772) app_sr: AFE Fetch Fail
W (8072) app_sr: AFE Fetch Fail
W (8372) app_sr: AFE Fetch Fail
I (8594) esp_netif_handlers: sta ip: 172.20.10.6, mask: 255.255.255.240, gw: 172.20.10.1
I (8594) wifi station: got ip:172.20.10.6
I (33038) ui-events: sr start once
I (33042) app_sr: AFE_FETCH_CHANNEL_VERIFIED, channel index: 2
I (33042) app_audio: ### record Start
I (33110) ui_ctrl: Swich to panel[1]
I (33196) app_audio: frame_rate= 16000, ch=2, width=16
I (33198) I2S_IF: Pending out channel for in channel running
E (33215) i2s_common: i2s_channel_disable(1021): the channel has not been enabled yet
I (33215) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (33221) I2S_IF: STD Mode 1 bits:16/16 channel:2 sample_rate:16000 mask:3
I (33306) Adev_Codec: Open codec device OK
E (33306) i2s_common: i2s_channel_disable(1021): the channel has not been enabled yet
I (33309) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (33316) I2S_IF: STD Mode 0 bits:16/16 channel:2 sample_rate:16000 mask:3
I (33324) ES7210: Bits 16
I (33328) ES7210: Enable ES7210_INPUT_MIC1
I (33331) ES7210: Enable ES7210_INPUT_MIC2
I (33338) ES7210: Unmuted
I (33338) Adev_Codec: Open codec device OK
I (38719) app_audio: ESP_MN_STATE_TIMEOUT
I (38719) app_audio: ### record Stop, 97185 94K
I (38876) OpenAI: OpenAI create, version: 0.3.1
I (38877) app_audio: Player PLAYING
I (38877) ui_ctrl: Swich to panel[2]
I (38896) I2S_IF: Pending out channel for in channel running
E (38914) i2s_common: i2s_channel_disable(1021): the channel has not been enabled yet
I (38914) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (38918) I2S_IF: STD Mode 1 bits:16/16 channel:2 sample_rate:16000 mask:3
I (38995) Adev_Codec: Open codec device OK
E (38995) i2s_common: i2s_channel_disable(1021): the channel has not been enabled yet
I (38998) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (39003) I2S_IF: STD Mode 0 bits:16/16 channel:2 sample_rate:16000 mask:3
I (39011) ES7210: Bits 16
I (39017) ES7210: Enable ES7210_INPUT_MIC1
I (39020) ES7210: Enable ES7210_INPUT_MIC2
I (39027) ES7210: Unmuted
I (39027) Adev_Codec: Open codec device OK
I (40732) esp-x509-crt-bundle: Certificate validated
I (42466) app_audio: Player IDLE
I (42468) I2S_IF: Pending out channel for in channel running
E (42486) i2s_common: i2s_channel_disable(1021): the channel has not been enabled yet
I (42486) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (42490) I2S_IF: STD Mode 1 bits:16/16 channel:2 sample_rate:16000 mask:3
I (42567) Adev_Codec: Open codec device OK
E (42568) i2s_common: i2s_channel_disable(1021): the channel has not been enabled yet
I (42571) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (42575) I2S_IF: STD Mode 0 bits:16/16 channel:2 sample_rate:16000 mask:3
I (42588) ES7210: Bits 16
I (42591) ES7210: Enable ES7210_INPUT_MIC1
I (42593) ES7210: Enable ES7210_INPUT_MIC2
I (42600) ES7210: Unmuted
I (42600) Adev_Codec: Open codec device OK
I (42604) app_main: replay audio end
I (49166) ui_ctrl: update reply question
I (49166) ui_ctrl: update listen speak
I (50838) esp-x509-crt-bundle: Certificate validated
I (53165) ui_ctrl: update reply question
I (53166) ui_ctrl: update listen speak
I (53166) ui_ctrl: update reply content
I (53168) ui_ctrl: decode:[34, 34] Hello! How can I assist you today?
I (53176) ui_ctrl: reply scroll timer start
I (53180) ui_ctrl: Swich to panel[3]
I (54491) esp-x509-crt-bundle: Certificate validated
E (57355) OpenAI: ./managed_components/espressif__openai/OpenAI.c:2462 (OpenAI_Speech_Request):HTTP client fetch headers failed!
E (57357) OpenAI: ./managed_components/espressif__openai/OpenAI.c:1936 (OpenAI_AudioSpeechMessage):Empty result!
W (57367) ui_ctrl: Switch panel to [0] in 10000ms
E (57421) app_main: start_openai(145): [audioSpeech]: invalid response
I (57421) app_audio: Player PLAYING
I (57433) I2S_IF: Pending out channel for in channel running
E (57452) i2s_common: i2s_channel_disable(1021): the channel has not been enabled yet
I (57453) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (57457) I2S_IF: STD Mode 1 bits:16/16 channel:2 sample_rate:24000 mask:3
I (57547) Adev_Codec: Open codec device OK
E (57548) i2s_common: i2s_channel_disable(1021): the channel has not been enabled yet
I (57549) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (57555) I2S_IF: STD Mode 0 bits:16/16 channel:2 sample_rate:24000 mask:3
I (57563) ES7210: Bits 16
I (57569) ES7210: Enable ES7210_INPUT_MIC1
I (57572) ES7210: Enable ES7210_INPUT_MIC2
I (57579) ES7210: Unmuted
I (57579) Adev_Codec: Open codec device OK
I (59526) app_audio: Player IDLE
I (59528) I2S_IF: Pending out channel for in channel running
E (59543) i2s_common: i2s_channel_disable(1021): the channel has not been enabled yet
I (59543) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (59547) I2S_IF: STD Mode 1 bits:16/16 channel:2 sample_rate:16000 mask:3
I (59632) Adev_Codec: Open codec device OK
E (59632) i2s_common: i2s_channel_disable(1021): the channel has not been enabled yet
I (59634) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (59641) I2S_IF: STD Mode 0 bits:16/16 channel:2 sample_rate:16000 mask:3
I (59649) ES7210: Bits 16
I (59654) ES7210: Enable ES7210_INPUT_MIC1
I (59657) ES7210: Enable ES7210_INPUT_MIC2
I (59664) ES7210: Unmuted
I (59664) Adev_Codec: Open codec device OK
I (59669) app_main: replay audio end
I (67367) ui_ctrl: Swich to panel[0]
语音识别为中文: https://github.com/espressif/esp-box/tree/master/examples/chatgpt_demo#known-issues
modified examples/chatgpt_demo/main/main.c
@@ -23,7 +23,7 @@
#include "settings.h"
#define SCROLL_START_DELAY_S (1.5)
-#define LISTEN_SPEAK_PANEL_DELAY_MS 2000
+#define LISTEN_SPEAK_PANEL_DELAY_MS 10000
#define SERVER_ERROR "server_error"
#define INVALID_REQUEST_ERROR "invalid_request_error"
#define SORRY_CANNOT_UNDERSTAND "Sorry, I can't understand."
@@ -56,7 +56,7 @@ esp_err_t start_openai(uint8_t *audio, int audio_len)
audioSpeech = openai->audioSpeechCreate(openai);
audioTranscription->setResponseFormat(audioTranscription, OPENAI_AUDIO_RESPONSE_FORMAT_JSON);
- audioTranscription->setLanguage(audioTranscription, "en");
+ audioTranscription->setLanguage(audioTranscription, "zh");
audioTranscription->setTemperature(audioTranscription, 0.2);
chatCompletion->setModel(chatCompletion, "gpt-3.5-turbo");
aliyun 智能语音识别(语音 -》文本)和语音合成(文本 —》语音) #
# 一句话识别(不超过一分钟的语音)
# https://help.aliyun.com/zh/isi/developer-reference/restful-api-2?spm=a2c4g.11186623.0.0.3a0824bcuJRdhn
# 上传 wav 文件, 返回识别结果 result
zj@a:~$ curl -X POST -H "X-NLS-Token: 7275bfc4a4af4fb6864d2319a85ee26b" "http://nls-gateway-cn-beijing.aliyuncs.com/stream/v1/asr?appkey=K8mSx8pOrrcJ5bS9" --data-binary @./nls-sample-16k.wav
{"task_id":"41b53a3471b44be8b2ed9c12790fd40f","result":"北京的天气","status":20000000,"message":"SUCCESS"}zj@a:~$
# 上传文本, 支持一次性合成300字符以内的文字
# https://help.aliyun.com/zh/isi/developer-reference/overview-of-speech-synthesis?spm=a2c4g.11186623.0.0.15441d63I69Yoc
zj@a:~/docs$ curl -X POST 'https://nls-gateway-cn-beijing.aliyuncs.com/stream/v1/tts' -H 'Content-Type: application/json' --data-raw '{
"appkey":"K8mSx8pOrrcJ5bS9",
"text":"今天是周一,天气挺好的。",
"token":"7275bfc4a4af4fb6864d2319a85ee26b",
"format":"mp3"
}' -o save.mp3
# 异步长文本合成
zj@a:~$ curl -v -X POST 'https://nls-gateway-cn-shanghai.aliyuncs.com/rest/v1/tts/async' -H 'Content-Type: application/json' --data '{
"payload": {
"tts_request": {
"voice": "xiaoyun",
"sample_rate": 16000,
"format": "wav",
"text": "今天天气好晴朗",
"enable_subtitle": true
},
"enable_notify": false
},
"context": {
"device_id": "my_device_id"
},
"header": {
"appkey": "K8mSx8pOrrcJ5bS9",
"token": "7275bfc4a4af4fb6864d2319a85ee26b"
}
}
'
{"status":200,"data":{"task_id":"5f7ca13c8c9d4a209c3d00d81020ed62"},"error_code":20000000,"error_message":"SUCCESS","request_id":"bb621c59522f4b7babe33f063c3aae9a"}zj@a:~$
# 访问 task_id 返回合成结果链接
zj@a:~/docs$ curl 'https://nls-gateway-cn-shanghai.aliyuncs.com/rest/v1/tts/async?appkey={K8mSx8pOrrcJ5bS9}&task_id={336d21fa712a469294f702b38624a305}&token={7275bfc4a4af4fb6864d2319a85ee26b}'
{"status":200,"data":{"sentences":[{"text":"今天天气好晴朗","begin_time":"0","end_time":"1985"}],"task_id":"336d21fa712a469294f702b38624a305","audio_address":"http://nls-cloud-cn-shanghai.oss-cn-shanghai.aliyuncs.com/jupiter-flow/tmp/336d21fa712a469294f702b38624a305.wav?Expires=1716365759&OSSAccessKeyId=LTAI4G588hXC7P47wauY5e2K&Signature=Qbk6aI4YZR%2F2C1f3JIE%2BPJLGmT8%3D","notify_custom":null},"error_code":20000000,"error_message":"SUCCESS","request_id":"33f3ec7598c740dbbd1db246b1daa081"}zj@a:~/docs$
# 获得合成结果
zj@a:~/docs$ curl -v 'http://nls-cloud-cn-shanghai.oss-cn-shanghai.aliyuncs.com/jupiter-flow/tmp/336d21fa712a469294f702b38624a305.wav?Expires=1716365759&OSSAccessKeyId=LTAI4G588hXC7P47wauY5e2K&Signature=Qbk6aI4YZR%2F2C1f3JIE%2BPJLGmT8%3D' -o test.wav
audio-SR 生成的 wav 文件采样率 16KHZ #
https://github.com/espressif/esp-box/blob/master/examples/chatgpt_demo/main/app/app_audio.c#L201
file_total_len += record_total_len;
ESP_LOGI(TAG, "### record Stop, %" PRIu32 " %" PRIu32 "K", \
record_total_len, \
record_total_len / 1024);
FILE *fp = fopen("/spiffs/echo_en_wake.wav", "r");
ESP_GOTO_ON_FALSE(NULL != fp, ESP_FAIL, err, TAG, "Failed create record file");
wav_header_t wav_head;
int len = fread(&wav_head, 1, sizeof(wav_header_t), fp);
ESP_GOTO_ON_FALSE(len > 0, ESP_FAIL, err, TAG, "Failed create record file");
wav_head.SampleRate = 16000; // 采样率
#if PCM_ONE_CHANNEL
wav_head.NumChannels = 1;
#else
wav_head.NumChannels = 2;
#endif
wav_head.BitsPerSample = 16; // 分辨率
wav_head.ChunkSize = file_total_len - 8;
wav_head.ByteRate = wav_head.SampleRate * wav_head.BitsPerSample * wav_head.NumChannels / 8;
wav_head.Subchunk2ID[0] = 'd';
wav_head.Subchunk2ID[1] = 'a';
wav_head.Subchunk2ID[2] = 't';
wav_head.Subchunk2ID[3] = 'a';
wav_head.Subchunk2Size = record_total_len;
memcpy((void *)record_audio_buffer, &wav_head, sizeof(wav_header_t));
Cache_WriteBack_Addr((uint32_t)record_audio_buffer, record_total_len);
sambert 语音合成 #
将合成音频保存为文件:
# https://help.aliyun.com/zh/dashscope/developer-reference/api-details-13?spm=a2c4g.11186623.0.0.5b2a1ac3a7GhO3
# coding=utf-8
import dashscope
from dashscope.audio.tts import SpeechSynthesizer
dashscope.api_key='your-dashscope-api-key'
result = SpeechSynthesizer.call(model='sambert-zhichu-v1',
text='今天天气怎么样',
sample_rate=48000)
if result.get_audio_data() is not None:
with open('output.wav', 'wb') as f:
f.write(result.get_audio_data())
语音合成提供的实时语音合成API,可将文字内容转化为音频。除语音数据外,可选择开启字级别和音素级别时间戳,用于生成字幕或驱动数字人嘴型。
不同的使用场景,需要选择适合的模型,如客服场景、直播场景、方言场景、童声场景等,详情请参考模型列表。 采样率的选择也同样重要,通常情况下,客户场景建议选择8kHz,其他场景建议选择16k/24k/48kHz,采样率越高音频越饱满,听感越好
同步调用: 提交单个语音合成任务,无需调用回调函数,进行语音合成(无流式输出中间结果),最终一次性获取完整结果。以下示例展示了如何使用同步接口调用发音人模型知厨(sambert-zhichu-v1),将文案“今天天气怎么 样”合成采样率为48kHz,音频格式为wav的音频,并保存到名为output.wav的文件中。
# coding=utf-8
import dashscope
from dashscope.audio.tts import SpeechSynthesizer
dashscope.api_key='your-dashscope-api-key'
result = SpeechSynthesizer.call(model='sambert-zhichu-v1',
text='今天天气怎么样',
sample_rate=48000,
format='wav')
if result.get_audio_data() is not None:
with open('output.wav', 'wb') as f:
f.write(result.get_audio_data())
print(' get response: %s' % (result.get_response()))
paraformer 语音识别 #
实时语音识别示例代码:实时语音识别是对不限时长的音频流做实时识别,达到“边说边出文字”的效果,内置智能断句,可提供每句话开始结束时间。可用于视频实时直播字幕、实时会议记录、实时法庭庭审记录、智能语音助手 等场景。
使用麦克风进行流式语音文字上屏:以下示例展示使用实时语音识别API,使用麦克风进行流式语音识别并进行文字上屏,达到“边说边出文字”的效果。
# For prerequisites running the following sample, visit https://help.aliyun.com/document_detail/611472.html
import pyaudio
import dashscope
from dashscope.audio.asr import (Recognition, RecognitionCallback,
RecognitionResult)
dashscope.api_key='<your-dashscope-api-key>'
mic = None
stream = None
class Callback(RecognitionCallback):
def on_open(self) -> None:
global mic
global stream
print('RecognitionCallback open.')
mic = pyaudio.PyAudio()
stream = mic.open(format=pyaudio.paInt16,
channels=1,
rate=16000,
input=True)
def on_close(self) -> None:
global mic
global stream
print('RecognitionCallback close.')
stream.stop_stream()
stream.close()
mic.terminate()
stream = None
mic = None
def on_event(self, result: RecognitionResult) -> None:
print('RecognitionCallback sentence: ', result.get_sentence())
callback = Callback()
recognition = Recognition(model='paraformer-realtime-v1',
format='pcm',
sample_rate=16000,
callback=callback)
recognition.start()
while True:
if stream:
data = stream.read(3200, exception_on_overflow = False)
recognition.send_audio_frame(data)
else:
break
recognition.stop()
使用同步接口进行文件转写:以下示例展示使用语音识别同步API接口进行文件转写,对于对话聊天、控制口令、语音输入法、语音搜索等较短的准实时语音识别场景可考虑采用该接口进行语音识别。
# For prerequisites running the following sample, visit https://help.aliyun.com/document_detail/611472.html
import requests
from http import HTTPStatus
import dashscope
from dashscope.audio.asr import Recognition
dashscope.api_key = '<your-dashscope-api-key>'
r = requests.get(
'https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_female2.wav'
)
with open('asr_example.wav', 'wb') as f:
f.write(r.content)
recognition = Recognition(model='paraformer-realtime-v1',
format='wav',
sample_rate=16000,
callback=None)
result = recognition.call('asr_example.wav')
if result.status_code == HTTPStatus.OK:
with open('asr_result.txt', 'w+') as f:
for sentence in result.get_sentence():
f.write(str(sentence) + '\n')
print('Recognition done!')
else:
print('Error: ', result.message)
调用返回的结果:
{
"begin_time": 280,
"end_time": 4000,
"text": "hello word, 这里是阿里巴巴语音实验室。",
"words": [{
"begin_time": 280,
"end_time": 776,
"text": "hello ",
"punctuation": ""
}, {
"begin_time": 776,
"end_time": 1024,
"text": "word",
"punctuation": ", "
}, {
"begin_time": 1024,
"end_time": 1520,
"text": "这里",
"punctuation": ""
}, {
"begin_time": 1520,
"end_time": 1768,
"text": "是",
"punctuation": ""
}, {
"begin_time": 1768,
"end_time": 2760,
"text": "阿里巴巴",
"punctuation": ""
}, {
"begin_time": 2760,
"end_time": 3256,
"text": "语音",
"punctuation": ""
}, {
"begin_time": 3256,
"end_time": 4000,
"text": "实验室",
"punctuation": "。"
}]
}
异步文件转写示例代码:以下示例展示了调用Paraformer语音识别文件转写异步API, =对多个通过URL给出的音频文件进行语音识别批处理的代码= 。
# For prerequisites running the following sample, visit https://help.aliyun.com/document_detail/611472.html
import json
from urllib import request
from http import HTTPStatus
import dashscope
dashscope.api_key='your-dashscope-api-key'
task_response = dashscope.audio.asr.Transcription.async_call(
model='paraformer-v1',
file_urls=[
'https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_female2.wav',
'https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_male2.wav'
])
transcription_response = dashscope.audio.asr.Transcription.wait(
task=task_response.output.task_id)
if transcription_response.status_code == HTTPStatus.OK:
for transcription in transcription_response.output['results']:
url = transcription['transcription_url']
result = json.loads(request.urlopen(url).read().decode('utf8'))
print(json.dumps(result, indent=4, ensure_ascii=False))
print('transcription done!')
else:
print('Error: ', transcription_response.output.message)
结果:
{
"file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_male2.wav",
"properties": {
"channels": [],
"original_sampling_rate": 16000,
"original_duration_in_milliseconds": 4726
},
"transcripts": [{
"channel_id": 0,
"content_duration_in_milliseconds": 3696,
"text": "Hello, world, 这里是阿里巴巴语音实验室。",
"sentences": [{
"begin_time": 480,
"end_time": 4176,
"sentence_id": 0,
"text": "Hello, world, 这里是阿里巴巴语音实验室。",
"words": [{
"begin_time": 480,
"end_time": 860,
"text": "Hello",
"punctuation": ", "
},
{
"begin_time": 860,
"end_time": 1320,
"text": "world",
"punctuation": ", "
},
{
"begin_time": 1320,
"end_time": 2034,
"text": "这里是",
"punctuation": ""
},
{
"begin_time": 2034,
"end_time": 2986,
"text": "阿里巴巴",
"punctuation": ""
},
{
"begin_time": 2986,
"end_time": 3462,
"text": "语音",
"punctuation": ""
},
{
"begin_time": 3462,
"end_time": 4176,
"text": "实验室",
"punctuation": "。"
}
]
}]
}]
} {
"file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_female2.wav",
"properties": {
"channels": [],
"original_sampling_rate": 16000,
"original_duration_in_milliseconds": 3834
},
"transcripts": [{
"channel_id": 0,
"content_duration_in_milliseconds": 2816,
"text": "Hello, world, 这里是阿里巴巴语音实验室。",
"sentences": [{
"begin_time": 560,
"end_time": 3376,
"sentence_id": 0,
"text": "Hello, world, 这里是阿里巴巴语音实验室。",
"words": [{
"begin_time": 560,
"end_time": 880,
"text": "Hello",
"punctuation": ", "
},
{
"begin_time": 880,
"end_time": 1180,
"text": "world",
"punctuation": ", "
},
{
"begin_time": 1180,
"end_time": 1729,
"text": "这里是",
"punctuation": ""
},
{
"begin_time": 1729,
"end_time": 2461,
"text": "阿里巴巴",
"punctuation": ""
},
{
"begin_time": 2461,
"end_time": 2827,
"text": "语音",
"punctuation": ""
},
{
"begin_time": 2827,
"end_time": 3376,
"text": "实验室",
"punctuation": "。"
}
]
}]
}]
}
实时语音识别:
recognition = Recognition(model='paraformer-realtime-v1',
format='pcm',
sample_rate=16000,
callback=callback)
# 同步返回识别结果
result = recognition.call('asr_example.wav')
录音文件识别API详情: 由于音视频文件的尺寸通常较大,文件传输和语音识别处理均需要时间,文件转写API通过异步调用方式来提交任务。开发者需要通过查询接口,在文件转写完成后获得语音识别结果。文件转写API支持批 处理,用户可以单次上传最多100个文件URL,待所有URL转写完成后,用户可以一次性获取全部转写结果