【obs-studio开源项目从入门到放弃】audio_thread 音频编码线程理解

文章目录

    • 前言
    • 1.音频编码线程的创建时机
    • 2.音频编码线程的工作内容
      • 真正处理音频输入输出的函数 input_and_output
      • audio_callback 回调获取所有音频数据
      • do_audio_output 做混音后的音频编码和输出
      • receive_audio 真正做音频编码的函数入口
    • 3.音频编码线程与rtmp输出线程之间的配合
    • 4.总结
    • 技术参考


前言

obs系列文章入口:https://blog.csdn.net/qq_33844311/article/details/121479224

obs的音频处理只有一个线程,不像视频处理有两个线程。obs的音频源的输入主要有以下3个源

  • 本地麦克风采集 wasapi_input_capture
  • 本地扬声器采集 wasapi_output_capture
  • 媒体源声音输出 ffmpeg_source
  • windows 进程音频捕获 win-capture-audio(非官方音频源,对win10版本有最低要求 Windows 10 2004 (released 2020-05-27) or later)

所有源的音频采集输入到libobs核心库中的音频缓存队列都是通过obs_source_output_audio完成,obs的音频处理都是异步的。音频采集线程负责音频数据的采集(不同的源有各自的音频采集线程),音频编码线程负责音频的编码发送。通过音频缓存队列实现两个线程之间的通信。

/** Outputs audio data (always asynchronous) 总是以异步方式输出音频数据*/
void obs_source_output_audio(obs_source_t *source, const struct obs_source_audio *audio);

接下来通过注释源码方式具体分析obs音频编码线程的创建时机,工作内容,线程之间配合。同分析obs视频编码线程那篇文章一样。

1.音频编码线程的创建时机

通过断点调试,列出创建音频编码线程的函数调用堆栈。

	// 创建音频线程 audio_thread
	pthread_create(&out->thread, NULL, audio_thread, out)
>	obs.dll!audio_output_open(audio_output * * audio, audio_output_info * info)395	C
 	obs.dll!obs_init_audio(audio_output_info * ai)578	C
 	obs.dll!obs_reset_audio(const obs_audio_info * oai)1202	C
 	obs64.exe!OBSBasic::ResetAudio()4487	C++
 	obs64.exe!OBSBasic::OBSInit()1772	C++
 	obs64.exe!OBSApp::OBSInit()1474	C++
 	obs64.exe!run_program(std::basic_fstream<char,std::char_traits<char>> & logFile, int argc, char * * argv)2138	C++
 	obs64.exe!main(int argc, char * * argv)2839	C++
 	obs64.exe!WinMain(HINSTANCE__ * __formal, HINSTANCE__ * __formal, char * __formal, int __formal)97	C++

通过堆栈分析可以看到音频编码线程的创建和视频渲染线程视频编码线程创建都是在主线程的初始化 OBSBasic::OBSInit() 函数里面创建的。在程序启动的时候这三个线程就一起创建好了。

2.音频编码线程的工作内容

音频编码线程负责获取所有源的音频数据,混音后送去编码(AAC、Opus音频编码),完成编码后插入音视频包交织队列,通知rtmp发送线程打包成flv通过rtmp协议发送到流媒体服务器。也可以录制成本地视频文件。

真正处理音频输入输出的函数 input_and_output

audio_thread 线程函数循环体中真正处理音频输入输出的函数input_and_output

static void input_and_output(struct audio_output *audio, uint64_t audio_time,
			     uint64_t prev_time)
{
	size_t bytes = AUDIO_OUTPUT_FRAMES * audio->block_size;
	struct audio_output_data data[MAX_AUDIO_MIXES];
	uint32_t active_mixes = 0;
	uint64_t new_ts = 0;
	bool success;

	memset(data, 0, sizeof(data));

	/* get mixers 当前处于活动的音频源个数 待混音个数*/
	pthread_mutex_lock(&audio->input_mutex);
	for (size_t i = 0; i < MAX_AUDIO_MIXES; i++) {
		if (audio->mixes[i].inputs.num)
			active_mixes |= (1 << i);
	}
	pthread_mutex_unlock(&audio->input_mutex);

	/* clear mix buffers 清空混音缓冲区,并让data保存缓冲区的首地址 */
	for (size_t mix_idx = 0; mix_idx < MAX_AUDIO_MIXES; mix_idx++) {
		struct audio_mix *mix = &audio->mixes[mix_idx];

		memset(mix->buffer, 0, sizeof(mix->buffer));

		for (size_t i = 0; i < audio->planes; i++)
			data[mix_idx].data[i] = mix->buffer[i];
	}

	/* get new audio data */
	// 在obs_reset_audio函数中绑定音频回调函数 audio_callback  obs-audio.c 
	success = audio->input_cb(audio->input_param, prev_time, audio_time,
				  &new_ts, active_mixes, data);
	if (!success)
		return;

	/* clamps audio data to -1.0..1.0 */
	// 把音频数据转换成[-1.0f, 1.0f]之间, 
	// 应该是混音后的音频数据做消峰处理【加和并箝(qián)位混音算法】,可能理解的不对,懂的大佬麻烦纠正一下。
	clamp_audio_output(audio, bytes);

	/* do_audio_output 接下来进行音频编码以及推流 */
	for (size_t i = 0; i < MAX_AUDIO_MIXES; i++)
		do_audio_output(audio, i, new_ts, AUDIO_OUTPUT_FRAMES);
}

audio_callback 回调获取所有音频数据

这个函数负责所有源的音频的渲染和混音操作,函数比较长有些地方我也不懂。回头有更深的理解再补充更新,欢迎大佬补充。

bool audio_callback(void *param, uint64_t start_ts_in, uint64_t end_ts_in,
		    uint64_t *out_ts, uint32_t mixers,
		    struct audio_output_data *mixes)
{
	struct obs_core_data *data = &obs->data;
	struct obs_core_audio *audio = &obs->audio;
	struct obs_source *source;
	//获取音频的采样率、声道数
	size_t sample_rate = audio_output_get_sample_rate(audio->audio);
	size_t channels = audio_output_get_channels(audio->audio);
	struct ts_info ts = {start_ts_in, end_ts_in};
	size_t audio_size;
	uint64_t min_ts;

	da_resize(audio->render_order, 0);
	da_resize(audio->root_nodes, 0);

	circlebuf_push_back(&audio->buffered_timestamps, &ts, sizeof(ts));
	circlebuf_peek_front(&audio->buffered_timestamps, &ts, sizeof(ts));
	min_ts = ts.start;

	audio_size = AUDIO_OUTPUT_FRAMES * sizeof(float);

	/* ------------------------------------------------ */
	/* build audio render order
	 * NOTE: these are source channels, not audio channels */
	// 遍历当前场景中的所有源( source ),加入到音频的渲染队列 audio->render_order 中
	for (uint32_t i = 0; i < MAX_CHANNELS; i++) {
		obs_source_t *source = obs_get_output_source(i);
		if (source) {
			obs_source_enum_active_tree(source, push_audio_tree,
						    audio);
			push_audio_tree(NULL, source, audio);
			da_push_back(audio->root_nodes, &source);
			obs_source_release(source);
		}
	}
	
	// 把系统自带的音频输入输出音频源(扬声器和麦克风)也加入到渲染队列 audio->render_order 中
	pthread_mutex_lock(&data->audio_sources_mutex);
	source = data->first_audio_source;
	while (source) {
		push_audio_tree(NULL, source, audio);
		source = (struct obs_source *)source->next_audio_source;
	}
	pthread_mutex_unlock(&data->audio_sources_mutex);

	/* ------------------------------------------------ */
	/* render audio data */
	//遍历上面构建的音频渲染队列,循环调用 obs_source_audio_render渲染队列中的所有源的音频数据
	//每个渲染完成的音频数据存放在 source->audio_output_buf 中
	for (size_t i = 0; i < audio->render_order.num; i++) {
		obs_source_t *source = audio->render_order.array[i];
		obs_source_audio_render(source, mixers, channels, sample_rate,
					audio_size);

		/* if a source has gone backward in time and we can no
		 * longer buffer, drop some or all of its audio */
		if (audio->total_buffering_ticks == MAX_BUFFERING_TICKS &&
		    source->audio_ts < ts.start) {
			if (source->info.audio_render) {
				blog(LOG_DEBUG,
				     "render audio source %s timestamp has "
				     "gone backwards",
				     obs_source_get_name(source));

				/* just avoid further damage */
				source->audio_pending = true;
			} else {
				pthread_mutex_lock(&source->audio_buf_mutex);
				bool rerender = ignore_audio(source, channels,
							     sample_rate,
							     ts.start);
				pthread_mutex_unlock(&source->audio_buf_mutex);

				/* if we (potentially) recovered, re-render */
				if (rerender)
					obs_source_audio_render(source, mixers,
								channels,
								sample_rate,
								audio_size);
			}
		}
	}

	/* ------------------------------------------------ */
	/* get minimum audio timestamp */
	pthread_mutex_lock(&data->audio_sources_mutex);
	const char *buffering_name = calc_min_ts(data, sample_rate, &min_ts);
	pthread_mutex_unlock(&data->audio_sources_mutex);

	/* ------------------------------------------------ */
	/* if a source has gone backward in time, buffer */
	if (min_ts < ts.start)
		add_audio_buffering(audio, sample_rate, &ts, min_ts,
				    buffering_name);

	/* ------------------------------------------------ */
	/* mix audio */
	if (!audio->buffering_wait_ticks) {
		for (size_t i = 0; i < audio->root_nodes.num; i++) {
			obs_source_t *source = audio->root_nodes.array[i];

			if (source->audio_pending)
				continue;

			pthread_mutex_lock(&source->audio_buf_mutex);
			//混音操作 mix_audio 将audio_output_buf的音频混音后存放到 struct audio_mix mixes[MAX_AUDIO_MIXES];
			if (source->audio_output_buf[0][0] && source->audio_ts)
				mix_audio(mixes, source, channels, sample_rate, &ts);
			pthread_mutex_unlock(&source->audio_buf_mutex);
		}
	}

	/* ------------------------------------------------ */
	/* discard audio */
	pthread_mutex_lock(&data->audio_sources_mutex);
	
	source = data->first_audio_source;
	while (source) {
		pthread_mutex_lock(&source->audio_buf_mutex);
		discard_audio(audio, source, channels, sample_rate, &ts);
		pthread_mutex_unlock(&source->audio_buf_mutex);

		source = (struct obs_source *)source->next_audio_source;
	}

	pthread_mutex_unlock(&data->audio_sources_mutex);

	/* ------------------------------------------------ */
	/* release audio sources */
	release_audio_sources(audio);

	circlebuf_pop_front(&audio->buffered_timestamps, NULL, sizeof(ts));

	*out_ts = ts.start;

	if (audio->buffering_wait_ticks) {
		audio->buffering_wait_ticks--;
		return false;
	}

	UNUSED_PARAMETER(param);
	return true;
}

do_audio_output 做混音后的音频编码和输出

在获取音频的回调函数audio_callback完成后,首先做了一个音频数据的消峰处理clamp_audio_output,紧接着就是音频的重采样和编码发送工作。

static inline void do_audio_output(struct audio_output *audio, size_t mix_idx,
				   uint64_t timestamp, uint32_t frames)
{
	struct audio_mix *mix = &audio->mixes[mix_idx];
	struct audio_data data;

	pthread_mutex_lock(&audio->input_mutex);

	for (size_t i = mix->inputs.num; i > 0; i--) {
		struct audio_input *input = mix->inputs.array + (i - 1);
		
		// 赋值音频的缓冲区指针,帧数,时间戳等参数
		for (size_t i = 0; i < audio->planes; i++)
			data.data[i] = (uint8_t *)mix->buffer[i];
		data.frames = frames;
		data.timestamp = timestamp;
		
		// 输出音频数据做重采样
		if (resample_audio_output(input, &data))
			// 回调绑定的receive_audio  在obs-encoder.c
			input->callback(input->param, mix_idx, &data);
	}

	pthread_mutex_unlock(&audio->input_mutex);
}

receive_audio 真正做音频编码的函数入口

音频编码线程走到 receive_audio 后就到了送去音频编码器工作的地方,这里的处理和视频编码线程基本一致。调用注册的aac音频编码器,编码完成后插入音视频包交织队列,发送信号量通知rtmp发送线程send_thread开始工作,将音视频包打包成flv通过rtmp协议发送到流媒体服务器。具体的代码细节,大家看源码就可以。

static void receive_audio(void *param, size_t mix_idx, struct audio_data *in)
{
	profile_start(receive_audio_name);

	struct obs_encoder *encoder = param;
	struct audio_data audio = *in;
	// 第一次收到音频数据后,清空encoder->audio_input_buff
	if (!encoder->first_received) {
		encoder->first_raw_ts = audio.timestamp;
		encoder->first_received = true;
		clear_audio(encoder);
	}
	
	//录制暂停的检查,文件录制支持暂停录制
	if (audio_pause_check(&encoder->pause, &audio, encoder->samplerate))
		goto end;

	if (!buffer_audio(encoder, &audio))
		goto end;

	while (encoder->audio_input_buffer[0].size >= encoder->framesize_bytes) {
		// 送去做 (aac/opus) 编码		
		if (!send_audio_data(encoder)) {
			break;
		}
	}

	UNUSED_PARAMETER(mix_idx);

end:
	profile_end(receive_audio_name);
}

3.音频编码线程与rtmp输出线程之间的配合

do_encode接口开始音频的处理和视频的处理是同样的逻辑,不再赘述。

下面贴一下完整的音频编码线程从采集混音到通知发送线程的用堆栈

>	obs.dll!os_sem_post(os_sem_data * sem)139	C //发送信号通知send_thread 发送音视频数据
 	obs-outputs.dll!rtmp_stream_data(void * data, encoder_packet * packet)1462	C	//encoded_packet 绑定的回调函数
 	obs.dll!send_interleaved(obs_output * output)1350	C	// 取出音视频交织队列的第一个数据包发送给encoded_packet回调
 	obs.dll!interleave_packets(void * data, encoder_packet * packet)1738	C	//把音频包插入到音视频包交织队列
 	obs.dll!send_packet(obs_encoder * encoder, encoder_callback * cb, encoder_packet * packet)896	C
 	obs.dll!send_off_encoder_packet(obs_encoder* encoder, bool success, bool received, encoder_packet* pkt)954	C
 	obs.dll!do_encode(obs_encoder * encoder, encoder_frame * frame)989	C 	// 做音频编码
 	obs.dll!send_audio_data(obs_encoder * encoder)1183	C
 	obs.dll!receive_audio(void * param, unsigned __int64 mix_idx, audio_data * in)1284	C
 	obs.dll!do_audio_output(audio_output * audio, unsigned __int64 mix_idx, unsigned __int64 timestamp, unsigned int frames)126	C
 	obs.dll!input_and_output(audio_output * audio, unsigned __int64 audio_time, unsigned __int64 prev_time)201	C
 	obs.dll!audio_thread(void * param)241	C
 	w32-pthreads.dll!ptw32_threadStart(void * vthreadParms)225	C

4.总结

我对obs的音频编码线程的理解还不是很深,有很多细节没有讲到。大致上理清楚了audio_thread 的创建时机、工作原理、线程之间的配合,具体的细节以后有机会在做说明吧。

以上都是个人工作当中对obs-studio开源项目的理解,难免有错误的地方,如果有欢迎指出。

若有帮助幸甚。


技术参考

  1. 视频技术参考: https://ke.qq.com/course/3202131?flowToken=1040950
  2. OBS音频数据混音、编码、推流数据流程https://blog.csdn.net/liuhengxiao/article/details/83059314
  3. 音频混音算法介绍:https://blog.csdn.net/u010164190/article/details/117691952

你可能感兴趣的