Raw Audio Data

Introduction

During the audio transmission process, you can pre- and post-process the captured audio data to achieve the desired playback effect.

Agora provides the raw data function for you to process the audio data according to your scenarios. This function enables you to pre-process the captured audio signal before sending it to the encoder, or to post-process the decoded audio signal.

Sample project

Agora provides an open-source sample project that implements processing raw audio data using Java APIs on GitHub. You can try the demo and view the source code.

Implementation

Before using the raw data functions, ensure that you have implemented the basic real-time audio functions in your project.

Process raw audio data using Java APIs

To call Java APIs in your project to implement the raw audio data functions, do the following:

Before joining a channel, create an IAudioFrameObserver object and then call registerAudioFrameObserver to register an audio frame observer.
After you successfully register the audio frame observer, the SDK triggers the getRecordAudioParams, getPlaybackAudioParams, or getMixedAudioParams callbacks. You can set the desired audio data format in the return values of these callbacks.
After you join the channel, the SDK triggers the getObservedAudioFramePosition and isMultipleChannelFrameWanted callbacks when capturing each audio frame. In the return values of these callbacks, you can set the audio observation positions and whether to receive raw audio data from multiple channels.
According to the return values of getObservedAudioFramePosition and isMultipleChannelFrameWanted, the SDK triggers the onRecordFrame, onPlaybackFrame, onPlaybackFrameBeforeMixing/onPlaybackFrameBeforeMixingEx, or onMixedFrame callbacks to send you the captured raw audio data.
Process the captured audio data according to your scenarios. You can send the processed audio data with the onRecordFrame, onPlaybackFrame, onPlaybackFrameBeforeMixing/onPlaybackFrameBeforeMixingEx or onMixedFrame callbacks according to your scenarios.

API call sequence

The following diagram shows how to implement the raw audio data function in your project:

1635734726061

Sample code

// Define the readBuffer method to read the audio buffer of the local audio file.
private byte[] readBuffer(){
    int byteSize = SAMPLES_PER_CALL * BIT_PER_SAMPLE / 8;
    byte[] buffer = new byte[byteSize];
    try {
        if(inputStream.read(buffer) < 0){
            inputStream.reset();
            return readBuffer();
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
    return buffer;
}
// Define the audioAggregate method to mix the audio data from the onRecordFrame callback with the audio buffer of the local audio file.
private byte[] audioAggregate(byte[] origin, byte[] buffer) {
    byte[] output = new byte[origin.length];
    for (int i = 0; i < origin.length; i++) {
        output[i] = (byte) ((int) origin[i] + (int) buffer[i] / 2);
    }
    return output;
}
// Implement an IAudioFrameObserver class.
private final IAudioFrameObserver audioFrameObserver = new IAudioFrameObserver() {
    // Implement the getObservedAudioFramePosition callback. Set the audio observation position as POSITION_RECORD in the return value of this callback, which enables the SDK to trigger the onRecordFrame callback.
    @Override
    public int getObservedAudioFramePosition() {
        return IAudioFrameObserver.POSITION_RECORD;
    }
    // Implement the getRecordAudioParams callback. Set the audio recording format in the return value of this callback for the onRecordFrame callback.
    @Override
    public AudioParams getRecordAudioParams() {
        return new AudioParams(SAMPLE_RATE, SAMPLE_NUM_OF_CHANNEL, Constants.RAW_AUDIO_FRAME_OP_MODE_READ_WRITE, SAMPLES_PER_CALL);
    }
    // Implement the onRecordFrame callback, get audio data from the callback, and send the data to the SDK after mixing it with the local audio file.
    @Override
    public boolean onRecordFrame(AudioFrame audioFrame) {
        Log.i(TAG, "onRecordAudioFrame " + isWriteBackAudio);
        if(isWriteBackAudio){
            ByteBuffer byteBuffer = audioFrame.samples;
            byte[] buffer = readBuffer();
            byte[] origin = new byte[byteBuffer.remaining()];
            byteBuffer.get(origin);
            byteBuffer.flip();
            byteBuffer.put(audioAggregate(origin, buffer), 0, byteBuffer.remaining());
        }
        return true;
    }
};
// Pass the IAudioFrameObserver and register the audio observer.
engine.registerAudioFrameObserver(audioFrameObserver);

API reference

Process raw audio data using JNI and C++ APIs

Before using the raw data function, ensure that you have implemented the basic real-time audio function in your project.

The Agora C++ SDK provides the IAudioFrameObserver class to capture and modify raw audio data. Therefore, you can use Java to call the C++ API via the JNI (Java Native Interface). Since the Video SDK for Java encapsulates the Video SDK for C++, you can include the .h file in the SDK to directly call the C++ methods.

Follow these steps to implement the raw audio data function in your project:

Use the JNI and C++ interface files to generate a shared library in the project, and use Java to call the raw audio data interface of the Agora C++ SDK.
Before joining a channel, call the registerAudioFrameObserver method to register an audio observer, and implement an IAudioFrameObserver class in this method.
After you successfully register the observer, the SDK sends the captured raw audio data via the onRecordAudioFrame, onPlaybackAudioFrame, onPlaybackAudioFrameBeforeMixing, or onMixedAudioFrame callbacks.
Process the captured raw audio data according to your needs. Then, you can either play it yourself directly or send it to the SDK via the callbacks mentioned in step 3 per your requirements.

Call the Agora C++ API in a Java project

The following diagram shows the basic flow of calling the Agora C++ API in a Java project:

1607913364400

The Java project loads the .so library built from the C++ interface file (.cpp file) via the Java interface file.
The Java interface file generates a .h file with the javac -h -jni command. The C++ interface file should include this file.
The C++ interface file calls the C++ method of the .so library in the Agora Android SDK by including the header files from the Agora Android SDK.

API call sequence

The following diagram shows how to implement the raw audio data function in your project:

The registerAudioFrameObserver, onRecordAudioFrame, onPlaybackAudioFrame, onMixedAudioFrame, and onPlaybackAudioFrameBeforeMixing are all C++ methods and callbacks.

1607913459953

Sample code

Create a JNI interface

Create a Java interface file and a C++ interface file separately via the JNI interface. Make sure to build the C++ interface file as a .so library.

Create a Java interface file to call the C++ API. The interface file should declare the relevant Java methods for calling C++. Refer to the MediaPreProcessing.java file in the sample project for the implementation.

// The Java interface file declares the relevant Java methods for calling C++.
package io.agora.advancedvideo.rawdata;
import java.nio.ByteBuffer;
public class MediaPreProcessing {
    static {
        // Loads the C++ .so library. Build the C++ interface file to generate the .so library.
        // The name of the .so library depends on the library name generated by building the C++ interface file.
        System.loadLibrary("apm-plugin-raw-data");
    }
    // Define the local method
    public interface ProgressCallback {
       ...
        // Get the recorded audio frame
        void onRecordAudioFrame(int audioFrameType, int samples, int bytesPerSample, int channels, int samplesPerSec, long renderTimeMs, int bufferLength);
        // Get the playback audio frame
        void onPlaybackAudioFrame(int audioFrameType, int samples, int bytesPerSample, int channels, int samplesPerSec, long renderTimeMs, int bufferLength);
        // Get the playback audio frame before mixing
        void onPlaybackAudioFrameBeforeMixing(int uid, int audioFrameType, int samples, int bytesPerSample, int channels, int samplesPerSec, long renderTimeMs, int bufferLength);
        // Get the mixed audio frame
        void onMixedAudioFrame(int audioFrameType, int samples, int bytesPerSample, int channels, int samplesPerSec, long renderTimeMs, int bufferLength);
    }
    public static native void setCallback(ProgressCallback callback);
    public static native void setAudioRecordByteBuffer(ByteBuffer byteBuffer);
    public static native void setAudioPlayByteBuffer(ByteBuffer byteBuffer);
    public static native void setBeforeAudioMixByteBuffer(ByteBuffer byteBuffer);
    public static native void setAudioMixByteBuffer(ByteBuffer byteBuffer);
    public static native void releasePoint();
}

Run the following command to generate a .h file from the Java interface file:

# JDK 10 or later
javac -h -jni MediaPreProcessing.java
# JDK 9 or earlier
javac MediaPreProcessing.java
javah -jni MediaPreProcessing.class

Create a C++ interface file. The C++ interface file exports the corresponding methods from the C++ SDK based on the generated .h file. Refer to the io_agora_advancedvideo_rawdata_MediaPreProcessing.cpp file in the sample project for the implementation.

The JNI defines the data structure in C++ that maps to Java. You can refer to this document for more information.

// Global variables
jobject gCallBack = nullptr;
jclass gCallbackClass = nullptr;
// Method IDs at the Java level
jmethodID recordAudioMethodId = nullptr;
jmethodID playbackAudioMethodId = nullptr;
jmethodID playBeforeMixAudioMethodId = nullptr;
jmethodID mixAudioMethodId = nullptr;
// ByteBuffer for audio frames from onRecordAudioFrame
void *_javaDirectPlayBufferRecordAudio = nullptr;
// ByteBuffer for audio frames from onPlaybackAudioFrame
void *_javaDirectPlayBufferPlayAudio = nullptr;
// ByteBuffer for audio frames from onPlaybackAudioFrameBeforeMixing
void *_javaDirectPlayBufferBeforeMixAudio = nullptr;
// ByteBuffer for audio frames from onMixedAudioFrame
void *_javaDirectPlayBufferMixAudio = nullptr;
map<int, void *> decodeBufferMap;
static JavaVM *gJVM = nullptr;
// Implement the IAudioFrameObserver class and related callbacks
class AgoraAudioFrameObserver : public agora::media::IAudioFrameObserver
{
public:
    AgoraAudioFrameObserver()
    {
        gCallBack = nullptr;
    }
    ~AgoraAudioFrameObserver()
    {
    }
    // Get audio frames from the AudioFrame object, copy to the ByteBuffer, and call the Java method by method ID
    void getAudioFrame(AudioFrame &audioFrame, _jmethodID *jmethodID, void *_byteBufferObject,
                       unsigned int uid)
    {
        if (_byteBufferObject == nullptr)
        {
            return;
        }
        AttachThreadScoped ats(gJVM);
        JNIEnv *env = ats.env();
        if (env == nullptr)
        {
            return;
        }
        int len = audioFrame.samples * audioFrame.bytesPerSample;
        memcpy(_byteBufferObject, audioFrame.buffer, (size_t) len); // * sizeof(int16_t)
        if (uid == 0)
        {
            env->CallVoidMethod(gCallBack, jmethodID, audioFrame.type, audioFrame.samples,
                                audioFrame.bytesPerSample,
                                audioFrame.channels, audioFrame.samplesPerSec,
                                audioFrame.renderTimeMs, len);
        } else
        {
            env->CallVoidMethod(gCallBack, jmethodID, uid, audioFrame.type, audioFrame.samples,
                                audioFrame.bytesPerSample,
                                audioFrame.channels, audioFrame.samplesPerSec,
                                audioFrame.renderTimeMs, len);
        }
    }
    // Copies the audio frames from the ByteBuffer to the AudioFrame object
    void writebackAudioFrame(AudioFrame &audioFrame, void *byteBuffer)
    {
        if (byteBuffer == nullptr)
        {
            return;
        }
        int len = audioFrame.samples * audioFrame.bytesPerSample;
        memcpy(audioFrame.buffer, byteBuffer, (size_t) len);
    }
public:
    // Implement the onRecordAudioFrame callback
    virtual bool onRecordAudioFrame(AudioFrame &audioFrame) override
    {
        // Gets the recorded audio frames
        getAudioFrame(audioFrame, recordAudioMethodId, _javaDirectPlayBufferRecordAudio, 0);
        // Sends the audio frames to the SDK
        writebackAudioFrame(audioFrame, _javaDirectPlayBufferRecordAudio);
        return true;
    }
    // Implement the onPlaybackAudioFrame callback
    virtual bool onPlaybackAudioFrame(AudioFrame &audioFrame) override
    {
        // Gets the playback audio frames
        getAudioFrame(audioFrame, playbackAudioMethodId, _javaDirectPlayBufferPlayAudio, 0);
        // Sends the audio frames to the SDK
        writebackAudioFrame(audioFrame, _javaDirectPlayBufferPlayAudio);
        return true;
    }
   // Implement the onPlaybackAudioFrameBeforeMixing callback
    virtual bool onPlaybackAudioFrameBeforeMixing(unsigned int uid, AudioFrame &audioFrame) override
    {
        // Gets the playback audio frames before mixing
        getAudioFrame(audioFrame, playBeforeMixAudioMethodId, _javaDirectPlayBufferBeforeMixAudio,
                      uid);
        // Sends the audio frames to the SDK
        writebackAudioFrame(audioFrame, _javaDirectPlayBufferBeforeMixAudio);
        return true;
    }
    // Implement the onMixedAudioFrame callback
    virtual bool onMixedAudioFrame(AudioFrame &audioFrame) override
    {
        // Gets the mixed audio frames
        getAudioFrame(audioFrame, mixAudioMethodId, _javaDirectPlayBufferMixAudio, 0);
        // Sends the audio frames to the SDK
        writebackAudioFrame(audioFrame, _javaDirectPlayBufferMixAudio);
        return true;
    }
};
...
// AgoraAudioFrameObserver object
static AgoraAudioFrameObserver s_audioFrameObserver;
// IRtcEngine object
static agora::rtc::IRtcEngine *rtcEngine = nullptr;
// Set up the C++ interface
#ifdef __cplusplus
extern "C" {
#endif
int __attribute__((visibility("default")))
loadAgoraRtcEnginePlugin(agora::rtc::IRtcEngine *engine)
{
    __android_log_print(ANDROID_LOG_DEBUG, "agora-raw-data-plugin", "loadAgoraRtcEnginePlugin");
    rtcEngine = engine;
    return 0;
}
void __attribute__((visibility("default")))
unloadAgoraRtcEnginePlugin(agora::rtc::IRtcEngine *engine)
{
    __android_log_print(ANDROID_LOG_DEBUG, "agora-raw-data-plugin", "unloadAgoraRtcEnginePlugin");
    rtcEngine = nullptr;
}
...
// For the Java interface file, use the JNI to export corresponding C++ methods. The Java_io_agora_advancedvideo_rawdata_MediaPreProcessing_setCallback method corresponds to the setCallback method in the Java interface file.
JNIEXPORT void JNICALL Java_io_agora_advancedvideo_rawdata_MediaPreProcessing_setCallback
        (JNIEnv *env, jclass, jobject callback)
{
    if (!rtcEngine) return;
    env->GetJavaVM(&gJVM);
    // Create an AutoPtr instance that uses the IMediaEngine class as the template
    agora::util::AutoPtr<agora::media::IMediaEngine> mediaEngine;
    // The AutoPtr instance calls the queryInterface method to get a pointer to the IMediaEngine instance from the IID.
    // The AutoPtr instance accesses the pointer to the IMediaEngine instance via the arrow operator and calls the registerVideoFrameObserver method via the IMediaEngine instance.
    mediaEngine.queryInterface(rtcEngine, agora::INTERFACE_ID_TYPE::AGORA_IID_MEDIA_ENGINE);
    if (mediaEngine)
    {
        ...
        // Register the audio frame observer
        int ret = mediaEngine->registerAudioFrameObserver(&s_audioFrameObserver);
    }
   if (gCallBack == nullptr)
    {
        gCallBack = env->NewGlobalRef(callback);
        gCallbackClass = env->GetObjectClass(gCallBack);
        // Get the MethodId of each callback function through the callback object
        recordAudioMethodId = env->GetMethodID(gCallbackClass, "onRecordAudioFrame", "(IIIIIJI)V");
        playbackAudioMethodId = env->GetMethodID(gCallbackClass, "onPlaybackAudioFrame",
                                                 "(IIIIIJI)V");
        playBeforeMixAudioMethodId = env->GetMethodID(gCallbackClass,
                                                      "onPlaybackAudioFrameBeforeMixing",
                                                      "(IIIIIIJI)V");
        mixAudioMethodId = env->GetMethodID(gCallbackClass, "onMixedAudioFrame", "(IIIIIJI)V");
        ...
        __android_log_print(ANDROID_LOG_DEBUG, "setCallback", "setCallback done successfully");
    }
}
...
// C++ implementation of setAudioRecordByteBuffer in the Java interface file
JNIEXPORT void JNICALL
Java_io_agora_advancedvideo_rawdata_MediaPreProcessing_setAudioRecordByteBuffer
        (JNIEnv *env, jclass, jobject bytebuffer)
{
    _javaDirectPlayBufferRecordAudio = env->GetDirectBufferAddress(bytebuffer);
}
// C++ implementation of setAudioPlayByteBuffer in the Java interface file
JNIEXPORT void JNICALL Java_io_agora_advancedvideo_rawdata_MediaPreProcessing_setAudioPlayByteBuffer
        (JNIEnv *env, jclass, jobject bytebuffer)
{
    _javaDirectPlayBufferPlayAudio = env->GetDirectBufferAddress(bytebuffer);
}
//  C++ implementation of setBeforeAudioMixByteBuffer in the Java interface file
JNIEXPORT void JNICALL
Java_io_agora_advancedvideo_rawdata_MediaPreProcessing_setBeforeAudioMixByteBuffer
        (JNIEnv *env, jclass, jobject bytebuffer)
{
    _javaDirectPlayBufferBeforeMixAudio = env->GetDirectBufferAddress(bytebuffer);
}
//  C++ implementation of setAudioMixByteBuffer in the Java interface file
JNIEXPORT void JNICALL Java_io_agora_advancedvideo_rawdata_MediaPreProcessing_setAudioMixByteBuffer
        (JNIEnv *env, jclass, jobject bytebuffer)
{
    _javaDirectPlayBufferMixAudio = env->GetDirectBufferAddress(bytebuffer);
}
}
...
#ifdef __cplusplus
}
#endif

Build the C++ interface file via the NDK to generate a .so library. Use the System.loadLibrary() method to load the generated .so library in the Java interface file. See the following CMake file.

cmake_minimum_required(VERSION 3.4.1)
add_library( # Sets the name of the library.
        apm-plugin-raw-data
        # Sets the library as a shared library.
        SHARED
        # Provides a relative path to your source file(s).
        src/main/cpp/io_agora_advancedvideo_rawdata_MediaPreProcessing.cpp)
find_library( # Sets the name of the path variable.
        log-lib
        # Specifies the name of the NDK library that
        # you want CMake to locate.
        log)
target_link_libraries( # Specifies the target library.
        apm-plugin-raw-data
        # Links the target library to the log library
        # included in the NDK.
        ${log-lib})

Implement the raw audio data function in a Java project

Implement an interface that maps to the C++ methods in a Java interface file.

// Implement the ProgressCallback interface in Java
public class MediaDataObserverPlugin implements MediaPreProcessing.ProgressCallback {
    ...
   // Get the recorded audio frame
   @Override
    public void onRecordAudioFrame(int audioFrameType, int samples, int bytesPerSample, int channels, int samplesPerSec, long renderTimeMs, int bufferLength) {
        byte[] buf = new byte[bufferLength];
        byteBufferAudioRecord.limit(bufferLength);
        byteBufferAudioRecord.get(buf);
        byteBufferAudioRecord.flip();
        for (MediaDataAudioObserver observer : audioObserverList) {
            observer.onRecordAudioFrame(buf, audioFrameType, samples, bytesPerSample, channels, samplesPerSec, renderTimeMs, bufferLength);
        }
        byteBufferAudioRecord.put(buf);
        byteBufferAudioRecord.flip();
    }
    // Get the playback audio frame
    @Override
    public void onPlaybackAudioFrame(int audioFrameType, int samples, int bytesPerSample, int channels, int samplesPerSec, long renderTimeMs, int bufferLength) {
        byte[] buf = new byte[bufferLength];
        byteBufferAudioPlay.limit(bufferLength);
        byteBufferAudioPlay.get(buf);
        byteBufferAudioPlay.flip();
        for (MediaDataAudioObserver observer : audioObserverList) {
            observer.onPlaybackAudioFrame(buf, audioFrameType, samples, bytesPerSample, channels, samplesPerSec, renderTimeMs, bufferLength);
        }
        byteBufferAudioPlay.put(buf);
        byteBufferAudioPlay.flip();
    }
    // Get the playback audio frame before mixing
    @Override
    public void onPlaybackAudioFrameBeforeMixing(int uid, int audioFrameType, int samples, int bytesPerSample, int channels, int samplesPerSec, long renderTimeMs, int bufferLength) {
        byte[] buf = new byte[bufferLength];
        byteBufferBeforeAudioMix.limit(bufferLength);
        byteBufferBeforeAudioMix.get(buf);
        byteBufferBeforeAudioMix.flip();
        for (MediaDataAudioObserver observer : audioObserverList) {
            observer.onPlaybackAudioFrameBeforeMixing(uid, buf, audioFrameType, samples, bytesPerSample, channels, samplesPerSec, renderTimeMs, bufferLength);
        }
        byteBufferBeforeAudioMix.put(buf);
        byteBufferBeforeAudioMix.flip();
    }
    // Get the mixed audio frame
    @Override
    public void onMixedAudioFrame(int audioFrameType, int samples, int bytesPerSample, int channels, int samplesPerSec, long renderTimeMs, int bufferLength) {
        byte[] buf = new byte[bufferLength];
        byteBufferAudioMix.limit(bufferLength);
        byteBufferAudioMix.get(buf);
        byteBufferAudioMix.flip();
        for (MediaDataAudioObserver observer : audioObserverList) {
            observer.onMixedAudioFrame(buf, audioFrameType, samples, bytesPerSample, channels, samplesPerSec, renderTimeMs, bufferLength);
        }
        byteBufferAudioMix.put(buf);
        byteBufferAudioMix.flip();
    }
 }

Call the setCallback method. The setCallback method calls the registerAudioFrameObserver C++ method via JNI to register an audio frame observer.

@Override
   public void onActivityCreated(@Nullable Bundle savedInstanceState) {
       super.onActivityCreated(savedInstanceState);
       mediaDataObserverPlugin = MediaDataObserverPlugin.the();
       // Registers the audio frame observer
       MediaPreProcessing.setCallback(mediaDataObserverPlugin);
       ...
   }

Implement the onRecordAudioFrame, onPlaybackAudioFrame, onPlaybackAudioFrameBeforeMixing, and onMixedAudioFrame callbacks. Get the audio frames from the callbacks, and process the audio frames.

// Get the recorded audio frame
@Override
 public void onRecordAudioFrame(byte[] data, int audioFrameType, int samples, int bytesPerSample, int channels, int samplesPerSec, long renderTimeMs, int bufferLength) {
 }
 // Get the playback audio frame
 @Override
 public void onPlaybackAudioFrame(byte[] data, int audioFrameType, int samples, int bytesPerSample, int channels, int samplesPerSec, long renderTimeMs, int bufferLength) {
 }
// Get the playback audio frame before mixing
 @Override
 public void onPlaybackAudioFrameBeforeMixing(int uid, byte[] data, int audioFrameType, int samples, int bytesPerSample, int channels, int samplesPerSec, long renderTimeMs, int bufferLength) {
 }
// Get the mixed audio frame
 @Override
 public void onMixedAudioFrame(byte[] data, int audioFrameType, int samples, int bytesPerSample, int channels, int samplesPerSec, long renderTimeMs, int bufferLength) {
 }