Developer Documentation
Our new developer documentation is now available. Please check it out!

Streaming audio files

With the raise of AI, we are seeing more and more applications that require server-side audio streaming capabilities. In this article, we will show you how to stream an audio file into an ODIN room using our NodeJS SDK.

Use cases

There are many use cases for audio streaming. Here are some examples:

  • Talking with an Artificial Intelligence (AI) bot. You need to stream the audio from the user to the AI and the audio from the AI to the user.
  • Playing music in a room.
  • Playing a sound effect in a room.

Our NodeJS SDK is perfect for these use cases. It allows you to receive audio streams from users and send audio streams back into the room. You can transform the audio stream you receive from users into text with OpenAIs Whisper API and you can use AWS Polly to transform text into audio and send it back to the room.

In all of these use cases, you can concentrate on building the use-case while ODIN handles all that audio and networking stuff for you.

Example

This example takes an MP3 file and streams it into an ODIN room. It uses the following libraries:

Providing a UserData object is not necessary but its good practice and allows you to identify your bot in the room. The user data object is a JSON object that is used by our Web client we use internally for testing. You can use it quickly test if everything works fine. More info on the web client can be found here.

const accessKey = "__YOUR_ACCESS_KEY__";
const roomName = "Lobby";
const userName = "My Bot";

// Load the odin module and other libs
import odin from '@4players/odin-nodejs';
const {OdinClient} = odin;
import fs from 'fs';
import decode, {decoders} from 'audio-decode';
import AudioBufferStream from 'audio-buffer-stream';

// Create an odin client instance using our access key and create a room
const userData = {
  name: "Music Bot",
  avatar: "https://avatars.dicebear.com/api/bottts/123.svg?backgroundColor=%23333333&textureChance=0&margin=10",
  seed: "123",
  userId: "Bot007",
  outputMuted: 1,
  inputMuted: 0,
  platform: "ODIN JS Bot SDK",
  version: "0.1"
}
const data = new TextEncoder().encode(JSON.stringify(userData));
const odinClient = new OdinClient(accessKey, 48000, 2);
const room = odinClient.createRoom(roomName, userName);

// Join the room
room.join("gateway.odin.4players.io", data);

// Send a message to the room
const message = {
  kind: 'message',
  payload: 'Hello, I am a music bot and will stream some music to you.'
}
room.sendMessage(new TextEncoder().encode(JSON.stringify(message)));

// Send music to the room
const sendMusic = async (media) => {
  // Prepare our MP3 decoder and load the sample file
  const audioBuffer = await decode(fs.readFileSync('./santa.mp3'));
  
  // ODIN requires 20ms chunks of audio data (i.e. 50 times a second). We need to calculate the chunk length based on 
  // the sample rate of the file by dividing it by 50. If sample rate is 48kHz, we need to send 960 samples per chunk.
  const chunkLength = audioBuffer.sampleRate/50;

  // Create a stream that will match the settings of the file
  const audioBufferStream = new AudioBufferStream({
    channels: audioBuffer.numberOfChannels,
    sampleRate: audioBuffer.sampleRate,
    float: true,
    bitDepth: 32,
    chunkLength: chunkLength 
  });

  // Create a queue to store the chunks of audio data
  const queue = [];

  // Whenever the stream has data, add it to the queue
  audioBufferStream.on('data', (data) => {
    const floats = new Float32Array(new Uint8Array(data).buffer);
    queue.push(floats);
  });

  // Start a timer to send audio data at regular intervals
  const interval = setInterval(() => {
    if (queue.length > 0) {
      const chunk = queue.shift();
      media.sendAudioData(chunk);
    } else {
      // If there's no more data to send, stop the timer
      clearInterval(interval);
      audioBufferStream.end();
      console.log("Audio finished");
    }
  }, 20);  // Send a chunk every 20ms

  audioBufferStream.write(audioBuffer);
}

// Create a media stream in the room - it will return an OdinMedia instance that we can use to send data to ODIN
const media = room.createAudioStream(48000, 2);
console.log(media);
console.log("MEDIA-ID:", media.id);

// Start the stream and send the music to ODIN
sendMusic(media).then(() => {
  console.log("Started sending audio");
});

// Wait until the user presses a key to stop
console.log("Press any key to stop");
const stdin = process.stdin;
stdin.resume();
stdin.setEncoding( 'utf8' );
stdin.on( 'data', function( key )
{
  console.log("Shutting down");
  room.close();

  process.exit();
});

Next steps

If you can send audio, you might also be interested in receiving audio and eventually transcribing it into text for content moderation or AI interaction. We have an example for that too. You can find it here.

ODIN Bot SDK

This example is just a starting point. You can use it to build your own audio streaming application. We have built an ODIN Bot SDK in TypeScript built on top of the ODIN NodeJS SDK that you can use to build your own AI bots and provides simple interfaces to capture and send audio streams. We have published it as a separate NPM package. You can find it here.