Custom Voice AI Hosting

Overview

Purpose:
Host agents using our Indian phone number. Use cases:

Running conversational agents for outbound campaigns
Hosting voice-enabled services directly over phone calls
Using Indian phone numbers for compliance and reach

Setup

Before using this feature, make sure you:

Have your agent ready and running on a WebSocket endpoint.
Have an AwaazAI account with:
- A template set up for calls
- An Indian phone number assigned
- The WebSocket URL configured

How to Use

Once setup is complete, you can use Voice Hosting in these steps:

1. Upload Messages

Log in to the AwaazAI Portal UI.
Upload the messages you want to deliver.
The system pre-processes these messages before calls are triggered.

2. Outgoing Calls

Calls are triggered to end users via your Indian phone number.
During each call, AwaazAI opens a WebSocket connection to your agent.

3. Real-Time Audio Exchange

Once connected:
- User → AwaazAI → Agent (user’s audio is sent as encoded JSON messages).
- Agent → AwaazAI → User (agent’s audio is returned the same way).

Message Formats

AwaazAI uses pre-defined JSON formats to exchange audio and events.

Messages from AwaazAI

Start Message:

This message will be sent directly after websocket connection is established.
It will contain audio formating information and any data that might be required by the agent for executing call.

{
  "event": "start",
  "sequence_number": 1,
  "stream_sid": "MZXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
  "start": {
    "stream_sid": "MZXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
    "call_sid": "MZXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
    "media_format": {
        "encoding": "base64",
        "sample_rate": 8000,
        "bit_rate": "" },
    "custom_parameters": {
      "FirstName": "Jane",
      "LastName": "Doe",
      "RemoteParty": "Bob",
    }
  }
}

Media Message:

AwaazAI will continuously send call audio in below json encoded message format.
Each payload will contain 320 bytes of base64 encoded, 16-bit PCM, 8 kHz audio.

{
  "event": "media",
  "sequence_number": 2,
  "stream_sid": "MZXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
  "media": {
    "chunk": 1,
    "timestamp": "5",
    "payload": "<>"
  } 
}

DTMF Message:

This message will be sent when a digit is pressed in call by user.

{
 "event": "dtmf",
 "stream_sid":"MZXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
 "sequence_number":3,
 "dtmf": {
     "digit": "<>",
     "duration":"<duration in ms>"
 }
}

Mark Message:

This message is used to notify when media processing is completed

{
  "event": "mark",
  "sequence_number": "4",
  "stream_sid": "MZXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
  "mark": {
    "name": "my label"
  }
}

Stop Message:

This message will be sent when call has ended.

{
  "event": "stop",
  "sequence_number": 5,
  "stream_sid": "MZXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
  "stop": {
    "call_sid": "MZXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
  }
}

Messages to AwaazAI

All messages sent to AwaazAI should follow this format for audio messages

Media Message:

This message will be received from your websocket with audio to be played in call

{
  "event": "media",
  "sequence_number": 1,
  "stream_sid": "MZXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
  "media": {
    "chunk": 1,
    "timestamp": "5",
    "payload": "<>"
  } 
}

Mark Message:

This message is sent by your websocket to mark a media point. Once audio is processed, you will recieve mark event message from AwaazAI with matching name

{
  "event": "mark",
  "sequence_number": 2,
  "stream_sid": "MZXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
  "mark": {
    "name": "my label"
  }
}

Clear Message:

Send this message to clear audio that has been sent but not played yet.

{
  "event": "clear",
  "stream_sid": "MZXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
}

Example Client Code for Your Application

We provide a sample client implementation in our GitHub repository. This boilerplate demonstrates how to interact with the application by:

Receiving audio chunks in JSON messages.
Accumulating a certain number of chunks.
Sends audio from an audio file in chunks

Note: This example is intended as a reference. You should refactor it to fit the specific requirements of your agent or application.

You can explore the full sample and get started quickly by cloning the repo:
View Example Client Code

Overview

XACT

Voice AI Hosting

​Overview

​Setup

​How to Use

​1. Upload Messages

​2. Outgoing Calls

​3. Real-Time Audio Exchange

​Message Formats

​Messages from AwaazAI

​Messages to AwaazAI

​Example Client Code for Your Application

Overview

Setup

How to Use

1. Upload Messages

2. Outgoing Calls

3. Real-Time Audio Exchange

Message Formats

Messages from AwaazAI

Messages to AwaazAI

Example Client Code for Your Application