# Create Audio AI Companions

## Audio AI Companion - Configuration Guide

The Audio AI Editor enables creators to create and customize the behavior of live audio AI Companions using various configuration settings.&#x20;

{% embed url="<https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fb2XHHmnGm9gfPrTW5NRc%2Fuploads%2FYvTdMajt2FvmnLJmsYCJ%2F2024-12-02-CreatorsAGI_Audio_AI_Companions.mp4?alt=media&token=ab718608-ce1c-4de5-bb0e-34e862682fc1>" %}

### Key Features and Adjustable Settings

#### 1. Voice Activity Detection (VAD)

Voice Activity Detection determines when the system detects speech in an audio stream.

* Duration (Milliseconds):  Configure the time window for detecting voice activity, ranging from 200 ms to 2000 ms.
* How It Works:
  * A shorter duration makes VAD more sensitive to brief sounds.
  * Longer durations are useful for capturing sustained speech while reducing noise interference.

<figure><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXcuYQHFEoFEd__Y5QliNczxMzydw8L-HRDx5ulJ8xNq9lVDAfzC4U-F2zk83N-xtbdrFphg_ndNUZzgscnuuw_KzDkShvZuIY5pHspEoVMeQeAYEF5Jqx6yOEHQTN3PYIJPHdGQ?key=Rwl6h3XPCxb03Fd9vdjC2oKb" alt=""><figcaption></figcaption></figure>

#### 2. Audio Silence Threshold

The silence threshold sets the minimum audio level required for the system to detect voice input.

* Threshold: A value between 0.2 and 1.0.
* How It Works:
  * Lower thresholds (e.g., 0.2) make the system sensitive to softer sounds.
  * Higher thresholds (e.g., 1.0) ensure only loud or prominent sounds are captured, filtering background noise.

<figure><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXeLFxRhBGabtPMjYJLyd1pVDcbkKsLXWYZ6BQ3rKRPqEaT_do7B0O101SkDE8WUfLiO1hatO6llDJ222QIVmSFWLI_HGGKA0-bQ_CUzJGTv9F0QCBZywgocTQdg03I1QCA9XmH0zw?key=Rwl6h3XPCxb03Fd9vdjC2oKb" alt=""><figcaption></figcaption></figure>

### Other Configurable Options

* Creativity (Temperature): Adjusts the system's randomness in generating outputs (Between 0-2).  Higher values (e.g., 1.7) produce more creative responses, while lower values generate more deterministic outputs.
* Word Diversity (Top P): Controls how diverse or focused the generated responses are (Between 0-1). Lower values ensure more relevant and concise outputs.
* Voice Model: Allows selection from available AI voice models (e.g., "coral") for tailoring audio output styles.

***

### Usage Tips

* Optimizing VAD Settings:\
  Experiment with the duration to find the ideal balance between responsiveness and accuracy for your use case.
* Fine-Tuning Silence Threshold:\
  Use a lower threshold in quiet environments to capture all audio and a higher threshold in noisy spaces to focus on clear speech.
* Preview and Test:\
  Always test your configurations in the "Test" or "Preview" section to ensure your settings meet project requirements.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.creatorsagi.com/creator-portal-guide/create-audio-ai-companions.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
