Base Classes
The base classes are responsible for running the core functionality such as streaming, multiprocessing and other parallelization.
stt.base.BaseEar
BaseEar(silence_seconds=2, not_interrupt_words=None, listener=None, stream=False, listen_interruptions=True, logger=None)
Initializes the BaseEar class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
silence_seconds
|
float, optional
|
Number of seconds of silence to detect. Defaults to 2. |
2
|
not_interrupt_words
|
list, optional
|
List of words that should not be considered as interruptions. |
None
|
listener
|
object, optional
|
Listener object to receive the audio from. Defaults to None. |
None
|
stream
|
bool, optional
|
Flag indicating whether to stream the audio or process it as a whole. Defaults to False. |
False
|
listen_interruptions
|
bool, optional
|
Flag indicating whether to listen for interruptions. Defaults to True. |
True
|
Methods:
Name | Description |
---|---|
transcribe |
Given an audio input, return the transcription |
transcribe_stream |
:param audio_queue: Queue containing audio chunks from pyaudio stream |
listen |
records audio using record_user and returns its transcription |
interrupt_listen |
Records audio with interruption. Transcribes audio if |
Attributes:
Name | Type | Description |
---|---|---|
silence_seconds |
|
|
not_interrupt_words |
|
|
vad |
|
|
listener |
|
|
stream |
|
|
listen_interruptions |
|
|
logger |
|
Source code in openvoicechat/stt/base.py
transcribe
Given an audio input, return the transcription
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_audio
|
ndarray
|
|
required |
Returns:
Type | Description |
---|---|
str
|
transcription |
transcribe_stream
_sim_transcribe_stream
Simulates the transcribe stream using a single audio input
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_audio
|
ndarray
|
fp32 numpy array of the audio |
required |
Returns:
Type | Description |
---|---|
str
|
transcription |
Source code in openvoicechat/stt/base.py
_log_event
_listen
records audio using record_user and returns its transcription
Returns:
Type | Description |
---|---|
str
|
transcription |
Source code in openvoicechat/stt/base.py
_listen_stream
records audio using record_user and returns its transcription
Returns:
Type | Description |
---|---|
str
|
transcription |
Source code in openvoicechat/stt/base.py
listen
records audio using record_user and returns its transcription
Returns:
Type | Description |
---|---|
str
|
transcription |
interrupt_listen
Records audio with interruption. Transcribes audio if voice activity detected and returns True if transcription indicates interruption.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
record_seconds
|
Max seconds to record for |
100
|
Returns:
Type | Description |
---|---|
str
|
boolean indicating the if an interruption occured |
Source code in openvoicechat/stt/base.py
llm.base.BaseChatbot
Initialize the model and other things here
Methods:
Name | Description |
---|---|
run |
Yields the response to the input text |
post_process |
Post process the response before returning |
generate_response |
:param input_text: The user input |
generate_response_stream |
:param input_text: The user input |
Attributes:
Name | Type | Description |
---|---|---|
logger |
|
Source code in openvoicechat/llm/base.py
run
post_process
generate_response
_log_event
generate_response_stream
generate_response_stream(input_text: str, output_queue: queue.Queue, interrupt_queue: queue.Queue) -> str
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_text
|
str
|
The user input |
required |
output_queue
|
Queue
|
The text output queue where the result is accumulated. |
required |
interrupt_queue
|
Queue
|
The interrupt queue which stores the transcription if interruption occurred. Used to stop generating. |
required |
Returns:
Type | Description |
---|---|
str
|
The chatbot's response after running self.post_process |
Source code in openvoicechat/llm/base.py
tts.base.BaseMouth
Initializes the BaseMouth class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample_rate
|
int
|
The sample rate of the audio. |
required |
player
|
The audio player object. Defaults to sounddevice. |
sounddevice
|
|
wait
|
Whether to wait for the audio to finish playing. Defaults to True. |
True
|
Methods:
Name | Description |
---|---|
run_tts |
:param text: The text to synthesize speech for |
say_text |
calls run_tts and plays the audio using the player. |
say |
Plays the audios in the queue using the player. Stops if interruption occurred. |
say_multiple |
Splits the text into sentences. Then plays the sentences one by one |
say_multiple_stream |
Receives text from the text_queue. As soon as a sentence is made run_tts is called to |
Attributes:
Name | Type | Description |
---|---|---|
sample_rate |
|
|
interrupted |
|
|
player |
|
|
seg |
|
|
wait |
|
|
logger |
|
Source code in openvoicechat/tts/base.py
run_tts
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text
|
str
|
The text to synthesize speech for |
required |
Returns:
Type | Description |
---|---|
ndarray
|
audio numpy array for sounddevice |
say_text
calls run_tts and plays the audio using the player.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text
|
str
|
The text to synthesize speech for |
required |
Source code in openvoicechat/tts/base.py
say
Plays the audios in the queue using the player. Stops if interruption occurred.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
audio_queue
|
Queue
|
The queue where the audio is stored for it to be played |
required |
listen_interruption_func
|
Callable
|
callable function from the ear class. |
required |
Source code in openvoicechat/tts/base.py
say_multiple
Splits the text into sentences. Then plays the sentences one by one using run_tts() and say()
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text
|
str
|
Input text to synthesize |
required |
listen_interruption_func
|
Callable
|
callable function from the ear class |
required |
Source code in openvoicechat/tts/base.py
_handle_interruption
Source code in openvoicechat/tts/base.py
_get_all_text
_log_event
say_multiple_stream
say_multiple_stream(text_queue: queue.Queue, listen_interruption_func: Callable, interrupt_queue: queue.Queue, audio_queue: queue.Queue = None)
Receives text from the text_queue. As soon as a sentence is made run_tts is called to synthesize its speech and sent to the audio_queue for it to be played.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text_queue
|
Queue
|
The queue where the llm adds the predicted tokens |
required |
listen_interruption_func
|
Callable
|
callable function from the ear class |
required |
interrupt_queue
|
Queue
|
The queue where True is put when interruption occurred. |
required |
audio_queue
|
Queue
|
The queue where the audio to be played is placed |
None
|
Source code in openvoicechat/tts/base.py
133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 |
|