app_confbridge: Update dsp_silence_threshold and dsp_talking_threshold docs.
The dsp_talking_threshold does not represent time in milliseconds. It represents the average magnitude per sample in the audio packets. This is what the DSP uses to determine if a packet is silence or talking/noise. Change-Id: If6f939c100eb92a5ac6c21236559018eeaf58443
This commit is contained in:
parent
6c5e3226ec
commit
b9024197ab
|
@ -144,72 +144,66 @@
|
|||
</para></description>
|
||||
</configOption>
|
||||
<configOption name="dsp_silence_threshold">
|
||||
<synopsis>The number of milliseconds of detected silence necessary to trigger silence detection</synopsis>
|
||||
<description><para>
|
||||
The time in milliseconds of sound falling within the what
|
||||
the dsp has established as baseline silence before a user
|
||||
is considered be silent. This value affects several
|
||||
operations and should not be changed unless the impact
|
||||
on call quality is fully understood.</para>
|
||||
<para>What this value affects internally:</para>
|
||||
<para>
|
||||
1. When talk detection AMI events are enabled, this value
|
||||
<synopsis>The number of milliseconds of silence necessary to declare talking stopped.</synopsis>
|
||||
<description>
|
||||
<para>The time in milliseconds of sound falling below the
|
||||
<replaceable>dsp_talking_threshold</replaceable> option when
|
||||
a user is considered to stop talking. This value affects several
|
||||
operations and should not be changed unless the impact on call
|
||||
quality is fully understood.
|
||||
</para>
|
||||
<para>What this value affects internally:
|
||||
</para>
|
||||
<para>1. When talk detection AMI events are enabled, this value
|
||||
determines when the user has stopped talking after a
|
||||
period of talking. If this value is set too low
|
||||
AMI events indicating the user has stopped talking
|
||||
may get falsely sent out when the user briefly pauses
|
||||
during mid sentence.
|
||||
</para>
|
||||
<para>
|
||||
2. The <replaceable>drop_silence</replaceable> option depends on this value to
|
||||
determine when the user's audio should begin to be
|
||||
dropped from the conference bridge after the user
|
||||
</para>
|
||||
<para>2. The <replaceable>drop_silence</replaceable> option
|
||||
depends on this value to determine when the user's audio should
|
||||
begin to be dropped from the conference bridge after the user
|
||||
stops talking. If this value is set too low the user's
|
||||
audio stream may sound choppy to the other participants.
|
||||
This is caused by the user transitioning constantly from
|
||||
silence to talking during mid sentence.
|
||||
</para>
|
||||
<para>
|
||||
The best way to approach this option is to set it slightly above
|
||||
the maximum amount of ms of silence a user may generate during
|
||||
natural speech.
|
||||
</para>
|
||||
<para>By default this value is 2500ms. Valid values are 1 through 2^31.</para>
|
||||
audio stream may sound choppy to the other participants. This
|
||||
is caused by the user transitioning constantly from silence to
|
||||
talking during mid sentence.
|
||||
</para>
|
||||
<para>The best way to approach this option is to set it slightly
|
||||
above the maximum amount of milliseconds of silence a user may
|
||||
generate during natural speech.
|
||||
</para>
|
||||
<para>Valid values are 1 through 2^31.</para>
|
||||
</description>
|
||||
</configOption>
|
||||
<configOption name="dsp_talking_threshold">
|
||||
<synopsis>The number of milliseconds of detected non-silence necessary to triger talk detection</synopsis>
|
||||
<description><para>
|
||||
The time in milliseconds of sound above what the dsp has
|
||||
established as base line silence for a user before a user
|
||||
is considered to be talking. This value affects several
|
||||
operations and should not be changed unless the impact on
|
||||
call quality is fully understood.</para>
|
||||
<para>
|
||||
What this value affects internally:
|
||||
<synopsis>Average magnitude threshold to determine talking.</synopsis>
|
||||
<description>
|
||||
<para>The minimum average magnitude per sample in a frame
|
||||
for the DSP to consider talking/noise present. A value below
|
||||
this level is considered silence. This value affects several
|
||||
operations and should not be changed unless the impact on call
|
||||
quality is fully understood.
|
||||
</para>
|
||||
<para>
|
||||
1. Audio is only mixed out of a user's incoming audio stream
|
||||
if talking is detected. If this value is set too
|
||||
loose the user will hear themselves briefly each
|
||||
time they begin talking until the dsp has time to
|
||||
establish that they are in fact talking.
|
||||
<para>What this value affects internally:
|
||||
</para>
|
||||
<para>
|
||||
2. When talk detection AMI events are enabled, this value
|
||||
<para>1. Audio is only mixed out of a user's incoming audio
|
||||
stream if talking is detected. If this value is set too
|
||||
high the user will hear himself talking.
|
||||
</para>
|
||||
<para>2. When talk detection AMI events are enabled, this value
|
||||
determines when talking has begun which results in
|
||||
an AMI event to fire. If this value is set too tight
|
||||
an AMI event to fire. If this value is set too low
|
||||
AMI events may be falsely triggered by variants in
|
||||
room noise.
|
||||
</para>
|
||||
<para>
|
||||
3. The <replaceable>drop_silence</replaceable> option depends on this value to determine
|
||||
when the user's audio should be mixed into the bridge
|
||||
after periods of silence. If this value is too loose
|
||||
the beginning of a user's speech will get cut off as they
|
||||
transition from silence to talking.
|
||||
<para>3. The <replaceable>drop_silence</replaceable> option
|
||||
depends on this value to determine when the user's audio should
|
||||
be mixed into the bridge after periods of silence. If this value
|
||||
is too high the user's speech will get discarded as they will
|
||||
be considered silent.
|
||||
</para>
|
||||
<para>By default this value is 160 ms. Valid values are 1 through 2^31</para>
|
||||
<para>Valid values are 1 through 2^15.</para>
|
||||
</description>
|
||||
</configOption>
|
||||
<configOption name="jitterbuffer">
|
||||
|
@ -1479,7 +1473,7 @@ static char *handle_cli_confbridge_show_user_profile(struct ast_cli_entry *e, in
|
|||
"enabled" : "disabled");
|
||||
ast_cli(a->fd,"Silence Threshold: %ums\n",
|
||||
u_profile.silence_threshold);
|
||||
ast_cli(a->fd,"Talking Threshold: %ums\n",
|
||||
ast_cli(a->fd,"Talking Threshold: %u\n",
|
||||
u_profile.talking_threshold);
|
||||
ast_cli(a->fd,"Denoise: %s\n",
|
||||
u_profile.flags & USER_OPT_DENOISE ?
|
||||
|
|
|
@ -41,7 +41,10 @@
|
|||
#define DEFAULT_BRIDGE_PROFILE "default_bridge"
|
||||
#define DEFAULT_MENU_PROFILE "default_menu"
|
||||
|
||||
/*! Default minimum average magnitude threshold to determine talking by the DSP. */
|
||||
#define DEFAULT_TALKING_THRESHOLD 160
|
||||
|
||||
/*! Default time in ms of silence necessary to declare talking stopped by the bridge. */
|
||||
#define DEFAULT_SILENCE_THRESHOLD 2500
|
||||
|
||||
enum user_profile_flags {
|
||||
|
@ -140,9 +143,9 @@ struct user_profile {
|
|||
char announcement[PATH_MAX];
|
||||
unsigned int flags;
|
||||
unsigned int announce_user_count_all_after;
|
||||
/*! The time in ms of talking before a user is considered to be talking by the dsp. */
|
||||
/*! Minimum average magnitude threshold to determine talking by the DSP. */
|
||||
unsigned int talking_threshold;
|
||||
/*! The time in ms of silence before a user is considered to be silent by the dsp. */
|
||||
/*! Time in ms of silence necessary to declare talking stopped by the bridge. */
|
||||
unsigned int silence_threshold;
|
||||
/*! The time in ms the user may stay in the confbridge */
|
||||
unsigned int timeout;
|
||||
|
|
|
@ -53,9 +53,16 @@
|
|||
/*! \brief Number of mixing iterations to perform between gathering statistics. */
|
||||
#define SOFTMIX_STAT_INTERVAL 100
|
||||
|
||||
/* This is the threshold in ms at which a channel's own audio will stop getting
|
||||
* mixed out its own write audio stream because it is not talking. */
|
||||
/*!
|
||||
* \brief Default time in ms of silence necessary to declare talking stopped by the bridge.
|
||||
*
|
||||
* \details
|
||||
* This is the time at which a channel's own audio will stop getting
|
||||
* mixed out of its own write audio stream because it is no longer talking.
|
||||
*/
|
||||
#define DEFAULT_SOFTMIX_SILENCE_THRESHOLD 2500
|
||||
|
||||
/*! Default minimum average magnitude threshold to determine talking by the DSP. */
|
||||
#define DEFAULT_SOFTMIX_TALKING_THRESHOLD 160
|
||||
|
||||
#define SOFTBRIDGE_VIDEO_DEST_PREFIX "softbridge_dest"
|
||||
|
|
|
@ -49,59 +49,67 @@ type=user
|
|||
; noise from the conference. Highly recommended for large conferences
|
||||
; due to its performance enhancements.
|
||||
|
||||
;dsp_talking_threshold=128 ; The time in milliseconds of sound above what the dsp has
|
||||
; established as base line silence for a user before a user
|
||||
; is considered to be talking. This value affects several
|
||||
;dsp_talking_threshold=128 ; Average magnitude threshold to determine talking.
|
||||
;
|
||||
; The minimum average magnitude per sample in a frame for the
|
||||
; DSP to consider talking/noise present. A value below this
|
||||
; level is considered silence. This value affects several
|
||||
; operations and should not be changed unless the impact on
|
||||
; call quality is fully understood.
|
||||
;
|
||||
; What this value affects internally:
|
||||
;
|
||||
; 1. Audio is only mixed out of a user's incoming audio stream
|
||||
; if talking is detected. If this value is set too
|
||||
; loose the user will hear themselves briefly each
|
||||
; time they begin talking until the dsp has time to
|
||||
; establish that they are in fact talking.
|
||||
; 2. When talk detection AMI events are enabled, this value
|
||||
; determines when talking has begun which results in
|
||||
; an AMI event to fire. If this value is set too tight
|
||||
; AMI events may be falsely triggered by variants in
|
||||
; room noise.
|
||||
; 3. The drop_silence option depends on this value to determine
|
||||
; when the user's audio should be mixed into the bridge
|
||||
; after periods of silence. If this value is too loose
|
||||
; the beginning of a user's speech will get cut off as they
|
||||
; transition from silence to talking.
|
||||
; 1. Audio is only mixed out of a user's incoming audio
|
||||
; stream if talking is detected. If this value is set too
|
||||
; high the user will hear himself talking.
|
||||
;
|
||||
; By default this value is 160 ms. Valid values are 1 through 2^31
|
||||
; 2. When talk detection AMI events are enabled, this value
|
||||
; determines when talking has begun which results in an
|
||||
; AMI event to fire. If this value is set too low AMI
|
||||
; events may be falsely triggered by variants in room
|
||||
; noise.
|
||||
;
|
||||
; 3. The 'drop_silence' option depends on this value to
|
||||
; determine when the user's audio should be mixed into the
|
||||
; bridge after periods of silence. If this value is too
|
||||
; high the user's speech will get discarded as they will
|
||||
; be considered silent.
|
||||
;
|
||||
; Valid values are 1 through 2^15.
|
||||
; By default this value is 160.
|
||||
|
||||
;dsp_silence_threshold=2000 ; The time in milliseconds of sound falling within the what
|
||||
; the dsp has established as baseline silence before a user
|
||||
; is considered be silent. This value affects several
|
||||
; operations and should not be changed unless the impact
|
||||
; on call quality is fully understood.
|
||||
;dsp_silence_threshold=2000 ; The number of milliseconds of silence necessary to declare
|
||||
; talking stopped.
|
||||
;
|
||||
; The time in milliseconds of sound falling below the
|
||||
; 'dsp_talking_threshold' option when a user is considered to
|
||||
; stop talking. This value affects several operations and
|
||||
; should not be changed unless the impact on call quality is
|
||||
; fully understood.
|
||||
;
|
||||
; What this value affects internally:
|
||||
;
|
||||
; 1. When talk detection AMI events are enabled, this value
|
||||
; determines when the user has stopped talking after a
|
||||
; period of talking. If this value is set too low
|
||||
; AMI events indicating the user has stopped talking
|
||||
; may get falsely sent out when the user briefly pauses
|
||||
; during mid sentence.
|
||||
; 2. The drop_silence option depends on this value to
|
||||
; period of talking. If this value is set too low AMI
|
||||
; events indicating the user has stopped talking may get
|
||||
; falsely sent out when the user briefly pauses during mid
|
||||
; sentence.
|
||||
;
|
||||
; 2. The 'drop_silence' option depends on this value to
|
||||
; determine when the user's audio should begin to be
|
||||
; dropped from the conference bridge after the user
|
||||
; stops talking. If this value is set too low the user's
|
||||
; audio stream may sound choppy to the other participants.
|
||||
; This is caused by the user transitioning constantly from
|
||||
; dropped from the conference bridge after the user stops
|
||||
; talking. If this value is set too low the user's audio
|
||||
; stream may sound choppy to the other participants. This
|
||||
; is caused by the user transitioning constantly from
|
||||
; silence to talking during mid sentence.
|
||||
;
|
||||
; The best way to approach this option is to set it slightly above
|
||||
; the maximum amount of ms of silence a user may generate during
|
||||
; natural speech.
|
||||
; The best way to approach this option is to set it slightly
|
||||
; above the maximum amount of milliseconds of silence a user
|
||||
; may generate during natural speech.
|
||||
;
|
||||
; By default this value is 2500ms. Valid values are 1 through 2^31
|
||||
; Valid values are 1 through 2^31.
|
||||
; By default this value is 2500ms.
|
||||
|
||||
;talk_detection_events=yes ; This option sets whether or not notifications of when a user
|
||||
; begins and ends talking should be sent out as events over AMI.
|
||||
|
|
|
@ -46,11 +46,9 @@ enum ast_bridge_preference {
|
|||
* performing talking optimizations.
|
||||
*/
|
||||
struct ast_bridge_tech_optimizations {
|
||||
/*! The amount of time in ms that talking must be detected before
|
||||
* the dsp determines that talking has occurred */
|
||||
/*! Minimum average magnitude threshold to determine talking by the DSP. */
|
||||
unsigned int talking_threshold;
|
||||
/*! The amount of time in ms that silence must be detected before
|
||||
* the dsp determines that talking has stopped */
|
||||
/*! Time in ms of silence necessary to declare talking stopped by the bridge. */
|
||||
unsigned int silence_threshold;
|
||||
/*! Whether or not the bridging technology should drop audio
|
||||
* detected as silence from the mix. */
|
||||
|
|
|
@ -87,7 +87,7 @@ void ast_dsp_free(struct ast_dsp *dsp);
|
|||
* created with */
|
||||
unsigned int ast_dsp_get_sample_rate(const struct ast_dsp *dsp);
|
||||
|
||||
/*! \brief Set threshold value for silence */
|
||||
/*! \brief Set the minimum average magnitude threshold to determine talking by the DSP. */
|
||||
void ast_dsp_set_threshold(struct ast_dsp *dsp, int threshold);
|
||||
|
||||
/*! \brief Set number of required cadences for busy */
|
||||
|
@ -106,19 +106,41 @@ int ast_dsp_set_call_progress_zone(struct ast_dsp *dsp, char *zone);
|
|||
busies, and call progress, all dependent upon which features are enabled */
|
||||
struct ast_frame *ast_dsp_process(struct ast_channel *chan, struct ast_dsp *dsp, struct ast_frame *inf);
|
||||
|
||||
/*! \brief Return non-zero if this is silence. Updates "totalsilence" with the total
|
||||
number of seconds of silence */
|
||||
/*!
|
||||
* \brief Process the audio frame for silence.
|
||||
*
|
||||
* \param dsp DSP processing audio media.
|
||||
* \param f Audio frame to process.
|
||||
* \param totalsilence Variable to set to the total accumulated silence in ms
|
||||
* seen by the DSP since the last noise.
|
||||
*
|
||||
* \return Non-zero if the frame is silence.
|
||||
*/
|
||||
int ast_dsp_silence(struct ast_dsp *dsp, struct ast_frame *f, int *totalsilence);
|
||||
|
||||
/*! \brief Return non-zero if this is silence. Updates "totalsilence" with the total
|
||||
number of seconds of silence. Returns the average energy of the samples in the frame
|
||||
in frames_energy variable. */
|
||||
/*!
|
||||
* \brief Process the audio frame for silence.
|
||||
*
|
||||
* \param dsp DSP processing audio media.
|
||||
* \param f Audio frame to process.
|
||||
* \param totalsilence Variable to set to the total accumulated silence in ms
|
||||
* seen by the DSP since the last noise.
|
||||
* \param frames_energy Variable to set to the average energy of the samples in the frame.
|
||||
*
|
||||
* \return Non-zero if the frame is silence.
|
||||
*/
|
||||
int ast_dsp_silence_with_energy(struct ast_dsp *dsp, struct ast_frame *f, int *totalsilence, int *frames_energy);
|
||||
|
||||
/*!
|
||||
* \brief Return non-zero if this is noise. Updates "totalnoise" with the total
|
||||
* number of seconds of noise
|
||||
* \brief Process the audio frame for noise.
|
||||
* \since 1.6.1
|
||||
*
|
||||
* \param dsp DSP processing audio media.
|
||||
* \param f Audio frame to process.
|
||||
* \param totalnoise Variable to set to the total accumulated noise in ms
|
||||
* seen by the DSP since the last silence.
|
||||
*
|
||||
* \return Non-zero if the frame is silence.
|
||||
*/
|
||||
int ast_dsp_noise(struct ast_dsp *dsp, struct ast_frame *f, int *totalnoise);
|
||||
|
||||
|
|
21
main/dsp.c
21
main/dsp.c
|
@ -122,12 +122,19 @@ static struct progress {
|
|||
{ GSAMP_SIZE_UK, { 350, 400, 440 } }, /*!< UK */
|
||||
};
|
||||
|
||||
/*!\brief This value is the minimum threshold, calculated by averaging all
|
||||
* of the samples within a frame, for which a frame is determined to either
|
||||
* be silence (below the threshold) or noise (above the threshold). Please
|
||||
* note that while the default threshold is an even exponent of 2, there is
|
||||
* no requirement that it be so. The threshold will accept any value between
|
||||
* 0 and 32767.
|
||||
/*!
|
||||
* \brief Default minimum average magnitude threshold to determine talking/noise by the DSP.
|
||||
*
|
||||
* \details
|
||||
* The magnitude calculated for this threshold is determined by
|
||||
* averaging the absolute value of all samples within a frame.
|
||||
*
|
||||
* This value is the threshold for which a frame's average magnitude
|
||||
* is determined to either be silence (below the threshold) or
|
||||
* noise/talking (at or above the threshold). Please note that while
|
||||
* the default threshold is an even exponent of 2, there is no
|
||||
* requirement that it be so. The threshold will work for any value
|
||||
* between 1 and 2^15.
|
||||
*/
|
||||
#define DEFAULT_THRESHOLD 512
|
||||
|
||||
|
@ -397,7 +404,9 @@ typedef struct {
|
|||
struct ast_dsp {
|
||||
struct ast_frame f;
|
||||
int threshold;
|
||||
/*! Accumulated total silence in ms since last talking/noise. */
|
||||
int totalsilence;
|
||||
/*! Accumulated total talking/noise in ms since last silence. */
|
||||
int totalnoise;
|
||||
int features;
|
||||
int ringtimeout;
|
||||
|
|
Loading…
Reference in New Issue