There are several strategies currently in use when playing explicit confirmation prompts. For example:

  • System: Was that $500?

  • System: That's $500, right?

  • System: That's $500, correct?

  • System: That's $500, correct? yes or no?

All of these are reasonably conversational, and all, to some extent, elicit responses of "Yes" or "No" (and when callers say "No", they also often restate their response, e.g., "No, $900" -- see Allow for one-step correction).

The first style -- a simple yes/no question -- can be effective, but has the downside (when using recordings of a professional voice talent for speech output) of requiring multiple recordings of the data to be confirmed so it will have the appropriate end-of-question prosody (rising fundamental frequency).

To avoid the requirement for those special recordings, an alternative is to use one of the other strategies to avoid putting the data to be confirmed at the end of the question. But when using those other strategies, which is better -- "right" or "correct" -- or does it just not matter?

It isn't clear how strong this effect is, but consider the likely responses to "right?" or "correct?" (other than yes/no).

For "right?", you'd expect "right" or "wrong".

For "correct?", you'd expect "correct" or "incorrect".

Note that the responses for "correct?" have substantially greater acoustic similarity than those for "right?" For this reason and because "right?" seems to be slightly more commonly used in conversation, we recommend "right?" over "correct?" if the choice is between the two alone. On the other hand, the occasional use of "correct?" to avoid robotically using the same method in a dialog that includes multiple confirmations should be fine -- just be sure to routinely examine the recognition logs to make sure you're getting high enough recognition accuracy at that prompt. If not, consider switching to a different style.

A final style to consider is the use of "That's $500, correct? yes or no?" Here, the application prompts the caller for the required input to minimize inappropriate parroting. There have been informal reports that although this is longer and highly directive (and less conversational), it can result in improved recognition accuracy. If using this approach, be sure to have no more than 250 ms of pause between "correct?" and "yes or no?" to minimize the likelihood that the prompt extension ("yes or no") will begin to play just as the caller responds to "correct". Alternatively, if there is a need for this level of direction, consider starting with "yes or no", for example, "Yes or no, that's $500, right?" (and be sure that the grammar will accept "right" and "correct" as synonyms for "yes").