Choose modality for the system based on caller and company needs
Modality is covered in great detail in Chapter 9, but it's something you need to consider at the beginning of your project. You can go with pure DTMF, pure speech, or some sort of hybrid (press or say throughout, or perhaps only on reprompts). Think about what tasks the system is trying to help callers accomplish, what environment they're doing it in, and what kind of budget the project has. Here are some basic considerations.

For interactive voice response (IVR) applications, consider using speech when:
  • The task becomes considerably easier with speech
    • Stocks
    • Funds transfers
    • Travel destinations
  • Callers are in a hands-busy or eyes-busy setting (Halstead-Nussloch, 1989),
    • When they are mobile (Novick et al., 1999)
    • Some job-related tools

Avoid using a speech-only IVR when:
  • The task requires graphics or other visual aids
    • How to dance
    • How to assemble something
  • The environment of use will be extremely noisy
    • Background noise
    • Simultaneous conversations
  • Users have special needs
    • Hearing impairment
    • Speech disfluency (or strong accents)
  • The environment prohibits the use of speech
    • Courtroom
    • Movie theater


Halstead-Nussloch, R. (1989). The design of phone-based interfaces for consumers. In Proceedings of CHI 1989 (pp. 347–352). Austin, TX: ACM.

Novick, D. G., Hansen, B., Sutton, S., & Marshall, C. R. (1999). Limiting factors of automated telephone dialogues. In D. Gardner-Bonneau (Ed.), Human factors and voice interactive systems (pp. 163–186). Boston, MA: Kluwer Academic Publishers.