Aaron, A., Eide, E., & Pitrelli, J. F. (2005). Conversational computers. Scientific American, 292(6), 64–69.

Adlin, X., & Pruitt, J. (2010). The essential persona lifecycle: Your guide to building and using personas. Waltham, MA: Morgan Kaufmann.

Ahlén, S., Kaiser, L., & Olvera, E. (2004). Are you listening to your Spanish speakers? Speech Technology, 9(4), 10-15.

Ainsworth, W. A., & Pratt, S. R. (1992). Feedback strategies for error correction in speech recognition systems. International Journal of Man-Machine Studies, 36, 833–842.

Ainsworth, W. A., & Pratt, S. R. (1993). Comparing error correction strategies in speech recognition systems. In C. Baber & J. M. Noyes (Eds.), Interactive speech technology: Human factors issues in the application of speech input/output to computers (pp. 131–135). London, UK: Taylor & Francis.

Alwan, J., & Suhm, B. (2010). Beyond best practices: A data-driven approach to maximizing self-service. In W. Meisel (Ed.), Speech in the user interface: Lessons from experience (pp. 99–105). Victoria, Canada: TMA Associates.

Attwater, D. (2008). Speech and touch-tone in harmony [PowerPoint Slides]. Paper presented at SpeechTek 2008. New York, NY: SpeechTek.

Baddeley, A. D., & Hitch, G. (1974). Is working memory still working? American Psychologist, 56, 851-864.

Bailey, R. W. (1989). Human performance engineering: Using human factors/ergonomics to achieve computer system usability. Englewood Cliffs, NJ: Prentice-Hall.

Bailly, G. (2003). Close shadowing natural versus synthetic speech. International Journal of Speech Technology, 6, 11–19.

Balentine, B. (1999). Re-engineering the speech menu. In D. Gardner-Bonneau (Ed.), Human factors and voice interactive systems (pp. 205-235). Boston, MA: Kluwer Academic Publishers.

Balentine, B. (2006). The power of the pause. In W. Meisel (Ed.), VUI Visions: Expert Views on Effective Voice User Interface Design (pp. 89-91). Victoria, Canada: TMA Associates.

Balentine, B. (2007). It’s better to be a good machine than a bad person. Annapolis, MD: ICMI Press.

Balentine, B. (2010). Next-generation IVR avoids first-generation user interface mistakes. In W. Meisel (Ed.), Speech in the user interface: Lessons from experience (pp. 71–74). Victoria, Canada: TMA Associates.

Balentine, B., Ayer, C. M., Miller, C. L., & Scott, B. L. (1997). Debouncing the speech button: A sliding capture window device for synchronizing turn-taking. International Journal of Speech Technology, 2, 7–19.

Balentine, B., & Morgan, D. P. (2001). How to build a speech recognition application: A style guide for telephony dialogues, 2nd edition. San Ramon, CA: EIG Press.

Barkin, E. (2009). But is it natural? Speech Technology, 14(2), 21–24.

Beattie, G. W., & Barnard, P. J. (1979). The temporal structure of natural telephone conversations (directory enquiry calls). Linguistics, 17, 213–229.

Berndt, R. S., Mitchum, C., Burton, M., & Haendiges, A. (2004). Comprehension of reversible sentences in aphasia: The effects of verb meaning. Cognitive Neuropsychology, 21, 229–245.

Bitner, M. J., Ostrom, A. L., & Meuter, M. L. (2002). Implementing successful self-service technologies. Academy of Management Executive, 16(4), 96–108.

Bloom, J., Gilbert, J. E., Houwing, T., Hura, S., Issar, S., Kaiser, L., et al. (2005). Ten criteria for measuring effective voice user interfaces. Speech Technology, 10(9), 31–35.

Bloom, R., Pick, L., Borod, J., Rorie, K., Andelman, F., Obler, L., Sliwinski, M., Campbell, A., Tweedy, J., & Welkowitz, J. (1999). Psychometric aspects of verbal pragmatic ratings. Brain and Language, 68, 553–565.

Boretz, A. (2009). VUI standards: The great debate. Speech Technology, 14(8), 14-19.

Boyce, S. J. (2008). User interface design for natural language systems: From research to reality. In D. Gardner-Bonneau & H. E. Blanchard (Eds.), Human factors and voice interactive systems (2nd ed.) (pp. 43–80). New York, NY: Springer.

Boyce, S., & Viets, M. (2010). When is it my turn to talk?: Building smart, lean menus. In W. Meisel (Ed.), Speech in the user interface: Lessons from experience (pp. 108–112). Victoria, Canada: TMA Associates.

Broadbent, D. E. (1977). Language and ergonomics. Applied Ergonomics, 8, 15–18.

Byrne, B. (2003). “Conversational” isn’t always what you think it is. Speech Technology, 8(4), 16–19.

Callejas, Z., & López-Cózar, R. (2008). Relations between de-facto criteria in the evaluation of a spoken dialogue system. Speech Communication, 50, 646-665.

Chang, C. (2006). When service fails: The role of the salesperson and the customer. Psychology & Marketing, 23(3), 203–224.

Chapanis, A. (1988). Some generalizations about generalization. Human Factors, 30, 253-267.

Clark, H. H. (1996). Using language. Cambridge, UK: Cambridge University Press.

Clark, H. H. (2004). Pragmatics of language performance. In L. R. Horn & G. Ward (Eds.), Handbook of pragmatics (pp. 365–382). Oxford, UK: Blackwell.

Cohen, M. H., Giangola, J. P., & Balogh, J. (2004). Voice user interface design. Boston, MA: Addison-Wesley.

Commarford, P. M., & Lewis, J. R. (2005). Optimizing the pause length before presentation of global navigation commands. In Proceedings of HCI International 2005: Volume 2—The management of information: E-business, the Web, and mobile computing (pp. 1–7). St. Louis, MO: Mira Digital Publication.

Commarford, P. M., Lewis, J. R., Al-Awar Smither, J. & Gentzler, M. D. (2008). A comparison of broad versus deep auditory menu structures. Human Factors, 50(1), 77-89.

Couper, M. P., Singer, E., & Tourangeau, R. (2004). Does voice matter? An interactive voice response (IVR) experiment. Journal of Official Statistics, 20(3), 551–570.

Crystal, T. H., & House, A. S. (1990). Articulation rate and the duration of syllables and stress groups in connected speech. Journal of the Acoustical Society of America, 88, 101–112.

Cunningham, L. F., Young, C. E., & Gerladina, J. H. (2008). Consumer views of self-service technologies. The Service Industries Journal, 28(6), 719-732.

Dahl, D. (2006). Point/counter point on personas. Speech Technology, 11(1), 18–21.

Damper, R. I., & Gladstone, K. (2007). Experiences of usability evaluation of the IMAGINE speech-based interaction system. International Journal of Speech Technology, 9, 41–50.

Damper, R. I., & Soonklang, T. (2007). Subjective evaluation of techniques for proper name pronunciation. IEEE Transactions on Audio, Speech, and Language Processing, 15(8), 2213-2221.

Davidson, N., McInnes, F., & Jack, M. A. (2004). Usability of dialogue design strategies for automated surname capture. Speech Communication, 43, 55–70.

Dougherty, M. (2010). What’s universally available, but rarely used? In W. Meisel (Ed.), Speech in the User Interface: Lessons from Experience (pp. 117-120). Victoria, Canada: TMA Associates.

Dulude, L. (2002). Automated telephone answering systems and aging. Behaviour and Information Technology, 21(3), 171–184.

Durrande-Moreau, A. (1999). Waiting for service: Ten years of empirical research. International Journal of Service Industry Management, 10(2), 171–189.

Edworthy, J. & Hellier, E. (2006). Complex nonverbal auditory signals and speech warnings. In (Wogalter, M. S., Ed.) Handbook of Warnings (pp. 199-220). Mahwah, NJ: Lawrence Erlbaum.

Enterprise Integration Group. (2000). Speech Recognition 1999 R&D Program: User interface design recommendations final report. San Ramon, CA: Author.

Ervin-Tripp, S. (1993). Conversational discourse. In J. B. Gleason & N. B. Ratner (Eds.), Psycholinguistics (pp. 238–270). Fort Worth, TX: Harcourt Brace Jovanovich.

Evans, D. G., Draffan, E. A., James, A., & Blenkhorn, P. (2006). Do text-to-speech synthesizers pronounce correctly? A preliminary study. In K. Miesenberger et al. (Eds.), Proceedings of ICCHP (pp. 855–862). Berlin, Germany: Springer-Verlag.

Ferreira, F. (2003). The misinterpretation of noncanonical sentences. Cognitive Psychology, 47, 164–203.

Fosler-Lussier, E., Amdal, I., & Juo, H. J. (2005). A framework for predicting speech recognition errors. Speech Communication, 46, 153–170.

Frankish, C., & Noyes, J. (1990). Sources of human error in data entry tasks using speech input. Human Factors, 32(6), 697–716.

Fried, J., & Edmondson, R. (2006). How customer perceived latency measures success in voice self-service. Business Communications Review, 36(3), 26–32.

Fröhlich, P. (2005). Dealing with system response times in interactive speech applications. In Proceedings of CHI 2005 (pp. 1379–1382). Portland, OR: ACM.

Fromkin, V., Rodman, R., & Hyams, N. (1998). An introduction to language (6th ed.). Fort Worth, TX: Harcourt Brace Jovanovich.

Gardner-Bonneau, D. J. (1992). Human factors in interactive voice response applications: “Common sense” is an uncommon commodity. Journal of the American Voice I/O Society, 12, 1-12.

Gardner-Bonneau, D. (1999). Guidelines for speech-enabled IVR application design. In D. Gardner-Bonneau (Ed.), Human factors and voice interactive systems (pp. 147-162). Boston, MA: Kluwer Academic Publishers.

Garrett, M. F. (1990). Sentence processing. In D. N. Osherson & H. Lasnik (Eds.), Language: An invitation to cognitive science (pp. 133–176). Cambridge, MA: MIT Press.

Gleason, J. B., & Ratner, N. B. (1993). Psycholinguistics. Fort Worth, TX: Harcourt Brace Jovanovich.

Gould, J. D., Boies, S. J., Levy, S., Richards, J. T., & Schoonard, J. (1987). The 1984 Olympics message system: A test of behavioral principles of system design. Communications of the ACM, 30, 758-569.

Graham, G. M. (2005). Voice branding in America. Alpharetta, GA: Vivid Voices.

Graham, G. M. (2010). Speech recognition, the brand and the voice: How to choose a voice for your application. In W. Meisel (Ed.), Speech in the user interface: Lessons from experience (pp. 93–98). Victoria, Canada: TMA Associates.

Grice, H. P. (1975). Logic and conversation. In P. Cole & J. L. Morgan (Eds.), Syntax and semantics, volume 3: Speech acts (pp. 41–58). New York, NY: Academic Press.

Guinn, I. (2010). You can’t think of everything: The importance of tuning speech applications. In W. Meisel (Ed.), Speech in the user interface: Lessons from experience (pp. 89–92). Victoria, Canada: TMA Associates.

Hafner, K. (2004, Sept. 9). A voice with personality, just trying to help. The New York Times. Retrieved from www.nytimes.com/2004/09/09/technology/circuits/09emil.html.

Halstead-Nussloch, R. (1989). The design of phone-based interfaces for consumers. In Proceedings of CHI 1989 (pp. 347–352). Austin, TX: ACM.

Harris, R. A. (2005). Voice interaction design: Crafting the new conversational speech systems. San Francisco, CA: Morgan Kaufmann.

Heins, R., Franzke, M., Durian, M., & Bayya, A. (1997). Turn-taking as a design principle for barge-in in spoken language systems. International Journal of Speech Technology, 2, 155-164.

Henton, C. (2003). The name game: Pronunciation puzzles for TTS. Speech Technology, 8(5), 32-35.

Hone, K. S., & Graham, R. (2000). Towards a tool for the subjective assessment of speech system interfaces (SASSI). Natural Language Engineering, 6(3–4), 287–303.

Huang, X., Acero, A., & Hon, H. (2001). Spoken language processing: A guide to theory, algorithm and system development. Upper Saddle River, NJ: Prentice Hall.

Huguenard, B. R., Lurch, F. J., Junker, B. W., Patz, R. J., & Kass, R. E. (1997). Working-memory failure in phone-based interaction. ACM Transactions on Computer-Human Interaction, 4(2), 67–102.

Hunter, P. (2009). More isn't better, but (help me with) something else is. From the design-outloud blog.

Hura, S. L. (2008). What counts as VUI? Speech Technology, 13(9), 7.

Hura, S. L. (2010). My big fat main menu: The case for strategically breaking the rules. In W. Meisel (Ed.), Speech in the User Interface: Lessons from Experience (pp 113-116). Victoria, Canada: TMA Associates.

Jain, A. K., & Pankanti, S. (2008). Beyond fingerprinting. Scientific American, 299(3), 78-81.

Jelinek, F. (1997). Statistical methods for speech recognition. Cambridge, MA: MIT Press.

Joe, R. (2007). The elements of style. Speech Technology, 12(8), 20–24.

Johnstone, A., Berry, U., Nguyen, T., & Asper, A. (1994). There was a long pause: Influencing turn-taking behaviour in human-human and human-computer spoken dialogues. International Journal of Human-Computer Studies, 41, 383–411.

Kaiser, L., Krogh, P., Leathem, C., McTernan, F., Nelson, C., Parks, M. C., & Turney, S. (2008). Thinking outside the box: Designing for the overall user experience. From the 2008 Workshop on the Maturation of VUI.

Karray, L., & Martin, A. (2003). Toward improving speech detection robustness for speech recognition in adverse conditions. Speech Communication, 40, 261–276.

Kaushansky, K. (2006). Voice authentication – not just another speech application. In W. Meisel (Ed.), VUI Visions: Expert Views on Effective Voice User Interface Design (pp. 139-142). Victoria, Canada: TMA Associates.

Klatt, D. (1987). Review of text-to-speech conversion for English. Journal of the Acoustical Society of America, 82, 737–793. Audio samples available at <www.cs.indiana.edu/rhythmsp/ASA/Contents.html>.

Kleijnen, M., de Ruyter, K., & Wetzels, M. (2007). An assessment of value creation in mobile service delivery and the moderating role of time consciousness. Journal of Retailing, 83(1), 33–46.

Klie, L. (2010). When in Rome. Speech Technology, 15(3), 20-24.

Knott, B. A., Bushey, R. R., & Martin, J. M. (2004). Natural language prompts for an automated call router: Examples increase the clarity of user responses. In Proceedings of the Human Factors and Ergonomics Society 48th annual meeting (pp. 736–739). Santa Monica, CA: Human Factors and Ergonomics Society.

Kortum, P., & Peres, S. C. (2006). An exploration of the use of complete songs as auditory progress bars. In Proceedings of the Human Factors and Ergonomics Society 50th annual meeting (pp. 2071–2075). Santa Monica, CA: HFES.

Kortum, P., & Peres, S. C. (2007). A survey of secondary activities of telephone callers who are put on hold. In Proceedings of the Human Factors and Ergonomics Society 51st annual Meeting (pp. 1153–1157). Santa Monica, CA: HFES.

Kortum, P., Peres, S. C., Knott, B. A., & Bushey, R. (2005). The effect of auditory progress bars on consumer’s estimation of telephone wait time. In Proceedings of the Human Factors and Ergonomics Society 49th annual meeting (pp. 628–632). Santa Monica, CA: HFES.

Kotan, C., & Lewis, J. R. (2006). Investigation of confirmation strategies for speech recognition applications. In Proceedings of the Human Factors and Ergonomics Society 50th annual meeting (pp. 728–732). Santa Monica, CA: Human Factors and Ergonomics Society.

Kotelly, B. (2003). The art and business of speech recognition: Creating the noble voice. Boston, MA: Pearson Education.

Kotelly, B. (2006). Six tips for better branding. In W. Meisel (Ed.), VUI Visions: Expert Views on Effective Voice User Interface Design (pp. 61-64). Victoria, Canada: TMA Associates.

Krahmer, E., Swerts, M., Theune, M., & Weegels, M. (2001). Error detection in spoken human-machine interaction. International Journal of Speech Technology, 4, 19–30.

Lai, J., Karat, C.-M., & Yankelovich, N. (2008). Conversational speech interfaces and technology. In A. Sears & J. A. Jacko (Eds.) The human-computer interaction handbook: Fundamentals, evolving technologies, and emerging applications (pp. 381-391). New York, NY: Lawrence Erlbaum.

Larson, J. A. (2005). Ten guidelines for designing a successful voice user interface. Speech Technology, 10(1), 51-53.

Leppik, P. (2005). Does forcing callers to use self-service work? Quality Times, 22, 1-3. Downloaded 2/18/2009 from http://www.vocalabs.com/resources/newsletter/newsletter22.html.

Leppik, P. (2006). Developing metrics part 1: Bad metrics. The Customer Service Survey. Retrieved from www.vocalabs.com/resources/blog/C834959743/E20061205170807/index.html.

Leppik, P. (2012). The customer frustration index. Golden Valley, MN: Vocal Laboratories. Downloaded 7/23/2012 from http://www.vocalabs.com/download-ncss-cross-industry-report-customer-frustration-index-q2-2012.

Leppik, P., & Leppik, D. (2005). Gourmet customer service: A scientific approach to improving the caller experience. Eden Prairie, MN: VocaLabs.

Lewis, J.R. (1982). Testing small system customer set-up. In Proceedings of the Human Factors Society 26th Annual Meeting (pp. 718-720). Santa Monica, CA: Human Factors Society.

Lewis, J. R. (2005). Frequency distributions for names and unconstrained words associated with the letters of the English alphabet. In Proceedings of HCI International 2005: Posters (pp. 1–5). St. Louis, MO: Mira Digital Publication. Available at http://drjim.0catch.com/hcii05-368-wordfrequency.pdf.

Lewis, J. R. (2006). Effectiveness of various automated readability measures for the competitive evaluation of user documentation. In Proceedings of the Human Factors and Ergonomics Society 50th annual meeting (pp. 624–628). Santa Monica, CA: Human Factors and Ergonomics Society.

Lewis, J. R. (2007). Advantages and disadvantages of press or say <x> speech user interfaces (Tech. Rep. BCR-UX-2007-0002. Retrieved from http://drjim.0catch.com/2007_AdvantagesAndDisadvantagesOfPressOrSaySpeechUserInter.pdf). Boca Raton, FL: IBM Corp.

Lewis, J. R. (2008). Usability evaluation of a speech recognition IVR. In T. Tullis & B. Albert (Eds.), Measuring the user experience, Chapter 10: Case studies (pp. 244–252). Amsterdam, Netherlands: Morgan-Kaufman.

Lewis, J. R. (2011). Practical speech user interface design. Boca Raton, FL: CRC Press, Taylor & Francis Group.

Lewis, J. R. (2012). Usability testing. In G. Salvendy (Ed.), Handbook of Human Factors and Ergonomics, 4th ed. (pp. 1267-1312). New York, NY: John Wiley.

Lewis, J. R., & Commarford, P. M. (2003). Developing a voice-spelling alphabet for PDAs. IBM Systems Journal, 42(4), 624–638. Available at http://drjim.0catch.com/2003_DevelopingAVoiceSpellingAlphabetForPDAs.pdf.

Lewis, J. R., Commarford, P. M., Kennedy, P. J., and Sadowski, W. J. (2008). Handheld electronic devices. In C. Melody Carswell (Ed.), Reviews of Human Factors and Ergonomics, Vol. 4 (pp. 105-148). Santa Monica, CA: Human Factors and Ergonomics Society. Available at http://drjim.0catch.com/2008_HandheldElectronicDevices.pdf.

Lewis, J. R., Commarford, P. M., & Kotan, C. (2006). Web-based comparison of two styles of auditory presentation: All TTS versus rapidly mixed TTS and recordings. In Proceedings of the Human Factors and Ergonomics Society 50th annual meeting (pp. 723–727). Santa Monica, CA: Human Factors and Ergonomics Society.

Lewis, J. R., Potosnak, K. M., and Magyar, R. L. (1997). Keys and keyboards. In M. Helandar, T. K. Landauer, and P. Prabhu (Eds.), Handbook of Human-Computer Interaction (pp. 1285-1315). Amsterdam: Elsevier. Available at http://drjim.0catch.com/1997_KeysAndKeyboards.pdf.

Lewis, J. R., Simone, J. E., & Bogacz, M. (2000). Designing common functions for speech-only user interfaces: Rationales, sample dialogs, potential uses for event counting, and sample grammars (Tech. Report 29.3287, available at <http://drjim.0catch.com/always-ral.pdf>). Raleigh, NC: IBM Corp.

Liberman, A. M., Harris, K. S., Hoffman, H. S., & Griffith, B. C. (1957). The discrimination of speech sounds within and across phoneme boundaries. Journal of Experimental Psychology, 54, 358–368.

Litman, D., Hirschberg, J., & Swerts, M. (2006). Characterizing and predicting corrections in spoken dialogue systems. Computational Linguistics, 32(3), 417–438.

Lombard, E. (1911). Le signe de l’elevation de la voix. Annales des maladies de l’oreille et du larynx, 37, 101–199.

Machado, S., Duarte, E., Teles, J., Reis, L., & Rebelo, F. (2012). Selection of a voice for a speech signal for personalized warnings: The effect of speaker's gender and voice pitch. Work, 41, 3592-3598.

Margulies, E. (2005). Adventures in turn-taking: Notes on success and failure in turn cue coupling. In AVIOS 2005 proceedings (pp. 1–10). San Jose, CA: AVIOS.

Margulies, M. K. (1980). Effects of talker differences on speech intelligibility in the hearing impaired. Doctoral dissertation, City University of New York.

Marics, M. A., & Engelbeck, G. (1997). Designing voice menu applications for telephones. In M. Helander, T. K. Landauer, & P. Prabhu (Eds.), Handbook of human-computer interaction, 2nd edition (pp. 1085-1102). Amsterdam, Netherlands: Elsevier.

Markowitz, J. (2010). VUI concepts for speaker verification. In W. Meisel (Ed.), Speech in the User Interface: Lessons from Experience (pp. 161-166). Victoria, Canada: TMA Associates.

Massaro, D. (1975). Preperceptual images, processing time, and perceptual units in speech perception. In D. Massaro (Ed.), Understanding language: An information-processing analysis of speech perception, reading, and psycholinguistics (pp. 125–150). New York, NY: Academic Press.

McInnes, F., Attwater, D., Edgington, M. D., Schmidt, M. S., & Jack, M. A. (1999). User attitudes to concatenated natural speech and text-to-speech synthesis in an automated information service. In Proceedings of Eurospeech99 (pp. 831–834). Budapest, Hungary: ESCA.

McInnes, F. R., Nairn, I. A., Attwater, D. J., Edgington, M. D., & Jack, M. A. (1999). A comparison of confirmation strategies for fluent telephone dialogues. Edinburgh, UK: Centre for Communication Interface Research.

McKellin, W. H., Shahin, K., Hodgson, M., Jamieson, J., & Pichora-Fuller, K. (2007). Pragmatics of conversation and communication in noisy settings. Journal of Pragmatics, 39, 2159–2184.

McKienzie, J. (2009). Menu pauses: How long? [PowerPoint Slides]. Paper presented at SpeechTek 2009. New York, NY: SpeechTek.

McTear, M., O’Neill, I., Hanna, P., & Liu, X. (2005). Handling errors and determining confirmation strategies—an object based approach. Speech Communication, 45, 249–269.

Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. The Psychological Review, 63, 81-97.

Miller, G. A. (1962). Some psychological studies of grammar. American Psychologist, 17, 748–762.

Minker, W., Pitterman, J., Pitterman, A., Strauß, P.-M., & Bühler, D. (2007). Challenges in speech-based human-computer interaction. International Journal of Speech Technology, 10, 109–119.

Mościcki, E.K., Elkins, E. F., Baum, H. M., & McNamara, P. M. (1985). Hearing loss in the elderly: An epidemiologic study of the Framingham Heart Study cohort. Ear and Hearing Journal, 6, 184-190.

Munichor, N., & Rafaeli, A. (2007). Numbers or apologies? Customer reactions to telephone waiting time fillers. Journal of Applied Psychology, 92(2), 511–518.

Nairne, J. (2002). Remembering over the short-term: The case against the standard model. Annual Review of Psychology, 53, 53-81.

Nass, C., & Brave, S. (2005). Wired for speech: How voice activates and advances the human-computer relationship. Cambridge, MA: MIT Press.

Nass, C., & Yen, C. (2010). The man who lied to his laptop: What machines teach us about human relationships. New York, NY: Penguin Group.

Németh, G., Kiss, G., Zainkó, C., Olaszy, G., & Tóth, B. (2008). Speech generation in mobile phones. In D. Gardner-Bonneau & H. E. Blanchard (Eds.), Human factors and voice interactive systems (2nd ed.) (pp. 163–191). New York, NY: Springer.

North, A. C., Hargreaves, D. J., & McKendrick, J. (1999). Music and on-hold waiting time. British Journal of Psychology, 90, 161–164.

Novick, D. G., Hansen, B., Sutton, S., & Marshall, C. R. (1999). Limiting factors of automated telephone dialogues. In D. Gardner-Bonneau (Ed.), Human factors and voice interactive systems (pp. 163–186). Boston, MA: Kluwer Academic Publishers.

Ogden, W. C., & Bernick, P. (1997). Using natural language interfaces. In M. Helander, T. K. Landauer, & P. Prabhu (Eds.), Handbook of human-computer interaction (pp. 137–161). Amsterdam, Netherlands: Elsevier.

Ostendorf, M., Kannan, A., Austin, S., Kimball, O., Schwartz, R., & Rohlicek, J. R. (1991). Integration of diverse recognition methodologies through reevaluation of n-best sentence hypotheses. In Proceedings of DARPA Workshop on Speech and Natural Language (pp. 83-87). Stroudsburg, PA: Association for Computational Linguistics. <http://acl.ldc.upenn.edu/H/H91/H91-1013.pdf>

Osuna, E. E. (1985). The psychological cost of waiting. Journal of Mathematical Psychology, 29, 82–105.

Parkinson, F. (2012). Alphanumeric Confirmation & User Data. Presentation at SpeechTek 2012, available at http://www.speechtek.com/2012/Presentations.aspx (search for Parkinson in Session B102).

Pieraccini, R. (2010). Continuous automated speech tuning and the return of statistical grammars. In W. Meisel (Ed.), Speech in the user interface: Lessons from experience (pp. 255–259). Victoria, Canada: TMA Associates.

Pieraccini, R. (2012). The voice in the machine: Building computers that understand speech. Cambridge, MA: MIT Press.

Polkosky, M. D. (2001). User preference for system processing tones (Tech. Rep. 29.3436). Raleigh, NC: IBM.

Polkosky, M. D. (2002). Initial psychometric evaluation of the Pragmatic Rating Scale for Dialogues (Tech. Report 29.3634). Boca Raton, FL: IBM.

Polkosky, M. D. (2005a). Toward a social-cognitive psychology of speech technology: Affective responses to speech-based e-service. Unpublished doctoral dissertation, University of South Florida.

Polkosky, M. D. (2005b). What is speech usability, anyway? Speech Technology, 10(9), 22–25.

Polkosky, M. D. (2006). Respect: It’s not what you say, it’s how you say it. Speech Technology, 11(5), 16–21.

Polkosky, M. D. (2008). Machines as mediators: The challenge of technology for interpersonal communication theory and research. In E. Konjin (Ed.), Mediated interpersonal communication (pp. 34–57). New York, NY: Routledge.

Polkosky, M. D., & Lewis, J. R. (2002). Effect of auditory waiting cues on time estimation in speech recognition telephony applications. International Journal of Human-Computer Interaction, 14, 423–446.

Polkosky, M. D., & Lewis, J. R. (2003). Expanding the MOS: Development and psychometric evaluation of the MOS-R and MOS-X. International Journal of Speech Technology, 6, 161–182.

Ramos, L. (1993). The effects of on-hold telephone music on the number of premature disconnections to a statewide protective services abuse hot line. Journal of Music Therapy, 30(2), 119–129.

Reeves, B., & Nass, C. (2003). The media equation: How people treat computers, television, and new media like real people and places. Chicago, IL: University of Chicago Press.

Reinders, M., Dabholkar, P. A., & Frambach, R. T. (2008). Consequences of forcing consumers to use technology-based self-service. Journal of Service Research, 11(2), 107-123.

Resnick, M. & Sanchez, J. (2004). Effects of organizational scheme and labeling on task performance in product-centered and user-centered web sites. Human Factors, 46, 104-117.

Roberts, F., Francis, A. L., & Morgan, M. (2006). The interaction of inter-turn silence with prosodic cues in listener perceptions of “trouble” in conversation. Speech Communication, 48, 1079–1093.

Rolandi, W. (2003). When you don’t know what you don’t know. Speech Technology, 8(4), 28.

Rolandi, W. (2004a). Improving customer service with speech. Speech Technology, 9(5), 14.

Rolandi, W. (2004b). Rolandi's razor. Speech Technology, 9(4), 39.

Rolandi, W. (2005). The impotence of being earnest. Speech Technology, 10(1), 22.

Rolandi, W. (2006). The alpha bail. Speech Technology, 11(1), 56.

Rolandi, W. (2007a). Aligning customer and company goals through VUI. Speech Technology, 12(2), 6.

Rolandi, W. (2007b). The pains of main are plainly VUI’s bane. Speech Technology, 12(1), 6.

Rolandi, W. (2007c). The persona craze nears an end. Speech Technology, 12(5), 9.

Rosenbaum, S. (1989). Usability evaluations versus usability testing: When and why? IEEE Transactions on Professional Communication, 32, 210-216.

Rosenfeld, R., Olsen, D., & Rudnicky, A. (2001). Universal speech interfaces. Interactions, 8(6), 34-44.

Sadowski, W. J. (2001). Capabilities and limitations of Wizard of Oz evaluations of speech user interfaces. In Proceedings of HCI International 2001: Usability evaluation and interface design (pp. 139–142). Mahwah, NJ: Lawrence Erlbaum.

Sadowski, W. J., & Lewis, J. R. (2001). Usability evaluation of the IBM WebSphere “WebVoice” demo (Tech. Rep. 29.3387, available at drjim.0catch.com/vxmllive1-ral.pdf). West Palm Beach, FL: IBM Corp.

Sauro, J. (2009). Estimating productivity: Composite operators for keystroke level modeling. In Jacko, J.A. (Ed.), Proceedings of the 13th International Conference on Human–Computer Interaction, HCII 2009 (pp. 352-361). Berlin, Germany: Springer-Verlag.

Sauro, J., & Lewis, J. R. (2012). Quantifying the user experience: Practical statistics for user research. Burlington, MA: Morgan Kaufmann.

Schegloff, E. A. (2000). Overlapping talk and the organization of turn-taking for conversation. Language in Society, 29, 1–63.

Schoenborn C. A., & Marano, M. (1988). Current estimates from the national health interview survey: United States 1987. In Vital and Health Statistics, series 10, #166. Washington, D.C.: Government Printing Office.

Sheeder, T., & Balogh, J. (2003). Say it like you mean it: Priming for structure in caller responses to a spoken dialog system. International Journal of Speech Technology, 6, 103–111.

Schumacher, R. M., Jr., Hardzinski, M. L., & Schwartz, A. L. (1995). Increasing the usability of interactive voice response systems: Research and guidelines for phone-based interfaces. Human Factors, 37, 251–264.

Shinn, P. (2009). Getting persona – IVR voice gender, intelligibility & the aging. In Speech Strategy News (November, pp. 37-39).

Shinn, P., Basson, S. H., & Margulies, M. (2009). The impact of IVR voice talent selection on intelligibility. Presentation at SpeechTek 2009. Available at <www.speechtek.com/2009/program.aspx?SessionID=2386>.

Shriver, S., & Rosenfeld, R. (2002). Keywords for a universal speech interface. In Proceedings of CHI 2002 (pp. 726-727). Minneapolis, MN: ACM.

Skantze, G. (2005). Exploring human error recovery strategies: Implications for spoken dialogue systems. Speech Communication, 45, 325–341.

Spiegel, M. F. (1997). Advanced database preprocessing and preparations that enable telecommunication services based on speech synthesis. Speech Communication, 23, 51–62.

Spiegel, M. F. (2003a). Proper name pronunciations for speech technology applications. International Journal of Speech Technology, 6, 419-427.

Spiegel, M. F. (2003b). The difficulties with names: Overcoming barriers to personal voice services. Speech Technology, 8(3), 12-15.

Stivers, T.; Enfield, N. J.; Brown, P.; Englert, C.; Hayashi, M.; Heinemann, T.; Hoymann, G.; Rossano, F.; de Ruiter, J. P.; Yoon, K.-E.; Levinson, S. C. (2009). Universals and cultural variation in turn-taking in conversation. Proceedings of the National Academy of Sciences, 106 (26), 10587-10592.

Suhm, B. (2008). IVR usability engineering using guidelines and analyses of end-to-end calls. In D. Gardner-Bonneau & H. E. Blanchard (Eds.), Human factors and voice interactive systems, 2nd edition (pp. 1-41). New York, NY: Springer.

Suhm, B., Freeman, B., & Getty, D. (2001). Curing the menu blues in touch-tone voice interfaces. In Proceedings of CHI 2001 (pp. 131-132). The Hague, Netherlands: ACM.

Suhm, B., Bers, J., McCarthy, D., Freeman, B., Getty, D., Godfrey, K., & Peterson, P. (2002). A comparative study of speech in the call center: Natural language call routing vs. touch-tone menus. In Proceedings of CHI 2002 (pp. 283–290). Minneapolis, MN: ACM.

Toledano, D. T., Pozo, R. F., Trapote, Á. H., & Gómez, L. H. (2006). Usability evaluation of multi-modal biometric verification systems. Interacting with Computers, 18, 1101-1122.

Tomko, S., Harris, T. K., Toth, A., Sanders, J., Rudnicky, A., & Rosenfeld, R. (2005). Towards efficient human machine speech communication: The speech graffiti project. ACM Transactions on Speech and Language Processing, 2(1), 1-27.

Torres, F., Hurtado, L. F., García, F., Sanchis, E., & Segarra, E. (2005). Error handling in a stochastic dialog system through confidence measures. Speech Communication, 45, 211–229.

Turunen, M., Hakulinen, J., & Kainulainen, A. (2006). Evaluation of a spoken dialogue system with usability tests and long-term pilot studies: Similarities and differences. In Proceedings of the 9th International Conference on Spoken Language Processing (pp. 1057-1060). Pittsburgh, PA: ICSLP.

Unzicker, D. K. (1999). The psychology of being put on hold: An exploratory study of service quality. Psychology & Marketing, 16(4), 327–350.

Vacca, J. R. (2007). Biometric technologies and verification systems. Burlington, MA: Elsevier.

Virzi, R. A., & Huitema, J. S. (1997). Telephone-based menus: Evidence that broader is better than deeper. In Proceedings of the Human Factors and Ergonomics Society 41st Annual Meeting (pp. 315-319). Santa Monica, CA: Human Factors and Ergonomics Society.

Voice Messaging User Interface Forum. (1990). Specification document. Cedar Knolls, NJ: Probe Research.

Walker, M. A., Fromer, J., Di Fabbrizio, G., Mestel, C., & Hindle, D. (1998). What can I say?: Evaluating a spoken language interface to email. In Proceedings of CHI 1998 (pp. 582–589). Los Angeles, CA: ACM.

Watt, W. C. (1968). Habitability. American Documentation, 19(3), 338–351.

Weegels, M. F. (2000). Users’ conceptions of voice-operated information services. International Journal of Speech Technology, 3, 75–82.

Wilkie, J., McInnes, F., Jack, M. A., & Littlewood, P. (2007). Hidden menu options in automated human-computer telephone dialogues: Dissonance in the user’s mental model. Behaviour & Information Technology, 26(6), 517-534.

Williams, J. D., & Witt, S. M. (2004). A comparison of dialog strategies for call routing. International Journal of Speech Technology, 7, 9–24.

Wilson, T. P., & Zimmerman, D. H. (1986). The structure of silence between turns in two-party conversation. Discourse Processes, 9, 375–390.

Wolters, M., Georgila, K., Moore, J. D., Logie, R. H., MacPherson, S. E., & Watson, M. (2009). Reducing working memory load in spoken dialogue systems. Interacting with Computers, 21, 276-287.

Wright, L. E., Hartley, M. W., & Lewis, J. R. (2002). Conditional probabilities for IBM Voice Browser 2.0 alpha and alphanumeric recognition (Tech. Rep. 29.3498. Retrieved from http://drjim.0catch.com/alpha2-acc.pdf). West Palm Beach, FL: IBM.

Yagil, D. (2001). Ingratiation and assertiveness in the service provider-customer dyad. Journal of Service Research, 3(4), 345–353.

Yang, F., & Heeman, P. A. (2010). Initiative conflicts in task-oriented dialogue. Computer Speech and Language, 24, 175–189.

Yellin, E. (2009). Your call is (not that) important to us: Customer service and what it reveals about our world and our lives. New York, NY: Free Press.

Yudkowsky, M. (2008). The creepiness factor. Speech Technology, 13(8), 4.

Yuschik, M. (2008). Silence locations and durations in dialog management. In D. Gardner-Bonneau & H. E. Blanchard (Eds.), Human factors and voice interactive systems, 2nd edition (pp. 231-253). New York, NY: Springer.

Zoltan-Ford, E. (1991). How to get people to say and type what computers can understand. International Journal of Man-Machine Studies, 34, 527–547.

Zurif, E. B. (1990). Language and the brain. In D. N. Osherson & H. Lasnik (Eds.), Language: An invitation to cognitive science (pp. 177–198). Cambridge, MA: MIT Press.