Set and state goals for the system
When considering an application's design, it is usually best to establish one or more specific goals for the application. Consider this akin to a charter or mission statement.

The setting and documentation of goals is usually an iterative process; it involves both the VUI designer and stakeholders. All too often, it is the stakeholders who do not have a solid set of goals in mind, but rather, have a vague set of purposes for the application. It is not uncommon for stakeholders, say for a banking application, to state that their goals are "to allow their customers to perform all routing banking transactions." Such a statement may be fine from a public relations perspective, but it does not suffice for a designer. A better stated goal, for this example, might be: "The XYZ IVR will allow callers to check their checking account balance."

When setting and documenting goals, be as specific as possible, lest the line becomes blurry as to what is "within scope" and "out of scope" for a given application release. The result of this iterative process will be a specific set of documented goals for the target application. To get stakeholders to a set of specific application goals, it will be incumbent on the designer to listen, synthesize what is said, and reflect back to the stakeholders what was heard. Stated goals must be agreed to by the stakeholders, for it is these goals that the VUI design will support.

Set goals for usability tests
There are a variety of sources of varying quality when setting goals for usability tests (Sauro & Lewis, 2012). Some approaches to the development of criteria are:

  1. Base criteria on historical data obtained from previous tests that included similar tasks.
  2. Search the published scientific or marketing research for relevant information.
  3. Use task modeling such as GOMS or KLM to estimate expert task-time performance (Sauro, 2009).
  4. Negotiate criteria with the stakeholders who are responsible for the product. Ideally, the goals should have an objective basis and shared acceptance among stakeholders such as marketing and development (Lewis, 1982). The best objective basis for measurement goals are data from previous usability studies of predecessor or competitive products. For maximum generalizability, the source of historical data should be studies of similar types of participants completing the same tasks under the same conditions (Chapanis, 1988). If this type of information is not available (or really, even if it is), it is important for test designers to recommend objective goals and to negotiate with the other stakeholders for the final set of shared goals.

Whatever approach you take, don’t let analysis paralysis prevent you from specifying goals. “Defining usability objectives (and standards) isn’t easy, especially when you’re beginning a usability program. However, you’re not restricted to the first objective you set. The important thing is to establish some specific objectives immediately, so that you can measure improvement. If the objectives turn out to be unrealistic or inappropriate, you can revise them” (Rosenbaum, 1989, p. 211). If you find yourself needing to make these types of revisions, try to make them in the early stages of gaining experience and taking initial measurements with a product. Do not change reasonable goals to accommodate an unusable product.

Your goals for testing may vary wildly project to project. If testing the primary tasks in an application where no controversy or big unknowns exist, then you may be looking mostly for affirmation that things work well, and a relatively high success rate is anticipated and stated as a goal. If instead there is something where the design team was divided as to approach or worse, where the design team was overruled, and usability testing stands a good chance of highlighting a problem, then the expected success rate on tasks will be lower. And that's OK.

Consider the audience for the test when choosing tasks and setting goals. If the customer will be present, then a day full of watching callers struggle with the piece that you suspect is going to problematic is probably not a good experience for anybody: testers, subjects, or customer. Mix in some tasks that have a higher probability of success.

References

Chapanis, A. (1988). Some generalizations about generalization. Human Factors, 30, 253-267.

Lewis, J.R. (1982). Testing small system customer set-up. In Proceedings of the Human Factors Society 26th Annual Meeting (pp. 718-720). Santa Monica, CA: Human Factors Society.

Rosenbaum, S. (1989). Usability evaluations versus usability testing: When and why? IEEE Transactions on Professional Communication, 32, 210-216.

Sauro, J. (2009). Estimating productivity: Composite operators for keystroke level modeling. In Jacko, J.A. (Ed.), Proceedings of the 13th International Conference on Human–Computer Interaction, HCII 2009 (pp. 352-361). Berlin, Germany: Springer-Verlag.

Sauro, J., & Lewis, J. R. (2012). Quantifying the user experience: Practical statistics for user research. Waltham, MA: Morgan Kaufmann.