CSE Blog

Part 1: Background-A Case Study in Cognitive Systems Engineering Usability Evaluation

Recently we were asked by one of our partner companies to provide a Usability Evaluation of one of their new systems.  This is Part 1 of a series, to be expanded as the project unfolds.  Hopefully it will capture a sense of how ‘different’ a deeply grounded CSE Evaluation of a system really is.

We started off by explaining (during the proposal process) that our Cognitive Systems Engineering background and our experience with designing revolutionary Decision Support Systems really ‘warps’ our assessments of systems.  We see things far beyond ‘knobology,’ ‘look and feel,’ and ’style guide’ and see deeper underlying issues that fundamentally impact the success of the system, but are often much harder to fix, (similar to the idea that ‘Remodeling is more difficult than new construction’).  Equally, we don’t believe that a Style Guide will cure all ills, nor do we believe one will prevent all future ills as the system continues to grow.  After a big dose of “Be careful what you wish for,”  we’re on our way to evaluating a system.

The system itself is a pretty cool GIS referenced, dynamic data repository, intended to support a team of users trying to analyze the data with a powerful set of geo-centric, direct manipulation tools.  As a trivial example, ‘find all the data inside this area’ done by defining a box on a map, not by typing an SQL query.  The whole project is further challenged by what Dr. David Woods calls “The Envisioned World Problem,” (i.e. the system they envision would radically change the operational practices once fielded, hence there is no body of expertise in working this way).  That lack of expertise makes many system development processes impractical.  In addition, the program has all the typical pressures to ‘build in an agile way’ (to give management lots of near instant gratification) and ‘build these features next’ (to give marketing new ammunition as they look for customers).   Our planned approach is to rapidly construct a crude CSE Analysis of the cognitive work to be performed to ground our thinking.  System design features, information presented, representational techniques, etc. will be evaluated against this rough ‘basis for support’.

To manage these support needs and findings, we use Woods’ categorization of Usability, Usefulness, and Understanding.  Roughly stated:

  • Usability:  the actual operation of the machinery of the system (i.e. classic knobology, style, etc.) - successful system design on this level allows individual operations to occur
  • Usefulness:  the ability to ‘execute the workflow’, the everyday processing of information through the system–successful system design on this level allows routine work to be performed
  • Understanding:  the true, deep insight into the first principles of the work domain (and also the system itself)–successful system design on th is level allows the user to adapt to unexpected situations, to truly understand the world and the system

The parallelism to Rasmussen’s Skill-Rule-Knowledge levels seems readily apparent.

The project started with an all day training session (that we saw as an all day Knowledge Elicitation opportunity.)   As the trainer described individual features, we watched for indicators.  Since the training tended to be encapsulated feature-by-feature presentations, we expected to detect indicators related to Usability (since they intentionally were out of context of actual use).  We observed places where the instructor operated a control, and didn’t get the expected result (literally “oops” or “why did it do that”).  There were many instances of this class of observation.

Being the overachievers that we are, we began asking for more ‘Use Case’ examples, in the hopes of also getting examples of Usefulness cases.  The indicators for these are different, this time focusing on ‘work processes’ and any interrupts, or ‘do-overs’ along the way.  We gathered several very different indicators that triggered lots of questions and note taking, from “wait, let’s try that again” from the instructors to “now, XYZ should be in place here and we…wait, it’s not here.”  While the language was feature/case specific, the ‘pause/regroup/retry’ behavior became easy to detect.

While just getting going with the entire process, one very interesting meta-observation came out of ‘Day One:’  by late in the day the instructors and developers were *self* detecting these indicators immediately after they occurred.  They were correctly identifying the indicators, but also the class of the finding and also some powerful insights about the underlying system design causes.  This System Remodeling Project looks like it will be interesting.

Standby for Part 2:  Analysis by Inspection.