原创 It's usability testing, stupid!

 2011-3-10 11:31  2184 14 14 分类: 消费电子

What went wrong on the first release of Apple iPhone 4? Apparently none of the traditional white box, black box or usability testing procedures caught it. Is a more free form kind of ad hoc usability testing necessary? iPhone 4's wireless antenna problems have been a public relations fiasco for Apple Inc. No doubt you've followed all the press on this, but if you haven't witnessed the disaster for yourself, there are video demos on YouTube, including one, from Consumer Reports, demonstrating that when the iPhone 4 is held in a certain way, signal strength drops from four bars to one.

The issue underscores the importance of usability testing on any product targeted at the consumer market. Steve Jobs has said that his company subjects its iPhones, iPads and iPods to many variations of the traditional white box and black box testing procedures. Yet these tests apparently did not identify the antenna problem.

In my investigation of the iPhone 4 antenna problem I could not find any definitive statement that usability testing was done, except one anonymous rumor that the company does indeed conduct such tests on their consumer products, but using their own trained engineers. If that is the case and it had been done properly, it would have increased the odds that the problem would be identified. But Apple iPhone 4 developers did not spot it.

So what went wrong?

Black-box testing, the most common form of testing, exercises the functionality of a device, a software program or a system, as opposed to its internal structures or workings. Test cases are built around specifications and requirements, i.e., what the application is supposed to do. It uses external descriptions of the software, the hardware and the application to derive test cases. These tests can be functional or non-functional, though usually functional. The test object's internal structure is irrelevant.

White box testing, on the other hand, allows designers to test a device's internal hardware and software structures rather than its functionality. An internal perspective of the system, as well as programming and engineering knowledge, are required in order to exercise paths through the code and to subject the hardware to all of its various operating modes.

Neither of these testing modalities subjects the device to operation in a real-world environment in the hands of a consumer. For that you need usability testing with real consumers, not engineers. If Apple had done such a test, it's likely they would have spotted the fact that, as multiple videos on YouTube attest, the way a user grips the mobile phone determines how much its antenna reception is degraded.

Many companies seem to confuse market research with true usability testing. Usability testing is not gathering opinions from typical users about a device's features or whether or not it is hard or easy to use. Rather, usability testing as it is usually carried out involves watching people—usually under controlled conditions—trying to use something for its intended purpose.

For example, when testing instructions for assembling a toy, or using a cellphone, the test subjects are given the instructions to read, and then asked to proceed – that is, to build the toy or operate the cellphone. They are then observed in action, using the device correctly or incorrectly.

The trick in such usability testing is in your definition of typical users. Apple apparently thought trained engineers would be the best ones to conduct usability tests. For black and white box testing, maybe.

But to get valid usability test results who do you want? The user experienced in using your product or a similar one? The user who has taken the time to read the user manual? The user who has not read the manual? Or the mobile user from Hell, who has no experience at all, the one who is apt to break the product when, through ignorance or frustration, tries to make it work in a way he or she thinks it should? And the use of a controlled environment further skews results by forcing users to work in conditions that do not necessarily reflect the real world.

The iPhone 4 is not the only example where all of these more traditional forms of testing fall short. My life is full of consumer devices—from remote TV channel changers to mobile phones – that do not seem to have been subjected to any serious usability testing. Using a new cellphone is like joining a secret society with a secret handshake. The handshake may be somewhat similar to the one in another secret society, with a common set of movements and procedures that require passwords, but the sequence, rhythm and the timing may be completely different.

The secret handshake for an Apple iPhone, for example, is very different from the ones used on any number of Android sort-of look-alikes that are becoming available. In such an ad hoc world of consumer product design, where there is no common set of procedures and protocols that everyone agrees to, maybe a more ad hoc, free form method of usability testing is necessary as well.

What is needed is usability testing under completely uncontrolled conditions, in the field, watching users in their own environments, trying out devices they think they know how to use but do not. In the case of mobile phones, the way a consumer thinks it should be used may not be the way that particular device should be used.

I have done a lot of reading on all the different types of white box, black box and usability testing and I can't find anything in the formal technical literature that resembles this kind of freeform testing.

But that does not mean it does not exist and is not being used. I know for a fact that it does. The best practitioners of the kind of usability testing I'm suggesting were the field engineers I hung out with in the early 1990s while in Japan visiting some of the major consumer electronics companies.

A major portion of their job was to go into the consumer electronics stores that lined the streets in the downtown areas of many of the major cities. We watched how consumers of different types used the devices they were considering buying: how much time and how many steps did it take to complete basic tasks? How many mistakes did they make? Were the mistakes fatal; that is, did they break the devices or just temporarily put them out of action? Was it possible for them to easily back out of a situation and try something else? What was their emotional response: frustration? anger? confidence? stress?

When the engineers talked to the users they were observing in these totally uncontrolled situations, they did not ask their opinion of what and how to improve the device. Nor did they work from any prepared list of questions. Rather, their queries emerged out of their observations: why did you take this or that action? Were you aware of this or that function?

After a day of all of this ad hoc, informal user testing and observation, the engineers then got together and discussed what they had seen users do "in the wild" and what the users told them about what they did and why they had done it that way.

But the engineers did not then do what I expected them to do, which would have been to try to quantify their results for a report to be submitted to management, who would then try to derive some general principles.

Instead they kept their focus on the specifics of what the users had to tell them and redesigned their products to make them easier to use. Not only that, they tried to make it make it impossible for the consumer devices to be used in any way except the correct way. If I had closed my eyes, I would have mistaken them for field anthropologists and social psychologists observing some remote tribe in the jungle.

Nowadays, with only a few exceptions (most of them Japanese), few if any of the consumer products and mobile devices—Apple's iPhone included—seem to have been subjected to the kind of freeform usability testing I saw practiced by those engineers. My closets are full of devices that can attest to that.

The reasons probably have to do with cost, time to market, and competitive pressures. No doubt this ad hoc free form and individualistic approach to device usability testing is expensive and time consuming. But as Apple's iPhone 4 experience illustrates, what you "save" early on in time, effort and money could come back to haunt you in lost reputation, lost customers and losses in company valuation on the stock market.