[Analysis] Minuum and the Quest for a Better On-Screen Keyboard

Update: It looks like they reached their $10000 funding goal within a day. I guess it’s time for my dancing cookie-dispensing robot idea to greet the world…

Everyone’s aflutter about Minuum, the on-screen keyboard concept looking for funding on Indiegogo.  The reactions fit into the usual classifications: this sucks, this is stupid, this is amazing, this is genius, this will change the world, this has no hope, try mine instead, you’re stupid, you’re stupid but I’m smart.  Very informative.

Continuing my quest to over-analyze everything as though it were a fine wine (or decent winegum), I provide you with my initial analysis of their initial promo material, initially initialed intricately in triplicate.

The Analysis

A writer of type.

They hate typewriters.  Or, or, they badmouth typewriters but like to show them in their fundraising video.

I know, it’s just a marketing video, it’s a commercial, it doesn’t represent their intellects nor their capabilities.  But it is annoying to hear the same half- or quarter-truth repeated by designers promoting their latest interface improvement.  The fractional truth in question: the influence of typewriters on modern interface design.

It’s almost obligatory for someone to mention typewriters when presenting a new interface design – especially anything keyboard-ish.  The argument goes something like this:

  • typewriters are over a century old
  • they had a big problem with keys getting stuck together
  • so they made the layout less-efficient and slowed everything down
  • modern devices aren’t anything like those old wing-dingers with their cogs and cranks
  • therefore it’s stupid or at least strange to make modern interface devices work in some way similar to those old contraptions
  • It may even be treason.

To which the proper response is “Yes, but…”

Yes: It’s true that the QWERTY layout isn’t optimal in terms of key location relative to letter frequency (the more common a letter is used in the English language, the closer it should be to a fingertip; the least commonly used letters should be farthest from the fingertip in that case, kinda sorta), and it’s true that modern keyboards and on-screen/virtual keyboards don’t have the mechanical issues that called for the use of QWERTY.

But: there are oodles of good reasons to use some typewriter-related concepts, there are many ways that on-screen keyboards are fundamentally inferior to typewriters, and it’s misleading to invoke the typewriter in comparison to your product without elaborating.

The QWERTY layout is really the only thing that an onscreen keyboard takes from the typewriter. The relative size and separation of the keys on the screen is to make targeted touches easier for the user – they can easily judge whether they’re between two keys, directly on one key, or somewhere else.  Physical keyboards and typewriters give us all sorts of tactile feedback that we don’t get on a screen, so it’s hard to touch type.  We just can’t feel precisely where our fingertip is on the virtual keyboard; there are no raised edges, no valleys between keys, no concave surface to invite a fingertip in for a rest.  This loss of feedback has a much larger impact on interface efficiency than is generally recognized, and I’ll be addressing it in a future article.

So the user gets no tactile feedback cues to guide the finger placement.  That’s a negative for any on-screen keyboard, but at least they all have it in common.  What, then, separates the good screenboards from the okay, the okay from the bad?

As always: it depends.  There are all sorts of objective and subjective ways to measure and compare screenboards, but which measures really matter?  Minuum‘s premise is that the default style of screenboard is usually something large with typewriter-like layout and separation between the keys, something that often covers half of the screen in a way that is distracting or otherwise negatively affects users, so it would be of benefit to have something that is functionally equivalent to a big screenboard but much smaller and less obstructive.  I agree that the large boards are obstructive and disrupt the flow of the experience, but I have some issues with their solution…

Even though the half-screen virtual keyboards eat up so much space, the user is able to trust that the keys will always be in the same locations on the screen, no matter what they do (except for switching to alt characters, number boards, etc.), and pressing a key always results in that position’s character being added to the input buffer.  The Minuum type of predictive entry starts as a sort of compressed QWERTY board which lets you choose a “first candidate” character.  A mini-board pops up above the first board and includes guesses about what character you were actually trying to hit; this can be characters to the immediate left and right, or the next letter of a word that it thinks you’re trying to spell.  It’s not obvious from the video whether a second selection is necessary if the first guess was correct; it could just wait for a delay and then push the guess onto the input buffer.

The point here is that flat QWERTY is the only constant part of the board; the virtual keys are lined up shoulder to shoulder in one long, thin row and it would be difficult to choose the desired key on the first click. The mini pop-up board’s contents are not static – they can change depending on tiny differences in finger position on the first pass and depending on predictions about the word or string you’re trying to type. This means that the only constant part of the board is hard to use on its own, and that you’ll have to do a two-stage selection using a board that isn’t static. 

I’m not saying that this won’t work or anything like that. I’m just saying that the way this operates goes against some UX principles at first glance. If the prediction algorithm works well, you’ll be saved a lot of extra key presses, and that’s good; after typing the first 5 letters of “antidisestablishmentarianism”, it lets you click on the finished word and saves you all that isestablishmentarianism.  If you’re typing a lot of non-dependent (non-predictable) text or strings, like alphanumeric passwords or LULZSPEAK TXTING LING0, you’ll have to more actively scan the mini-board for the correct character (since you won’t know what characters it will include) or use the “magnifier” feature (which is really a 2-stage board without the prediction feature).

In general, the more the user has to actively think about something, search through sets, make judgments, etc., the less optimal the interaction will be. If the board layout remains constant and the fingertips are moving to a fixed location each time for a specific key, the process becomes less and less a conscious task. Physical keyboards are great for this because the keys are always in the same absolute position and there are many little tactile and auditory clues and cues that feed back to the motor control, helping to make precise key presses without needing to visually track the finger’s position or do any conscious processing.

Now, I must stress that I don’t have any more information about Minuum than anyone else who has only seen the promo video, so I’m speculating about some of the details and about what manner of beast will be the final product  Feel free to point out any glaring mistakes in my reasoning or understanding.

I wish them good luck in fundraising and good luck in the market.

Android Phone Goes Inky. E Inky, Prototypically Speaking.

Wow, what a great headline…

I read an article at Laptop Mag regarding a prototype Android phone that uses an E Ink display.  My inner critic decided to outwardly criticize, producing a rather lengthy blog comment.  I reprinted the comment here on my own blog because… well, why not?

Laptop Mag’s hands-on demo:

My response:

Notwithstanding the super-light weight and super-long battery life that E Ink affords this device, the display is a showstopper. The talk about using an older processor is a red herring; a faster processor won’t fix fundamental characteristics of the display. The currently available generations of E Ink give you a trade-off between refresh speed and power consumption; crappy refresh rates mean long battery life, fast refreshes are draining.

The E Ink screen is great for displays that don’t require rapid refresh, but this prototype demonstrates how inappropriate it is as a smartphone’s primary display.

Motofone F3

When you buy an Android phone with multi-touch, the implication is that you’ll be interacting using finger swipes and taps, and that your interactions produce feedback quickly enough to make the experience seem natural and effortless. What we think of as normal single- and multi-touch functions would lose much of their utility; pinch-to-zoom, for one, would be a noticeable series of zoom-in steps (instead of a fluid growing and shrinking effect), something you could achieve with a zoom-in button and a single finger.

I’m not trying to bad-mouth E Ink, here – this is just not a viable application until/unless E Ink rolls out a display that gives you imperceptible refresh without massively increasing power consumption, hopefully at a reasonable price.

It would be cool to have the option of swapping your phone’s display, either physically changing it for another one or flipping one over the other like a book cover. There are times when I wish my display was e-paper, but then I look at my Motorola F3 and all is forgotten.


LG Support Super Happy Fun Time

Who doesn’t love an easter egg hunt?

Staring at the refurb LG television on my desk, I felt the need to check its customizability, or “hackability” for those wearing rollerblades.  Before any of that could happen, I wanted to find precise specifications and descriptors for the TV to help my search.  The logical place to start was the manufacturer’s support site…  Corporate product support sites are universally craptastic, but LG has a way of making theirs even more frustrating.

Model 22LG30 LCD TV

Example: the exact model number stamped on the back of my TV isn’t listed in the product search.  I have a 22LG30-UA.  When I visit LG’s Canada support page* and enter “22lg30-ua” I get no results at all from the quick menu or drop-down menu.  Hmm.  That’s not a good sign.  Clicking the Search button brings me to a results page that purports to show close matches to known products.  But there are none.  Zero matches for product, tutorials, or frequently asked questions.

*Strangely, I initially landed at the UK support site.  I can’t say that this was LG’s doing since I performed a Google search instead of entering the basic lg.com URL, but I didn’t realize I was at the wrong site for my region for a few minutes.  A “you appear to be in Canada, would you like to visit their site instead?” message would have been appreciated.

Playing the game, I try a less specific search term, “22lg30” (case isn’t important) and I get this from the quick menu:


Notice the total lack of “22LG30-UA” results.  This time, at least, I have some leads.

This is a clear UX failure; you’ve asked me for a model number, I gave it to you verbatim, you tell me there is no such product.  One of us is lying or misinformed.  I can appreciate that they have oodles and oodles of model numbers and that running a support site isn’t generating revenue, but somewhere in the corporate databases there must be a master list of model numbers that could be dumped to the support site.  Then, at least, a user would have the luxury of finding that his television really does exist.

So, I have two possible matches for my model, “22LG30DC” and “22LG30DC-UA“.  What do these mean?  What is the difference between a model that has “UA” and one that doesn’t?  Is there a default, generic result that I should try first?  There are many ways to help me, the frustrated user, complete his task, but I’m left to click through each link.

I clicked the results in order, looking through the first result, then back to look at the second.  The pages were exactly the same in any meaningful way and looked like this:


There was no information about region specifics (is this a UK model, a Canadian, a German?), no explanation of the “UA” suffix, no information about release year or years, no mention of product family or relation to other products.  There wasn’t even a picture of the TV!  All you get is a generic, slightly ghosted flat panel TV image, which is quite unhelpful when the user wants to know if it’s his television and, naturally, there is no caption or asterisk telling you that it isn’t a picture of your device.

The Help Library section, which one expects to have tips about the device, includes gems like:

  • Sharing Files & Folders – Windows Vista OS
  • Smart TV – Resetting of Netflix Premium Application
  • DLNA not supported on Macintosh Operating System

None of these is related to the product on the page.  Oh well.  Let’s check the manual and get all of the information we want:


THANK YOU!  Not even a manual to peruse for the 22LG30DC-UA model.  The 22LG30DC model does list a manual, a PDF document (sigh) which appears to match my product.  PDFs are annoying in all sorts of ways, but at least I do, eventually, get the info.

My favourite part of this whole exercise?  Finding LG’s USA support site.  It has an exact listing for “22LG30-UA” with the correct product image (top of this post, source lg.com), a spec sheet with information not found in the manual, and different Help Library information that is also unrelated to the product.  Parfait.

Why is the support database balkanized into separate regions like this?  It makes a certain logical sense for each region to list only the models actively sold (and therefore supported) in that region, and it will probably have no negative effect on most users, but there are many realistic and recorded scenarios where users find themselves unable to get what they need.  From the outside, I can’t know the real reasons for the regionalized nature of LG’s support system.  I would not be surprised, however, to learn that no real usability analysis or user testing was performed, and that support was organized according to the structure of the companies involved rather than a genuine effort to provide a service to the customers.

The whole foundation of good user experience design is knowing your users, and anyone who has really tried to know their userbase has discovered a heterogeneous group of people with different expectations and different ways of solving their problems.  Accepting that reality, a good designer must account for these different expectations and methods, finding ways to accommodate and assist.  You can’t make every task the press of a single button, nor can you make every user act according to your plans, but you can offer suggestions (“You might also check our other regional support sites”), useful information (“The sections of the model number refer to this year, this family, this region, this revision, etc.”), and more agency (“Enter this, press that button” vs “If you know X or part of X, you can search for Y here.  You may also try these other methods, or follow our tutorial, etc. etc.”).  A little consideration can build a lot of customer satisfaction.

LG Support has other ways to frustrate the consumer (not releasing updated firmware via the support page is a frequent complaint), but that’s enough for today.

My next post introduces us to the hidden world of the TV’s Service Menu.

Google Play says your username and password don’t match?

UX designers and coders take note: nothing will frustrate your users more than being asked for login credentials and being told that they’re wrong.

This is especially true when the user (me) is trying to enter a long alphanumeric password on a tablet with a stylus.  Every time the user sees “username and password don’t match”, they will naturally assume that they’ve hit an extra key or capitalized something accidentally, and will grumble to themselves as they try again.  Things get even more fun when the password field is masked with stars to prevent shoulder surfing.

It’s pretty easy to humble your user this way.  So easy, in fact, that you should spend time analyzing the user’s task to see if you’re asking them the right questions and giving them enough help…

Case in point: Google Play Store.  I have a very low cost (cheap) tablet on which I managed to load the Google Play packages.  When asked to login to my Google account, I received the very helpful response “username and password do not match”.  I attempted to login several times with my normal credentials and failed every time.  There were any number of reasons for this to have failed (including the fact that my tablet was unsupported, ahem), but the real reason was ridiculous:

I use Google’s two-factor authentication.

Logging in to Google from a new computer usually means entering my username, password, and then a 6-digit number that is sent to my cellphone over SMS.  If I enter the user/pass incorrectly, the error would be “username and password do not match.”  If I enter the 6-digit number incorrectly, the error would be something like “incorrect PIN.”  This is straightforward proposition: enter your Google username, your Google password, the PIN that Google sends to you; if you get something wrong, you entered the user/pass incorrectly, or you mistyped the PIN.

Google Play’s device login, however, doesn’t mention anything about PINs or two-factor authentication.  A naive user, like myself, assumes that he must enter his normal Google username and his normal Google password.  But that’s wrong.  Normal username, yes, but you must enter your “application specific password”.

What’s that?  Rather than implementing the SMS PIN step, Google lets you create a sort of special password that you only use on mobile devices or desktop apps.  There are many good reasons for doing this; it’s extra security against rogue apps or compromised devices (not exposing your main Google credentials), it saves developers using Google APIs from having to rework their products, and the application specific password is only made of lower-case letters so that mobile users won’t have to fiddle with entering special characters.

Good reasons, all of them.  But it all falls apart at the user interface.  Users are dependent on the UX designer to give them the information they need for the task.  Failing to mention mention that “password” could mean “application-specific password” is a big omission.  Google’s support site does mention the issue, and users of 2-factor authentication are told in advance to expect this behaviour, but that doesn’t cut mustard.

Now, back to my under-powered plastic tablet and its slight violations of terms of service…

Nintendo’s Wii U in Paper Prototype Form

Prototyping is an indispensable tool for design.  You have an idea of what you want your product to do, who the user will be, and what the product might look like, but you need feedback.  Feedback from users, designers, and other stakeholders, will tell you if you’re on the right track.  Hand a user a prototype and ask him to perform a task; you’ll quickly learn how many of your assumptions were right.

The key to an effective and efficient design process is to prototype earlier and often; the earlier you produce and test prototypes, the easier it will be to implement changes in design, and, ultimately, you will get a better product.

Gamasutra has an article about prototyping an app for Nintendo’s Wii U.  The developer wanted to see the interface in the real world and be able to touch the device, putting himself into the user’s shoes.

His solution involved bits of cardboard and glue.

This kind of paper prototyping is fast, cheap, and very powerful.  Within minutes you have something you can put in your user’s hands (or just in front of him) that can be manipulated, modified, or torn up without much grief.

How to stop HP printers from grabbing drive letters

A few months ago, I purchased a used HP Officejet Pro L7780 for my parents.  It was quite an upgrade from the little Epson all-in-one that they had been using for the past few years.  But there was a problem…

The software and drivers are painful to use.

I don’t know what kind of UX work went into this stuff, but it wasn’t enough.  The drivers aren’t easy to install (especially for the scanner function), errors are cryptic and have a morbid finality to them, and a lot of the software’s behaviour isn’t user-customizable.

My biggest gripe, outside of the installation problems, is with the network mapping feature. The printer has a set of media card slots (SD, compact flash, etc.) that can be mapped to a drive letter on the user’s computer.  For some reason, known only to HP, the mapping isn’t persistent and it isn’t controlled by the user; that is to say, the mapping has to be re-established each time the system boots, and the user can’t tell it which drive letter to use.

HP’s kludgy solution to the persistence issue (which is odd since persistent mapping is a feature in many operating systems) is to run a service at boot time.  The service checks for available drive letters starting at Z and working backwards.  When it finds one, it assigns it to the printer’s card slots.  This means that no matter how you arrange your drives, the printer’s card slots will always show up somewhere in your drive list.  It also means that the card slots can bounce around the drive listing with no fixed address.

For most of us, this isn’t a practical problem, just an annoyance.  I can see that this behaviour would be beneficial in some situations.  For instance, a novice user won’t be able to accidentally block access to the card slots by assigning their preferred drive letter to another device.

Personally, I want to be asked for my preference and I want to be able to change the software’s behaviour.

There is no way to use HP’s software to assign a preferred drive letter.  It will always do the search from Z to A.

Stop the HP mapping service

The network drive mapping is done by a service called “HP Network Devices Support”.  By default, the service launches when Windows boots.  The easiest thing to do is to disable the service completely.

Open up the Services management console.  In Windows 7, click on the Start button, type services.msc then press the Enter key.

Scroll through the list until you find HP Network Devices Support.

You can see that the “Startup Type” is set to Automatic (Delayed).

Right-click on “HP Network Devices Support” and left-click on Properties.

Left-click on the Startup Type drop-down box and select Disabled.  Click Apply.  Now turn off the service by clicking on Stop.  Now click OK.

When you’re done, the Services console should look like this:

Okay, you’re done!  The HP software will no longer try to map your printer’s card slots.  Please note that you will still get pop-ups from HP software telling you that your printer is disconnected.  If you want to stop those notifications completely, go back to the Services console, then Stop and Disable the following services:

  • HP Cue DeviceDiscovery Service
  • HP Service
  • hpqcxs08
If you still want access to the media slots, read on.

Making a permanent mapping

Under Windows 7, setting this up is quite easy.

Click the Start button, then click on Computer in the menu that appears.  You should see a list of drives.

There will be a set of links near the top of the window which say Organize, System properties, etc.  Click on the one that says Map network drive.

Choose your preferred drive letter from the list.

For this next part, you need to know the IP address or the network name of your printer.  The network name is best, since the printer’s IP may change if your router uses DHCP to assign addresses.  The network name will stay the same.

If you’re unsure of the IP or the network name, check your router’s setup.  It should have a list of connected devices.

Click in the text field next to Folder, then type two backslashes, followed by the IP or the network name of the printer.  Now click the Browse button.  A dialog window should open with a list of network devices.  If your printer appears in the list, click on the triangle next to it to reveal a folder named “memory_card“.  Click on “memory_card“, then OK.

To make this mapping permanent, click the checkbox next to “Reconnect at logon”.

The printer should now be listed in Computer with the drive letter you chose.

Some User Experience Mistakes

  • Unexpected behaviour:  the printer’s card slots are storage devices, but they are behaving unlike other storage devices.  When the user adds a new USB stick or other memory device, Windows either asks for a preferred drive letter or it assigns the next available drive letter sorted from A to Z.  The HP software doesn’t ask the user for a preferred letter and chooses the next available letter sorted from Z to A.
  • No choice / lack of choice:  there is no way for a user to change the drive mapping behaviour by using the printer’s software.  The user is forced to either live with it or disable the mapping service entirely.  The card slots can be manually mapped to a specific drive letter, but this is an advanced procedure that most users couldn’t do.

Windows Password Hints: Big Deal?

Jonathan Claudius and Ryan Reynolds, white-hats in the security game, have discovered a registry key in Windows 7 and Windows 8 that contains password hints, the little reminders that pop up when you try to log in and make a mistake.  The hint is supposed to be an additional cue to a user’s recall, one that is meaningful to the user but not to anyone else.

There’s some debate about the significance of this discovery.  On one hand, the hint is freely available to anyone trying to log in to an account, so it’s not meant to be privileged knowledge.  Anyone can try to log in to any account, mash the keyboard to make a random guess at the password, and get that account’s hint in return.  On the other hand, the hint is crafted by the user with the goal of helping him/herself to remember something.  The amount and kind of information they can put into a hint has the potential to make guessing much, much easier, and not just for the user.

Claudius and Reynolds’ discovery shows us that anyone or any program with access to the registry can read the hint.  How is this beneficial for an attacker?

For one, there’s no actual failed login attempt needed to get the hint.  If I walk up to a machine, click on someone’s account name and type in a random password to get the hint, the login attempt can be logged.  In the aftermath of an attack, that log entry can tie me to a particular place or network connection at a particular time.  Even if there hasn’t been an attack, it’s a possible sign that someone other than the intended user has probed the system.

The fact that the hint is stored essentially in plaintext is understandable given the way Microsoft intends it to work (it’s given out prior to authentication, so why bother encrypting it for storage?), but it’s a bad, bad, bad, bad, bad idea for security purposes.  A user could practically spell out their entire password in the plaintext hint and kneecap the security model altogether.

How can this be fixed?  Well, ask the user to remember something before you give them the hint.  Show them a set of images, one of which has been pre-selected by the user.  The user is asked to click their special image to reveal the hint.  Only after this soft authentication would the hint be unencrypted and revealed.  The image would (or should) have absolutely no relationship to the password, no relationship to the hint.  This would be a small barrier for an attacker, but even a small barrier could profoundly reduce the incidence and success of casual or automated attacks.  The additional cognitive burden on the user should be very small, certainly much smaller than that of a password.

Some have called this discovery a non-issue, and in many ways this changes nothing about the Windows security model, but it does highlight some bad characteristics of the model itself.

Microsoft Thinks Two-Factor Authentication Isn’t Important

Reps for Microsoft’s new Outlook.com service suggest that strong passwords and vague statements about R&D are enough to protect their users.

Mashable questioned Microsoft about Outlook.com’s security and was told that, unlike Google, two-factor authentication will not be implemented.  Google does not require use of two-factor authentication, but they do offer it to users on an opt-in basis.  Microsoft’s decree that they won’t even offer an opt-in service is disappointing to say the least, and will very likely come back to haunt them in the following months/years.

The tone of the MS rep’s comments gives the impression that two-factor-auth is a sort of anachronism or secret handshake — something only a Spock-eared nerd or snobby IT elite would encumber himself with.  Whether they take issue with the two-factor concept generally, or Google’s implementation specifically, is unclear.

The rep’s case boils down to two propositions:

  1. (Google’s?) Two-factor auth creates a bad user experience.
  2. Strong passwords and unspecified future schemes are secure enough.

Let’s examine these in more detail.

Google’s two-factor scheme works like this:

  • Joe Blow has a Google account which stores email, browser passwords, documents, and other private stuff
  • Joe tells Google that his password will be “JoeIsCool”, that his cell phone number is 867-555-5309, and that he wants to use two-factor authentication
  • If Joe accesses his Google account from his home computer or other trusted machine, he may be asked to enter his password, “JoeIsCool”, once a day or every other day; Joe enters the password and Google lets him in
  • If Joe wants to use his Google account from an internet cafe or untrusted computer, Google will ask him to enter his password, but it will also send a code to his cell phone; he must enter both tokens (password and one-time code) before Google will let him in
  • If Hacker Henry discovers Joe’s password, he isn’t able break in to Joe’s account since he doesn’t have Joe’s phone and thus can’t receive the one-time code; what’s more, Joe is now alerted that someone is trying to break in

To me, the biggest usability barrier in this scheme is getting the initial user buy-in.  The user needs to know that the option exists, that it is important, that it is good, and that it is easy.  If the user opts-in, use of the two-factor scheme is fairly straightforward; Joe attempts to log in from an untrusted computer, he enters his password correctly, he receives a text with a number in it, he enters the number, he’s in.

Think about it this way: he only needs to remember his password.  Whether he uses two-factor or not, he only needs to remember his password.  If he uses two-factor, the system itself gives him a token and asks for it back — no additional memory burden.  Compare this to the most common method of enhanced authentication: testing your memory.  When I log on to my bank’s website from an untrusted computer, the site asks for the usual stuff and then tests my memory about certain things.  What is your favourite movie? Who was your first grade teacher’s name?  What is your pet’s name?  These seem like simple questions to answer, but there be dragons here.  You may remember the facts just fine, but must also reproduce the answer you gave when you created the account.  This gets into issues of case-sensitive tokens, ambiguous questions that can have several perfectly sensible but incorrect answers, etc.

How is that not a giant usability minefield?  Google prompts you with the exact token it wants, zero memory burden on the user (aside from the password, obviously) each and every time it asks you to authenticate.  A memory test, by comparison, is always a memory test; each time you are asked to remember something, you have to perform a search in your memory or in your little black book of things you can’t be bothered to actually remember.

One potential pitfall of Google’s two-factor is that the user must have access to his or her phone during the authentication.  In 2012, I don’t see this as an unreasonable requirement, but there’s always that one day when you’ve lost your phone or its battery died right when you need to log in.

So, what authentication scheme will Microsoft employ that is plenty secure and user friendly?  They aren’t saying.  The rep assures us that they’re pouring money and effort into R&D on this matter, but I can’t see them inventing a totally new scheme that satisfies ease of use and strong security.  I’m expecting a reliance on strong passwords (on pain of death, little user!), the usual memory tests, and perhaps something gimmicky tacked on… Something graphical?

Keep watching the skies…

Dragon ID’s mobile unlock by voice

A brief but interesting story on GigaOm.

Nuance, the company behind the Dragon family of voice recognition products, is promoting a mobile app called Dragon ID.  The app acts as a replacement for standard user authorization schemes like PINs or swipe patterns by matching speech characteristics of a user against a known set of characteristics.  It’s the old “My voice is my passport” idea that we (or at least I) saw in “Sneakers“; the user speaks a phrase into the device, the device checks to see if the user’s speech has the same x, y, and z as the real Mr. User, and accepts or rejects the attempt.

At first blush this looks like a UX winner.  The user doesn’t have to remember any complicated passwords, PINs, or other meaningless tokens.  And it would be impossible for the user to lose his authenticator, his voice, except by disease or injury.

But there are some security considerations that must be satisfied for this to be an acceptable gatekeeper for a mobile device.  The most obvious weakness of this system would be to a replay attack, literally replaying a recording of the user authenticating.  What countermeasures are used by Dragon ID to prevent such simple attacks?  Presumably the audio recording is analyzed by Dragon ID to ensure that the voice is coming from a point directly in front of the device or headset microphone, but this would not be a robust defense.  Can it detect artifacts of digital audio reproduction?  Audio compression schemes like MP3?  Does it emit a one-time audio watermark via the speaker during recording so that a replay would be easily detected?  I’d certainly love to know.

Pattern matching is performed against an established set of phrases recorded by the user.  This simplifies the task of matching a candidate audio sample’s characteristics against a known set of characteristics, but it presumably reduces the amount of work an attacker would need to put into making a passable authenticator.  In a perfect world, the app would compose a unique phrase for each attempted authentication, each log-in, so that an attacker would have no real template for a “good guess”.  The attacker would need to know about a user’s full range of accents, inflections, cadences, etc., in order to make a passable authenticator, and he would only get one shot at each phrase.  With a known subset of authenticators (like a decent recording of one successful authentication attempt), the attacker knows what the phrase will be for any future attempt and that he will only have to polish it somehow for it to be acceptable.

Phrases can be disabled by the user or disallowed by the device or the Dragon ID servers for too many failed attempts, but this raises a question about the resistance of the app to multiple attempts.  The app surely only allows a certain number of attempts before either locking the device entirely, disabling a specific phrase, or forcing the user to authenticate with a password or some other non-voice token.  But how does it track multiple attempts? The app is required to work even when the device is completely disconnected from voice or data networks, so there must be some form of device-resident logging.  If the device’s memory is cloned before an attack, what prevents the attacker from reflashing the device into its previous state where the counter was at 0?  There are plenty of memory locations on a device to store counter information, and more clever ways than a simple variable in a LoginAttempts.dat file.  Is it possible to completely reset the state of the device to a set point such that an attacker could indefinitely attempt authentication?

Enlighten me.  I love this stuff.


Original article on GigaOm.