Now that the "Who owns the glass rectangle" smartphone wars are thankfully fading into the background of the news cycle, competition in interaction designs is coming to the forefront. Apple arguably kicked it off in '11 by integrating Siri, introducing voice control; as we saw yesterday, Google may push into backside touch; and now Samsung is introducing a host of different interaction designs with their latest model.
Unveiled last night, Samsung's new Galaxy S4 has "Smart Pause," which stops and starts videos depending on whether your eyes are looking at the screen (they are presumably tracked by the camera). "Smart Scroll" advances screen content when the user tilts the phone to one side or the other. "Air Gesture" allows users to manipulate the phone without actually touching it, but rather by hovering a finger over the screen, or using a broader gesture like a hand wave to advance photographs. (And it works while wearing gloves.) Lastly, "S Translator" enables you to speak one language into the phone, and have the phone speak back a translation into a different language.
While none of these features are a magic bullet that will instantly win the smartphone war, that's not the relevant point, to us. What we're glad of is that heated competition is producing a range of experimental ways that we can interact with devices. Apple's steady, measured development process is very different from Samsung's "throw it at the wall and see what sticks" approach, with Google somewhere in the middle, and we can't say which methodolody is superior; but either way it's an exciting time for interaction design, and it is the end user who stands to win from all of these companies duking it out.
'Dare We Do It Real Time' by body>data>space (photo by Jean-Paul Berthoin)
Over an intensive two days at the end the month, 100 delegates at MEX 2013—the international forum for mobile user experience, in its 12th iteration this year—will gather in central London to discuss and attempt to envision the development and future impact of mobile technology.
With speakers at last year's forum including Dale Herigstad, four-time Emmy award winning creator of the iconic Minority Report conceptual user interfaces, as well as connected car experts from Car Design Research, this year's event boasts inspiring input from the likes of content strategist at Facebook Melody Quintana, UX research guru of WhatUsersDo Lee Duddell and Ghislaine Boddington creative director at experimental connected performance outfit, body>data>space.
Insight - How should we improve understanding of user behaviour and enhance how that drives design decisions? Diffusion - What are the principles of multiple touch-point design and the new, diffused digital experiences? Context - How can designers provide relevant experiences, respect privacy and adapt to preferences? Sensation - What techniques are there for enhancing digital experience with audible and tactile elements? Form - How can change in shapes, materials or the abandonment of physical form be used to excite users? Sustainability - How can we enable sustainable expression in digital product choices? Can we harness digital design to promote sustainable living?
Sam Dunne, Design Strategist at Plan and Core77 UK Correspondent, will be reporting live from the event.
MEX, Mobile User Experience
Walllacespace St. Pancras
22 Duke's Road
London, WC1H 9PN
March 26–27, 2013
Way back in '07, we learned Apple had patented touchscreens with interactive backs, meaning you could perform on-screen manipulations while keeping your finger out of the way. By 2010 we were calling it "backtouch" and (incorrectly) predicting the iPad would have it. Now that we'd given up hope on this UI technology ever hitting the market, Google is bringing our hopes up once more (even though we're afraid to love again).
We thought the whole point of a patent was that they're not awarded to duplicate technologies, but apparently there's something in Google's secret sauce that makes it different. From a user standpoint though, the benefits appear the same: You tap the back of your phone or tablet, and that registers a hit on-screen, enabling you to manipulate apps or perhaps type.
We're curious as to how ergonomically sound this is, as the opposable thumbs my dog always complains about not having seem more agile than the fingers we'd use to access the back of a device. I just picked up my phone and spent a few minutes pretending to type on the back versus actually typing on the front, and while the former feels a little awkward, I already suck at the latter. (One sure benefit though, backtouch would leave less fingerprints on the glass.) Try it yourself, assuming you're not out in public and don't want to look like a tool, and let us know if you think backtouch has got legs.
When we last looked in on interaction designer Jinha Lee, he was developing the See-Through 3D Desktop for the Microsoft Applied Sciences Group. Last week Lee, who's pursuing a doctorate at MIT Media Lab's Tangible Media Group, posted a video showing a potential retail application for the set-up: Called WYCIWYW, for "What You Click is What You Wear," the interface would allow the user to virtually try on wristwatches and jewelry.
Apple notoriously applies for tons of patents, very few of which will make it into actual products. This one is interesting from a UI perspective.
You could argue either way, but let's say it's an ergonomic necessity to have an easy-to-locate Home button, as now exists on the iPhone. That button cuts into screen real estate. Is there a way for Apple to get rid of it, growing the screen, while still somehow offering the Home button's functionality?
The answer may lie in a patent Apple has secured involving the measuring of electrical capacitance of the body of a product. As an example, if this were incorporated in an all-screen iPhone, the user could simply squeeze the phone's housing as a means of input. That doesn't mean the body of the phone itself would have to deform; it just means that the phone's body would register the change in capacitance coming from a squeeze, and would turn that into some sort of command. Software would sort out whether the squeeze was purposeful or accidental.
Apple Insider is speculating that the technology mentioned in the patent, which was granted several years ago, may pop up in the forthcoming iWatch.
Well folks, looks like 2013 is shaping up to be the year gesture control finally becomes available to the masses.
First up, the Leap motion controller that caused such a blog stir (we covered it here and here) will start shipping on May 13th, just about a year after they began taking pre-orders.
Hot on their heels—or forearms, I should say—is the Myo controller pictured above, an arm bracelet that you wear well above your wrist but below the elbow. Why the weird position? The Myo actually reads the electrical activity in your muscles, rather than relying on a camera.
This seems like a pretty smart approach, as the Myo can decipher complex finger gestures, flicks and rotations without requiring line-of-sight. That suddenly opens up a new world of interactivity that doesn't require the user be sitting in front of a camera-equipped computer, or dancing around in front of a Kinect. Peep this:
Looks amazing, no? If it works as advertised, it will have a much broader range of applications than the stationary Leap, and the Myo's price reflects that: The Leap's going for $80, while the Myo will run you $150. It's up for pre-order now and they're claiming it will ship later this year.
Jeroen van Geel was invited to participate in the Redux at Interaction 13 in Toronto. Speakers were invited to reflect upon the conference content on the last day of the conference. This is part of his reflection, combined with some after thoughts.
Interaction design is a young field. At least, that's what we as interaction designers keep telling ourselves. And of course, in comparison to many other fields we are relatively young. But I get the feeling that we use it more as an excuse to permit ourselves to have an unclear definition of who we are—and who we aren't.
At this year's Interaction Design Association (IxDA) conference, Interaction 13, you got a good overview of the topics that are of interest to interaction designers. And I can tell you that, as long as it has something to do with human behaviour, it seems of interest. In four days time there were talks and discussions around data, food design, social, health, gaming, personas, storytelling, lean, business and even changing the world. The topics ranged from the very specific task of creation of attributes to having an impact on a global scale. It shows that interaction designers have a great curiosity and want to understand many aspects of life. When we think we have an understanding of how things work, we have the feeling that we can impact everything. Of course this is great and we all know that curiosity should be stimulated, but at the same time this energy and endless search for knowledge can be a curse. Before we know it we become the jack of all trades, master of none. Interaction designers already have a lot of difficulty explaining their exact value. But where does it end? I don't know the answer, because I myself understand this endless curiosity and see how it helps me to improve my skills. Maybe the question is: are we becoming more a belief than a field?
The theme of Interaction 13 was 'social innovation with impact.' From this topic there were several presentations that focused on the role of interaction designers making the world a better place. Almost all designers in general, but every interaction designer specifically, wants to have this kind of impact. Over the last few years I've seen quite a few presentations at 'User Experience' conferences where a speaker enthusiastically puts his fist in the air and proclaimed that the time has come for the interaction designer to make the world more livable. Everybody cheered, interaction designers rallied up with their sharpies and thought they could solve every possible wicked problem. They enthusiastically went back to their huge corporation or agency in the hope that the next day they would finally get this world-changing assignment from their boss. But of course it didn't work that way.
As Google's Project Glass moves closer to completion, they're making a two-pronged push to draw eyeballs, both figuratively and literally. For the former, they've released a video with actual footage captured through actual Glass prototypes:
It's funny how quickly I've become accustomed to the fineness of GoPro's smooth footage; Google's comparatively primitive video quality leaves no doubt that the footage is real.
Viewing the footage, we see Google pushing several applications:
1. Exciting athletic or action-packed POV footage, à la GoPro.
2. Voyeuristic or "memory-making" POV footage, as with the ballerina about to hit the stage, folks playing with their children and dogs.
3. Practical real-time referencing, as with the ice sculptor pulling up images of his subject.
4. Hands-free photography.
5. Real-time sharing, à la Facetime, as with the man sharing footage of a snake with (presumably) his wife and child.
6. Real-time navigation.
7. Real-time translation (though I think choosing tone-based Asian languages like Chinese and Thai will present some implementation challenges).
What's interesting is that Glass promises such a broad range of applications—quite a different tack from Apple's approach of making their devices do a few things well. For us designers, the video raises questions of interface design: Glass presumably taps into Wi-Fi, how do we access the network and enter passwords? Will the voice control work on a crowded sidewalk or a noisy train station? How fine is the camera's voice-prompted shutter timing? How, and how often, do users charge the device? And how do users get footage off of the device?
While it doesn't appear to be an outright Apple-Chevrolet partnership, Chevy has announced that their new Sonic and Spark models will offer integration with Apple's Siri. Called "Eyes Free Integration," Chevy's system will enable iPhone-toting drivers to initiate and answer phone calls, interact with their calendars, play music, hear transcriptions of incoming text messages, and compose outgoing text messages all by voice.
As per the context in which it's meant to be used, one of the system's touches purposely violates a cardinal rule of user interface design: Visual feedback. With Eyes Free, the phone avoids lighting up when interacted with, instead remaining dark to prevent your tendency to look at things that suddenly illuminate, so that you'll keep your peepers on the road.
Two Eyes-Free-compatible apps/hacks we'd like to see:
1. The KITT voice mod, which continually refers to you as "Michael" no matter what your name is.
2. An app that enables you to call out the license plate of the car in front of you that just cut you off. It automatically dials that driver's phone, and you can tell them exactly what you think of them without needing to roll the window down to yell it.
When I grew up in the '70s, all of New York City had the same area code; I could call from Queens to Manhattan, or vice versa, without having to dial "212" first. When "718" was finally assigned to the outerboroughs, there was a sort of bizarre pride that people took in having a "212" area code, which we from the outerboroughs of course thought was silly.
Interestingly enough, the number sequence "212" wasn't chosen randomly, but was a direct result of the design of the input device of the time: The rotary dial.
Touch-tone phones may have debuted in the early '60s due to John E. Karlin, but I grew up in a house that used rotary phones all the way into the '80s. It was only after we got our first touch-tone phone that I realized how slow the dial was—numbers with an 9 or 0 in them seemed to take forever, and maybe one out of ten times you'd screw the dialing up and have to start over. But "212" was always easy to dial.
As you can guess, when the North American Numbering Plan of the 1940s went about assigning area codes, "212" was assigned to New York City because it was a center of business, and businesspeople are by definition busy, and "212" is the fastest possible area code to dial; due to the way the switching equipment worked, the first and third digits could not be a "1," and the second digit had to be a "1" or a "0." So "212," at a total of five clicks on the dial, was the fastest.
Of course, after the addition of "718," it was only us in the outerboroughs that enjoyed the speed of "212"—you Manhattanites had to wait for the "7" and "8" to go all the way around the dial. Suckers!
You may not know his name, but you know his work. John E. Karlin, who passed away in late January, essentially invented the touch-tone keypad. We take that ubiquitous input device for granted—it's on everything from cell phones to alarm systems to microwave ovens—but there was a time when that interface didn't exist, and no one knew what the "correct" design for quickly inputting numbers ought to be.
An industrial psychologist, Karlin was working for Bell Labs (AT&T's R&D department) in the 1940s when he convinced them to start a dedicated human factors department. By 1951 he himself was the director of Human Factors Engineering. In the late 1950s they sought a faster alternative to rotary dialing, and Karlin and his group developed the configuration we know so well today.
During the process they examined different options, of course. Aren't you glad we didn't wind up with this?
You might think Karlin simply took the calculator keypad and placed the smaller numbers up top. Nope—take a look at what calculators looked like at the time:
We've all seen it: the teenagers with one earbud in, feigning interest in conversation; iPad users brandishing the device like a radiation barrier to snap a photo; the veritable hypnosis of the "cell trance." In fact, maybe you're reading these very words on your smartphone, killing time in line while you wait for the next express train or your double-shot skinny latte. No shame in that—we all do it.
These behaviors and over 20 other digital gestures are duly catalogued in a research project conducted at the Art Center College of Design by Nicolas Nova, Katherine Miyake, Nancy Kwon and Walton Chiu, in July and August of last year. The four published their findings on our gadget-enabled society in an ongoing blog and a book [PDF] as of last September. "Curious Rituals" is nothing short of brilliant, a comprehensive index of the gestures, tics and related epiphenomena organized into seven categories of vaguely anthropological rigor. (The authors also extrapolated their findings in a short film of several hypothetical not-so-distant future scenarios, which I found rather less compelling than the book.)
While the blog illustrates their process—along with related videos and imagery—the final report, published under a Creative Commons Attribution-NonCommercial 3.0 Unported License, offers an incisive examination of "gestures, postures and digital rituals that typically emerged with the use of digital technologies."
Regarding digital technologies, [this endeavor shows] how the use of such devices is a joint construction between designers and users. Some of the gestures we describe here indeed emerged from people's everyday practices, either from a naïve perspective (lifting up one's ï¬nger in a cell phone conversation to have better signal) or because they're simply more practical (watching a movie in bed with the laptop shifted). Even the ones that have been "created" by designers (pinching, taps, swipes, clicks) did not come out from the blue; they have been transferred from existing habits using other objects. The description of these postures, gestures and rituals can then be seen as a way to reveal the way users domesticate new technologies.
Dan Hill of City of Sound sets the stage with a number of own observations in his fluent introductory essay. The designer/urbanist/technologist sets the stage by taking a casual inventory of gestures from the "wake-up wiggle" (impatiently jostling a mouse to awaken a sleeping computer) to iPad photography (which "feels awkward and transitional") and instant-classic iPhone compass calibrator (later referred to as the "angry monkey"). I'd add that this last gesture looks something like twirling an invisible baton or fire dancing—or, incidentally, 'skippable rope' from Art Hack Day.
How do you project moving images onto water? That was the challenge faced by Red Paper Heart, a Brooklyn-based collective of designers and coders. Tasked by nightlife tracker UrbanDaddy with creating an event featuring "a memorable interactive experience in water," RPH decided to "create animations that partygoers could swim through."
Sixty-five thousand ping pong balls later, they had their solution:
The rumors were true, and we finally got to see the touchscreen cafe table produced by Korean manufacturer Moneual. It's officially called the Touch Table PC MTT300, and there's a little more to it than sticking a tablet on a table.
First off, the invisible stuff: It's an Intel/Windows 7/Android/Nvidia-powered affair, and features two hidden speakers, though the model hired to flog the table couldn't say what the audio was meant to accomplish—perhaps feedback for button touches? As for the visible, the screen has a resolution of 1920 x 1080. The demo models we saw all had the menu taking up the entire screen and oriented just one way; will it be split up and oriented for two people, or even four? Or must the menu be swipe-rotated towards each person who wants to order? Again, the rep didn't know. (I'm starting to get frustrated with this aspect of CES).
As for the physical design, the side of the table features two USB ports, a mic jack and a headphone jack. They're located underneath the table, presumably to avoid spills that run over the edges, and their presence is indicated by icons:
The forthcoming Touchscreen Cafe Table we posted on has had some good follow-up, and unsurprisingly, Moneual aren't the only ones to have visualized such a thing. Fans of the seminal '90s Japanese anime Cowboy Bebop may remember Spike and Jet ordering dishes off of a touchscreen restaurant table that presented holographic images of the dishes, and Core77 readers have chimed in with more examples. SCAD grad and interaction designer Clint Rule (update your Coroflot page please!) worked up a touchscreen cafe table concept video a couple of years ago, and at least one restaurant in London has something similar currently in existence. Whereas I was thinking of the table's potential purely as a transactional device, both Rule and London's Inamo eatery have taken it further.
To start with, Rule's concept integrates social features:
Inamo, an Asian-fusion restaurant in London's Soho district, opts for projection rather than touchscreen. Their system was created by a London-based company called E-Table Interactive, and though it's projection, it still contains some type of hand-tracking mechanism that provides similar functionality to a touchscreen.
At least, that's what Korean electronics company Moneual is hoping, with the rumored forthcoming release of their touchscreen cafe table. With a touchscreen integrated into a table, restaurants could do away with paper menus, instead displaying dish descriptions and photos on demand. Diners would never have to flag a waiter down. And with the NFC technology that Moneual will reportedly integrate into the table, you could pay the bill without having to wait for the check. You'd still need a runner to dole out the chow and a busboy to clean up afterwards, but as a former waiter myself, I'd wholeheartedly vote for an object that made the waiter obsolete.
The rumor mill says Moneual will pull the wraps off of the table at this year's CES, where it just so happens Core77 will be. We'll keep you posted if we come across it.
Whether or not you're interested in videogames, this device is kind of fascinating from an industrial design/interface design point of view. The PhoneJoy Play is essentially a portable input device with a slick mechanical design: The two holdable halves can spread sideways, connected by a telescoping mechanism. Your smartphone or mini-tablet can then be "docked" in the middle, and the variety of buttons and motion pads interact with your device wirelessly.
This past weekend I had a car trip to make into unfamiliar territory, and I finally got to try out the newly-accepted Google Maps on my iPhone.
Google has dumped a lot of time, money and effort into amassing and updating the world's best consumer-targeted map database, and generously provided it for free. I don't want to be one of those people that complains about free stuff, like the whiners who moan about Facebook features—what do you, want your money back?—but I do have to point out how a single poor design decision can needlessly hamper an otherwise great product.
Nearly every unfamiliar-destination car trip I've taken in the past three years has been guided by Google Maps. I have the directions in my phone, which I prop onto the dashboard, in "map" view, so I can see at a glance where I am along the route.
Well, for this iteration the graphic designers have decided to make the route line blue. They've also decided to make the dot that represents you blue as well. The "you" dot doesn't blink, or have a strong drop shadow, or feature a reticle around it, and it's just a hair-width thicker than the route line, which makes it virtually impossible, while driving, to see where you are along the route.
What were they thinking? Why on Earth wouldn't you make the dot a different color, or provide some kind of graphic distinction? Isn't visual feedback basically UI Design 101? Does no one observe how people actually use the product in the real world? This is absolutely mind-boggling to me.
After spending millions of dollars and man-hours to get this product right, not a single person working at the company had the foresight to make a zero-cost change that would vastly improve the experience. It irritates me to no end when one of the world's more powerful companies ignores basic design common sense.
While yesterday's date of 12/12/12 was good luck for the numerically superstitious, it's today's date of 12/13 that's looking auspicious for me: Google Maps for the iPhone was finally made available today, at its good ol' price of zero dollars.
The Apple Maps debacle was a clear reminder that there are some areas where Apple can't out-design the competition, namely in raw data. Apple has my loyalty with most of the stuff they make, due to their unrelenting focus on the user experience: As I've steadily populated my parents' house, several states away, with Apple products over the years, I've experienced a steady decrease in those painful parental tech-support calls. But the Maps mess made clear that blind, across-the-board brand loyalty isn't the way to go.
So yes, no more trying to get crosstown directions and winding up with a destination in Kentucky. No more having to type every last letter of an address because Siri's silent partner is incapable of basic logic. The downside is that yes, there's no way to access Google Maps with Siri, meaning more typing; but I'd rather let my fingers do the walking, rather than my feet leading me in the wrong direction.
Interaction designer Ed Lea's visual metaphor for web products made rounds earlier this year, but it's definitely worth checking out if you haven't seen it yet. Thankfully, unlike cereal, digital products persist even after consumption...
Your cell phone knows where you are through triangulation. A Hungary-based company called Leonar3Do has taken that principle and applied it to a 3D mouse: by integrating several antennae into the form factor, a reading device can determine, with pinpoint accuracy, exactly where the mouse is in space. Have a look:
Remember the LIFX, the wi-fi-enabled smart LED bulb? While its Kickstarter funding period ended two weeks ago (well past its $100,000 target with $1.3 mil in pledges), there's no word on when production will begin; on November 12th the LIFX team wrote that "It's not possible to make final [production decisions] until we perform detailed thermal modeling and standardized measurements of light output, color rendering index, white balance agility, etc."
In the meantime Philips has been stumping for their own wi-fi-enabled, color changing offering, the Hue bulb. Interestingly, one of their marketing points is that you can select the output color (using an iDevice) via a method that will be familiar to Photoshop eyedrop tool users. Check it out:
Being the corporate giant that they are, Philips has adopted an interesting marketing technique: They've chosen to make the device available only through Apple Stores (both online and brick-and-mortar), taking preorders now and shipping in several months. At 200 bucks for a three-bulb starter pack the things ain't cheap, though they're about the same cost as the LIFX's initial $69 Kickstarter buy-in.
Rogue retailers, by the way, are re-selling Hues through Amazon at an usurious $100 per bulb; it remains to be seen if Philips will crack down.
On LIFX's Kickstarter comments page, some expressed skepticism about this project; but internet trollage aside, if Philips has thrown their weight behind a similar concept, you can bet they've concluded there's a market. Now we'll have to see whether it's David or Goliath that wins this early battle in the smartbulb war.
Continuing from my earlier scattering of field notes, in this post, I want to turn my attentions to the rural areas of Uganda and some of the uses of technology I observed. Dubbed the "Pearl of Africa", the country has rich, fertile soil with great potential. Agriculture is a vital component of the economy, and according to Wikipedia, nearly 30% of its exports are coffee alone. Anecdotally speaking, most people I meet in Kampala, the capital, have family ties in rural areas—a reflection of the fact that most of the population is rural.
As with my previous post, my field notes often take the form of Instagram. Although I eventually type up more thorough notes, I find the practice of taking live field notes to be beneficialhttp://www.ictworks.org/news/2011/12/23/avoiding-digital-divide-hype-using-mobile-phones-developmentboth because they allow me to capture my initial thoughts and reactions while they're fresh in my head and because they spark dialogue and conversations with social media friends who get me thinking differently about what I see.
So much of food in rural areas is experienced in bags—stored and shipped in bags, purchased in bags, even sometimes cooked with bags. Known as kaveera, plastic bags are abundant in Uganda. The Uganda High Court recently ruled in favor banning such bags, a trend across East Africa, but it remains to be seen how the ban could be enforced. This is a story of technology but not communications technology. I couldn't help but wonder: what could technology provide that helps balance the twin needs of reducing environmental impact and providing accessible food packaging?
While spending time in Oyam, in northern Uganda, I saw a number of smart phones being used. This Nokia could play videos and music, display ebooks and of course capture photos, but it's not connected to a data plan—nor were most smart phones I encountered in the region. Rather, individuals would find opportunities to access an Internet-enabled computer (most often at a net cafe) in nearby towns that do have the Internet, and they would download materials, which could range from Nigerian comedies dubbed in Luo, the local language, to educational materials about agriculture and business. In this regard, Ugandans used the device more like an iPod... which happened to have phone capabilities.
In rural areas, I tend to rely much more often on my feature phone than on computers and my iPhone. It gives me an appreciation for the disruptive role of mobile phones. Although our driver (whose stereo you might recognize from the previous post) lives in the city, he spends much of his time in the field. But that doesn't stop his business: armed with multiple phones and phone plans, he's developed a 'cocktail of special plans that allow him to make multiple calls at low rates. He keeps his phone charged by his car and whenever we're stopped, he's constantly making calls and conducting business.
Tom Taylor is a technologist and engineer who enjoys working "in the fuzzy space between matter and radiation," and he's got a neat Mac app (probably most fun for those who travel a lot for work): Satellite Eyes. The simple application changes your desktop wallpaper to a satellite photo of your current location as soon as you connect to the internet.
"It features a number of different map styles, ranging from aerial photography to abstract watercolors," writes Taylor. "And if you have multiple monitors, it will take advantage of the full width, spanning images across them."
Surprisingly it does not use Google Maps' images, and unsurprisingly it doesn't use Apple Maps' images either; data comes from OpenStreetMap, Bing Maps and San-Francisco-based design/technology studio Stamen Design.
Best of all, London-based Taylor has made the app's price conversion nice and easy: £0 equals $0.
As you might have noticed, we've had quite a bit of Asian design coverage lately (with a few more stories to come): between the second annual Beijing Design Week, a trip to Shanghai for Interior Lifestyle China and last week's design events in Tokyo, we're hoping to bring you the best of design from the Eastern Hemisphere this fall.
Of course, I'll be the first to admit that our coverage hasn't been quite as quick as we'd like, largely due to the speed bump of the language barrier. At least two of your friendly Core77 Editors speak passable Mandarin, but when it comes to parsing large amounts of technical information, the process becomes significantly more labor-intensive than your average blogpost... which is precisely why I was interested to learn that Microsoft Research is on the case.
In a recent talk in Tianjin, China, Chief Research Officer Rick Rashid (no relation to Karim) presented their latest breakthrough in speech recognition technology, a significant improvement from the 20–25% error of current software. Working with a team from the University of Toronto, Microsoft Research has "reduced the word error rate for speech by over 30% compared to previous methods. This means that rather than having one word in 4 or 5 incorrect, now the error rate is one word in 7 or 8."
In the late 1970s a group of researchers at Carnegie Mellon University made a significant breakthrough in speech recognition using a technique called hidden Markov modeling which allowed them to use training data from many speakers to build statistical speech models that were much more robust. As a result, over the last 30 years speech systems have gotten better and better. In the last 10 years the combination of better methods, faster computers and the ability to process dramatically more data has led to many practical uses.
Just over two years ago, researchers at Microsoft Research and the University of Toronto made another breakthrough. By using a technique called Deep Neural Networks, which is patterned after human brain behavior, researchers were able to train more discriminative and better speech recognizers than previous methods.
Once Rashid has gotten the audience up to speed, he starts discussing how current technology is implemented in extant translation services (5:03). "It happens in two steps," he explains. "The first takes my words and finds the Chinese equivalents, and while non-trivial, this is the easy part. The second reorders the words to be appropriate for Chinese, an important step for correct translation between languages."
Short though it may be, the talk is a slow build of relatively dry subject matter until Rashid gets to the topic at hand at 6:45: "Now the last step that I want to take is to be able to speak to you in Chinese." But listening to him talk for those first seven-and-a-half minutes is exactly the point: the software has extrapolated Rashid's voice from an hour-long speech sample, and it modulates the translated audio based on his English speech patterns.
Thus, I recommend watching (or at least listening) to the video from the beginning to get a sense for Rashid's inflection and timbre... but if you're in some kind of hurry, here's the payoff:
How do you post a YouTube video that gets nearly five million hits in 24 hours? Simple: Record a touchscreen voting machine in Pennsylvania that apparently wants to choose your candidate for you.
The Pennsylvania man who posted this video claimed that try as he might, every time he tapped Obama, it selected Romney instead:
Thinking the calibration was off, he then tapped the option below Obama, hoping that would activate his choice. It didn't.
I initially selected Obama but Romney was highlighted. I assumed it was being picky so I deselected Romney and tried Obama again, this time more carefully, and still got Romney. Being a software developer, I immediately went into troubleshoot mode. I first thought the calibration was off and tried selecting Jill Stein to actually highlight Obama. Nope. Jill Stein was selected just fine. Next I deselected her and started at the top of Romney's name and started tapping very closely together to find the 'active areas'. From the top of Romney's button down to the bottom of the black checkbox beside Obama's name was all active for Romney. From the bottom of that same checkbox to the bottom of the Obama button (basically a small white sliver) is what let me choose Obama. Stein's button was fine. All other buttons worked fine.
I asked the voters on either side of me if they had any problems and they reported they did not. I then called over a volunteer to have a look at it. She him hawed for a bit then calmly said "It's nothing to worry about, everything will be OK." and went back to what she was doing. I then recorded this video.
Faulty touchscreen, fat fingers, or something more menacing? If it was the latter, it didn't work: Obama had Pennsylvania's 21 electoral votes in his pocket by election's end.