Leader: “My universal translator doesn’t seem to be working…”

And nor will it for some time yet…

A look at any one of several well-known sci-fi classics on TV and film tends to show up speech as an area where we’re clearly about to make some big strides.

We regularly see people speaking to computers and vice versa, as well as humans speaking to other species via something called a ‘universal translator’. (OK, so we mean Star Trek - maybe only in the original series did they pretend everyone had done a crash course in English.)

Now while some technology first imagined in futuristic books and movies is pretty much upon us now – think of some areas of wireless communication or biotech – speech is an area that continues to disappoint.

Two announcements today highlight this fact. BT has announced it is to use a “synthesised lady” (perhaps better described as a synthesised lady’s voice) to deliver text messages to users as the spoken word.

So far, so good. This is eminently doable, allowing for the odd bit of weirdness where someone types nonsense – though it should be noted there is a claim smilie emoticons will be cleverly translated, which is interesting.

We can only assume a female voice has been used because it just sounds better. Either that or it will pander to a legion of businessmen who wish they had a PA. Services such as Orange’s Wildfire and the speaking clock have taken the same cue in the past.

Then we also heard about a service being trialled by those clever people at NEC in Japan. One of that country’s long-time computing and electronics giants, NEC is using a room at Toyko’s Narita Airport (where else!). Demonstrations of the Travel Interpreter – part of an 'e-airport’ project - are being held in a special booth with the aim of commercialisation of the service by the end of this year.

Our bet is that won’t happen. While we have no doubt about the quality of software algorithms NEC and others use for speech this is just a damned hard thing to pull off.

To start with, Travel Interpreter will go from Japanese to text to a translation engine then back to speech in English. Or vice versa. All in less than a second, it promises.

As if that isn’t hard enough to do, imagine a fairly average conversation. Remember, this is about interpreting speech, not translation of the written word. Whereas the verb comes near the beginning of sentences in English, in Japanese it’s at the end. Something like ‘There are strange businessmen in this terminal’ in English will read in Japanese as ‘This terminal in strange businessmen there are’. Imagine the process for a really long sentence where what the technology is waiting on to start is at the end of a sentence.

But even if this trial had been limited to languages with similar sentence structure, say English with French or Japanese with Korean, it wouldn’t be more than an interesting experiment for now.

The days of seamlessly integrating speech technology in our everyday lives, or even our business lives, aren’t exactly hurtling closer. But this is a start captains Kirk, Picard, Janeway et al would approve of.

Comments

There are 4 comments. Join the discussion

  1. 1. anonymous

    Surely it should be 'a lady's synthesised voice' - its the voice that's sythesised, not the female person, isn't it??!

    • 12 January 2004 08:57
    • Add comment
  2. 2. Dave Rose

    Sounds a bit like Yoda, "around the interpreter, a lady's voice create!"

    • 11 February 2004 14:18
    • Add comment
  3. 3. anonymous

    Here's the skinny on the Universal translator, how it can work fast enough to be practical.
    A speech to text engine (used modified zero crossover method to convert speech to text, sorry code only for obsolete system ddp24)algorithm. Feeds a syntax insensitive language engine based on the use of a dyslexic convergent algorithm written in C. This in turn feeds a simple lookup text to human voice engine (similar to an automated attendant also written in C with some assemble language all 86 based). The method has the advantage of translating ideas and concepts not word. Code for each of these parts currently exists separately but has not to date been combined into a total package.

    Enjoy...

    • 12 February 2004 18:23
    • Add comment
  4. 4. Ian Savell

    I assume the lady is also imaginary so, in my best patent language:

    "a voice response system comprising a telecommunications system and a voice characterised by said voice being synthesised by electronic means and said synthesised voice having pitch and timbre such that a listener to said voice response system would consider said synthesised voice to be that of a female person of good character"

    Can I have a job?

    • 9 March 2004 12:14
    • Add comment

Post your comment

In order to post a comment you need to be registered and logged in.

You can also log in with Facebook. Log in or create your silicon.com account below

  • Login

Will not be displayed with your comment

By signing up for this service, you indicate that you agree to our Terms and Conditions and have read and understood our Privacy Policy.

Questions about membership? Find the answers in the Membership FAQ

Get silicon.com's daily newsletter

  • Register on silicon.com

    Enter your email to register

Keep in touch with silicon.com

silicon.com newsletters