Banazîr the Jedi Hobbit (banazir) wrote,
Banazîr the Jedi Hobbit
banazir

  • Mood:
  • Music:

Quelle vitesse: NIST Machine Translation workshop

It's hard to believe, but another year has rolled past, and once the machine translation arena is being dominated by a few giants. We aren't supposed to talk about the BLEU scores until they've been published at the start of next month, but for your reference, the top 6 scorers in 2005 were:

  • Google (with 0.5131), ISI, IBM, UMD, JHU, and Edinburgh in the Arabic-to-English Large Data Track

  • Google (with 0.5137), SAKHR, and ARL in the Arabic-to-English Unlimited Data Track

  • Google (with 0.3531), ISI, UMD, RWTH, JHU and IBM in the Chinese-to-English Large Data Track

  • Google (with 0.3516), ICT, and HIT in the Chinese-to-English Unlimited Data Track


Two of my grad students, Tejaswi Pydimarri (pnvtejaswi) and Waleed Al-Jandal, are at the NIST annual Machine Translation evaluation workshop. This is our first attendance at an MT or computational linguistics meeting, so we're mainly there to learn. Teja has a short presentation on our first (nominally) functioning end-to-end translation system, but we did it primarily to get our feet wet. This being our first BLEU score ever, I have strong hopes for the coming year. My goal is to "make the scoreboard" in earnest by October (with a 0.1 on the Chinese track) and get as close as we can to 0.2 by year's end. If you're interested in keeping tabs on our progress, look for my posts in comptranslation.

I can't believe they've been at the workshop a day now and are almost coming home! Time zips by.

--
Banazir
Tags: machine translation, nist, workshops
Subscribe

  • Old and Die for the Thief Is Also

    ... that's what Google Translate produces for 老而不死是為賊也. The Chinese sentence is a Confucian proverb: "To be old and not die is to be a thief as…

  • The Five Departments We Need the Most, Number 1: Linguistics

    Who needs it: People in speech communications, linguistic anthropology, modern languages, psycholinguistics, computational linguistics, and many…

  • The name of the wose

    State of the courses: I'm hearing some positive (second-hand) feedback about the organization and content of my AI and database courses this fall,…

  • Post a new comment

    Error

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.
  • 1 comment