KDnuggets : News : 2009 : n18 : item20 < PREVIOUS | NEXT >

Jobs

From: Scott, CC
Date: Thu, 17 Sep 2009
Subject: Seattle, WA: Information Retrieval/Data Mining Software Engineer at Amazon

Amazon.com's Darwin team is looking for exceptional software engineers to develop algorithms and build systems to automatically solve a variety of Information Retrieval and Data Mining problems related to the Amazon Product Catalog - one of the company's biggest asset.

Our principal challenge is to improve the shopping experience by detecting duplicate products for sale in the catalog and merging them. Merchants on Amazon.com provide information about the products they want to sell. Amazon attempts to match these product data submissions to items in its catalog so that it can display offers for the same product on a single page. Poorly structured or incomplete data makes this problem very challenging and often results in duplicate products getting created in the catalog. These duplicate products are shown in search results and end up confusing customers, leading to a bad customer experience. The Darwin team detects these duplicate products in the Amazon.com catalog using an innovative mix of Information Retrieval, Data Mining and Text Analysis algorithms and human intelligence harnessed via the Amazon Mechanical Turk. We then automatically merge products detected as duplicates together, improving customer experience and the quality of the catalog.

We are also responsible for a variety of other Catalog-related projects such as placing Product Advertisements on pages, automatically extracting important product features from the product description with a view to improving the discovery (search and browse) experience on the website and detecting egregious cases of poor quality data provided by sellers.

We are a highly-motivated, co-operative and fun loving team who thrive on solving challenging problems with innovation. As part of this team you will be analyzing data, developing new algorithms, building large-scale distributed software systems in Java using open source technologies such as Apache Lucene and JBoss and other Amazon.com proprietary technologies.

Qualifications:

The ideal candidate will have the following qualifications:

  • Advanced degree in Computer Science, Math or related field with 2+ years of experience.
  • Past experience in at least one of the following areas - Information Retrieval, Data Mining, Text Analysis or Machine Learning.
  • Desire to analyze data while developing solutions to problems.
  • Strong desire to build high-performance, highly-available and scalable distributed systems.
  • Strong design and coding skills in Java/C++ on Unix Platforms.
  • Familiar with Perl and have a good understanding of SQL.
  • Be highly innovative, flexible and self-directed.
  • Excellent written and verbal communication skills.
_Contact_:
C.C. Scott, ccscott at amazon.com

KDnuggets : News : 2009 : n18 : item20 < PREVIOUS | NEXT >

Copyright © 2009 KDnuggets.   Subscribe to KDnuggets News!