Sr. Software Development Engineer – Cloud/ Big Data
Build machine learning solutions that enhance the richness and quality of the world's largest product catalog utilizing cloud computing, big data analytics, machine learning algorithms, and crowd sourcing.
Location: Seattle, WA
Hiring manager is Fabian Moerchen
What would you do if you had access to the world's largest product catalog with billons of products, offers, images, reviews, searches, and much more? Amazon's Search & Discovery group is looking for an exceptional software development engineer to build machine learning solutions that enhance the richness and quality of our massive product catalog utilizing cloud computing, big data analytics, machine learning algorithms, and crowd sourcing.
An information-rich and accurate product catalog is a critical strategic asset for Amazon. It powers unrivaled product discovery, informs customers' buying decisions, offers a large selection and positions Amazon as the first stop for our customers. This is a unique position that provides an opportunity to build data-driven systems at a scale rarely available anywhere else. If you want to learn how to derive value from data using machine learning and data mining techniques by working with experts in the field and have experience in building scalable solutions, you will fit right in. You will have the ability to influence how millions of customers discover and shop for products at Amazon within weeks of developing a new idea.
We are building a knowledge base of product features relevant to product discovery and product comparison from noisy and unstructured data. We combine multiple large data sources to perform relation extraction, clustering, pattern mining, predictive modeling and statistical testing to find out what matters most to our customers. The gained knowledge helps us devise strategies to improve our product catalog in turn leading to better searches, refinements, and more informed purchase decisions. You will find challenges in:
Data analysis: We build data analytical workflows to dig into the huge amounts of data available at Amazon using data mining, machine learning, and statistics. We look for patterns, train thousands of models and use them to build solutions that improve the catalog quality. We collect knowledge collected through crowdsourcing and auditing and train models that generalize across the catalog.
Scalability: We process billions of records daily. We build systems and design algorithms that are able to handle these large amounts of data and make sure the cloud usage scales sub-linear with the ever growing data size. Where traditional solutions fail we develop approximate, distributed, and streaming algorithms.
Systems: We leverage Amazon's cloud infrastructure to scale. We create production workflows and applications utilizing AWS technologies such as EMR, SWF, Data Flow, RedShift and SQS. Our systems must run reliably in the face of variations in the input data or local hardware failures in distributed systems.
- Writing high quality code, participating in code reviews, designing/architecting systems of varying complexity and scope, and creating high quality documentation supporting the design/coding tasks.
- Participating in team meetings, stand-ups, and architecture/design discussions.
- Identify areas of improvement in our frameworks, tools, processes and strive to make them better. Evaluate our success metrics and evolve our reporting systems.
- Dive deep into the catalog data, understand different functional areas, and use your creativity to come up with techniques to improve the quality of Amazon's product catalog.
- Participate in the roadmap definition for the team
- Bachelor in Computer Science or related field
- 4+ years of experience in software development and full product life-cycles
- Coding skills in Java/C++ coupled with strong base in object-oriented design and development
- Experience with relational databases, designing schemas and formulating complex SQL queries.
- 6+ years of experience in Software development and SDLC.
- Advanced degree/PHD/Doctorate/Masters in Computer Science/related field.
- Experience in machine learning, data mining, artificial intelligence, statistics.
- Experience distributed algorithms (Map-Reduce, MPI)
- Experience with AWS offerings (S3, EMR, SWF)
- Ability to technically lead small to mid-size teams, mentor junior members.
- Excellent verbal and written communication skills.
- Results oriented person with a delivery focus.
- Ability to handle multiple competing priorities in a fast-paced environment.
- Proven track record of architecting and building scalable systems.