KDnuggets Home » Jobs » Quad Analytix: Extraction Architect ( 16:n01 )

Quad Analytix: Extraction Architect


If you love to working on internet-scale problems and enjoy implementing, tuning enterprise-class, large-scale distributed systems and understanding their production foot-print, then Quad might be a great fit for you.



Quad Analytix At: Quad Analytix
Location: San Mateo, CA
Web: www.quadanalytix.com


Quad Analytix is an exciting, early stage general purpose Analytics startup headquartered in the Bay Area-CA. We focus on harnessing e-commerce and other information gathered from a variety of sources, which are then cleansed, normalized and analyzed to create visualizations that provide meaningful insights to our customers.

If you love to working on internet-scale problems and enjoy implementing, tuning enterprise-class, large-scale distributed systems and understanding their production foot-print, then Quad might be a great fit for you. We offer you an opportunity to learn and grow with our company, have your work directly impact the entire business in the context of an exciting, fast-paced environment with smart, down-to-earth people who enjoy having fun while working together. We are scaling significantly as a function of data-size and concurrent-users, so this means technical challenges abound.

JOB DESCRIPTION:
You will participate heavily in architecture, design and implementation of large-scale distributed crawlers that extract semantically meaningful data from the classic-Web and social-Web (such as Pinterest, Twitter, Facebook, Instagram etc.). You are to work with the data-science team to implement semantic extraction off crawled resources and work with data-engineers who offer persistence services to a polyglot persistence tier. You are expected to have good knowledge of Map-Reduce methodologies, AWS, REST and NoSQL Databases.

DESIRED SKILLS AND QUALIFICATIONS:
  • Significant experience and knowledge of the complete SDLC.
  • Intricately familiar with Agile and Scrum.
  • Bachelors or Masters in Computer Science
  • Python: celery, urllib2, lxml, selenium, eventlet, nltk, matplotlib, scrapbook extensions.
  • Amazon Web Services (EC2, S3/Glacier, VPC)
  • Devops tools like Puppet and Fabric.
  • Knowledge of Nutch, Heritrix.
  • NoSQL Databases such as MongoDB and Hadoop-Hbase.
  • Statsd & Graphite experience is a plus.
  • Prior experience implementing an industrial strength distributed crawler that implements suitable politeness.Knowledge of Intrusion Detection and Intrusion Prevention systems is a plus.
  • We're looking for someone who can think about the big-picture and bring a team of engineers to consensus, but can also handle optimization of its constituent components.

 
OTHER CHARACTERISTICS:
  • Strong oral and written communication skills and ability to work effectively independently and in teams (both local and distributed). Ability to clearly articulate roadmap, pros/cons of a given approach versus others.
  • Strong work ethic and proactive approach to problem solving
  • Intellectually curious with a passion for learning and professional evolution.
  • Ability to multitask, prioritize, show initiative, and respond quickly in a fast paced environment
  • Enjoy having fun at work, and desire to collaborate with smart, humble people every day

 
_Contact_:
Please apply online here.

Sign Up