KDnuggets Home » Jobs » Health Integrated: Big Data DevOps Engineer ( 14:n25 )

Health Integrated: Big Data DevOps Engineer


Prototyping, new development and production hardening the Big Data Platform. Work with different teams to manage the build out and administration of the big data clusters.



Health Integrated Company: Health Integrated
Location: Tampa, FL
Web: www.healthintegrated.com


_Contact_:
Email to mreed@healthintegrated.com .

JOB SUMMARY:
The Big Data Development/Ops Engineer position requires deep experience in managing Big Data Platforms. Much of the activity involved in this position will be prototyping, new development and production hardening the Big Data Platform. You will work collaboratively with different teams to manage the build out and administration of the big data clusters. Candidate should be able to manage moderately complex projects and direct activities of a team related to program initiatives and daily operations.
  • Administer, troubleshoot and maintain ELT/ETL Processes.
  • Ability to work under minimal supervision, with general guidance from Peers, Team Lead and/or management.
  • Analysis of complex problems, requiring an in-depth evaluation of multiple factors (e.g. data layout, job performance, supportability, etc.) to provide a sustainable solutions.
  • Work leadership may be provided by assigning work and resolving problems. Plan, schedule and document upgrades, patches and maintenance to Hadoop systems within the bounds of existing change processes.
  • Responsible for triaging, diagnosing, and escalation of issues/inquiries to vendors, arising during platform engineering and daily operational activities.

 
This position may not conduct any activities that require evaluation or interpretation of clinical information.

Minimum Qualifications:

Education/License/Certification: High School Diploma or GED required. A Computer Science degree is strongly preferred, or substantially equivalent experience.

Experience: Typically 2 or more years of experience as Hadoop administrator in a large production environment. Must have Experience with SQL Queries, performance tuning, with scripting and programming languages (for example): Ruby, Python, Perl, Java, etc

Knowledge/Skills:
  • Excellent understanding of Big Data Analytics platforms and ETL in the context of Big Data.
  • Working experience with the following technologies: Hadoop, HBase, Pig and Hive.
  • Hands-on administration, configuration management, monitoring, performance tuning of Hadoop/Distributed platforms. Page 2 of 3
  • Excellent troubleshooting and problem solving skills
  • Experience working in an agile team environment
  • Experience with large scale Hadoop environments build and support including design, capacity planning, cluster set up, performance tuning and monitoring
  • Strong understanding of Hadoop eco system such as HDFS, MapReduce, HBase, Zookeeper, Pig, Hadoop streaming, Sqoop, oozie and hive
  • Experience in administering, and supporting Linux operating systems and hardware in an enterprise environment. (CentOS/RHEL)
  • Expertise in typical system administration and programming skills such as storage capacity management, performance tuning
  • Proficient in shell scripting (e.g. BASH, ksh, etc)
  • Experience in setup, configuration and management of security for Hadoop clusters using Kerberos and integration with LDAP/AD at an Enterprise level
  • Strong scripting experience on Linux/Unix; good in Java would be preferable.
  • Minimum of 2+ year of data analytics experience.
  • Basic SQL knowledge a must.
  • Good to have advanced data warehousing knowledge.
  • Knowledge of one or more of the new database technologies would be useful. e.g. MongoDB, etc
  • Good written and verbal communication skills are required.

 
Accountabilities:

Job Performance/Responsibilities:
  • Responsible for the overall operational management and upkeep of the Big Data Platform
  • Working with our internal business partners to gather requirements.
  • Prototyping, architecting and implementing/updating the Big Data Platform
  • Developing unit test and performing developer testing of code changes.
  • Must have the ability to be a self-starter and work independently on technical projects but also work collaboratively with project team members through an agile development process that promotes constant team communication.
  • Must have excellent communication skills to assist in conducting user interview sessions, requirements gathering, and design reviews.
  • Responsible for the day to day operation and support of Hadoop environments, which includes Hadoop Cluster infrastructure setup, software installation, upgrading/patching, monitoring, tuning/optimizing, troubleshooting, maintenance, and working with development team to install components (Hive, Pig, etc.).
  • Perform Hadoop capacity monitoring and short and long-term capacity planning in collaboration with development resources, system administrators and system architects.
  • Be accountable for maintaining the availability, performance, and continuity of Hadoop environment
  • Implement and maintain security according to best practices for Hadoop infrastructure.
  • Implement DR strategy for Hadoop Distributions collaborating with storage and Unix teams.
  • Stay connected with open source development in Big Data Technology world, evaluate different tools and make recommendations.

Sign Up