a4 Media: Manager, Machine Learning Data Engineer [Long Island City, NY]
a4 Media is seeking a Manager, Machine Learning Data Engineer in Long Island City, NY, focused on managing the business understanding, data acquisition, and data understanding phases of an agile CRISP-data science process.
At: a4 Media
Location: Long Island City, NY
Position: Manager, Machine Learning Data Engineer
a4 Media is driven by a philosophy to always challenge ourselves. We question everything so that we can find the best way forward for our customers. And in a world where continuous innovation is the only way forward, we are redefining the vision we have for our customers, enterprises, advertisers - and our people.
a4 Media’s data science team develops and deploys models for smart advertising media apps and ad hoc analytics. Your role within the data science team is focused on managing the business understanding, data acquisition, and data understanding phases of an agile CRISP-data science process, which will inform the subsequent data preparation, modeling and deployment phases. You sit at the crossroads of data science and data engineering and have proficiency in both data engineering and data science.
Responsibilities of the Business Understanding phase:
- Define project objectives: by working with your client to understand and identify business problems and objectives to be solved by the project’s deliverables.
- Conduct detailed fact-finding: about all of the data, resources, constraints, assumptions, and other factors that should be considered in determining the data analysis goals and in developing the project plan.
- Produce a project plan: to achieve the data analysis goals. Collaborate with your client, data scientists, and deployment engineers, to list the project stages, together with their methodologies, duration, required resources, inputs, outputs, and dependencies.
Responsibilities of the Data Acquisition and Understanding phase:
- Extract, transfer and load: the data listed in the project plan into the analytics environment.
- Describe the data: format, quantity, fields, and any other features and quality issues that you discover.
- Explore the data: using querying, visualization, and reporting techniques to clarify or address some of the data analysis goals, and, to create a clean data set that feeds into the data scientists’ transformation and feature engineering steps prior to modeling.
- Create pipelines: and know what’s the right infrastructure, both in terms of storage and in terms of computing at a massive scale, that can run scoring or predictions on new data.
This role is based in Long Island City
- 5-8 years of related professional experience
- Bachelor’s Degree in related field and Masters or higher a plus
- Strong analyst background; generated insights and business recommendations
- Strong project consultative background; creating project requirements
- Strong experience creating ETLs and pipelines (streaming vs. batch; low vs. high-frequency pipelines), using tools such as AirFlow, ApacheNiFi, Kafka
- Strong experience wrangling, exploring and cleaning structured and unstructured data
- Strong data visualization skills using Tableau or open-source tools
- Ability to code from Python/R to Java/Scala/Spark
- Experience with collaboration tools such as Bitbucket, GitHub, Teams, or Jira
- Experience with various data sources (on-premises vs. cloud; database vs. files)
- Experience with various data environments (on-premises vs. cloud; database vs. data lake; small vs. medium vs. big data), such as Google, AWS, Oracle, or Hadoop
- Friendly, fun, conscientious, curious and out-of-the-box mindset
- Ability to ask questions and figure out what’s the right data and data science solution, in a collaborative team environment that follows an agile data science process
- A subject matter expert leading hands-on with technical know-how, determining methods and procedures on new projects, and providing leadership to other engineers
- Strong written and verbal communication skills with internal and external clients
Top Stories Past 30 Days