Distributed and Scalable Machine Learning [Webinar]

Mike McCarty and Gil Forsyth work at the Capital One Center for Machine Learning, where they are building internal PyData libraries that scale with Dask and RAPIDS. For this webinar, Feb 23 @ 2 pm PST, 5pm EST, they’ll join Hugo Bowne-Anderson and Matthew Rocklin to discuss their journey to scale data science and machine learning in Python.

Capital One, Dask, Distributed, Machine Learning, Python, scikit-learn, XGBoost

Sponsored Post.

Mike McCarty and Gil Forsyth work at the Capital One Center for Machine Learning, where they are building internal PyData libraries that scale with Dask and RAPIDS. For this webinar, they’ll join Hugo Bowne-Anderson and Matthew Rocklin to discuss their journey to scale data science and machine learning in Python.

In 2020, Capital One left data centers behind by completing a transition to the cloud, and they are now using the cloud with Dask to scale data science and machine learning. We’ll take a whirlwind tour through what this looked like and dive into several key specifics, such as how to deploy Dask and RAPIDS on AWS, the ins and outs of scaling your XGBoost workflows, and how Capital One leverages the scikit-learn API to scale with custom estimators.

We’ll also hit some more cultural notes, such as how Capital One is building internal communities who are knowledgeable on the best practices of using these OSS tools and why it’s important for an enterprise company to contribute to this community today.

After attending, you’ll know:

How Dask has grown at Capital One and some of the challenges they faced.
How (and why) to scale XGBoost training
How to leverage the scikit-learn API to build your own custom estimators that scale
The importance for institutions to participate in the open-source projects they are using

Join us Tuesday, February 23rd at 5:00 PM US Eastern time by signing up here and dive into the wonderful world of all things Dask and scalable Python at Capital One!

Time: Feb 23, 2021, 2 pm PST, 5pm EST

Register Now

More On This Topic

<= Previous post

Top Posts

NotebookLM for the Creative Architect

Top 7 Docker Compose Templates Every Developer Should Use

Docker for Python & Data Projects: A Beginner’s Guide

5 Best Books for Building Agentic AI Systems in 2026

5 Useful Python Scripts for Advanced Data Validation & Quality Checks

Python Project Setup 2026: uv + Ruff + Ty + Polars

Kaggle + Google’s Free 5-Day Gen AI Course

Advanced NotebookLM Tips & Tricks for Power Users

Breaking Down the .claude Folder

5 Useful Python Scripts to Automate Boring Excel Tasks

Latest Posts

Top Posts

KDnuggets Home » » Distributed and Scalable Machine Learning [Webinar]