-
Development & Testing of ETL Pipelines for AWS Locally
Typically, development and testing ETL pipelines is done on real environment/clusters which is time consuming to setup & requires maintenance. This article focuses on the development and testing of ETL pipelines locally with the help of Docker & LocalStack. The solution gives flexibility to test in a local environment without setting up any services on the cloud.
-
10 Machine Learning Model Training Mistakes
These common ML model training mistakes are easy to overlook but costly to redeem.
-
Online Master’s in Data Science from Northwestern
Build statistical and analytical expertise as well as the management and leadership skills necessary to implement high-level, data-driven decisions in Northwestern's online Master of Science in Data Science program. Apply now!
-
A Brief Introduction to the Concept of Data
Every aspiring data scientist must know the concept of data and the kind of analysis they can run. This article introduces the concept of data (quantitative and qualitative) and the types of analysis.
-
An AI-Based Framework Solution to Address Email Management Challenges
Expert.ai’s Edge NL API is an on-premise API that can perform NLU tasks with no required training or extra work, offering advanced, out-of-the-box capabilities that address common use cases and can be easily customized to your specific needs.
-
The Brutal Truth About Data Science
Many organizations approach data science as though it was a marketing tool — relabeling things that they already do as ‘data science’ as it involves the use of data. That is not real data science, and it completely misses the point of engaging in data science.
-
Building Machine Learning Pipelines using Snowflake and Dask
In this post, I want to share some of the tools that I have been exploring recently and show you how I use them and how they helped improve the efficiency of my workflow. The two I will talk about in particular are Snowflake and Dask. Two very different tools but ones that complement each other well especially as part of the ML Lifecycle.
-
How to Use Kafka Connect to Create an Open Source Data Pipeline for Processing Real-Time Data
This article shows you how to create a real-time data pipeline using only pure open source technologies. These include Kafka Connect, Apache Kafka, Kibana and more.
-
ColabCode: Deploying Machine Learning Models From Google Colab
New to ColabCode? Learn how to use it to start a VS Code Server, Jupyter Lab, or FastAPI.
-
Design patterns in machine learning
Can we abstract best practices to real design patterns yet?
|