The Best ETL Tools in 2021 - Dec 21, 2021.
If you have clear, well-defined objectives, it won’t be hard to identify the ETL technology that best meets your needs. Here are some of the best ETL tools you can use in your business.
ELT, ETL, Tools
- The Seven Best ELT Tools for Data Warehouses - Dec 1, 2021.
ELT helps to streamline the process of modern data warehousing and managing a business’ data. In this post, we’ll discuss some of the best ELT tools to help you clean and transfer important data to your data warehouse.
Data Science Tools, Data Warehouse, ELT, ETL
- KDnuggets™ News 21:n44, Nov 17: Don’t Waste Time Building Your Data Science Network; 19 Data Science Project Ideas for Beginners - Nov 17, 2021.
Don’t Waste Time Building Your Data Science Network; 19 Data Science Project Ideas for Beginners; How I Redesigned over 100 ETL into ELT Data Pipelines; Anecdotes from 11 Role Models in Machine Learning; The Ultimate Guide To Different Word Embedding Techniques In NLP
Beginners, Data Science, ETL, Machine Learning, NLP, Pipeline, Project, Word Embeddings
How I Redesigned over 100 ETL into ELT Data Pipelines - Nov 15, 2021.
Learn how to level up your Data Pipelines!
ELT, ETL, Pipeline, SQL
Design Patterns for Machine Learning Pipelines - Nov 2, 2021.
ML pipeline design has undergone several evolutions in the past decade with advances in memory and processor performance, storage systems, and the increasing scale of data sets. We describe how these design patterns changed, what processes they went through, and their future direction.
Data Preprocessing, ETL, Machine Learning, Pipeline
- ETL and ELT: A Guide and Market Analysis - Oct 29, 2021.
ETL and related techniques remain a powerful and foundational tool in the data industry. We explain what ETL is and how ETL and ELT processes have evolved over the years, with a close eye toward how third-generation ETL tools are about to disrupt standard data processing practices.
Data Preparation, ELT, ETL, Market Research, Pipeline
- Smart Ingestion: Using ontology-driven AI - Sep 8, 2021.
Imagine data that organizes itself to power your decision-making.
AI, ETL, Ontology
Prefect: How to Write and Schedule Your First ETL Pipeline with Python - Aug 16, 2021.
Workflow management systems made easy — both locally and in the cloud.
Cloud, ETL, Pipeline, Python
- Development & Testing of ETL Pipelines for AWS Locally - Aug 2, 2021.
Typically, development and testing ETL pipelines is done on real environment/clusters which is time consuming to setup & requires maintenance. This article focuses on the development and testing of ETL pipelines locally with the help of Docker & LocalStack. The solution gives flexibility to test in a local environment without setting up any services on the cloud.
AWS, Data Engineering, ETL, Pipeline
- dbt for Data Transformation – Hands-on Tutorial - Jul 28, 2021.
The data build tool (dbt) is gaining in popularity and use, and this hands-on tutorial covers creating complex models, using variables and functions, running tests, generating docs, and many more features.
Data Engineering, Data Preparation, dbt, ETL, SQL
- How to pitch to VCs, explained: The Deck We Used to Raise Capital For Our Open-Source ELT Platform - May 21, 2021.
Winning seed funding from venture capitalists is a daunting task, and the pitch is key. Learn how one effective slide deck resulted in a successful early funding round for an open-source start-up, Airbyte.
Data Preparation, ELT, ETL, Startup, VC
- KDnuggets™ News 21:n15, Apr 21: The Most In-Demand Skills for Data Scientists in 2021; How to organize your data science project - Apr 21, 2021.
The Most In-Demand Skills for Data Scientists in 2021; How to organize your data science project; You may have heard about Simpson's paradox, but do you know the other 2? Read Top 3 Statistical Paradoxes in Data Science; ETL in the Cloud; Data Profession Job Satisfaction: Beware Of The Drop; and more.
Data Science Skills, ETL, Project, Simpson's Paradox
- ETL in the Cloud: Transforming Big Data Analytics with Data Warehouse Automation - Apr 15, 2021.
Today, organizations are increasingly implementing cloud ETL tools to handle large data sets. With data sets becoming larger by the day, unified ETL tools have become crucial for data integration needs of enterprises.
Automation, Big Data, Big Data Analytics, Cloud, Data Analytics, Data Warehouse, ETL
What’s ETL? - Apr 2, 2021.
Discover what ETL is, and see in what ways it’s critical for data science.
Data Processing, Data Science, ETL
- Introducing dbt, the ETL and ELT Disrupter - Mar 17, 2021.
Moving and processing data is happening 24/7/365 world-wide at massive scales that only get larger by the hour. Tools exist to introduce efficiencies in how data can be extracted from sources, transformed through calculations, and loaded into target data repositories. However, on their own, these tools can introduce some restrictions in the processing, especially for the needs of data analytics and data science.
Data Engineering, Data Preparation, dbt, ELT, ETL
- The Best Tool for Data Blending is KNIME - Jan 13, 2021.
These are the lessons and best practices I learned in many years of experience in data blending, and the software that became my most important tool in my day-to-day work.
Data Exploration, Data Management, ETL, Knime
- KDnuggets™ News 20:n46, Dec 9: Why the Future of ETL Is Not ELT, But EL(T); Introduction to Data Engineering - Dec 9, 2020.
Learn why the future if ETL is not ELT, but EL(T) and what does that mean; Read a great intro to Data Engineering; Get expert opinions on the main developments in 2020 and key trends in 2021 in AI, Data Science, Machine Learning; NoSQL for Beginners; and more.
2021 Predictions, Data Engineering, ETL, TensorFlow, Trends
Why the Future of ETL Is Not ELT, But EL(T) - Dec 4, 2020.
The well-established technologies and tools around ETL (Extract, Transform, Load) are undergoing a potential paradigm shift with new approaches to data storage and expanding cloud-based compute. Decoupling the EL from T could reconcile analytics and operational data management use cases, in a new landscape where data warehouses and data lakes are merging.
Data Analysis, Data Engineering, Data Lakes, Data Preparation, ELT, ETL
- Find Your Perfect Fit: A Quick Guide for Job Roles in the Data World - Apr 23, 2020.
Data related positions are considered the hottest in the job market during the last couple of years. While everyone wants to join the party and enter this fascinating field, it is essential to first get an understanding. In this quick guide, I’ll do my best to dispel the confusion by crystalizing the essence of the different positions.
Business Analyst, Career Advice, Data Analyst, Data Architect, Data Engineer, Data Scientist, Database Management, Developers, ETL, Machine Learning Engineer
- Manual Coding or Automated Data Integration – What’s the Best Way to Integrate Your Enterprise Data? - Aug 19, 2019.
What’s the best way to execute your data integration tasks: writing manual code or using ETL tool? Find out the approach that best fits your organization’s needs and the factors that influence it.
Advice, Data Integration, Data Management, Data Science, Data Science Platform, ETL
- The Role of the Data Engineer is Changing - Jan 10, 2019.
The role of the data engineer in a startup data team is changing rapidly. Are you thinking about it the right way?
Data Engineer, Data Science, dbt, ETL
- UnitedHealth Group: Senior ETL Developer (Horsham, PA) - Aug 17, 2018.
Seeking a Senior ETL Developer with advanced ETL Architecture/Development background, to be a primary contributor in developing, testing and deploying key data warehouses, data marts and will be working with cutting edge technology.
Developer, ETL, Horsham, PA, UnitedHealth Group
- From Insights to Value in 90 Minutes – with Snowflake, July 12 Webinar - Jul 2, 2018.
Learn How to Accelerate Data Warehouse Modernization at a Low Cost.
BI, Big Data, BigData Dimension, Data Warehouse, ETL
ETL vs ELT: Considering the Advancement of Data Warehouses - May 22, 2018.
The traditional concept of ETL is changing towards ELT – when you’re running transformations right in the data warehouse. Let’s see why it’s happening, what it means to have ETL vs ELT, and what we can expect in the future.
BigQuery, Data Warehouse, ELT, ETL, Statsbot
- Loading Terabytes of Data from Postgres into BigQuery - Apr 9, 2018.
Despite the fact that an ETL task is pretty challenging when it comes to loading Big Data, there’s still the scenario in which you can load terabytes of data from Postgres into BigQuery relatively easy and very efficiently.
BigQuery, ETL, NoSQL, Postgres, SQL, Statsbot
- A Beginner’s Guide to Data Engineering – Part II - Mar 15, 2018.
In this post, I share more technical details on how to build good data pipelines and highlight ETL best practices. Primarily, I will use Python, Airflow, and SQL for our discussion.
Pages: 1 2
AirBnB, Data Engineering, Data Science, ETL, Pipeline, Python, SQL
A Beginner’s Guide to Data Engineering – Part I - Jan 25, 2018.
Data Engineering: The Close Cousin of Data Science.
Pages: 1 2
Data Engineer, Data Engineering, ETL, Pipeline
Are Data Lakes Fake News? - Sep 6, 2017.
The quick answer is yes, and the biggest problem is that the term “Data Lakes” has been overloaded by vendors and analysts with different meanings, resulting in an ill-defined and blurry concept.
Data Lakes, Data Warehouse, ETL, Fake News, Hadoop
- How to Choose a Data Format - Nov 3, 2016.
In any data analytics project, after business understanding phase, data understanding and selection of right data format as well as ETL tools is very important task. In this article, a very useful and practical set of guidelines is explained covering data format selection and ETL phases of project lifecycle.
Pages: 1 2
Data Cleaning, Data Engineering, Data Preparation, ETL, Hadoop, HDFS
- Automating Data Ingestion: 3 Important Parts - Sep 9, 2016.
In the day and age of ‘Big Data”, data ingestion has to be automated on some level. How best to automate it?
Automation, Claudia Perlich, Data Preparation, ETL, Quora
- Choosing Tools for Data ETLs - Aug 9, 2016.
Which tool should I use for my data pipelines? Get some advice from a data scientist recently having gone through this pipeline tool selection process.
AirBnB, Data Cleaning, Data Preparation, ETL
- Engineers Shouldn’t Write ETL: A Guide to Building a High Functioning Data Science Department - Mar 28, 2016.
An exploration of data science team building, with insight into why engineers should not write ETL, and other not-so-subtle pieces of advice.
Pages: 1 2 3
Advice, Data Engineering, Data Scientist, ETL, Stitch Fix
- Data Lake Plumbers: Operationalizing the Data Lake - Feb 18, 2016.
Gain insight into data lakes, their benefits, when they are appropriate, and how to operationalize them. How do they compare to the data warehouse?
Data Lake, Data Warehouse, ETL, Hadoop
- 3 Reasons Big Data Projects Fail - Aug 24, 2015.
Download Lavastorm whitepaper: How to Overcome 3 Key Big Data Challenges - how to operationalize the results, how to enable ETL to handle complexities of Big Data, and more.
Big Data, ETL, Lavastorm, Project Fail
- Interview: Joseph Babcock, Netflix on Genie, Lipstick, and Other In-house Developed Tools - Jun 16, 2015.
We discuss role of analytics in content acquisition, data architecture at Netflix, organizational structure, and open-source tools from Netflix.
Data Science, ETL, In-house, Interview, Joseph Babcock, Netflix, Open Source, Tools