- Data Engineering Landscape in the AI-Driven World - May 24, 2023.
Generative AI has just started to capture the imagination of data engineers, so the impact thus far has been just a fraction of what it will be a year or two from now.
Data Engineering
- Should You Consider a DataOps Career? - May 15, 2023.
Transitioning your career to DataOps could be just the change you need - not only will it provide the possibility to expand your technical skills, but also a rewarding salary with many job openings.
Data Engineering
- Schedule & Run ETLs with Jupysql and GitHub Actions - May 1, 2023.
This blog provided you with a comprehensive overview of ETL and JupySQL, including a brief introduction to ETLs and JupySQL. We also demonstrated how to schedule an example ETL notebook via GitHub actions, which allows you to automate the process of executing ETLs and JupySQL from Jupyter.
Data Engineering
- 11 Best Practices of Cloud and Data Migration to AWS Cloud - Apr 14, 2023.
list of Best Practices compiled from our learnings during our migration journey to the AWS cloud.
Data Engineering
- How to Build a Scalable Data Architecture with Apache Kafka - Apr 5, 2023.
Learn about Apache Kafka architecture and its implementation using a real-world use case of a taxi booking app.
Data Engineering
- ETL vs ELT: Which One is Right for Your Data Pipeline? - Mar 31, 2023.
Learn about the differences between ETL and ELT data integration techniques and determine which is right for your data pipeline.
Data Engineering
- Data Quality Dimensions: Assuring Your Data Quality with Great Expectations - Mar 23, 2023.
This article highlights the significance of ensuring high-quality data and presents six key dimensions for measuring it. These dimensions include Completeness, Consistency, Integrity, Timelessness, Uniqueness, and Validity.
Data Engineering
- A List of 7 Best Data Modeling Tools for 2023 - Mar 3, 2023.
Learn about data modeling tools to create, design and manage data models, allowing data scientists to access and use them more quickly.
Data Engineering
- Data Warehousing and ETL Best Practices - Feb 27, 2023.
How you can improve your data warehousing ETL process with these simple practices.
Data Engineering
- 5 SQL Visualization Tools for Data Engineers
- Feb 24, 2023.
This article will discuss SQL visualization, its role in augmenting the modern-day data engineer, and five categories of SQL visualization tools.
Data Engineering
- Docker for Data Science Cheat Sheet - Feb 14, 2023.
Docker is dependency management on steroids, helping to ensure both reproducibility and collaboration, making it an important tool for data science. Our latest cheat sheet serves as a handy Docker reference. Check it out now!
Data Engineering
- Learn Data Engineering From These GitHub Repositories
- Feb 7, 2023.
Kickstart your Data Engineering career with these curated GitHub repositories.
Data Engineering
- Tapping into the Potential of Data Products in 2023 - Jan 31, 2023.
Learn how data can be treated as a product and how it can be used to derive value.
Data Engineering
- Scaling Data Management Through Apache Gobblin - Jan 20, 2023.
Software companies can manage big data at a hyper-scale on different infrastructure stacks using Apache Gobblin.
Data Engineering
- SQL and Data Integration: ETL and ELT - Jan 19, 2023.
In this article, we will discuss use cases and methods for using ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) processes along with SQL to integrate data from various sources.
Data Engineering
- Data Lakes and SQL: A Match Made in Data Heaven - Jan 16, 2023.
In this article, we will discuss the benefits of using SQL with a data lake and how it can help organizations unlock the full potential of their data.
Data Engineering
- Overcome Your Data Quality Issues with Great Expectations - Jan 12, 2023.
Bad data costs organizations money, reputation, and time. Hence it is very important to monitor and validate data quality continuously.
Data Engineering
- Where Collaboration Fails Around Data (And 4 Tips for Fixing It) - Jan 9, 2023.
Data-driven organizations require complex collaboration between data teams and business stakeholders. Here are 4 proactive tips for reducing information asymmetries and achieving better collaboration.
Data Engineering
- 7 Essential Cheat Sheets for Data Engineering
- Dec 6, 2022.
Learn about the data life cycle, PySpark, dbt, Kafka, BigQuery, Airflow, and Docker.
Data Engineering
- The Complete Data Engineering Study Roadmap
- Nov 28, 2022.
Everything you need to know to start your career in Data Engineering.
Data Engineering
- Is OLAP Dead? - Oct 21, 2022.
OLAP enables citizen analysts to quickly, efficiently, and cost-effectively uncover new business insights at a reduced time-to-value.
Data Engineering
- Essential Books You Need to Become a Data Engineer
- Oct 18, 2022.
In this article, I will go through the roadmap of books you need to become a Data Engineer.
Data Engineering
- 11 Questions About Data Engineers: What’s the profession about, and where’s it heading? - Oct 6, 2022.
I hope my answers will be useful to novice data engineers and anyone interested in data engineering.
Data Engineering
- The Evolution of Apache Druid - Jul 19, 2022.
And so true to the origins of its name, Apache Druid is shapeshifting - with the addition of a new multi-stage query engine.
Data Engineering
- 10 Modern Data Engineering Tools - Jul 11, 2022.
Learn about the modern tools for data orchestration, data storage, analytical engineering, batch processing, and data streaming.
Data Engineering
- Free Data Engineering Courses - May 30, 2022.
Get into the highly in-demand world of data engineering for free and earn 6 figures salary.
Data Engineering
- Deploying a Streamlit WebApp to Heroku using DAGsHub - Feb 7, 2022.
Transform your machine learning models into a web app and share them with your friends and colleagues.
Data Engineering
- What Are NVIDIA NGC Containers & How to Get Started Using Them - Nov 15, 2021.
NVIDIA, the pioneer in the GPU technologies and deep learning revolution, has come up with an excellent catalog of specialized containers that they call NGC Collections. In this article, we explore their basic usage and some variations.
Containers, Data Engineering, Deep Learning, NVIDIA
- Is the Modern Data Stack Leaving You Behind? - Nov 1, 2021.
The modern data stack narrative is largely dominated by analytics engineering. Where does that leave data engineers? Discover the difference between the MDS for data engineers & analytics engineers.
Analytics, Data Engineer, Data Engineering, Tools
- Level-Up This November with the ODSC West 2021 Keynotes and Training Sessions - Oct 20, 2021.
At ODSC West 2021 this November 16th-18th, we’ll have 80+ training sessions and workshops on essential tools and languages led by some of the best and brightest minds in data science and AI.
Data Engineering, Meetings, MLOps, ODSC, Online Education
- Data Engineering Technologies 2021 - Sep 21, 2021.
Emerging technologies supporting the field of data engineering are growing at a rapid clip. This curated list includes the most important offerings available in 2021.
Abacus.ai, Dask, Data Engineering, Databricks, Dataiku, DataRobot, dbt, Fivetran, Pachyderm

Data Scientists Without Data Engineering Skills Will Face the Harsh Truth - Sep 14, 2021.
Although the role of the data scientist is still evolving, data remains at its core. Setting the right expectations for what you will do as a data scientist is important, and, to be sure, knowing the tools of data engineering will get yourself ready for the real world.
Data Engineering, Data Science Skills, Data Scientist
- The Most Important Tool for Data Engineers - Aug 26, 2021.
And it has nothing to do with Python or SQL
Career Advice, Data Engineer, Data Engineering
- Model Drift in Machine Learning – How To Handle It In Big Data - Aug 17, 2021.
Rendezvous Architecture helps you run and choose outputs from a Champion model and many Challenger models running in parallel without many overheads. The original approach works well for smaller data sets, so how can this idea adapt to big data pipelines?
Big Data, Data Engineering, Data Preparation, Machine Learning, Model Drift
- Development & Testing of ETL Pipelines for AWS Locally - Aug 2, 2021.
Typically, development and testing ETL pipelines is done on real environment/clusters which is time consuming to setup & requires maintenance. This article focuses on the development and testing of ETL pipelines locally with the help of Docker & LocalStack. The solution gives flexibility to test in a local environment without setting up any services on the cloud.
AWS, Data Engineering, ETL, Pipeline
- dbt for Data Transformation – Hands-on Tutorial - Jul 28, 2021.
The data build tool (dbt) is gaining in popularity and use, and this hands-on tutorial covers creating complex models, using variables and functions, running tests, generating docs, and many more features.
Data Engineering, Data Preparation, dbt, ETL, SQL
- MLOps is an Engineering Discipline: A Beginner’s Overview - Jul 8, 2021.
MLOps = ML + DEV + OPS. MLOps is the idea of combining the long-established practice of DevOps with the emerging field of Machine Learning.
Data Engineering, Deployment, Machine Learning, MLOps, Modeling
Analytics Engineering Everywhere - Jun 22, 2021.
Many new roles have appeared in the data world ever since the rise of the Data Scientist took the spotlight several years ago. Now, there is a new core player ready to take center stage, and we may see in five years, nearly every organization will have an Analytics Engineering team.
Analytics, Analytics Engineering, Data Engineering, dbt
- Beyond Brainless AI with a Feature Store - Jun 4, 2021.
AI-powered products that are limited to the data available within its application are like jellyfish: its autonomic system makes it functional, but it lacks a brain. However, you can evolve your models with data enriched "brains" through the help of a feature store.
AI, Data Engineering, Feature Store, Machine Learning
- KDnuggets™ News 21:n20, May 26: Data Engineer, Data Scientists & Other Data Careers, Explained; Where Did You Apply Analytics & Data Science Recently? - May 26, 2021.
Data Scientist, Data Engineer & Other Data Careers, Explained; A Guide On How To Become A Data Scientist (Step By Step Approach); A checklist to track your Data Science progress; How to Determine if Your Machine Learning Model is Overtrained; Differentiable Programming from Scratch
Analytics, Careers, Data Engineer, Data Engineering, Data Science, Data Scientist, Poll, Survey
- DataOps: 5 things that you need to know - May 20, 2021.
DataOps (Data Operations) has assumed a critical role in the age of big data to drive definitive impact on business outcomes. This process-oriented and agile methodology synergizes the components of DevOps and the capabilities of data engineers and data scientists to support data-focused workloads in enterprises. Here is a detailed look at DataOps.
Data Engineer, Data Engineering, DataOps
Why You Should Consider Being a Data Engineer Instead of a Data Scientist - Apr 27, 2021.
A new king of the jungle has emerged.
Career Advice, Data Engineer, Data Engineering, Data Science, Data Scientist
- Data careers are NOT one-size fits all! Tips for uncovering your ideal role in the data space - Apr 23, 2021.
Thriving as a data professional is about more than just making good money! It’s about FULFILLMENT & IMPACT. In this article, I will help you discover the BEST data role for you given your unique skill sets, personality & goals.
Career Advice, Careers, Data Engineering, Data Science
- Data vault: new weaponry in your data science toolkit - Mar 31, 2021.
Data Vault is a modern data modelling approach for capturing (historical) data in a structurally auditable and tractable way. While very helpful for data engineers, the Data Vault also enables Data Science in practice.
Business, Data Engineering, Data Science, Data Science Tools, Data Warehouse
- How to build a DAG Factory on Airflow - Mar 19, 2021.
A guide to building efficient DAGs with half of the code.
Data Engineering, Data Workflow, Graphs, Python, Workflow
- Wrangle Summit 2021: All the Best People, Ideas, and Technology in Data Engineering, All in One Place - Mar 18, 2021.
At Wrangle Summit 2021, Apr 7-9, you’ll get access to all the best people, ideas, and technology in data engineering, all in one place. Learn how to refine raw data and engineer unique data products, and gain insights from your data that can catalyze real, measurable business success.
Data Engineer, Data Engineering, Data Preparation, Google Cloud, Trifacta
- Introducing dbt, the ETL and ELT Disrupter - Mar 17, 2021.
Moving and processing data is happening 24/7/365 world-wide at massive scales that only get larger by the hour. Tools exist to introduce efficiencies in how data can be extracted from sources, transformed through calculations, and loaded into target data repositories. However, on their own, these tools can introduce some restrictions in the processing, especially for the needs of data analytics and data science.
Data Engineering, Data Preparation, dbt, ELT, ETL
Data Science Learning Roadmap for 2021 - Feb 26, 2021.
Venturing into the world of Data Science is an exciting, interesting, and rewarding path to consider. There is a great deal to master, and this self-learning recommendation plan will guide you toward establishing a solid understanding of all that is foundational to data science as well as a solid portfolio to showcase your developed expertise.
Data Engineering, Data Preparation, Data Science, Data Science Education, Python, Roadmap, SQL
- KDnuggets™ News 21:n08, Feb 24: Powerful Exploratory Data Analysis in just two lines of code; Cartoon: Data Scientist vs Data Engineer - Feb 24, 2021.
Powerful Exploratory Data Analysis in just two lines of code; Cartoon: Data Scientist vs Data Engineer; Evaluating Deep Learning Models: The Confusion Matrix, Accuracy, Precision, and Recall; Feature Store as a Foundation for Machine Learning; Approaching (Almost) Any Machine Learning Problem
Cartoon, Data Analysis, Data Engineering, Data Science, Deep Learning, Machine Learning, Metrics, Python
- Data Observability, Part II: How to Build Your Own Data Quality Monitors Using SQL - Feb 23, 2021.
Using schema and lineage to understand the root cause of your data anomalies.
Data Engineering, Data Quality, Data Science, Data Science Platform, SQL
- Feature Store as a Foundation for Machine Learning - Feb 19, 2021.
With so many organizations now taking the leap into building production-level machine learning models, many lessons learned are coming to light about the supporting infrastructure. For a variety of important types of use cases, maintaining a centralized feature store is essential for higher ROI and faster delivery to market. In this review, the current feature store landscape is described, and you can learn how to architect one into your MLOps pipeline.
Data Engineering, Data Infrastructure, Data Lake, Feature Engineering, Feature Store, Machine Learning, Metadata, MLOps, Pipeline
- Data Observability: Building Data Quality Monitors Using SQL - Feb 16, 2021.
To trigger an alert when data breaks, data teams can leverage a tried and true tactic from our friends in software engineering: monitoring and observability. In this article, we walk through how you can create your own data quality monitors for freshness and distribution from scratch using SQL.
Data Engineering, Data Quality, Data Science, Data Science Platform, SQL
- KDnuggets™ News 21:n04, Jan 27: The Ultimate Scikit-Learn Machine Learning Cheatsheet; Building a Deep Learning Based Reverse Image Search - Jan 27, 2021.
The Ultimate Scikit-Learn Machine Learning Cheatsheet; Building a Deep Learning Based Reverse Image Search; Data Engineering — the Cousin of Data Science, is Troublesome; Going Beyond the Repo: GitHub for Career Growth in AI & Machine Learning; Popular Machine Learning Interview Questions
Cheat Sheet, Data Engineering, Data Science, Deep Learning, GitHub, Image Recognition, Machine Learning, scikit-learn, Search
Data Engineering — the Cousin of Data Science, is Troublesome - Jan 22, 2021.
A Data Scientist must be a jack of many, many trades. Especially when working in broader teams, understanding the roles of others, such as data engineering, can help you validate progress and be aware of potential pitfalls. So, how can you convince your analysts to realize the importance of expanding their toolkit? Examples from real life often provide great insight.
Data Analyst, Data Engineer, Data Engineering, Data Scientist
- How to Get a Job as a Data Engineer - Jan 5, 2021.
Data engineering skills are currently in high demand. If you are looking for career prospects in this fast-growing profession, then these 10 skills and key factors will help you prepare to land an entry-level position in this field.
Career Advice, Data Engineer, Data Engineering
- The Future of Cloud is Now - Dec 22, 2020.
Our recent survey of over 130 top data engineers, data architects, and executives uncovered details and trends of the current state of data engineering and DataOps.Read our survey report to learn more about these trends as well as our predictions for future obstacles and our recommendations for avoiding them.
Cloud, Data Engineering, Data Platform, Immuta, Survey
- KDnuggets™ News 20:n46, Dec 9: Why the Future of ETL Is Not ELT, But EL(T); Introduction to Data Engineering - Dec 9, 2020.
Learn why the future if ETL is not ELT, but EL(T) and what does that mean; Read a great intro to Data Engineering; Get expert opinions on the main developments in 2020 and key trends in 2021 in AI, Data Science, Machine Learning; NoSQL for Beginners; and more.
2021 Predictions, Data Engineering, ETL, TensorFlow, Trends
- The Ultimate Guide to Data Engineer Interviews - Dec 7, 2020.
If you are preparing for data engineering interviews, then follow these technical recommendations regarding your resume, programming skills, SQL acumen, and system design problem-solving, as well as the non-technical aspects of your upcoming interview session.
Career Advice, Data Engineer, Data Engineering, Interview Questions, Programming, SQL
Why the Future of ETL Is Not ELT, But EL(T) - Dec 4, 2020.
The well-established technologies and tools around ETL (Extract, Transform, Load) are undergoing a potential paradigm shift with new approaches to data storage and expanding cloud-based compute. Decoupling the EL from T could reconcile analytics and operational data management use cases, in a new landscape where data warehouses and data lakes are merging.
Data Analysis, Data Engineering, Data Lakes, Data Preparation, ELT, ETL
Introduction to Data Engineering - Dec 3, 2020.
The Q&A for the most frequently asked questions about Data Engineering: What does a data engineer do? What is a data pipeline? What is a data warehouse? How is a data engineer different from a data scientist? What skills and programming languages do you need to learn to become a data engineer?
Analytics, Data Engineer, Data Engineering, Data Science, Skills
The Rise of the Machine Learning Engineer - Nov 23, 2020.
The evolution of Big Data into machine learning applications ushered in an exciting era of new roles and skillsets that became necessary to implement these technologies. With the Machine Learning Engineer being such a crucial component today, where the evolution of this field will take us tomorrow should be fascinating.
Data Engineer, Data Engineering, Data Scientist, Machine Learning Engineer, Trends
- Top KDnuggets tweets, Nov 11-17: Data Engineering – the Cousin of Data Science, is Troublesome - Nov 18, 2020.
Also 6 Things About #DataScience that Employers Don't Want You to Know; NLP - Zero to Hero with #Python #NLProc; 5 Tricky SQL Queries Solved - Explaining the approach to solving a few complex #SQL queries.
Career Advice, Data Engineering, Data Science, NLP, SQL, Top tweets
- Moving from Data Science to Machine Learning Engineering - Nov 10, 2020.
The world of machine learning — and software — is changing. Read this article to find out how, and what you can do to stay ahead of it.
Career Advice, Data Engineering, Data Science, Machine Learning, Machine Learning Engineer
- The Missing Teams For Data Scientists - Nov 2, 2020.
Still today, too large a percent of data science projects fail, many of which can be attributed to the impacts of how hard missing data teams hit the data science team. Advocating for the missing data engineering and operations components to your team will make your professional life easier and more productive.
Data Engineering, Data Science Skills, Data Science Team, Data Scientist, Team
- You Don’t Have to Use Docker Anymore - Oct 29, 2020.
Docker is not the only containerization tool out there and there might just be better alternatives…
Containers, Data Engineering, DevOps, Docker
- Apache Spark Cluster on Docker - Jul 22, 2020.
Build your own Apache Spark cluster in standalone mode on Docker with a JupyterLab interface.
Apache Spark, Data Engineering, Docker, Jupyter, Python
Skills to Build for Data Engineering - Jun 4, 2020.
This article jumps into the latest skill set observations in the Data Engineering Job Market which could definitely add a boost to your existing career or assist you in starting off your Data Engineering journey.
Career Advice, Data Engineering, Skills
- Why and How to Use Dask with Big Data - Apr 15, 2020.
The Pandas library for Python is a game-changer for data preparation. But, when the data gets big, really big, then your computer needs more help to efficiency handle all that data. Learn more about how to use Dask and follow a demo to scale up your Pandas to work with Big Data.
Big Data, Dask, Data Engineering
- Five Interesting Data Engineering Projects - Mar 17, 2020.
As the role of the data engineer continues to grow in the field of data science, so are the many tools being developed to support wrangling all that data. Five of these tools are reviewed here (along with a few bonus tools) that you should pay attention to for your data pipeline work.
Dask, Data Engineering, dbt, DVC, Python
- 7 Data Trends for 2020 (and one non-trend) - Feb 24, 2020.
This article discusses trends that will (and won't) take shape in 2020.
2020 Predictions, Data Engineering, Data Science, Trends
- In Loving Memory of Strictly-Typed Schemas - Feb 20, 2020.
This article addresses one very peculiar manifestation of marketing propaganda in the big data industry that has crippled data engineers across the board — a resolute and methodical undermining of the sanctity of strictly-typed schemas.
Big Data, Data Engineering, Database
- Scaling the Wall Between Data Scientist and Data Engineer - Feb 17, 2020.
The educational and research focuses of machine learning tends to highlight the model building, training, testing, and optimization aspects of the data science process. To bring these models into use requires a suite of engineering feats and organization, a standard for which does not yet exist. Learn more about a framework for operating a collaborative data science and engineering team to deploy machine learning models to end-users.
Advice, Data Engineer, Data Engineering, Data Scientist, Deployment, DevOps, Machine Learning Engineer, MLflow, MLOps, Production
- Observability for Data Engineering - Feb 10, 2020.
Going beyond traditional monitoring techniques and goals, understanding if a system is working as intended requires a new concept in DevOps, called Observability. Learn more about this essential approach to bring more context to your system metrics.
Data Engineering, DevOps, Explainability, KPI, Monitoring, Time Series
7 Resources to Becoming a Data Engineer - Jan 7, 2020.
An estimated 8,650% growth of the volume of Data to 175 zetabytes from 2010 to 2025 has created an enormous need for Data Engineers to build an organization's big data platform to be fast, efficient and scalable.
Advice, Big Data, Cloud Computing, Data Engineering, Data Science, MOOC, SQL
- Four questions to help accurately scope analytics engineering project - Oct 9, 2019.
Being really good at scoping analytics projects is crucial for team productivity and profitability. You can consistently deliver on time if you work out the issue first, and these four questions can help you prepare.
Analytics, Data Engineering, dbt, Deployment
- The thin line between data science and data engineering - Sep 25, 2019.
Today, as companies have finally come to understand the value that data science can bring, more and more emphasis is being placed on the implementation of data science in production systems. And as these implementations have required models that can perform on larger and larger datasets in real-time, an awful lot of data science problems have become engineering problems.
Data Engineering, Data Science, Podcast
- Mongo DB Basics - Jun 5, 2019.
Mongo DB is a document oriented NO SQL database unlike HBASE which has a wide column store. The advantage of Document oriented over relation type is the columns can be changed as an when required for each case as opposed to the same column name for all the rows.
Big Data, Data Engineering, Data Science, MongoDB
- 7 “Gotchas” for Data Engineers New to Google BigQuery - Mar 28, 2019.
Here are some things that might take some getting used to when new to Google BigQuery, along with mitigation strategies where I’ve found them.
BigQuery, Data Engineer, Data Engineering, Google
- KDnuggets™ News 19:n10, Mar 6: What no one will tell you about data science job applications; The rise of ML Engineering - Mar 6, 2019.
Also most impactful AI trends of 2018: The rise of ML Engineering; How to do Everything in Computer Vision; GANs Need Some Attention, Too; OpenAI GPT-2.
Career, Computer Vision, Data Engineering, Data Science Team, Machine Learning, OpenAI
- On Building Effective Data Science Teams - Mar 4, 2019.
We take a look at the qualities that make a successful data team in order to help business leaders and executives create better AI strategies.
CRISP-DM, Data Analyst, Data Engineering, Data Governance, Data Science Team, Machine Learning Engineer
- UnitedHealth Group: Sr Manager, Data Engineering [Minnetonka, MN] - Nov 19, 2018.
UnitedHealth Group is seeking a Sr Manager, Data Engineering in Minnetonka, MN. The position will work with our report developers and analyst to help set the vision and deliver data assets that drive insights and opportunities for the digital product teams.
Data Engineering, Healthcare, Manager, Minnetonka, MN, UnitedHealth Group
- Things you should know when traveling via the Big Data Engineering hype-train - Oct 8, 2018.
Maybe you want to join the Big Data world? Or maybe you are already there and want to validate your knowledge? Or maybe you just want to know what Big Data Engineers do and what skills they use? If so, you may find the following article quite useful.
Big Data, Big Data Hype, Data Engineering, Hype
- Crunch Data Engineering and Analytics Conference, 29-31 October, Budapest - Sep 21, 2018.
The biggest (and anecdotally best) data engineering and analytics conference in the CEE region, is back! Practical Data Engineering and Data Analytics talks will take over Budapest, 29-31 October. Best part: discounted 3-in-1 tickets for Crunch, Amuse and Impact.
Budapest, Business Analytics, Crunch Conference, Data Engineering, Data Science, Hungary, IBM
A Winning Game Plan For Building Your Data Science Team - Sep 18, 2018.
We need to understand the responsibilities, capabilities, expectations and competencies of the Data Engineer, Data Scientist and Business Stakeholder.
Data Engineering, Data Science, Data Science Team
- Scientific debt – what does it mean for Data Science? - May 23, 2018.
This article analyses scientific debt - what it is and what it means for data science.
Business, Data Engineering, Data Science, DataCamp, Technical Debt
- DSTI: Applied MSc in Data Engineering, Advanced MSc in AI – Learn in France - May 14, 2018.
DSTI launches 2 new programmes for October 2018 entry: Applied MSc in Data Engineering and Advanced MSc in AI - Paris, Nice, and online.
AI, Data Engineering, Data Science Education, DSTI, France, Master of Science, Paris
- KDnuggets™ News 18:n12, Mar 21: Will GDPR Make Machine Learning Illegal?; 5 Things You Need to Know about Big Data - Mar 21, 2018.
Also: A Beginner's Guide to Data Engineering - Part II; Introduction to Optimization with Genetic Algorithm; Introduction to Markov Chains; Your free 70-page guide to a career in data science
Big Data, Data Engineering, Data Science, GDPR, Machine Learning, Markov Chains, Optimization
- A Beginner’s Guide to Data Engineering – Part II - Mar 15, 2018.
In this post, I share more technical details on how to build good data pipelines and highlight ETL best practices. Primarily, I will use Python, Airflow, and SQL for our discussion.
Pages: 1 2
AirBnB, Data Engineering, Data Science, ETL, Pipeline, Python, SQL
- KDnuggets™ News 18:n05, Jan 31: Feynman Technique to become a Data Scientist; 4 Big Data Trends for 2018; Data Scientist – best job in America - Jan 31, 2018.
Also How To Grow As A Data Scientist; A Beginner Guide to Data Engineering; Exclusive Interview: Doug Laney on Big Data and Infonomics
Advice, Data Engineering, Data Scientist, scikit-learn, Trends
A Beginner’s Guide to Data Engineering – Part I - Jan 25, 2018.
Data Engineering: The Close Cousin of Data Science.
Pages: 1 2
Data Engineer, Data Engineering, ETL, Pipeline
- Strata Data Conference – 3 reasons to attend, Sep 25-28, NYC - Sep 7, 2017.
Data is driving business transformation. Come to Strata Data Conference and learn how to turn algorithms into business advantage, build modern data strategies, and spend quality time with experts. Use code KDNU to save.
Big Data, Data Engineering, New York City, NY, Strata
- What data has to teach us about deep learning? - Sep 4, 2017.
Budapest is calling Data Scientists and Data engineers to CRUNCH Conference, Oct 18-20. CRUNCH will feature talks from Google, Airbnb, Tesla, LinkedIn, Netflix, Uber, and more. Use code KDnuggetsAtCrunch to save.
Budapest, Data Engineering, Data Science, Google, Hungary, Netflix, Tesla
37 Reasons why your Neural Network is not working - Aug 22, 2017.
Over the course of many debugging sessions, I’ve compiled my experience along with the best ideas around in this handy list. I hope they would be useful to you.
Pages: 1 2
Data Engineering, Data Preparation, Gradient Descent, Neural Networks
- Strata Data Conference, the reunion of data brain trust – KDnuggets Offer - Aug 8, 2017.
Strata Data Conference, the annual reunion of data brain trust, is Sept 25-28 in New York. Early price ends Aug 11 - save more with code KDNU.
Data Engineering, Data Science, Fintech, Machine Learning, New York City, NY, Strata
- Jimdo: Team Lead Data - Jul 13, 2017.
Shape a technological vision for our data department and passionately manage a great and diverse team of very skilled engineers, scientist and analysts.
Data Engineering, Germany, Hamburg, Jimdo, Manager
5 Career Paths in Big Data and Data Science, Explained - Feb 6, 2017.
Sexiest job... massive shortage... blah blah blah. Are you looking to get a real handle on the career paths available in "Data Science" and "Big Data?" Read this article for insight on where to look to sharpen the required entry-level skills.
Big Data, Career, Data Analyst, Data Engineering, Data Infrastructure, Data Science, Explained, Machine Learning
- Why the Data Scientist and Data Engineer Need to Understand Virtualization in the Cloud - Jan 25, 2017.
This article covers the value of understanding the virtualization constructs for the data scientist and data engineer as they deploy their analysis onto all kinds of cloud platforms. Virtualization is a key enabling layer of software for these data workers to be aware of and to achieve optimal results from.
Pages: 1 2
Cloud, Data Engineer, Data Engineering, Data Science, Data Scientist, Virtualization
- How to Choose a Data Format - Nov 3, 2016.
In any data analytics project, after business understanding phase, data understanding and selection of right data format as well as ETL tools is very important task. In this article, a very useful and practical set of guidelines is explained covering data format selection and ETL phases of project lifecycle.
Pages: 1 2
Data Cleaning, Data Engineering, Data Preparation, ETL, Hadoop, HDFS
- Behind the Dream of Data Work as it Could Be - Sep 13, 2016.
This post is an insider's overview of data.world, and their attempt to build the most meaningful, collaborative, and abundant data resource in the world.
Pages: 1 2
Data Analysis, Data Engineering, Data Preparation, Data Preprocessing, Data Science, Data.world
- Data Science, Data Engineering Bootcamp, Seattle, Oct 10-14 - Aug 8, 2016.
Data Science Dojo will be teaching a comprehensive five-day Data Science & Data Engineering Bootcamp in Seattle on October 10 - 14. Register today!
Bootcamp, Data Engineering, Data Science Education, Seattle, WA
- Connecting Data Systems and DevOps - Jun 17, 2016.
This post will explain why anyone transforming their company into a data-driven organization should care about software development best practices, even if they don’t consider themselves a software company.
Data Engineering, Data Science, Developers, DevOps, Software Engineering
- Building Data Systems: What Do You Need? - Jun 3, 2016.
This post shares some insight gained through years of building data-powered products, and discusses the capabilities you need to have in place in order to successfully build and maintain data systems and data infrastructure.
Pages: 1 2
Big Data, Data Engineering, Deployment, Quality Control
- Engineers Shouldn’t Write ETL: A Guide to Building a High Functioning Data Science Department - Mar 28, 2016.
An exploration of data science team building, with insight into why engineers should not write ETL, and other not-so-subtle pieces of advice.
Pages: 1 2 3
Advice, Data Engineering, Data Scientist, ETL, Stitch Fix
- ZocDoc: Engineering Manager, Data Engineering - Feb 3, 2015.
Run Data Engineering team to create a hardcore data and analytics infrastructure for our business - working with our teams of data scientists and business analysts to transform data into information enabling the next generation of ZocDoc insight and products.
Data Engineering, Manager, New York City, NY, USA, ZocDoc
- Civis Analytics: Data Scientist – Engineering (Senior and Junior roles) - Oct 7, 2014.
Founded by a team from Obama 2012, we are helping companies, non-profits, and campaigns leverage their data. Integrate, scale, and optimize our team data science methods, techniques, and best practices to run on very large datasets at high speeds.
Chicago-IL, Civis Analytics, Data Engineering, Data Scientist
- Health Integrated: Manager of Data Engineering - Sep 18, 2014.
Responsible for all aspects of data exchange processes and software in collaboration with developers, business analysts, product managers, and program managers.
Data Engineering, Health Integrated, Manager, Tampa-FL