- Advanced Statistical Concepts in Data Science, by Nagesh Singh Chauhan - Sep 30, 2021.
The article contains some of the most commonly used advanced statistical concepts along with their Python implementation.
Career Advice, Data Science, Distribution, Probability, Statistics
- Use These Unique Data Sets to Sharpen Your Data Science Skills, by U. of North Florida - Sep 29, 2021.
Want to get your hands on some real-world data sets right now? Kick off your bootcamp prep with this list of hot-button data sets curated to help you hone different data science skills.
Data Science Skills, Datasets
- GitHub Desktop for Data Scientists, by Drew Seewald - Sep 29, 2021.
Less scary than version control in the command line.
Data Science, Data Scientist, GitHub, Version Control
- Important Statistics Data Scientists Need to Know, by Lekshmi Sunil - Sep 29, 2021.
Several fundamental statistical concepts must be well appreciated by every data scientist -- from the enthusiast to the professional. Here, we provide code snippets in Python to increase understanding to bring you key tools that bring early insight into your data.
Bayes Theorem, Data Science, Probability, Statistics
-
How To Build A Database Using Python, by Irfan Alghani Khalid - Sep 28, 2021.
Implement your database without handling the SQL using the Flask-SQLAlchemy library.
Databases, Flask, Python, SQL
- Building a Structured Financial Newsfeed Using Python, SpaCy and Streamlit, by Harshit Tyagi - Sep 28, 2021.
Getting started with NLP by building a Named Entity Recognition(NER) application.
Finance, NLP, Python, spaCy, Streamlit
- Computer Vision in Agriculture, by Kevin Vu - Sep 27, 2021.
Deep learning isn’t just for placing ads or identifying cats anymore. Instead, a slew of young startups have started to incorporate the advances in computer vision made possible through larger and larger neural networks to real working robots in the fields.
Agriculture, AI, Computer Vision
-

Path to Full Stack Data Science, by Jawwad Siddique - Sep 27, 2021.
Start your journey toward mastering all aspects of the field of Data Science with this focused list of in-depth self-learning resources. Curated with the beginner in mind, these recommendations will help you learn efficiently, and can also offer existing professionals useful highlights for review or help filling in any gaps in skills.
Career Advice, Data Science, Data Science Education, Data Visualization, Mathematics, Python, R, Roadmap
- Zero to RAPIDS in Minutes with NVIDIA GPUs + Saturn Cloud, by Schmitt & Nolis - Sep 27, 2021.
Managing large-scale data science infrastructure presents significant challenges. With Saturn Cloud, managing GPU-based infrastructure is made easier, allowing practitioners and enterprises to focus on solving their business challenges.
GPU, NVIDIA, Python, Saturn Cloud
- Data Analysis Using Scala, by Roman Zykov - Sep 24, 2021.
It is very important to choose the right tool for data analysis. On the Kaggle forums, where international Data Science competitions are held, people often ask which tool is better. R and Python are at the top of the list. In this article we will tell you about an alternative stack of data analysis technologies, based on Scala.
Data Science, Machine Learning, Scala, Spark, YARN
- Real-Time Histogram Plots on Unbounded Data, by Romain Picard - Sep 24, 2021.
Using histograms on real-time data is not possible in most of the popular data science libraries. In this article you will learn how dynamically compute and display a histogram within a Python notebook.
Data Visualization, Histogram, Real-time, Statistics
- How To Deal With Imbalanced Classification, Without Re-balancing the Data, by David B Rosen (PhD) - Sep 23, 2021.
Before considering oversampling your skewed data, try adjusting your classification decision threshold, in Python.
Balancing Classes, Classification, Python, Unbalanced
- A Breakdown of Deep Learning Frameworks, by Kevin Vu - Sep 23, 2021.
Deep Learning continues to evolve as one of the most powerful techniques in the AI toolbox. Many software packages exist today to support the development of models, and we highlight important options available with key qualities and differentiators to help you select the most appropriate for your needs.
Deep Learning, Keras, MATLAB, MXNet, PyTorch, TensorFlow
- 9 Outstanding Reasons to Learn Python for Finance, by Zulie Rane - Sep 23, 2021.
Is Python good for learning finance and working in the financial world? The answer is not only a resounding YES, but yes for nine very good reasons. This article gets into the details behind why Python is a must-know programming language for anyone who wants to work in the financial sector.
Finance, Python
- GitHub Copilot and the Rise of AI Language Models in Programming Automation, by Kevin Vu - Sep 22, 2021.
Read on to learn more about what makes Copilot different from previous autocomplete tools (including TabNine), and why this particular tool has been generating so much controversy.
AI, Automation, GitHub, NLP, Programming
-
20 Machine Learning Projects That Will Get You Hired, by Khushbu Shah - Sep 22, 2021.
If you want to break into the machine learning and data science job market, then you will need to demonstrate the proficiency of your skills, especially if you are self-taught through online courses and bootcamps. A project portfolio is a great way to practice your new craft and offer convincing evidence that an employee should hire you over the competition.
Career, Machine Learning, Project
- 15 Must-Know Python String Methods, by Soner Yıldırım - Sep 21, 2021.
It is not always about numbers.
Data Processing, NLP, Python, Text Analytics
- Data Engineering Technologies 2021, by Tech Ninja - Sep 21, 2021.
Emerging technologies supporting the field of data engineering are growing at a rapid clip. This curated list includes the most important offerings available in 2021.
Abacus.ai, Dask, Data Engineering, Databricks, Dataiku, DataRobot, dbt, Fivetran, Pachyderm
- If You Can Write Functions, You Can Use Dask, by Hugo Shi - Sep 21, 2021.
This article is the second article of an ongoing series on using Dask in practice. Each article in this series will be simple enough for beginners, but provide useful tips for real work. The first article in the series is about using LocalCluster.
Cloud, Dask, Python, Saturn Cloud
- Don’t Touch a Dataset Without Asking These 10 Questions, by Sandeep Uttamchandani - Sep 20, 2021.
Selecting the right dataset is critical for the success of your AI project.
Datasets, Distribution, Outliers, Privacy, Standardization
-
How to be a Data Scientist without a STEM degree, by Terence Shin - Sep 20, 2021.
Breaking into data science as a professional does require technical skills, a well-honed knack for problem-solving, and a willingness to swim in oceans of data. Maybe you are coming in as a career change or ready to take a new learning path in life--without having previously earned an advanced degree in a STEM field. Follow these tips to find your way into this high-demand and interesting field.
Career Advice, Data Science Education, Data Scientist, Project, Python, SQL
-
How to Find Weaknesses in your Machine Learning Models, by Michael Berk - Sep 20, 2021.
FreaAI: a new method from researchers at IBM.
Interpretability, Machine Learning, Modeling, Statistics
- Paradoxes in Data Science, by Pier Paolo Ippolito - Sep 17, 2021.
Have a look into some of the main paradoxes associate with Data Science and it’s statistical foundations.
Data Science, Statistics
- Introducing TensorFlow Similarity, by Matthew Mayo - Sep 17, 2021.
TensorFlow Similarity is a newly-released library from Google that facilitates the training, indexing and querying of similarity models. Check out more here.
Google, Neural Networks, TensorFlow
- Adventures in MLOps with Github Actions, Iterative.ai, Label Studio and NBDEV, by Soellinger & Kunz - Sep 16, 2021.
This article documents the authors' experience building their custom MLOps approach.
GitHub, Machine Learning, MLOps, Pipeline, Python, Workflow
-
The Machine & Deep Learning Compendium Open Book, by Ori Cohen - Sep 16, 2021.
After years in the making, this extensive and comprehensive ebook resource is now available and open for data scientists and ML engineers. Learn from and contribute to this tome of valuable information to support all your work in data science from engineering to strategy to management.
Deep Learning, ebook, GitHub, Machine Learning, Open Source
- Introduction to Automated Machine Learning, by Kevin Vu - Sep 15, 2021.
AutoML enables developers with limited ML expertise (and coding experience) to train high-quality models specific to their business needs. For this article, we will focus on AutoML systems which cater to everyday business and technology applications.
Automated Machine Learning, AutoML, Machine Learning, Python
- How to get Python PCAP Certification: Roadmap, Resources, Tips For Success, Based On My Experience, by Mehul Singh - Sep 15, 2021.
Follow this journey of personal experience -- with useful tips and learning resources -- to help you achieve the PCAP Certification, one of the most reputed Python Certifications, to validate your knowledge against International Standards.
Advice, Certification, Python, Tips
- 5 Must Try Awesome Python Data Visualization Libraries, by Roja Achary - Sep 15, 2021.
The goal of data visualization is to communicate data or information clearly and effectively to readers. Here are 5 must try awesome Python libraries for helping you do so, with overviews and links to quick start guides for each.
Data Visualization, Matplotlib, Plotly, Python, Seaborn
- Speeding up Neural Network Training With Multiple GPUs and Dask, by Jacqueline Nolis - Sep 14, 2021.
A common moment when training a neural network is when you realize the model isn’t training quickly enough on a CPU and you need to switch to using a GPU. It turns out multi-GPU model training across multiple machines is pretty easy with Dask. This blog post is about my first experiment in using multiple GPUs with Dask and the results.
Dask, GPU, Neural Networks, Training
-

Data Scientists Without Data Engineering Skills Will Face the Harsh Truth, by Soner Yildirim - Sep 14, 2021.
Although the role of the data scientist is still evolving, data remains at its core. Setting the right expectations for what you will do as a data scientist is important, and, to be sure, knowing the tools of data engineering will get yourself ready for the real world.
Data Engineering, Data Science Skills, Data Scientist
- An Introduction to Reinforcement Learning with OpenAI Gym, RLlib, and Google Colab, by Galarnyk & Mika - Sep 14, 2021.
Get an Introduction to Reinforcement Learning by attempting to balance a virtual CartPole with OpenAI Gym, RLlib, and Google Colab.
Google Colab, OpenAI, Python, Reinforcement Learning
- The Prefect Way to Automate & Orchestrate Data Pipelines, by Thuwarakesh Murallie - Sep 13, 2021.
I am migrating all my ETL work from Airflow to this super-cool framework.
Airflow, Data Workflow, Pipeline, Prefect, Python
- 3 Most Important Lessons I’ve Learned 3 Years Into My Data Science Career, by Terence Shin - Sep 13, 2021.
After only 3 years of working as a data professional, many tried-and-true lessons can be learned. Here are 3 of the most important lessons learned with key takeaways and reflections shared.
Career, Career Advice, Communication, Data Science, Data Science Skills
- Working with Python APIs For Data Science Project, by Nate Rosidi - Sep 10, 2021.
In this article, we will work with YouTube Python API to collect video statistics from our channel using the requests python library to make an API call and save it as a Pandas DataFrame.
API, Data Science, Project, Python
-

A Data Science Portfolio That Will Land You The Job, by Natassha Selvaraj - Sep 10, 2021.
Landing a data science job is no easy feat, especially during the COVID-19 pandemic. This article provides aspiring data scientists with advice on building a data science portfolio that stands out.
Data Science, Jobs, Portfolio
- Text Preprocessing Methods for Deep Learning, by Kevin Vu - Sep 10, 2021.
While the preprocessing pipeline we are focusing on in this post is mainly centered around Deep Learning, most of it will also be applicable to conventional machine learning models too.
Data Preprocessing, Data Processing, Deep Learning, NLP, Text Analytics
- How to Create an AutoML Pipeline Optimization Sandbox, by Matthew Mayo - Sep 9, 2021.
In this article, we will implement an automated machine learning pipeline optimization sandbox web app using Streamlit and TPOT.
Automated Machine Learning, AutoML, Python, Streamlit
-
8 Deep Learning Project Ideas for Beginners, by Aqsa Zafar - Sep 9, 2021.
Have you studied Deep Learning techniques, but never worked on a useful project? Here, we highlight eight deep learning project ideas for beginners that will help you sharpen your skills and boost your resume.
Beginners, Deep Learning, Project
-
7 Differences Between a Data Analyst and a Data Scientist, by Zulie Rane - Sep 9, 2021.
This article discusses the 7 key differences between data analysts and data scientists with an aim to help potential data analysts/scientists determine which is the right one for them. I touch on day-to-day tasks, skill requirements, typical career progression, and salary and career prospects for both.
Career Advice, Data Analysis, Data Analyst, Data Science, Data Scientist
-
Top 18 Low-Code and No-Code Machine Learning Platforms, by Yulia Gavrilova - Sep 8, 2021.
Machine learning becomes more accessible to companies and individuals when there is less coding involved. Especially if you are just starting your path in ML, then check out these low-code and no-code platforms to help expedite your capabilities in learning and applying AI.
AutoML, Data Science Platforms, Low-Code, Machine Learning, No-Code
- How Machine Learning Leverages Linear Algebra to Solve Data Problems, by Harshit Tyagi - Sep 7, 2021.
Why you should learn the fundamentals of linear algebra.
Data Science, Linear Algebra, Machine Learning, Mathematics
- ebook: Learn Data Science with R – free download, by Narayana Murthy - Sep 7, 2021.
Check out this new book for data science beginners with many practical examples that covers statistics, R, graphing, and machine learning. As a source to learn the full breadth of data science foundations, "Learn Data Science with R" starts at the beginner level and gradually progresses into expert content.
Data Science, Data Science Education, ebook, R
-

How to Create Stunning Web Apps for your Data Science Projects, by Murallie Thuwarakesh - Sep 7, 2021.
Data scientists do not have to learn HTML, CSS, and JavaScript to build web pages.
Apps, Data Science, Python, Streamlit
- Fast AutoML with FLAML + Ray Tune, by Wu, Wang, Baum, Liaw & Galarnyk - Sep 6, 2021.
Microsoft Researchers have developed FLAML (Fast Lightweight AutoML) which can now utilize Ray Tune for distributed hyperparameter tuning to scale up FLAML’s resource-efficient & easily parallelizable algorithms across a cluster.
Automated Machine Learning, AutoML, Hyperparameter, Machine Learning, Microsoft, Python, Ray
- Five Key Facts About Wu Dao 2.0: The Largest Transformer Model Ever Built, by Jesus Rodriguez - Sep 6, 2021.
The record-setting model combines some clever research and engineering methods.
AI, NLP, Transformer
- 6 Cool Python Libraries That I Came Across Recently, by Dhilip Subramanian - Sep 3, 2021.
Check out these awesome Python libraries for Machine Learning.
Data Science, Machine Learning, Python
- Build a synthetic data pipeline using Gretel and Apache Airflow, by Drew Newberry - Sep 2, 2021.
In this blog post, we build an ETL pipeline that generates synthetic data from a PostgreSQL database using Gretel’s Synthetic Data APIs and Apache Airflow.
Airflow, Pipeline, Postgres, SQL, Synthetic Data
- Best Resources to Learn Natural Language Processing in 2021, by Aqsa Zafar - Sep 2, 2021.
In this article, the author has listed listed all the best resources to learn natural language processing including Online Courses, Tutorials, Books, and YouTube Videos.
Books, Courses, NLP, Youtube
-

Do You Read Excel Files with Python? There is a 1000x Faster Way, by Nicolas Vandeput - Sep 1, 2021.
In this article, I’ll show you five ways to load data in Python. Achieving a speedup of 3 orders of magnitude.
Excel, Microsoft, Pandas, Python, Scalability
-
Data Science Cheat Sheet 2.0, by Aaron Wang - Sep 1, 2021.
Check out this helpful, 5-page data science cheat sheet to assist with your exam reviews, interview prep, and anything in-between.
Cheat Sheet, Data Science
- How is Machine Learning Beneficial in Mobile App Development?, by Ria Katiyar - Sep 1, 2021.
Mobile app developers have a lot to gain by implementing AI & Machine Learning from the revolutionary changes that these disruptive technologies can offer. This is due to AI and ML's potential to strengthen mobile applications, providing for smoother user experiences capable of leveraging powerful features.
App, Development, Machine Learning, Mobile