Blog / News
- KDnuggets Top Blogs Rewards for August 2021, by Gregory Piatetsky [Top ] - Sep 15, 2021.
These top blogs were winners of KDnuggets Top Blog Rewards Program for August: Automate Microsoft Excel and Word Using Python,
- DATAcated Expo, Oct 5, Live-streamed,
Explore new AI / Data Science Tech, by DATAcated [Prod] - Sep 15, 2021.
The DATAcated Expo, hosted by DATAcated founder Kate Strachnyi, is coming up on October 5, 2021 from 11am - 6pm ET. Live-streamed on LinkedIn, the free event provides the community with an opportunity to explore and discover innovative technologies in data science & analytics.
- Introduction to Automated Machine Learning, by Kevin Vu [Tuto] - Sep 15, 2021.
AutoML enables developers with limited ML expertise (and coding experience) to train high-quality models specific to their business needs. For this article, we will focus on AutoML systems which cater to everyday business and technology applications.
- How to get Python PCAP Certification: Roadmap, Resources, Tips For Success, Based On My Experience, by Mehul Singh [Tuto] - Sep 15, 2021.
Follow this journey of personal experience -- with useful tips and learning resources -- to help you achieve the PCAP Certification, one of the most reputed Python Certifications, to validate your knowledge against International Standards.
- 5 Must Try Awesome Python Data Visualization Libraries, by Roja Achary [Tuto] - Sep 15, 2021.
The goal of data visualization is to communicate data or information clearly and effectively to readers. Here are 5 must try awesome Python libraries for helping you do so, with overviews and links to quick start guides for each.
- KDnuggets™ News 21:n35, Sep 15: A Data Science Portfolio That Will Land You The Job; Top 18 Low-Code and No-Code Machine Learning Platforms, by KDnuggets - Sep 15, 2021.
Here is a Data Science Portfolio that will land you the job; Review the top 18 Low-Code and No-Code Machine Learning platforms; Try these 8 Deep Learning Project Ideas for Beginners; Very useful - working with Python APIs for data science project.
- Top August Stories: Automate Microsoft Excel and Word Using Python; The Difference Between Data Scientists and ML Engineers, by KDnuggets [Top ] - Sep 14, 2021.
Automate Microsoft Excel and Word Using Python; The Difference Between Data Scientists and ML Engineers; Most Common Data Science Interview Questions and Answers; 3 Reasons Why You Should Use Linear Regression Models Instead of Neural Networks
- Amazon Web Services Webinar: Boost customer satisfaction and sales with consumer insights data, by Roidna [Prod] - Sep 14, 2021.
Join this webinar, Sep 27, to learn how to leverage external data to understand market needs and consumer behavior – helping you build a more customer-centric business.
- Speeding up Neural Network Training With Multiple GPUs and Dask, by Jacqueline Nolis [Tuto] - Sep 14, 2021.
A common moment when training a neural network is when you realize the model isn’t training quickly enough on a CPU and you need to switch to using a GPU. It turns out multi-GPU model training across multiple machines is pretty easy with Dask. This blog post is about my first experiment in using multiple GPUs with Dask and the results.
- Data Scientists Without Data Engineering Skills Will Face the Harsh Truth, by Soner Yildirim [Tuto] - Sep 14, 2021.
Although the role of the data scientist is still evolving, data remains at its core. Setting the right expectations for what you will do as a data scientist is important, and, to be sure, knowing the tools of data engineering will get yourself ready for the real world.
- An Introduction to Reinforcement Learning with OpenAI Gym, RLlib, and Google Colab, by Galarnyk & Mika [Tuto] - Sep 14, 2021.
Get an Introduction to Reinforcement Learning by attempting to balance a virtual CartPole with OpenAI Gym, RLlib, and Google Colab.
- Top Stories, Sep 6-12: Do You Read Excel Files with Python? There is a 1000x Faster Way; 8 Deep Learning Project Ideas for Beginners, by KDnuggets [Top ] - Sep 13, 2021.
Also: How to Create Stunning Web Apps for your Data Science Projects; A Data Science Portfolio That Will Land You The Job; Top 18 Low-Code and No-Code Machine Learning Platforms; 8 Deep Learning Project Ideas for Beginners
- 85% of data science projects fail – here’s how to avoid it, by SparkBeyond [Prod] - Sep 13, 2021.
Here are a few common traps that data scientists can avoid to NOT be one of the 85% of data science projects that fail.
- The Prefect Way to Automate & Orchestrate Data Pipelines, by Thuwarakesh Murallie [Tuto] - Sep 13, 2021.
I am migrating all my ETL work from Airflow to this super-cool framework.
- 3 Most Important Lessons I’ve Learned 3 Years Into My Data Science Career, by Terence Shin [Tuto] - Sep 13, 2021.
After only 3 years of working as a data professional, many tried-and-true lessons can be learned. Here are 3 of the most important lessons learned with key takeaways and reflections shared.
- How Many AI Neurons Does It Take to Simulate a Brain Neuron?, by Jesus Rodriguez [Opin] - Sep 13, 2021.
A new research shows some shocking answers to that question.
- Working with Python APIs For Data Science Project, by Nathan Rosidi [Tuto] - Sep 10, 2021.
In this article, we will work with YouTube Python API to collect video statistics from our channel using the requests python library to make an API call and save it as a Pandas DataFrame.
- A Data Science Portfolio That Will Land You The Job, by Natassha Selvaraj [Tuto] - Sep 10, 2021.
Landing a data science job is no easy feat, especially during the COVID-19 pandemic. This article provides aspiring data scientists with advice on building a data science portfolio that stands out.
- Text Preprocessing Methods for Deep Learning, by Kevin Vu [Tuto] - Sep 10, 2021.
While the preprocessing pipeline we are focusing on in this post is mainly centered around Deep Learning, most of it will also be applicable to conventional machine learning models too.
- How to Create an AutoML Pipeline Optimization Sandbox, by Matthew Mayo [Tuto] - Sep 9, 2021.
In this article, we will implement an automated machine learning pipeline optimization sandbox web app using Streamlit and TPOT.
- 8 Deep Learning Project Ideas for Beginners, by Aqsa Zafar [Tuto] - Sep 9, 2021.
Have you studied Deep Learning techniques, but never worked on a useful project? Here, we highlight eight deep learning project ideas for beginners that will help you sharpen your skills and boost your resume.
- 7 Differences Between a Data Analyst and a Data Scientist, by Zulie Rane [Tuto] - Sep 9, 2021.
This article discusses the 7 key differences between data analysts and data scientists with an aim to help potential data analysts/scientists determine which is the right one for them. I touch on day-to-day tasks, skill requirements, typical career progression, and salary and career prospects for both.
- 300 Data Science Leaders Share What’s Holding Their Teams Back, by Domino [Prod] - Sep 8, 2021.
Flawed investments in people, processes, and tools are crushing potential business impact.
- Smart Ingestion: Using ontology-driven AI, by Prad Upadrashta [Opin] - Sep 8, 2021.
Imagine data that organizes itself to power your decision-making.
- Top 18 Low-Code and No-Code Machine Learning Platforms, by Yulia Gavrilova [Tuto] - Sep 8, 2021.
Machine learning becomes more accessible to companies and individuals when there is less coding involved. Especially if you are just starting your path in ML, then check out these low-code and no-code platforms to help expedite your capabilities in learning and applying AI.
- Math 2.0: The Fundamental Importance of Machine Learning, by Dr. Claus Horn [Opin] - Sep 8, 2021.
Machine learning is not just another way to program computers; it represents a fundamental shift in the way we understand the world. It is Math 2.0.
- KDnuggets™ News 21:n34, Sep 8: Do You Read Excel Files with Python? There is a 1000x Faster Way; Hypothesis Testing Explained, by KDnuggets - Sep 8, 2021.
Do You Read Excel Files with Python? There is a 1000x Faster Way; Hypothesis Testing Explained; Data Science Cheat Sheet 2.0; 6 Cool Python Libraries That I Came Across Recently; Best Resources to Learn Natural Language Processing in 2021
- Popular Certifications to validate your data and analytics skills, by SAS [Prod] - Sep 7, 2021.
Check out the most popular certifications from SAS to see what certification you want to pursue next. Now through the end of 2021, you can save 55% on your exam!
- How Machine Learning Leverages Linear Algebra to Solve Data Problems, by Harshit Tyagi [Tuto] - Sep 7, 2021.
Why you should learn the fundamentals of linear algebra.
- ebook: Learn Data Science with R – free download, by Narayana Murthy [Tuto] - Sep 7, 2021.
Check out this new book for data science beginners with many practical examples that covers statistics, R, graphing, and machine learning. As a source to learn the full breadth of data science foundations, "Learn Data Science with R" starts at the beginner level and gradually progresses into expert content.
- How to Create Stunning Web Apps for your Data Science Projects, by Murallie Thuwarakesh [Tuto] - Sep 7, 2021.
- Top Stories, Aug 30 – Sep 5: Do You Read Excel Files with Python? There is a 1000x Faster Way; Hypothesis Testing Explained, by KDnuggets [Top ] - Sep 6, 2021.
Also: The Top Industries Hiring Data Scientists in 2021; Hypothesis Testing Explained; Automate Microsoft Excel and Word Using Python; Data Science Cheat Sheet 2.0
- Fast AutoML with FLAML + Ray Tune, by Wu, Wang, Baum, Liaw & Galarnyk [Tuto] - Sep 6, 2021.
Microsoft Researchers have developed FLAML (Fast Lightweight AutoML) which can now utilize Ray Tune for distributed hyperparameter tuning to scale up FLAML’s resource-efficient & easily parallelizable algorithms across a cluster.
- Antifragility and Machine Learning, by Prad Upadrashta [Opin] - Sep 6, 2021.
Our intuition for most products, processes, and even some models might be that they either will get worse over time, or if they fail, they will experience an cascade of more failure. But, what if we could intentionally design systems and models to only get better, even as the world around them gets worse?
- Five Key Facts About Wu Dao 2.0: The Largest Transformer Model Ever Built, by Jesus Rodriguez [Tuto] - Sep 6, 2021.
The record-setting model combines some clever research and engineering methods.
- Behind OpenAI Codex: 5 Fascinating Challenges About Building Codex You Didn’t Know About, by Jesus Rodriguez [Opin] - Sep 3, 2021.
Some ML engineering and modeling challenges encountering during the construction of Codex.
- Hypothesis Testing Explained, by Angelica Lo Duca [Tuto] - Sep 3, 2021.
This brief overview of the concept of Hypothesis Testing covers its classification in parametric and non-parametric tests, and when to use the most popular ones, including means, correlation, and distribution, in the case of one sample and two samples.
- 6 Cool Python Libraries That I Came Across Recently, by Dhilip Subramanian [Tuto] - Sep 3, 2021.
Check out these awesome Python libraries for Machine Learning.
- eBook: A Practical Guide to Using Third-Party Data in the Cloud, by Roidna [Prod] - Sep 2, 2021.
Download this eBook to learn how innovative teams are shifting their focus from data-driven business intelligence to accelerating insight-driven decision-making and now are turning to third-party datasets as a differentiator.
- Build a synthetic data pipeline using Gretel and Apache Airflow, by Drew Newberry [Tuto] - Sep 2, 2021.
In this blog post, we build an ETL pipeline that generates synthetic data from a PostgreSQL database using Gretel’s Synthetic Data APIs and Apache Airflow.
- How to solve machine learning problems in the real world, by Pau Labarta Bajo [Opin] - Sep 2, 2021.
Becoming a machine learning engineer pro is your goal? Sure, online ML courses and Kaggle-style competitions are great resources to learn the basics. However, the daily job of a ML engineer requires an additional layer of skills that you won’t master through these approaches.
- Best Resources to Learn Natural Language Processing in 2021, by Aqsa Zafar [Tuto] - Sep 2, 2021.
In this article, the author has listed listed all the best resources to learn natural language processing including Online Courses, Tutorials, Books, and YouTube Videos.
- Future Says Series | Discover the Future of AI, by Altair [Prod] - Sep 1, 2021.
This innovative project brings together industry thought leaders from top tech companies such as Google, PwC, King, DNB, Piab, Scania, Telefonica, and more to discuss what the future holds for data and AI. Watch Future Says Series as industry experts discuss real-life examples how they are scaling AI successfully within their organizations.
- Do You Read Excel Files with Python? There is a 1000x Faster Way, by Nicolas Vandeput [Tuto] - Sep 1, 2021.
In this article, I’ll show you five ways to load data in Python. Achieving a speedup of 3 orders of magnitude.
- Data Science Cheat Sheet 2.0, by Aaron Wang [Tuto] - Sep 1, 2021.
Check out this helpful, 5-page data science cheat sheet to assist with your exam reviews, interview prep, and anything in-between.
- How is Machine Learning Beneficial in Mobile App Development?, by Ria Katiyar [Tuto] - Sep 1, 2021.
Mobile app developers have a lot to gain by implementing AI & Machine Learning from the revolutionary changes that these disruptive technologies can offer. This is due to AI and ML's potential to strengthen mobile applications, providing for smoother user experiences capable of leveraging powerful features.
- KDnuggets™ News 21:n33, Sep 1: Top Industries Hiring Data Scientists; The Most Important Tool for Data Engineers, by KDnuggets - Sep 1, 2021.
The top industries hiring Data Scientists; The most important tool for data engineers (hint - it is not technical); How to Engineer Date Features in Python; 15 Python Snippets to Optimize your Data Science Pipeline
- NLP Insights for the Penguin Café Orchestra, by Expert.ai [Prod] - Aug 31, 2021.
We give an example of how to use Expert.ai and Python to investigate favorite music albums.
- CSV Files for Storage? No Thanks. There’s a Better Option, by Dario Radečić [Tuto] - Aug 31, 2021.
Saving data to CSV’s is costing you both money and disk space. It’s time to end it.
- Multilabel Document Categorization, step by step example, by Saurabh Sharma [Tuto] - Aug 31, 2021.
This detailed guide explores an unsupervised and supervised learning two-stage approach with LDA and BERT to develop a domain-specific document categorizer on unlabeled documents.
- A Python Data Processing Script Template, by Matthew Mayo [Tuto] - Aug 31, 2021.
Here's a skeleton general purpose template for getting a Python command line script fleshed out as quickly as possible.
- Top Stories, Aug 23-29: Automate Microsoft Excel and Word Using Python, by KDnuggets [Top ] - Aug 30, 2021.
Also Django's 9 Most Common Applications; Learning Data Science and Machine Learning: First Steps after the Roadmap; The Significance of Data-centric AI
- Beacon North America,
the latest and greatest in data analytics, Sep 14-15, by Google [Prod] - Aug 30, 2021.
On Sep 14-15, Looker (Google Cloud) will be hosting BEACON, bi-annual data thought leadership virtual event series - find original content at the forefront of data, analytics trends, future predictions and best practices. Sign up now - tickets are free.
- Introducing Packed BERT for 2x Training Speed-up in Natural Language Processing, by Krell & Kosec [Tuto] - Aug 30, 2021.
Check out this new BERT packing algorithm for more efficient training.
- Data Science Project Infrastructure: How To Create It, by Nate Rosidi [Tuto] - Aug 30, 2021.
The intension for most data science projects is to build something that people use. Creating something purposeful requires a solid infrastructure and processes that keeps problem-solving front-and-center for your audience.
- The Top Industries Hiring Data Scientists in 2021, by Devin Partida [Opin] - Aug 30, 2021.
People realize that effective uses of data can increase competitiveness, even in a challenging marketplace. Here are six industries hiring data scientists now that will likely continue doing so for the foreseeable future.
- 3 Data Acquisition, Annotation, and Augmentation Tools, by Matthew Mayo [Tuto] - Aug 27, 2021.
Check out these 3 projects found around GitHub that can help with your data acquisition, annotation, and augmentation tasks.
- How causal inference lifts augmented analytics beyond flatland, by Michael Klaput [Opin] - Aug 27, 2021.
In our quest to better understand and predict business outcomes, traditional predictive modeling tends to fall flat. However, causal inference techniques along with business analytics approaches can unravel what truly changes your KPIs.
- The Significance of Data-centric AI, by Vidhi Chugh [Opin] - Aug 27, 2021.
How a systematic way of maintaining data quality can do wonders to your model performance.
- Automated Data Labeling with Machine Learning, by Watchful [Prod] - Aug 26, 2021.
Labeling training data is the one step in the data pipeline that has resisted automation. It’s time to change that.
- Coding Ethics for AI & AIOps: Designing Responsible AI Systems, by Manisha Singh [Opin] - Aug 26, 2021.
AI ops has taken Human machine collaboration to the next level where humans and machines are not just coexisting but are collaborating and working together like team members.
- 11 Best Data Science Education Platforms, by Zulie Rane [Tuto] - Aug 26, 2021.
We cover 11 best Data Science Education platforms for 11 different use cases, ranging from specific languages to hands-on learners, to the best free option.
- The Most Important Tool for Data Engineers, by Leo Godin [Opin] - Aug 26, 2021.
And it has nothing to do with Python or SQL
- Florida Hacks with IBM, by BeMyApp [Prod] - Aug 25, 2021.
Join the Florida Hacks with IBM virtual hackathon and create a project to tackle sustainability challenges. IBM will provide mentorship and data sets to help bring your ideas to life.
- 15 Python Snippets to Optimize your Data Science Pipeline, by Lucas Soares [Tuto] - Aug 25, 2021.
Quick Python solutions to help your data science cycle.
- What is Noise?, by Vasant Dhar [Opin] - Aug 25, 2021.
We might have a reasonable sense for what "noise" is as some statically random phenomena that occurs in Nature. But, how can this same characteristic be defined--and understood--within the context of making judgements, such as in human behavior, corporate decision-making, medicine, the law, and AI systems?
- How to Engineer Date Features in Python, by Matthew Mayo [Tuto] - Aug 25, 2021.
This article discusses and demonstrates how to quickly engineer some common date features using Python.
- KDnuggets™ News 21:n32, Aug 25: Open Source Datasets for Computer Vision; Django’s 9 Most Common Applications, by KDnuggets - Aug 25, 2021.
Open Source Datasets for Computer Vision; Django’s 9 Most Common Applications; How to Select an Initial Model for your Data Science Problem; Automate Microsoft Excel and Word Using Python; Stack Overflow Survey Data Science Highlights
- Essential Features of An Efficient Data Integration Solution, by Rabia Hatim [Opin] - Aug 24, 2021.
This blog highlights the essential features of a data integration solution that help an organization generate consistent and accurate data to keep the business running smoothly.
- Learning Data Science and Machine Learning: First Steps After The Roadmap, by Harshit Tyagi [Tuto] - Aug 24, 2021.
Just getting into learning data science may seem as daunting as (if not more than) trying to land your first job in the field. With so many options and resources online and in traditional academia to consider, these pre-requisites and pre-work are recommended before diving deep into data science and AI/ML.
- Automate Microsoft Excel and Word Using Python, by Mohammad Khorasani [Tuto] - Aug 24, 2021.
Integrate Excel with Word to generate automated reports seamlessly.
- Top Stories, Aug 16-22: The Difference Between Data Scientists and ML Engineers; Prefect: How to Write and Schedule Your First ETL Pipeline with Python, by KDnuggets [Top ] - Aug 23, 2021.
Also: Open Source Datasets for Computer Vision; Prefect: How to Write and Schedule Your First ETL Pipeline with Python; Most Common Data Science Interview Questions and Answers; How to Select an Initial Model for your Data Science Problem.
- Jurassic-1 Language Models and AI21 Studio, by AI21 [Prod] - Aug 23, 2021.
AI21 Labs’ new developer platform offers instant access to our 178B-parameter language model, to help you build sophisticated text-based AI applications at scale.
- Django’s 9 Most Common Applications, by Aakash Bijwe [Opin] - Aug 23, 2021.
Django is a Python web application framework enjoying widespread adoption in the data science community. But what else can you use Django for? Read this article for 9 use cases where you can put Django to work.
- 7 reasons you should get a formal degree in Data Science, by Purvanshi Mehta [Opin] - Aug 23, 2021.
So many options are now available online to learn in the field of data science. There are several factors to consider to determine if these options or a traditional degree from an academic institution is the best approach for your personal learning style and career aspirations.
- 5 Things That Make My Job as a Data Scientist Easier, by Shree Vandana [Opin] - Aug 23, 2021.
After working as a Data Scientist for a year, I am here to share some things I learnt along the way that I feel are helpful and have increased my efficiency. Hopefully some of these tips can help you in your journey :)
- Stack Overflow Survey Data Science Highlights, by Matthew Mayo [Opin] - Aug 20, 2021.
The results of the 2021 Stack Overflow Developer Survey were recently released, which is a fascinating snapshot of today's developers and the tools they are using. Have a look at some selections from the report, particularly those which may be of interest to data professionals.
- Demystifying AI: The prejudices of Artificial Intelligence (and human beings), by Manjesh Gupta [Opin] - Aug 20, 2021.
AI models are necessarily trained on historical data from the real-world--data that is generated from the daily goings on of society. If social-based biases are inherent in the training data, then will the AI predictions highlight these same biases? If so, what should we do (or not do) about making AI fair?
- How to Select an Initial Model for your Data Science Problem, by Zachary Warnes [Tuto] - Aug 20, 2021.
Save yourself some time and headaches and start simple.
- Speeding up data understanding by interactive exploration, by Visplore [Prod] - Aug 19, 2021.
A key success factor of data science projects is to understand the data well. This blog explains why coding can be inefficient for this and how you can improve.
- 5 Data Science Career Mistakes To Avoid, by Tessa Xie [Opin] - Aug 19, 2021.
Everyone makes mistakes, which can be a good thing when they lead to learning and improvements over time. But, we can also try to first learn from others to expedite our personal growth. To get started, consider these lessons learned the hard way, so you don’t have to.
- Enhancing Machine Learning Personalization through Variety, by Raghavan Kirthivasan [Tuto] - Aug 19, 2021.
Personalization drives growth and is a touchstone of good customer experience. Personalization driven through machine learning can enable companies to improve this experience while improving ROI for marketing campaigns. However, challenges exist in these techniques for when personalization makes sense and how and when specific options are recommended.
- 15 Things I Look for in Data Science Candidates, by Mathias Gruber [Opin] - Aug 19, 2021.
This article presents advice for anyone looking or hiring for data science jobs, written by someone with practical and useful insight.
- Amazon Web Services Webinar: Accelerating clinical trial and biomedical development processes with healthcare data, by Roidna [Prod] - Aug 18, 2021.
Join this webinar on August 27 to learn how to leverage external healthcare datasets to make faster decisions with greater accuracy – accelerating biomedical development and improving patient welfare.
- When Correlation is Better than Causation, by Brittany Davis [Tuto] - Aug 18, 2021.
Identifying causality in an analysis isn't always practical. We show a heuristic approach for using correlations to inform decisions.
- Open Source Datasets for Computer Vision, by Kevin Vu [Tuto] - Aug 18, 2021.
Access to high-quality, noise-free, large-scale datasets is crucial for training complex deep neural network models for computer vision applications. Many open-source datasets are developed for use in image classification, pose estimation, image captioning, autonomous driving, and object segmentation. These datasets must be paired with the appropriate hardware and benchmarking strategies to optimize performance.
- Data Scientist’s Guide to Efficient Coding in Python, by Dr. Varshita Sher [Tuto] - Aug 18, 2021.
Read this fantastic collection of tips and tricks the author uses for writing clean code on a day-to-day basis.
- KDnuggets™ News 21:n31, Aug 18: The Difference Between Data Scientists and ML Engineers; MLOPs And Machine Learning RoadMap, by KDnuggets - Aug 18, 2021.
What is the difference between Data Scientists and ML Engineers? How does MLOPs fit into Machine Learning RoadMap? How to Train a BERT Model From Scratch? What is so great about Intro to Statistical Learning, 2nd Edition? Find the answers to these questions and more in this issue.
- Top July Stories: Data Scientists and ML Engineers Are Luxury Employees, by Gregory Piatetsky [Top ] - Aug 17, 2021.
Also: Top 6 Data Science Online Courses in 2021; Advice for Learning Data Science from Google's Director of Research; 5 Lessons McKinsey Taught Me That Will Make You a Better Data Scientist
- Leaders at Allstate, eBay & Red Bull Agree: Don’t Miss the Rev 3 Enterprise MLOps Summit, by Domino [Prod] - Aug 17, 2021.
Join data science and MLOps leaders in-person in Chicago this November.