- Data Analysis Using Scala, by Roman Zykov - Sep 24, 2021.
It is very important to choose the right tool for data analysis. On the Kaggle forums, where international Data Science competitions are held, people often ask which tool is better. R and Python are at the top of the list. In this article we will tell you about an alternative stack of data analysis technologies, based on Scala.
- Real-Time Histogram Plots on Unbounded Data, by Romain Picard - Sep 24, 2021.
Using histograms on real-time data is not possible in most of the popular data science libraries. In this article you will learn how dynamically compute and display a histogram within a Python notebook.
- How Data Scientists Can Compete in the Global Job Market, by Devin Partida - Sep 24, 2021.
Data scientists wanting to stay competitive or break into the field will need the right approach. These techniques will help them search for and secure a new position.
- Introducing PostHog: An open-source product analytics platform, by PostHog - Sep 23, 2021.
PostHog is an open-source product analytics platform that helps you and your product team capture, analyze, and make informed decisions based on user behaviour.
- How To Deal With Imbalanced Classification, Without Re-balancing the Data, by David B Rosen (PhD) - Sep 23, 2021.
Before considering oversampling your skewed data, try adjusting your classification decision threshold, in Python.
- A Breakdown of Deep Learning Frameworks, by Kevin Vu - Sep 23, 2021.
Deep Learning continues to evolve as one of the most powerful techniques in the AI toolbox. Many software packages exist today to support the development of models, and we highlight important options available with key qualities and differentiators to help you select the most appropriate for your needs.
- 9 Outstanding Reasons to Learn Python for Finance, by Zulie Rane - Sep 23, 2021.
Is Python good for learning finance and working in the financial world? The answer is not only a resounding YES, but yes for nine very good reasons. This article gets into the details behind why Python is a must-know programming language for anyone who wants to work in the financial sector.
- Messy Data is Beautiful, by SparkBeyond - Sep 22, 2021.
Once these types of data have been cleaned, they do more than show organized data sets. They reveal unlimited possibilities, and AI analytics can reveal these possibilities faster and more efficiently than ever before.
- GitHub Copilot and the Rise of AI Language Models in Programming Automation, by Kevin Vu - Sep 22, 2021.
Read on to learn more about what makes Copilot different from previous autocomplete tools (including TabNine), and why this particular tool has been generating so much controversy.
- 20 Machine Learning Projects That Will Get You Hired, by Khushbu Shah - Sep 22, 2021.
If you want to break into the machine learning and data science job market, then you will need to demonstrate the proficiency of your skills, especially if you are self-taught through online courses and bootcamps. A project portfolio is a great way to practice your new craft and offer convincing evidence that an employee should hire you over the competition.
- Nine Tools I Wish I Mastered Before My PhD in Machine Learning, by Aliaksei Mikhailiuk - Sep 22, 2021.
Whether you are building a start up or making scientific breakthroughs these tools will bring your ML pipeline to the next level.
- Free virtual event: Big Data and AI Toronto, by Corp Agency - Sep 21, 2021.
This year’s Big Data and AI Toronto conference and expo, held virtually Oct 13-14, will provide attendees with a 360° view of the industry through a unique 4-in-1 experience: Artificial intelligence, big data, cloud, and cybersecurity.
- 15 Must-Know Python String Methods, by Soner Yıldırım - Sep 21, 2021.
It is not always about numbers.
- Data Engineering Technologies 2021, by Tech Ninja - Sep 21, 2021.
Emerging technologies supporting the field of data engineering are growing at a rapid clip. This curated list includes the most important offerings available in 2021.
- If You Can Write Functions, You Can Use Dask, by Hugo Shi - Sep 21, 2021.
This article is the second article of an ongoing series on using Dask in practice. Each article in this series will be simple enough for beginners, but provide useful tips for real work. The first article in the series is about using LocalCluster.
- Top Stories, Sep 13-19: Data Scientists Without Data Engineering Skills Will Face the Harsh Truth; The Machine & Deep Learning Compendium Open Book, by KDnuggets - Sep 20, 2021.
Also: The Machine & Deep Learning Compendium Open Book; Easy SQL in Native Python; The Prefect Way to Automate & Orchestrate Data Pipelines; A Data Science Portfolio That Will Land You The Job
- How to label time series efficiently – and boost your AI, by Visplore - Sep 20, 2021.
Data labeling is a critical step in building high-quality AI models. This blog explains how to speed up the labeling process of time series data from sensors and IoT devices.
- Don’t Touch a Dataset Without Asking These 10 Questions, by Sandeep Uttamchandani - Sep 20, 2021.
Selecting the right dataset is critical for the success of your AI project.
- How to be a Data Scientist without a STEM degree, by Terence Shin - Sep 20, 2021.
Breaking into data science as a professional does require technical skills, a well-honed knack for problem-solving, and a willingness to swim in oceans of data. Maybe you are coming in as a career change or ready to take a new learning path in life--without having previously earned an advanced degree in a STEM field. Follow these tips to find your way into this high-demand and interesting field.
- How to Find Weaknesses in your Machine Learning Models, by Michael Berk - Sep 20, 2021.
FreaAI: a new method from researchers at IBM.
- Paradoxes in Data Science, by Pier Paolo Ippolito - Sep 17, 2021.
Have a look into some of the main paradoxes associate with Data Science and it’s statistical foundations.
- What 2 years of self-teaching data science taught me, by Vishnu U - Sep 17, 2021.
Many of us self-learn data science from the very beginning. While continuing to self-learn on demand is crucial, especially after you become a professional, there can be many pitfalls early on for learning the wrong way or missing out on key ideas that are important for the real-world application of data science.
- Introducing TensorFlow Similarity, by Matthew Mayo - Sep 17, 2021.
TensorFlow Similarity is a newly-released library from Google that facilitates the training, indexing and querying of similarity models. Check out more here.
- What Is The Real Difference Between Data Engineers and Data Scientists?, by Springboard - Sep 16, 2021.
To launch your data career, you’ll need both theoretical knowledge and applied skills. Bootcamp programs like Springboard’s Data Science Career Track and Data Engineering Career Track can help make you job-ready through hands-on, project-based learning and one-on-one mentorship. Wondering which data career path is right for you? Read on to find out.
- Adventures in MLOps with Github Actions, Iterative.ai, Label Studio and NBDEV, by Soellinger & Kunz - Sep 16, 2021.
This article documents the authors' experience building their custom MLOps approach.
- The Machine & Deep Learning Compendium Open Book, by Ori Cohen - Sep 16, 2021.
After years in the making, this extensive and comprehensive ebook resource is now available and open for data scientists and ML engineers. Learn from and contribute to this tome of valuable information to support all your work in data science from engineering to strategy to management.
- Easy SQL in Native Python, by Matthew Mayo - Sep 16, 2021.
If the idea of being able to link with SQL databases and define, manipulate, and query using Python sounds appealing, check out the SQLModel library.
- KDnuggets Top Blogs Rewards for August 2021, by Gregory Piatetsky - Sep 15, 2021.
These top blogs were winners of KDnuggets Top Blog Rewards Program for August: Automate Microsoft Excel and Word Using Python,
- DATAcated Expo, Oct 5, Live-streamed,
Explore new AI / Data Science Tech, by DATAcated - Sep 15, 2021.
The DATAcated Expo, hosted by DATAcated founder Kate Strachnyi, is coming up on October 5, 2021 from 11am - 6pm ET. Live-streamed on LinkedIn, the free event provides the community with an opportunity to explore and discover innovative technologies in data science & analytics.
- Introduction to Automated Machine Learning, by Kevin Vu - Sep 15, 2021.
AutoML enables developers with limited ML expertise (and coding experience) to train high-quality models specific to their business needs. For this article, we will focus on AutoML systems which cater to everyday business and technology applications.
- How to get Python PCAP Certification: Roadmap, Resources, Tips For Success, Based On My Experience, by Mehul Singh - Sep 15, 2021.
Follow this journey of personal experience -- with useful tips and learning resources -- to help you achieve the PCAP Certification, one of the most reputed Python Certifications, to validate your knowledge against International Standards.
- 5 Must Try Awesome Python Data Visualization Libraries, by Roja Achary - Sep 15, 2021.
The goal of data visualization is to communicate data or information clearly and effectively to readers. Here are 5 must try awesome Python libraries for helping you do so, with overviews and links to quick start guides for each.
- Top August Stories: Automate Microsoft Excel and Word Using Python; The Difference Between Data Scientists and ML Engineers, by KDnuggets - Sep 14, 2021.
Automate Microsoft Excel and Word Using Python; The Difference Between Data Scientists and ML Engineers; Most Common Data Science Interview Questions and Answers; 3 Reasons Why You Should Use Linear Regression Models Instead of Neural Networks
- Amazon Web Services Webinar: Boost customer satisfaction and sales with consumer insights data, by Roidna - Sep 14, 2021.
Join this webinar, Sep 27, to learn how to leverage external data to understand market needs and consumer behavior – helping you build a more customer-centric business.
- Speeding up Neural Network Training With Multiple GPUs and Dask, by Jacqueline Nolis - Sep 14, 2021.
A common moment when training a neural network is when you realize the model isn’t training quickly enough on a CPU and you need to switch to using a GPU. It turns out multi-GPU model training across multiple machines is pretty easy with Dask. This blog post is about my first experiment in using multiple GPUs with Dask and the results.
- Data Scientists Without Data Engineering Skills Will Face the Harsh Truth, by Soner Yildirim - Sep 14, 2021.
Although the role of the data scientist is still evolving, data remains at its core. Setting the right expectations for what you will do as a data scientist is important, and, to be sure, knowing the tools of data engineering will get yourself ready for the real world.
- An Introduction to Reinforcement Learning with OpenAI Gym, RLlib, and Google Colab, by Galarnyk & Mika - Sep 14, 2021.
Get an Introduction to Reinforcement Learning by attempting to balance a virtual CartPole with OpenAI Gym, RLlib, and Google Colab.
- Top Stories, Sep 6-12: Do You Read Excel Files with Python? There is a 1000x Faster Way; 8 Deep Learning Project Ideas for Beginners, by KDnuggets - Sep 13, 2021.
Also: How to Create Stunning Web Apps for your Data Science Projects; A Data Science Portfolio That Will Land You The Job; Top 18 Low-Code and No-Code Machine Learning Platforms; 8 Deep Learning Project Ideas for Beginners
- 85% of data science projects fail – here’s how to avoid it, by SparkBeyond - Sep 13, 2021.
Here are a few common traps that data scientists can avoid to NOT be one of the 85% of data science projects that fail.
- The Prefect Way to Automate & Orchestrate Data Pipelines, by Thuwarakesh Murallie - Sep 13, 2021.
I am migrating all my ETL work from Airflow to this super-cool framework.
- 3 Most Important Lessons I’ve Learned 3 Years Into My Data Science Career, by Terence Shin - Sep 13, 2021.
After only 3 years of working as a data professional, many tried-and-true lessons can be learned. Here are 3 of the most important lessons learned with key takeaways and reflections shared.
- How Many AI Neurons Does It Take to Simulate a Brain Neuron?, by Jesus Rodriguez - Sep 13, 2021.
A new research shows some shocking answers to that question.
- Working with Python APIs For Data Science Project, by Nathan Rosidi - Sep 10, 2021.
In this article, we will work with YouTube Python API to collect video statistics from our channel using the requests python library to make an API call and save it as a Pandas DataFrame.
- A Data Science Portfolio That Will Land You The Job, by Natassha Selvaraj - Sep 10, 2021.
Landing a data science job is no easy feat, especially during the COVID-19 pandemic. This article provides aspiring data scientists with advice on building a data science portfolio that stands out.
- Text Preprocessing Methods for Deep Learning, by Kevin Vu - Sep 10, 2021.
While the preprocessing pipeline we are focusing on in this post is mainly centered around Deep Learning, most of it will also be applicable to conventional machine learning models too.
- How to Create an AutoML Pipeline Optimization Sandbox, by Matthew Mayo - Sep 9, 2021.
In this article, we will implement an automated machine learning pipeline optimization sandbox web app using Streamlit and TPOT.
- 8 Deep Learning Project Ideas for Beginners, by Aqsa Zafar - Sep 9, 2021.
Have you studied Deep Learning techniques, but never worked on a useful project? Here, we highlight eight deep learning project ideas for beginners that will help you sharpen your skills and boost your resume.
- 7 Differences Between a Data Analyst and a Data Scientist, by Zulie Rane - Sep 9, 2021.
This article discusses the 7 key differences between data analysts and data scientists with an aim to help potential data analysts/scientists determine which is the right one for them. I touch on day-to-day tasks, skill requirements, typical career progression, and salary and career prospects for both.
- 300 Data Science Leaders Share What’s Holding Their Teams Back, by Domino - Sep 8, 2021.
Flawed investments in people, processes, and tools are crushing potential business impact.
- Smart Ingestion: Using ontology-driven AI, by Prad Upadrashta - Sep 8, 2021.
Imagine data that organizes itself to power your decision-making.
- Top 18 Low-Code and No-Code Machine Learning Platforms, by Yulia Gavrilova - Sep 8, 2021.
Machine learning becomes more accessible to companies and individuals when there is less coding involved. Especially if you are just starting your path in ML, then check out these low-code and no-code platforms to help expedite your capabilities in learning and applying AI.
- Math 2.0: The Fundamental Importance of Machine Learning, by Dr. Claus Horn - Sep 8, 2021.
Machine learning is not just another way to program computers; it represents a fundamental shift in the way we understand the world. It is Math 2.0.
- Popular Certifications to validate your data and analytics skills, by SAS - Sep 7, 2021.
Check out the most popular certifications from SAS to see what certification you want to pursue next. Now through the end of 2021, you can save 55% on your exam!
- How Machine Learning Leverages Linear Algebra to Solve Data Problems, by Harshit Tyagi - Sep 7, 2021.
Why you should learn the fundamentals of linear algebra.
- ebook: Learn Data Science with R – free download, by Narayana Murthy - Sep 7, 2021.
Check out this new book for data science beginners with many practical examples that covers statistics, R, graphing, and machine learning. As a source to learn the full breadth of data science foundations, "Learn Data Science with R" starts at the beginner level and gradually progresses into expert content.
- How to Create Stunning Web Apps for your Data Science Projects, by Murallie Thuwarakesh - Sep 7, 2021.
- Top Stories, Aug 30 – Sep 5: Do You Read Excel Files with Python? There is a 1000x Faster Way; Hypothesis Testing Explained, by KDnuggets - Sep 6, 2021.
Also: The Top Industries Hiring Data Scientists in 2021; Hypothesis Testing Explained; Automate Microsoft Excel and Word Using Python; Data Science Cheat Sheet 2.0
- Fast AutoML with FLAML + Ray Tune, by Wu, Wang, Baum, Liaw & Galarnyk - Sep 6, 2021.
Microsoft Researchers have developed FLAML (Fast Lightweight AutoML) which can now utilize Ray Tune for distributed hyperparameter tuning to scale up FLAML’s resource-efficient & easily parallelizable algorithms across a cluster.
- Antifragility and Machine Learning, by Prad Upadrashta - Sep 6, 2021.
Our intuition for most products, processes, and even some models might be that they either will get worse over time, or if they fail, they will experience an cascade of more failure. But, what if we could intentionally design systems and models to only get better, even as the world around them gets worse?
- Five Key Facts About Wu Dao 2.0: The Largest Transformer Model Ever Built, by Jesus Rodriguez - Sep 6, 2021.
The record-setting model combines some clever research and engineering methods.
- Behind OpenAI Codex: 5 Fascinating Challenges About Building Codex You Didn’t Know About, by Jesus Rodriguez - Sep 3, 2021.
Some ML engineering and modeling challenges encountering during the construction of Codex.
- Hypothesis Testing Explained, by Angelica Lo Duca - Sep 3, 2021.
This brief overview of the concept of Hypothesis Testing covers its classification in parametric and non-parametric tests, and when to use the most popular ones, including means, correlation, and distribution, in the case of one sample and two samples.
- 6 Cool Python Libraries That I Came Across Recently, by Dhilip Subramanian - Sep 3, 2021.
Check out these awesome Python libraries for Machine Learning.
- eBook: A Practical Guide to Using Third-Party Data in the Cloud, by Roidna - Sep 2, 2021.
Download this eBook to learn how innovative teams are shifting their focus from data-driven business intelligence to accelerating insight-driven decision-making and now are turning to third-party datasets as a differentiator.
- Build a synthetic data pipeline using Gretel and Apache Airflow, by Drew Newberry - Sep 2, 2021.
In this blog post, we build an ETL pipeline that generates synthetic data from a PostgreSQL database using Gretel’s Synthetic Data APIs and Apache Airflow.
- How to solve machine learning problems in the real world, by Pau Labarta Bajo - Sep 2, 2021.
Becoming a machine learning engineer pro is your goal? Sure, online ML courses and Kaggle-style competitions are great resources to learn the basics. However, the daily job of a ML engineer requires an additional layer of skills that you won’t master through these approaches.
- Best Resources to Learn Natural Language Processing in 2021, by Aqsa Zafar - Sep 2, 2021.
In this article, the author has listed listed all the best resources to learn natural language processing including Online Courses, Tutorials, Books, and YouTube Videos.
- Future Says Series | Discover the Future of AI, by Altair - Sep 1, 2021.
This innovative project brings together industry thought leaders from top tech companies such as Google, PwC, King, DNB, Piab, Scania, Telefonica, and more to discuss what the future holds for data and AI. Watch Future Says Series as industry experts discuss real-life examples how they are scaling AI successfully within their organizations.
- Do You Read Excel Files with Python? There is a 1000x Faster Way, by Nicolas Vandeput - Sep 1, 2021.
In this article, I’ll show you five ways to load data in Python. Achieving a speedup of 3 orders of magnitude.
- Data Science Cheat Sheet 2.0, by Aaron Wang - Sep 1, 2021.
Check out this helpful, 5-page data science cheat sheet to assist with your exam reviews, interview prep, and anything in-between.
- How is Machine Learning Beneficial in Mobile App Development?, by Ria Katiyar - Sep 1, 2021.
Mobile app developers have a lot to gain by implementing AI & Machine Learning from the revolutionary changes that these disruptive technologies can offer. This is due to AI and ML's potential to strengthen mobile applications, providing for smoother user experiences capable of leveraging powerful features.