2020 Oct
All (87) | Events (1) | News, Education (12) | Opinions (19) | Top Stories, Tweets (9) | Tutorials, Overviews (46)
- How to Make Sense of the Reinforcement Learning Agents? - Oct 30, 2020.
In this blog post, you’ll learn what to keep track of to inspect/debug your agent learning trajectory. I’ll assume you are already familiar with the Reinforcement Learning (RL) agent-environment setting and you’ve heard about at least some of the most common RL algorithms and environments.
- Overcoming the Racial Bias in AI - Oct 30, 2020.
The results of any AI developed today is entirely dependent on the data on which it trains. If the data is distributed--intentionally or not--with a bias toward any category of data over another, then the AI will display that bias. What is a better way forward to handle this possibility toward bias when the datasets involve human beings?
-
Building Neural Networks with PyTorch in Google Colab - Oct 30, 2020.
Combining PyTorch and Google's cloud-based Colab notebook environment can be a good solution for building neural networks with free access to GPUs. This article demonstrates how to do just that. - Seven Steps for Migrating Sensitive Data to the Cloud: A Guide for Data Teams - Oct 29, 2020.
Cloud migration requires a careful planning process to ensure all systems work as they should. Use this checklist, sponsored by Immuta and TDWI, to learn seven best practices for data teams migrating sensitive data to the cloud.
- Dealing with Imbalanced Data in Machine Learning - Oct 29, 2020.
This article presents tools & techniques for handling data when it's imbalanced.
- Explaining the Explainable AI: A 2-Stage Approach - Oct 29, 2020.
Understanding how to build AI models is one thing. Understanding why AI models provide the results they provide is another. Even more so, explaining any type of understanding of AI models to humans is yet another challenging layer that must be addressed if we are to develop a complete approach to Explainable AI.
- You Don’t Have to Use Docker Anymore - Oct 29, 2020.
Docker is not the only containerization tool out there and there might just be better alternatives…
- Top KDnuggets tweets, Oct 21-27: #MachineLearning can recover lost languages - Oct 28, 2020.
Also: Free Introductory Machine Learning Course From Amazon; Dataset Splitting Best Practices in #Python; 10 Underrated Python Skills; Computer Vision tells us how the presidential candidates really feel
- Exploring the Significance of Machine Learning for Algorithmic Trading with Stefan Jansen - Oct 28, 2020.
The immense expansion of digital data has increased the demand for proficiency in trading strategies that use machine learning (ML). Learn more from author Stefan Jansen, and get his latest book on the subject from Packt Publishing.
- Mastering Time Series Analysis with Help From the Experts - Oct 28, 2020.
Read this discussion with the “Time Series” Team at KNIME, answering such classic questions as "how much past is enough past?" others that any practitioner of time series analysis will find useful.
-
An Introduction to AI, updated - Oct 28, 2020.
We provide an introduction to key concepts and methods in AI, covering Machine Learning and Deep Learning, with an updated extensive list that includes Narrow AI, Super Intelligence, and Classic Artificial Intelligence, as well as recent ideas of NeuroSymbolic AI, Neuroevolution, and Federated Learning. - Stop Running Jupyter Notebooks From Your Command Line - Oct 28, 2020.
Instead, run your Jupyter Notebook as a stand alone web app.
-
PerceptiLabs – A GUI and Visual API for TensorFlow - Oct 27, 2020.
Recently released PerceptiLabs 0.11, is quickly becoming the GUI and visual API for TensorFlow. PerceptiLabs is built around a sophisticated visual ML modeling editor in which you drag and drop components and connect them together to form your model, automatically creating the underlying TensorFlow code. Try it now. - Can AI Learn Human Values? - Oct 27, 2020.
OpenAI believes that the path to safe AI requires social sciences.
- Getting A Data Science Job is Harder Than Ever – How to turn that to your advantage - Oct 27, 2020.
Although many aspiring Data Scientists are finding it is becoming more difficult to land a job than it was in previous years, understanding what has changed in the hiring landscape can be used to to your advantage in matching with the best organization for your goals and interests.
- Advice for Aspiring Data Scientists - Oct 27, 2020.
Are you a student of some type asking how to get into Data Science? You've come to the right place. Read on for both common and less basic advice on entering the field and excelling in the profession.
-
How to become a Data Scientist: a step-by-step guide - Oct 26, 2020.
Data science is everywhere. But what are the best ways to learn the field well enough to enter the profession? Read on for some tips and steps on doing so, and some great courses to help you get there. - Top Stories, Oct 19-25: How to Explain Key Machine Learning Algorithms at an Interview; Roadmap to Natural Language Processing - Oct 26, 2020.
Also: Roadmap to Natural Language Processing (NLP); 5 Must-Read Data Science Papers (and How to Use Them); DeepMind Relies on this Old Statistical Method to Build Fair Machine Learning Models; Good-bye Big Data. Hello, Massive Data!
- How Automation Is Improving the Role of Data Scientists - Oct 26, 2020.
Here is an overview of 5 ways that data automation will enhance how scientists spend their time and improve the results they get.
-
Ain’t No Such a Thing as a Citizen Data Scientist - Oct 26, 2020.
With learn-it-quick courses on data science popping up nearly a dime a dozen, more people are obtaining the sense they can dive into professional work with minimal qualifications and scant experience or practice. While the notion of a 'Citizen Scientist' is intended to simply support a broader appreciation of science and the scientific process to more people, the 'Citizen Data Scientist' is being inappropriately seen as a fast track to a new career. - Roadmap to Computer Vision - Oct 26, 2020.
Read this introduction to the main steps which compose a computer vision system, starting from how images are pre-processed, features extracted and predictions are made.
- Deploying Secure and Scalable Streamlit Apps on AWS with Docker Swarm, Traefik and Keycloak - Oct 23, 2020.
If you are a data scientist who just wants to get the work done but doesn’t necessarily want to go down the DevOps rabbit hole, this tutorial offers a relatively straightforward deployment solution leveraging Docker Swarm and Traefik, with an option of adding user authentication with Keycloak.
- Software 2.0 takes shape - Oct 23, 2020.
Software developers remain in very high demand as many organizations continue to experience workloads that far exceed available talent. AI-enhanced approaches that automate more areas of the software development lifecycle are in development with interesting potentials for how machine learning and natural language processing can significantly impact how software is designed, developed, tested, and deployed in the future.
- DeepMind Relies on this Old Statistical Method to Build Fair Machine Learning Models - Oct 23, 2020.
Causal Bayesian Networks are used to model the influence of fairness attributes in a dataset.
-
Good-bye Big Data. Hello, Massive Data! - Oct 22, 2020.
Join the Massive Data Revolution with SQream. Shorten query times from days to hours or minutes, and speed up data preparation with - analyze the raw data directly. -
The unspoken difference between junior and senior data scientists - Oct 22, 2020.
The unspoken difference between junior and senior data scientists? It’s not what you think. - Behavior Analysis with Machine Learning and R: The free eBook - Oct 22, 2020.
Check out this new free ebook to learn how to leverage the power of machine learning to analyze behavioral patterns from sensor data and electronic records using R.
- Which flavor of BERT should you use for your QA task? - Oct 22, 2020.
Check out this guide to choosing and benchmarking BERT models for question answering.
- Top KDnuggets tweets, Oct 14-20: An Introduction To Mathematics Behind #NeuralNetworks - Oct 21, 2020.
Also: 13 must-read papers from AI experts; 10 Pandas Tricks to Make My Data Analyzing Process More Efficient: Part 2 - Tricks I wish I had known earlier; An Introduction To Mathematics Behind #NeuralNetworks; How to Write Web Apps Using Simple #Python for #DataScientists
- 10 Underrated Python Skills - Oct 21, 2020.
Tips for feature analysis, hyperparameter tuning, data visualization and more.
- The Ethics of AI - Oct 21, 2020.
Marketing scientist Kevin Gray asks Dr. Anna Farzindar of the University of Southern California about a very important subject - the ethics of AI.
- Capitalize on Your Analytics Skills With the Johns Hopkins SAIS MA in Global Risk (Online) - Oct 20, 2020.
The Johns Hopkins SAIS MA in Global Risk (online) program prepares graduates to gain experience analyzing and addressing real-world scenarios, and the knowledge to fine-tune your analyses with insights from economics, history and political science. Enroll now and join a highly active, international community of more than 230,000 Johns Hopkins alumni.
- Deploying Streamlit Apps Using Streamlit Sharing - Oct 20, 2020.
Read this sneak peek into Streamlit’s new deployment platform.
- 5 Must-Read Data Science Papers (and How to Use Them) - Oct 20, 2020.
Keeping ahead of the latest developments in a field is key to advancing your skills and your career. Five foundational ideas from recent data science papers are highlighted here with tips on how to leverage these advancements in your work, and keep you on top of the machine learning game.
- Data Science in the Cloud with Dask - Oct 20, 2020.
Scaling large data analyses for data science and machine learning is growing in importance. Dask and Coiled are making it easy and fast for folks to do just that. Read on to find out how.
- Top Stories, Oct 12-18: fastcore: An Underrated Python Library; Free From MIT: Intro to Computational Thinking and Data Science - Oct 19, 2020.
Also: Software Engineering Tips and Best Practices for Data Science; 5 Best Practices for Putting Machine Learning Models Into Production; Goodhart's Law for Data Science and what happens when a measure becomes a target?; Text Mining with R: The Free eBook
- Knowledge Graphs: Connecting Your Data to Solve Real-World Problems in R&D, Business Intelligence and Strategy [Oct 28 Webinar] - Oct 19, 2020.
Learn about the development and application of knowledge graphs and how they are helping knowledge and information professionals to solve real business problems.
- Feature Ranking with Recursive Feature Elimination in Scikit-Learn - Oct 19, 2020.
This article covers using scikit-learn to obtain the optimal number of features for your machine learning project.
-
How to Explain Key Machine Learning Algorithms at an Interview - Oct 19, 2020.
While preparing for interviews in Data Science, it is essential to clearly understand a range of machine learning models -- with a concise explanation for each at the ready. Here, we summarize various machine learning models by highlighting the main points to help you communicate complex models. -
Roadmap to Natural Language Processing (NLP) - Oct 19, 2020.
Check out this introduction to some of the most common techniques and models used in Natural Language Processing (NLP). - Cartoon: Cloud Dating - Oct 17, 2020.
New KDnuggets cartoon looks at how AI can transform love and romance.
- DOE SMART Visualization Platform 1.5M Prize Challenge - Oct 16, 2020.
The U.S. Department of Energy’s (DOE) Office of Fossil Energy (FE) will award up to $1.5 million to winning innovators in a prize challenge to support FE’s SMART initiative. Registration deadline to participate in the challenge is 11:59 p.m. EDT Friday, Jan 22, 2021.
- Optimizing the Levenshtein Distance for Measuring Text Similarity - Oct 16, 2020.
For speeding up the calculation of the Levenshtein distance, this tutorial works on calculating using a vector rather than a matrix, which saves a lot of time. We’ll be coding in Java for this implementation.
- Deep Learning for Virtual Try On Clothes – Challenges and Opportunities - Oct 16, 2020.
Learn about the experiments by MobiDev for transferring 2D clothing items onto the image of a person. As part of their efforts to bring AR and AI technologies into virtual fitting room development, they review the deep learning algorithms and architecture under development and the current state of results.
- Fast Gradient Boosting with CatBoost - Oct 16, 2020.
In this piece, we’ll take a closer look at a gradient boosting library called CatBoost.
- Machine Learning’s Greatest Omission: Business Leadership - Oct 15, 2020.
Eric Siegel's business-oriented, vendor-neutral machine learning course is designed to fulfill vital unmet learner needs, delivering material critical for both techies and business leaders.
-
fastcore: An Underrated Python Library - Oct 15, 2020.
A unique python library that extends the python programming language and provides utilities that enhance productivity. -
How to ace the data science coding challenge - Oct 15, 2020.
Preparing to interview for a Data Scientist position takes preparation and practice, and then it could all boil down to a final review of your skills. Based on personal experience, these tips on how to approach such a review will help you excel in the coding challenge project for your next interview. -
Text Mining with R: The Free eBook - Oct 15, 2020.
This freely-available book will show you how to perform text analytics in R, using packages from the tidyverse. - Top KDnuggets tweets, Oct 7-13: Every DataFrame Manipulation, Explained and Visualized Intuitively - Oct 14, 2020.
Also Free Introductory Machine Learning Course From Amazon; A Complete Guide to Learn #DataScience in 100 Days; Top 3 Books for Every #DataEngineer.
- Deep Learning Design Patterns - Oct 14, 2020.
New book, "Deep Learning Design Patterns" presents deep learning models in a unique-but-familiar new way: as extendable design patterns you can easily plug-and-play into your software projects. Use code kdmath50 to save 50% off.
-
Free From MIT: Intro to Computational Thinking and Data Science - Oct 14, 2020.
This free course from MIT will help in your transition to thinking computationally, and ultimately solving complex data science problems. -
Goodhart’s Law for Data Science and what happens when a measure becomes a target? - Oct 14, 2020.
When developing analytics and algorithms to better understand a business target, unintended biases can sneak in that ensure desired outcomes are obtained. Guiding your work with multiple metrics in mind can help avoid such consequences of Goodhart's Law. - Getting Started with PyTorch - Oct 14, 2020.
A practical walkthrough on how to use PyTorch for data analysis and inference.
- Top September Stories: Free From MIT: Intro to Computer Science and Programming in Python; Best Online MS in AI, Analytics, Data Science, Machine Learning - Oct 13, 2020.
Also: Introduction to Time Series Analysis in Python; Automating Every Aspect of Your Python Project
- SIAM launches activity group, publications for data scientists - Oct 13, 2020.
Data science community at SIAM continues to grow with a new journal, book series, and activity group.
- The Future of Fake News - Oct 13, 2020.
Let's talk about misleading communications in the digital era.
-
Software Engineering Tips and Best Practices for Data Science - Oct 13, 2020.
Bringing your work as a Data Scientist into the real-world means transforming your experiments, test, and detailed analysis into great code that can be deployed as efficient and effective software solutions. You must learn how to enable your machine learning algorithms to integrate with IT systems by taking them out of your notebooks and delivering them to the business by following software engineering standards. - Uber Open Sources the Third Release of Ludwig, its Code-Free Machine Learning Platform - Oct 13, 2020.
The new release makes Ludwig one of the most complete open source AutoML stacks in the market.
- 5 Best Practices for Putting Machine Learning Models Into Production - Oct 12, 2020.
Our focus for this piece is to establish the best practices that make an ML project successful.
- How to be a 10x data scientist - Oct 12, 2020.
If you are a Data Scientist looking to make it to the next level, then there are many opportunities to up your game and your efficiency to stand out from the others. Some of these recommendations that you can follow are straightforward, and others are rarely followed, but they will all pay back in dividends of time and effectiveness for your career.
- Top Stories, Oct 5-11: A step-by-step guide for creating an authentic data science portfolio project; 10 Best Machine Learning Courses in 2020 - Oct 12, 2020.
Also: Free Introductory Machine Learning Course From Amazon; How LinkedIn Uses Machine Learning in its Recruiter Recommendation Systems; A step-by-step guide for creating an authentic data science portfolio project; Data Science Minimum: 10 Essential Skills You Need to Know to Start Doing Data Science
- Exploring The Brute Force K-Nearest Neighbors Algorithm - Oct 12, 2020.
This article discusses a simple approach to increasing the accuracy of k-nearest neighbors models in a particular subset of cases.
-
Annotated Machine Learning Research Papers - Oct 9, 2020.
Check out this collection of annotated machine learning research papers, and no longer fear their reading. - Algorithms of Social Manipulation - Oct 9, 2020.
As we all continuously interact with each other and our favorite businesses through apps and websites, the level at which we are being tracked and monitored is significant. While the technologies behind these capabilities provide us value, the tech companies can also influence our decisions on where to click, spend our money, and much more.
- How I Levelled Up My Data Science Skills In 8 Months - Oct 9, 2020.
Read how the author used their time to level up a variety of their data science skills over a short period of time, and learn how you could do the same.
- Strategies of Docker Images Optimization - Oct 8, 2020.
Large Docker images lengthen the time it takes to build and share images between clusters and cloud providers. When creating applications, it’s therefore worth optimizing Docker Images and Dockerfiles to help teams share smaller images, improve performance, and debug problems.
- 6 Lessons Learned in 6 Months as a Data Scientist - Oct 8, 2020.
When transitioning into a Data Science career, a new mindset toward collaboration, data, and reporting is required. Learn from these recommendations on approaches you should consider to successfully develop into your dream job.
-
How LinkedIn Uses Machine Learning in its Recruiter Recommendation Systems - Oct 8, 2020.
LinkedIn uses some very innovative machine learning techniques to optimize candidate recommendations. - Top KDnuggets tweets, Sep 30 – Oct 06: How to Explain Each Machine Learning Model at an Interview - Oct 7, 2020.
Also: The Definitive Data Scientist Environment Setup; Algorithm discovers how six simple molecules could evolve into life’s building blocks; A Concise Course in Statistical Inference: The #Free eBook; Are your coding skills good enough for a Data Science job?
-
Free Introductory Machine Learning Course From Amazon - Oct 7, 2020.
Amazon's Machine Learning University offers an introductory course titled Accelerated Machine Learning, which is a good starting place for those looking for a foundation in generalized practical ML. -
A step-by-step guide for creating an authentic data science portfolio project - Oct 7, 2020.
Especially if you are starting out launching yourself as a Data Scientist, you will want to first demonstrate your skills through interesting data science project ideas that you can implement and share. This step-by-step guide shows you how to do go through this process, with an original example that explores Germany’s biggest frequent flyer forum, Vielfliegertreff. - 5 Challenges to Scaling Machine Learning Models - Oct 7, 2020.
ML models are hard to be translated into active business gains. In order to understand the common pitfalls in productionizing ML models, let’s dive into the top 5 challenges that organizations face.
- Effective Visualization Techniques for Data Discovery and Analysis - Oct 6, 2020.
Learn how effective visual techniques help better explore and understand their data, discover trends and patterns, and communicate findings.
- Here are the Most Popular Python IDEs/Editors - Oct 6, 2020.
Jupyter Notebook continues to lead as the most popular Python IDE, but its share has declined since the last poll. The top 4 contenders have remained the same, but only one has significantly improved its share. We also examine the breakdown by employment and region.
-
10 Best Machine Learning Courses in 2020 - Oct 6, 2020.
If you are ready to take your career in machine learning to the next level, then these top 10 Machine Learning Courses covering both practical and theoretical work will help you excel. - A Guide to Preparing OpenCV for Android - Oct 6, 2020.
This tutorial guides Android developers in preparing the popular library OpenCV for use. Using a step-by-step guide, the library will be imported into Android Studio and then can be used for performing any of the operations it supports, such as object detection, segmentation, tracking, and more.
- Top Stories, Sep 28 – Oct 4: Data Science Minimum: 10 Essential Skills You Need to Know to Start Doing Data Science - Oct 5, 2020.
Also: The Best Free Data Science eBooks: 2020 Update; International alternatives to Kaggle for Data Science / Machine Learning competitions; Geographical Plots with Python; Introduction to Time Series Analysis in Python; Comparing the Top Business Intelligence Tools: Power BI vs Tableau vs Qlik vs Domo
- New U. of Chicago Machine Learning for Cybersecurity Certificate Gives Professionals Tools to Detect and Prevent Attacks - Oct 5, 2020.
Machine learning has become an essential tool for IT security professionals seeking to detect and prevent attacks and vulnerabilities. The Center for Data and Computing (CDAC) convened a trio of University of Chicago computer science faculty to produce an innovative new remote Machine Learning for Cybersecurity certificate that will be offered for the first time this autumn.
- Your Guide to Linear Regression Models - Oct 5, 2020.
This article explains linear regression and how to program linear regression models in Python.
- Key Machine Learning Technique: Nested Cross-Validation, Why and How, with Python code - Oct 5, 2020.
Selecting the best performing machine learning model with optimal hyperparameters can sometimes still end up with a poorer performance once in production. This phenomenon might be the result of tuning the model and evaluating its performance on the same sets of train and test data. So, validating your model more rigorously can be key to a successful outcome.
- Getting Started in AI Research - Oct 5, 2020.
A guide on how to contribute to confirming the reproducibility of some of the most recent papers and join open-search research.
- Data Protection Techniques Needed to Guarantee Privacy - Oct 2, 2020.
This article takes a look at the concepts of data privacy and personal data. It presents several privacy protection techniques and explains how they contribute to preserving the privacy of individuals.
- 5 Concepts Every Data Scientist Should Know - Oct 2, 2020.
Once a Data Scientist, there are certain skills you will apply each and every day of your career. Some of these might be common techniques you learned during your education, while others may develop fully only after you become more established in your organization. Continuing to hone these skills will provide you with valuable professional benefits.
- Comparing the Top Business Intelligence Tools: Power BI vs Tableau vs Qlik vs Domo - Oct 2, 2020.
How smart are your organizations’ decisions? Do you have the right information to make those decisions in the first place?
- 10 Days With “Deep Learning for Coders” - Oct 1, 2020.
Read about the author's experience with the course and the book from fast.ai.
- Understanding Transformers, the Data Science Way - Oct 1, 2020.
Read this accessible and conversational article about understanding transformers, the data science way — by asking a lot of questions that is.