2020 Apr Tutorials, Overviews
All (89) | Events (6) | News, Education (6) | Opinions (14) | Top Stories, Tweets (10) | Tutorials, Overviews (53)
- Outbreak Analytics: Data Science Strategies for a Novel Problem - Apr 30, 2020.
You walk down one aisle of the grocery store to get your favorite cereal. On the dairy aisle, someone sick from COVID-19 coughs. Did your decision to grab your cereal before your milk possibly keep you healthy? How can these unpredictable, near-random choices be included in complex models?
- Five Cool Python Libraries for Data Science - Apr 30, 2020.
Check out these 5 cool Python libraries that the author has come across during an NLP project, and which have made their life easier.
- Introducing Brain Simulator II: A New Platform for AGI Experimentation - Apr 29, 2020.
A growing consensus of researchers contend that new algorithms are needed to transform narrow AI to AGI. Brain Simulator II is free software for new algorithm development targeted at AGI that you can experiment with and participate in its development.
- Understanding the COVID-19 Pandemic Using Interactive Visualizations - Apr 29, 2020.
Interactive visualizations are an effective method for understanding the COVID-19 pandemic. This article presents a repository filled with just such insightful interactions.
- Coronavirus COVID-19 Genome Analysis using Biopython - Apr 29, 2020.
So in this article, we will interpret, analyze the COVID-19 DNA sequence data and try to get as many insights regarding the proteins that made it up. Later will compare COVID-19 DNA with MERS and SARS and we’ll understand the relationship among them.
- How Data Scientists Can Train and Updates Models to Prepare for COVID-19 Recovery - Apr 28, 2020.
The COVID-19 pandemic has affected everything, and building predictions during this time is difficult. Data science teams need to update their models to prepare for the recovery, and know how to properly train 2020 data models to learn from the coronavirus anomaly.
- How AI Can Help Manage Infectious Diseases - Apr 28, 2020.
With the capability to analyze huge amounts of data, including medical information, human behavior patterns, and environmental conditions, big data tools can be invaluable in dealing with deadly outbreaks.
- 10 Best Machine Learning Textbooks that All Data Scientists Should Read, by Daniel Smith - Apr 28, 2020.
Check out these 10 books that can help data scientists and aspiring data scientists learn machine learning today.
- LSTM for time series prediction - Apr 27, 2020.
Learn how to develop a LSTM neural network with PyTorch on trading data to predict future prices by mimicking actual values of the time series data.
- A Concise Course in Statistical Inference: The Free eBook - Apr 27, 2020.
Check out this freely available book, All of Statistics: A Concise Course in Statistical Inference, and learn the probability and statistics needed for success in data science.
- Google Open Sources SimCLR, A Framework for Self-Supervised and Semi-Supervised Image Training - Apr 27, 2020.
The new framework uses contrastive learning to improve image analysis in unlabeled datasets.
- Learning during a crisis (Data Science 90-day learning challenge) - Apr 24, 2020.
How can you keep your focus and drive during a global crisis? Take on a 90-day learning challenge for data science and check out this list of books and courses to follow.
- The Super Duper NLP Repo: 100 Ready-to-Run Colab Notebooks - Apr 24, 2020.
Check out this repository of more than 100 freely-accessible NLP notebooks, curated from around the internet, and ready to launch in Colab with a single click.
- Data Transformation: Standardization vs Normalization, by Clare Liu - Apr 23, 2020.
Increasing accuracy in your models is often obtained through the first steps of data transformations. This guide explains the difference between the key feature scaling methods of standardization and normalization, and demonstrates when and how to apply each approach.
- Find Your Perfect Fit: A Quick Guide for Job Roles in the Data World - Apr 23, 2020.
Data related positions are considered the hottest in the job market during the last couple of years. While everyone wants to join the party and enter this fascinating field, it is essential to first get an understanding. In this quick guide, I’ll do my best to dispel the confusion by crystalizing the essence of the different positions.
- Data context and how to get started with understanding COVID-19 data - Apr 22, 2020.
If you are already applying your Data Science skills or getting ready to contribute to analyzing COVID-19 data, then be sure to take sufficient time to appreciate the context of the numbers to focus on what's most important as we collaborate on this global battle.
- Fighting Coronavirus With AI: Improving Testing with Deep Learning and Computer Vision - Apr 22, 2020.
This post will cover how testing is done for the coronavirus, why it's important in battling the pandemic, and how deep learning tools for medical imaging can help us improve the quality of COVID-19 testing.
- Free High-Quality Machine Learning & Data Science Books & Courses: Quarantine Edition - Apr 22, 2020.
If you find yourself quarantined and looking for free learning materials in the way of books and courses to sharpen your data science and machine learning skills, this collection of articles I have previously written curating such things is for you.
- Announcing PyCaret 1.0.0 - Apr 21, 2020.
An open source low-code machine learning library in Python. PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with few words only. This makes experiments exponentially fast and efficient.
- The Benefits & Examples of Using Apache Spark with PySpark - Apr 21, 2020.
Apache Spark runs fast, offers robust, distributed, fault-tolerant data objects, and integrates beautifully with the world of machine learning and graph analytics. Learn more here.
- 5 Papers on CNNs Every Data Scientist Should Read - Apr 20, 2020.
In this article, we introduce 5 papers on CNNs that represent both novel approaches and baselines in the field.
- The Double Descent Hypothesis: How Bigger Models and More Data Can Hurt Performance - Apr 20, 2020.
OpenAI research shows a phenomenon that challenges both traditional statistical learning theory and conventional wisdom in machine learning practitioners.
- Dockerize Jupyter with the Visual Debugger - Apr 17, 2020.
A step by step guide to enable and use visual debugging in Jupyter in a docker container.
- OpenAI Open Sources Microscope and the Lucid Library to Visualize Neurons in Deep Neural Networks - Apr 17, 2020.
The new tools shows the potential of data visualizations for understanding features in a neural network.
- State of the Machine Learning and AI Industry - Apr 16, 2020.
Enterprises are struggling to launch machine learning models that encapsulate the optimization of business processes. These are now the essential components of data-driven applications and AI services that can improve legacy rule-based business processes, increase productivity, and deliver results. In the current state of the industry, many companies are turning to off-the-shelf platforms to increase expectations for success in applying machine learning.
- Dive Into Deep Learning: The Free eBook - Apr 16, 2020.
This freely available text on deep learning is fully interactive and incredibly thorough. Check out "Dive Into Deep Learning" now and increase your neural networks theoretical understanding and practical implementation skills.
- Better notebooks through CI: automatically testing documentation for graph machine learning - Apr 16, 2020.
In this article, we’ll walk through the detailed and helpful continuous integration (CI) that supports us in keeping StellarGraph’s demos current and informative.
- Why and How to Use Dask with Big Data - Apr 15, 2020.
The Pandas library for Python is a game-changer for data preparation. But, when the data gets big, really big, then your computer needs more help to efficiency handle all that data. Learn more about how to use Dask and follow a demo to scale up your Pandas to work with Big Data.
- Federated Learning: An Introduction - Apr 15, 2020.
Improving machine learning models and making them more secure by training on decentralized data.
- Visualizing Decision Trees with Python (Scikit-learn, Graphviz, Matplotlib) - Apr 15, 2020.
Learn about how to visualize decision trees using matplotlib and Graphviz.
- Top Process Mining Software Companies, Updated - Apr 14, 2020.
Understanding the real business processes of a company through analysis of its information systems can guide digital transformations. Here, the top 10 process mining software companies are reviewed that can assist businesses in process optimizations through unique insights of business systems.
- Peer Reviewing Data Science Projects - Apr 13, 2020.
In any technical development field, having other practitioners review your work before shipping code off to production is a valuable support tool to make sure your work is error-proof. Even through your preparation for the review, improvements might be discovered and then other issues that escaped your awareness can be spotted by outsiders. This peer scrutiny can also be applied to Data Science, and this article outlines a process that you can experiment with in your team.
- How Deep Learning is Accelerating Drug Discovery in Pharmaceuticals - Apr 13, 2020.
The goal of this essay is to discuss meaningful machine learning progress in the real-world application of drug discovery. There’s even a solid chance of the deep learning approach to drug discovery changing lives for the better doing meaningful good in the world.
- DeepMind Unveils Agent57, the First AI Agents that Outperforms Human Benchmarks in 57 Atari Games - Apr 13, 2020.
The new reinforcement learning agent innovates over previous architectures achieving one of the most important milestones in the AI space.
- Has AI Come Full Circle? A data science journey, or why I accepted a data science job - Apr 10, 2020.
Personal journeys in Data Science can vary greatly between individuals. Some are just getting starting and wading into this vast ocean of opportunity, and others have been involved during its decades-long evolution as a professional field. This review of a longer journey can provide a broader perspective of how you might fit into this interesting career.
- 3 Best Sites to Find Datasets for your Data Science Projects - Apr 9, 2020.
When first learning data science, you will inevitably find yourself looking for more datasets to practice with. Here, we recommend the 3 best sites to find datasets to spark your next data science project.
- Build PyTorch Models Easily Using torchlayers - Apr 9, 2020.
torchlayers aims to do what Keras did for TensorFlow, providing a higher-level model-building API and some handy defaults and add-ons useful for crafting PyTorch neural networks.
- 10 Must-read Machine Learning Articles (March 2020) - Apr 9, 2020.
This list will feature some of the recent work and discoveries happening in machine learning, as well as guides and resources for both beginner and intermediate data scientists.
- How to Do Hyperparameter Tuning on Any Python Script in 3 Easy Steps - Apr 8, 2020.
With your machine learning model in Python just working, it's time to optimize it for performance. Follow this guide to setup automated tuning using any optimization library in three steps.
- TensorFlow Dev Summit 2020: Top 10 Tricks for TensorFlow and Google Colab Users - Apr 8, 2020.
In this piece, we’ll highlight some of the tips and tricks mentioned during this year’s TF summit. Specifically, these tips will help you in getting the best out of Google’s Colab.
- 3 Reasons to Use Random Forest® Over a Neural Network: Comparing Machine Learning versus Deep Learning - Apr 8, 2020.
Both the random forest algorithm and Neural Networks are different techniques that learn differently but can be used in similar domains. Why would you use one over the other?
- 2 Things You Need to Know about Reinforcement Learning – Computational Efficiency and Sample Efficiency - Apr 7, 2020.
Experimenting with different strategies for a reinforcement learning model is crucial to discovering the best approach for your application. However, where you land can have significant impact on your system's energy consumption that could cause you to think again about the efficiency of your computations.
- Simple Question Answering (QA) Systems That Use Text Similarity Detection in Python - Apr 7, 2020.
How exactly are smart algorithms able to engage and communicate with us like humans? The answer lies in Question Answering systems that are built on a foundation of Machine Learning and Natural Language Processing. Let's build one here.
- Build an app to generate photorealistic faces using TensorFlow and Streamlit - Apr 7, 2020.
We’ll show you how to quickly build a Streamlit app to synthesize celebrity faces using GANs, Tensorflow, and st.cache.
- Uber Open Sourced Fiber, a Framework to Streamline Distributed Computing for Reinforcement Learning Models - Apr 6, 2020.
The new framework simplifies distributed and scalable training for reinforcement learning agents.
- Mathematics for Machine Learning: The Free eBook - Apr 6, 2020.
Check out this free ebook covering the fundamentals of mathematics for machine learning, as well as its companion website of exercises and Jupyter notebooks.
- More Performance Evaluation Metrics for Classification Problems You Should Know - Apr 3, 2020.
When building and optimizing your classification model, measuring how accurately it predicts your expected outcome is crucial. However, this metric alone is never the entire story, as it can still offer misleading results. That's where these additional performance evaluations come into play to help tease out more meaning from your model.
- Best Free Epidemiology Courses for Data Scientists - Apr 3, 2020.
Are you interested in knowing more about epidemiology, the field which studies the spread and distribution of diseases? This article collects some free courses which are intended to help you do just that.
- Stop Hurting Your Pandas! - Apr 3, 2020.
This post will address the issues that can arise when Pandas slicing is used improperly. If you see the warning that reads "A value is trying to be set on a copy of a slice from a DataFrame", this post is for you.
- A Layman’s Guide to Data Science. Part 2: How to Build a Data Project - Apr 2, 2020.
As Part 2 in a Guide to Data Science, we outline the steps to build your first Data Science project, including how to ask good questions to understand the data first, how to prepare the data, how to develop an MVP, reiterate to build a good product, and, finally, present your project.
- Python for data analysis… is it really that simple?!? - Apr 2, 2020.
The article addresses a simple data analytics problem, comparing a Python and Pandas solution to an R solution (using plyr, dplyr, and data.table), as well as kdb+ and BigQuery solutions. Performance improvement tricks for these solutions are then covered, as are parallel/cluster computing approaches and their limitations.
- Introduction to the K-nearest Neighbour Algorithm Using Examples - Apr 1, 2020.
Read this concise summary of KNN, a supervised and pattern classification learning algorithm which helps us find which class the new input belongs to when k nearest neighbours are chosen and distance is calculated between them.
- Introducing MIDAS: A New Baseline for Anomaly Detection in Graphs - Apr 1, 2020.
From network security to financial fraud, anomaly detection helps protect businesses, individuals, and online communities. To help improve anomaly detection, researchers have developed a new approach called MIDAS.