This post covers the emergence of Game Theoretic concepts in the design of newer deep learning architectures. Deep learning systems need to be adaptive to imperfect knowledge and coordinating systems, 2 areas with which game theory can help.
Admittedly, there is a lot more to building a successful data team, and we would be lying if we pretended we have it all figured out. But hopefully focusing on the elements in this post is a good start.
Deep learning writer Carlos Perez gives his own classification for deep learning-based AI, which is aimed at practitioners. This classification gives us a sense of where we currently are and where we might be heading.
Climate change is one of the most pressing challenges for human society in the 21st century. We review the Big Data ecosystem for studying the climate change.
Data Privacy, Security and Ethics are hot yet complex topics in the business and data science world. This important article talks about and provide guidelines for privacy, security and ethics, specifically in the context of Process Mining.
This post presents some common scenarios where a seemingly good machine learning model may still be wrong, along with a discussion of how how to evaluate these issues by assessing metrics of bias vs. variance and precision vs. recall.
Power laws and other relationships between observable phenomena may not seem like they are of any interest to data science, at least not to newcomers to the field, but this post provides an overview and suggests how they may be.
One of the biggest obstacles to successful projects has been getting access to interesting data. Here are some more cool public data sources you can use for your next project.
As 2016 comes to a close and we prepare for a new year, check out the final instalment in our "Main Developments in 2016 and Key Trends in 2017" series, where experts weigh in with their opinions.
This post walks through the logic behind three recent deep learning architectures: ResNet, HighwayNet, and DenseNet. Each make it more possible to successfully trainable deep networks by overcoming the limitations of traditional network design.
Bayesian inference is a powerful toolbox for modeling uncertainty, combining researcher understanding of a problem with data, and providing a quantitative measure of how plausible various facts are. This overview from Datascience.com introduces Bayesian probability and inference in an intuitive way, and provides examples in Python to help get you started.
Also 20 Questions to Detect Fake Data Scientists; Software used for Analytics, Data Science, Machine Learning projects; Top Algorithms and Methods Used by Data Scientists
Read this engaging overview of a report from the Stanford University 100 year study of Artificial Intelligence, “a long-term investigation of the field of Artificial Intelligence (AI) and its influences on people, their communities, and society.”
Gear up to speed and have concepts and commands handy in Data Science, Data Mining, and Machine learning algorithms with these cheat sheets covering R, Python, Django, MySQL, SQL, Hadoop, Apache Spark, Matlab, and Java.
Importance of correct classification and hazards of misclassification are subjective or we can say varies on case-to-case. Lets see how cost of misclassification is measured from monetary perspective.
Why do we mine data? This post is an overview of the types of patterns that can be gleaned from data mining, and some real world examples of said patterns.
Key themes included the polling failures in 2016 US Elections, Deep Learning, IoT, greater focus on value and ROI, and increasing adoption of predictive analytics by the "masses" of industry.
We review how key data science algorithms, such as regression, feature selection, and Monte Carlo, are used in financial instrument pricing and risk management.
The recent paper at hand approaches explaining deep learning from a different perspective, that of physics, and discusses the role of "cheap learning" (parameter reduction) and how it relates back to this innovative perspective.
Cognitive biases are inherently problematic in a variety of fields, including data science. Is this something that can be mitigated? A solid understanding of cognitive biases is the best weapon, which this overview hopes to help provide.
Matching the performance of a human brain is a difficult feat, but techniques have been developed to improve the performance of neural network algorithms, 3 of which are discussed in this post: Distortion, mini-batch gradient descent, and dropout.
This intro to ANNs will look at how we can train an algorithm to recognize images of handwritten digits. We will be using the images from the famous MNIST (Mixed National Institute of Standards and Technology) database.
Measuring accuracy of model for a classification problem (categorical output) is complex and time consuming compared to regression problems (continuous output). Let’s understand key testing metrics with example, for a classification problem.
Two free ebooks: "Building Machine Learning Systems with Python" and "Practical Data Analysis" will give your skills a boost and make a great start in the New Year.
There is a lot of confusion these days about Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL), yet the distinction is very clear to practitioners in these fields. Are you able to articulate the difference?
Data processing and analytical modelling are major bottlenecks in today’s big data world, due to need of human intelligence to decide relationships between data, required data engineering tasks, analytical models and it’s parameters. This article talks about Smart Data Platform to help to solve such problems.
Random forest is a highly versatile machine learning method with numerous applications ranging from marketing to healthcare and insurance. This is a post about random forests using Python.
It’s easy to optimize simple neural networks, let’s say single layer perceptron. But, as network becomes deeper, the optmization problem becomes crucial. This article discusses about such optimization problems with deep neural networks.
We examine the main reasons for failure in Big Data, Data Science, and Analytics projects which include lack of clear mandate, resistance to change, and not asking the right questions, and what can be done to address these problems.