2018 Jul
All (104) | Courses, Education (5) | Meetings (15) | News, Features (8) | Opinions, Interviews (25) | Top Stories, Tweets (9) | Tutorials, Overviews (36) | Webcasts & Webinars (6)
- Deep Learning Summit returns to Toronto – learn from Geoff Hinton
- Jul 31, 2018.
Learn from Geoff Hinton and others at Deep Learning Summit and AI for Government Summit, Oct 25-26 in Toronto. Save 20% with code KDNUGGETS.
- Weapons of Math Destruction, Ethical Matrix, Nate Silver and more Highlights from the Data Science Leaders Summit
- Jul 31, 2018.
Domino Data Lab hosted its first ever Data Science Leaders Summit at the lovely Yerba Buena Center for the Arts in San Francisco on May 30-31, 2018. Cathy O'Neil, Nate Silver, Cassie Kozyrkov and Eric Colson were some of the speakers at this event.
- Big Data a $4.7 Billion opportunity in the healthcare and pharmaceutical industry
- Jul 31, 2018.
This post contains some of the key findings from the SNS Telecom & IT's latest report, which indicates that Big Data investments in the healthcare and pharmaceutical industry are expected to reach nearly $4.7 Billion by the end of 2018.
- What is Normal?
- Jul 31, 2018.
I saw an article recently that referred to the normal curve as the data scientist's best friend. We examine myths around the normal curve, including - is most data normally distributed?
- Google’s AutoML: Cutting Through the Hype
- Jul 31, 2018.
In today’s post, I want to look specifically at Google’s AutoML, a product which has received a lot of media attention, and address "What is Google's AutoML?" and more.
- Top Stories, Jul 23-29: Cookiecutter Data Science: How to Organize Your Data Science Project; Comparison of Top 6 Python NLP Libraries
- Jul 30, 2018.
Also: How to Build a Data Science Portfolio; DevOps for Data Scientists: Taming the Unicorn; Why Germany did not defeat Brazil in the final, or Data Science lessons from the World Cup; Cookiecutter Data Science: How to Organize Your Data Science Project
- In-Depth Training for the Future of Data – TDWI Orlando Agenda Now Live
- Jul 30, 2018.
TDWI Orlando, Nov 11-16, provides you with the skills and best practices you need to advance your data management and analytics initiatives now. The agenda is now live! Save big with code KD20.
- Best Deal in the Galaxy? Win KDnuggets Free Pass to Strata Data Conference NYC, Sep 11-13, 2018
- Jul 30, 2018.
Cutting-edge science and new business fundamentals intersect and merge at Strata Data Conference. Win KDnuggets Pass - submit your entry by Aug 9, 2018.
- 5 reasons data analytics are falling short
- Jul 30, 2018.
When it comes to big data, possession is not enough. Comprehensive intelligence is the key. But traditional data analytics paradigms simply cannot deliver on the promise of data-driven insights. Here’s why.
- Intuitive Ensemble Learning Guide with Gradient Boosting
- Jul 30, 2018.
This tutorial discusses the importance of ensemble learning with gradient boosting as a study case.
- DevOps for Data Scientists: Taming the Unicorn
- Jul 27, 2018.
How do we version control the model and add it to an app? How will people interact with our website based on the outcome? How will it scale!?
- Remote Data Science: How to Send R and Python Execution to SQL Server from Jupyter Notebooks
- Jul 27, 2018.
Did you know that you can execute R and Python code remotely in SQL Server from Jupyter Notebooks or any IDE? Machine Learning Services in SQL Server eliminates the need to move data around.
- How to Lie with Data Science
- Jul 27, 2018.
This post is not really about how to lie with Data Science. Instead, it’s about how we may be fooled by not giving enough attention to details in different parts of the pipeline.
- AI Conference, Sep 4-7, San Francisco – KDnuggets Offer
- Jul 26, 2018.
Join the leading minds in AI, explore latest developments, separate hype from what is really game-changing, and learn how to apply AI in your organization. Save with code code PCKDNG.
- Maximize Value with Data. 100% Online MS in Applied Data Science. Enrolling Now.
- Jul 26, 2018.
At Bay Path University, we'll provide you with a framework for working together regardless of your background and experience. That is why we created two tracks to complete the MS in Applied Data Science degree, which is right for you?
- Data Science For Business: 3 Reasons You Need To Learn The Expected Value Framework
- Jul 26, 2018.
This article highlights the importance of learning the expected value framework in data science, covering classification, maximization and testing.
- Data Retrieval with Web Scraping: A Practitioner’s Guide to NLP
- Jul 26, 2018.
Proven and tested hands-on strategies to tackle NLP tasks.
- The Industries That Can Benefit Most From Predictive Analytics
- Jul 26, 2018.
Predictive analytics are useful for doing all those things and more, and could increase the overall competitiveness of individual companies or entire sectors.
- Top KDnuggets tweets, Jul 18-24: Causation in a Nutshell
- Jul 25, 2018.
Also fast.ai Deep Learning Part 2 Complete Course Notes; Comparison of Top 6 Python #NLProc Libraries.
- Iterate Your Way to a Top Analytics Product Experience, Aug 7 webinar
- Jul 25, 2018.
Learn how Mark43 researched, prototyped, and iterated to deliver analytics and business intelligence tools to police departments, emergency call centers, and other public safety agencies.
- On Stage at Predictive Analytics World Berlin: pwc, Lidl & Vodafone – November 13-14, 2018
- Jul 25, 2018.
The agenda for Predictive Analytics World for Business Berlin, 13-14 Nov, has just been released! Get inspired by heavyweight speakers & meet the people who make a difference!
- [eBook] A Unified Approach to Analytics with Apache Spark
- Jul 25, 2018.
How your data scientists and engineers can build models and data pipelines rapidly while collaborating with the business - download the ebook now.
- 9 Reasons why your machine learning project will fail
- Jul 25, 2018.
This article explains in detail some of the issues that you may face during your machine learning project.
-
How to Build a Data Science Portfolio - Jul 25, 2018.
This post will include links to where various data science professionals (data science managers, data scientists, social media icons, or some combination thereof) and others talk about what to have in a portfolio and how to get noticed. - ODSC Europe Schedule Launched + Premium Data Science Training + ODSC West
- Jul 24, 2018.
Attend ODSC Europe 2018, London, 19-22 Sept, and get hands-on training from leading data science experts - use code ODSC45 until Fri, July 27, 2018 to save 45%. Also, use code ODSC50 to save 50% on your pass to ODSC West 2018, Oct 31 - Nov 3, San Francisco.
- New Online MS in Business Analytics from Drexel
- Jul 24, 2018.
With Drexel University’s online MS in Business Analytics program, you’ll be able to effectively analyze this overlooked data to give your company and yourself a competitive edge.
- Why Germany did not defeat Brazil in the final, or Data Science lessons from the World Cup
- Jul 24, 2018.
We review World Cup predictions (all failed), examine what makes such events difficult to predict, and suggest 3 golden rules to determine when you can trust the predictions.
-
Genetic Algorithm Implementation in Python - Jul 24, 2018.
This tutorial will implement the genetic algorithm optimization technique in Python based on a simple example in which we are trying to maximize the output of an equation. -
Cookiecutter Data Science: How to Organize Your Data Science Project - Jul 24, 2018.
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work. - Top Stories, Jul 16-22: Cartoon: Data Scientist was the sexiest job of the 21st century until…; Causation in a Nutshell
- Jul 23, 2018.
Also: Efficient Graph-based Word Sense Induction; 5 Quick and Easy Data Visualizations in Python with Code; Explaining the 68-95-99.7 rule for a Normal Distribution; 5 Data Science Projects That Will Get You Hired in 2018
- Happy 25th Birthday, KDnuggets
- Jul 23, 2018.
Twenty five years covering Data Mining, Knowledge Discovery in Data, KDD, Predictive Analytics, Big Data, Data Science, Machine Learning, and AI - my reflections on 25 years of publishing and editing KDnuggets.
- Improve Data Science Productivity with Anaconda Enterprise
- Jul 23, 2018.
Anaconda Enterprise is the only product on the market that empowers your data science team to go from laptop to cluster to production with full reproducibility and governance.
- The Washington Post, Alibaba.com & ING – Learn from the best at Predictive Analytics World London
- Jul 23, 2018.
Predictive Analytics World London, Oct 17-18 - the leading vendor-neutral machine learning conference - is close to a finalized agenda, packed with cutting edge insights.
- Building A Data Science Product in 10 Days
- Jul 23, 2018.
At startups, you often have the chance to create products from scratch. In this article, the author will share how to quickly build valuable data science products, using his first project at Instacart as an example.
-
Comparison of Top 6 Python NLP Libraries - Jul 23, 2018.
Today, we want to outline and compare the most popular and helpful natural language processing libraries, based on our experience. - SuperDataScience Podcast: Insights from the Founder of KDnuggets
- Jul 21, 2018.
I talk to Kirill Eremenko about my journey to data science, how KDnuggets started, why you should start honing your machine learning engineering skills at this very moment, what's the future of data science, and more.
- Ready your Skills for a Cloud-First World with Google
- Jul 20, 2018.
The Machine Learning with TensorFlow on Google Cloud Platform Specialization on Coursera will help you jumpstart your career, includes hands-on labs, and takes you from a strategic overview to practical skills in building real-world, accurate ML models.
- Chaos is needed to keep us smart with Machine Learning
- Jul 20, 2018.
This post analyses why the chaotic nature of our lives can be used to improve machine learning algorithms.
-
Causation in a Nutshell - Jul 20, 2018.
Every move we make, every breath we take, and every heartbeat is an effect that is caused. Even apparent randomness may just be something we cannot explain. - Receiver Operating Characteristic Curves Demystified (in Python)
- Jul 20, 2018.
In this blog, I will reveal, step by step, how to plot an ROC curve using Python. After that, I will explain the characteristics of a basic ROC curve.
- Products for Product People: Best Practices in Analytics, July 24 Webinar
- Jul 19, 2018.
Learn product analytics best practices and the "meta" perspective from a practitioner who is building products that anybody, including product managers, can use to access, analyze, and act on data to make important decisions.
- Strata New York: Early Price ends July 27
- Jul 19, 2018.
Strata Data Conference is coming to New York September 11-13. Register today. Your last chance to save up to $679 on passes ends July 27.
- The ultimate list of Web Scraping tools and software
- Jul 19, 2018.
Here's your guide to pick the right web scraping tool for your specific data needs.
- Best (and Free!!) Resources to Understand Nuts and Bolts of Deep Learning
- Jul 19, 2018.
This blog is however not addressing the absolute beginner. Once you have a bit of intuition about how Deep Learning algorithms work, you might want to understand how things work below the hood.
-
Explaining the 68-95-99.7 rule for a Normal Distribution - Jul 19, 2018.
This post explains how those numbers were derived in the hope that they can be more interpretable for your future endeavors. - Math for Machine Learning: Open Doors to Data Science and Artificial Intelligence
- Jul 18, 2018.
This ebook explains the math involved and introduces you directly to the foundational topics in machine learning.
- Top KDnuggets tweets, Jul 11-17: Foundations of Machine Learning – A Bloomberg course; The 5 Clustering Algorithms Data Scientists Need to Know
- Jul 18, 2018.
Also: Bayesian Machine Learning, Explained; Is Google Tensorflow Object Detection API the Easiest Way to Implement Image Recognition?; Data Science of Variable Selection: A Review; 7 Steps to Understanding Deep Learning
- Efficient Graph-based Word Sense Induction
- Jul 18, 2018.
This paper describes a set of algorithms for Natural Language Processing (NLP) that match or exceed the state of the art on several evaluation tasks, while also being much more computationally efficient.
-
5 Quick and Easy Data Visualizations in Python with Code - Jul 18, 2018.
This post provides an overview of a small number of widely used data visualizations, and includes code in the form of functions to implement each in Python using Matplotlib. - Clustering Using K-means Algorithm
- Jul 18, 2018.
This article explains K-means algorithm in an easy way. I’d like to start with an example to understand the objective of this powerful technique in machine learning before getting into the algorithm, which is quite simple.
- NYU Stern MS in Business Analytics – Apply Now
- Jul 17, 2018.
Gain expertise in emerging operations, marketing, network analytics, data modeling, data science, and visualization. Next application deadline is Aug 1.
- Choosing the Right Marketing Technology Stack, July 25 Webinar
- Jul 17, 2018.
We will discuss ways to think about your marketing stack that help you stay focused on today's goals, while planning for tomorrow's growth.
- Basic Image Processing in Python, Part 2
- Jul 17, 2018.
We explain how to easily access and manipulate the internal components of digital images using Python and give examples from satellite image processing.
- BigQuery vs Redshift: Pricing Strategy
- Jul 17, 2018.
In this blog post, we’re going to break down BigQuery vs Redshift pricing structures and see how they work in detail.
- fast.ai Deep Learning Part 2 Complete Course Notes
- Jul 17, 2018.
This posts is a collection of a set of fantastic notes on the fast.ai deep learning part 2 MOOC freely available online, as written and shared by a student. These notes are a valuable learning resource either as a supplement to the courseware or on their own.
- Learn cutting edge techniques from world-class data scientists – RapidMiner Wisdom
- Jul 16, 2018.
Learn cutting edge techniques from world-class data scientists, including keynote speaker Kirk Borne. Use code WIS199 for 20% off early bird pricing through July 31, 2018.
- Key Takeaways from the Strata San Jose 2018
- Jul 16, 2018.
By dropping 'Hadoop' from its name, the @strataconf 2018 in San Jose signaled the emphasis on machine learning, cloud, streaming and real-time applications.
- Beating the 4-Year Slump: Mid-Career Growth in Data Science
- Jul 16, 2018.
This article provides a list of resources for data scientists who are transitioning from early-career/entry-level positions to more established roles. Surveys have shown a sharp decrease in satisfaction starting around 4 years into the profession, and resources are less obvious and readily available for professionals who have a good handle on the basics of data science than they are for beginners.
-
Beginners Ask “How Many Hidden Layers/Neurons to Use in Artificial Neural Networks?” - Jul 16, 2018.
By the end of this article, you could at least get the idea of how these questions are answered and be able to test yourself based on simple examples. - Top Stories, Jul 9-15: Cartoon: Data Scientist was the sexiest job of the 21st century until…; Analyze a Soccer (Football) Game Using Tensorflow Object Detection and OpenCV
- Jul 16, 2018.
Also: The 4 Levels of Data Usage in Data Science; fast.ai Deep Learning Part 1 Complete Course Notes; What is Minimum Viable (Data) Product?; Cartoon: Data Scientist was the sexiest job of the 21st century until...; Text Mining on the Command Line
-
Cartoon: Data Scientist was the sexiest job of the 21st century until … - Jul 14, 2018.
This Data Scientist thought that he had the sexiest job of the 21st century until the arrival of the competition ... - Top June Stories: 5 Data Science Projects That Will Get You Hired in 2018; Data Lake – the evolution of data processing
- Jul 13, 2018.
Also: Football World Cup 2018 Predictions: Germany vs Brazil in the final, and more; The 5 Clustering Algorithms Data Scientists Need to Know.
- McKinsey Analytics Online Hackathon, July 20-22
- Jul 13, 2018.
Calling all coders and data scientists to join McKinsey's hackathon. Prize: 5,000 USD + NIPS (Montreal, Canada) + Flights + Accommodation.
- The Future of Map-Making is Open and Powered by Sensors and AI
- Jul 13, 2018.
This article investigates the future of map-making and the role of Sensors, Artificial Intelligence and Machine Learning within that.
-
Text Mining on the Command Line - Jul 13, 2018.
In this tutorial, I use raw bash commands and regex to process raw and messy JSON file and raw HTML page. The tutorial helps us understand the text processing mechanism under the hood. -
Dimensionality Reduction : Does PCA really improve classification outcome? - Jul 13, 2018.
In this post, I am going to verify this statement using a Principal Component Analysis ( PCA ) to try to improve the classification performance of a neural network over a dataset. - New eBook: Machine Learning for Fraud Prevention
- Jul 12, 2018.
Get best practices on incorporating machine learning to automate your fraud prevention process and optimize workflows. Download this free ebook now.
- 9+ Rising Stars of Data Science
- Jul 12, 2018.
Connect and learn from these 9 data science rock stars and over 231 more presenters at ODSC West 2018, Oct 31-Nov 3 in San Francisco. Get 60% off until Friday, July 13 - reserve your spot here .
- What is Minimum Viable (Data) Product?
- Jul 12, 2018.
This post gives a personal insight into what Minimum Viable Product means for Machine Learning and the importance of starting small and iterating.
- AI Solutionism
- Jul 12, 2018.
Machine learning has huge potential for the future of humanity — but it won’t solve all our problems.
- Top KDnuggets tweets, Jul 4-10: Fantastic notes on the freely available @fastdotai machine learning course
- Jul 11, 2018.
Also: Analyze a Soccer (Football) Game Using #Tensorflow Object Detection; 18 Inspiring Women In AI, Big Data, Data Science, Machine Learning; Timsort - the fastest #sorting #algorithm you've never heard of.
- GDPR after 2 months – What does it mean for Machine Learning?
- Jul 11, 2018.
Almost 2 months on from the GDPR introduction, how was machine learning affected? What does the future hold?
- How to Balance the Load on a Data Team
- Jul 11, 2018.
This post will help you to better understand a data team’s workflow and allocate their resources to business users.
- 10 Mistakes to Avoid When Adopting Advanced Analytics
- Jul 10, 2018.
Download this report for a list of 10 mistakes to avoid when adopting advanced analytics, learn how you can improve your own implementation, and get a taste of premium membership.
- A FREE Live Online Conference For Aspiring Data Scientists & Data-Curious Business Leaders
- Jul 10, 2018.
Demystifying Data Science - a completely free, live online conference for aspiring data scientists and data-curious business professionals, July 24-25. Experience 28 interactive data science talks from industry-leading speakers. Register now!
- Basic Image Data Analysis Using Numpy and OpenCV – Part 1
- Jul 10, 2018.
Accessing the internal component of digital images using Python packages becomes more convenient to understand its properties as well as nature.
-
Analyze a Soccer (Football) Game Using Tensorflow Object Detection and OpenCV - Jul 10, 2018.
For the data scientist within you let's use this opportunity to do some analysis on soccer clips. With the use of deep learning and opencv we can extract interesting insights from video clips - fast.ai Deep Learning Part 1 Complete Course Notes
- Jul 10, 2018.
This posts is a collection of a set of fantastic notes on the fast.ai deep learning part 1 MOOC freely available online, as written and shared by a student. These notes are a valuable learning resource either as a supplement to the courseware or on their own.
- Top Stories, Jul 2-8: 5 of Our Favorite Free Visualization Tools; SQL Cheat Sheet
- Jul 10, 2018.
Also: 5 Data Science Projects That Will Get You Hired in 2018; Automated Machine Learning vs Automated Data Science.
- Strategic Analytics Summit, Las Vegas, Sep 26-27
- Jul 9, 2018.
This Summit will bring together Big Data thought leaders, top business executives and analytics experts for two days of insights, learning and networking. Use code KDNU18 for 25% off.
- Predictive Analytics World for Government – Sept 18-19 in Washington, DC – Save big if you register now
- Jul 9, 2018.
The deadline to save with Early Bird prices to the 2018 Predictive Analytics World for Government conference in Washington DC is fast approaching. This means that when you register by Friday, August 3, you could save up to $800.00.
- Deep Learning and Challenges of Scale Webinar
- Jul 9, 2018.
Join Nvidia for an on-demand webinar to learn how to tackle the challenges of scaling and building complex deep learning systems.
- Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: July and Beyond
- Jul 9, 2018.
Coming soon: ICDM/MLDM New York, Data Innovation Summits Las Vegas, ICML Stockholm, IJCAI/ECAI Stockholm, TDWI Anaheim, KDD-2018 London, JupyterCon NYC, and many more.
- Data science of the connected vehicle: perspectives, applications and trends
- Jul 9, 2018.
The application of data science to streaming data from vehicles is an emerging field. Here we review general trends and some specific examples of relevant data feeds and applications where data science can deliver value.
- Data Mining Book Chapter Download
- Jul 9, 2018.
Download this chapter by Gordon Linoff and Michael Berry, and learn how to create derived variables, which allow the statistical modeling process to incorporate human insights.
- The 4 Levels of Data Usage in Data Science
- Jul 9, 2018.
This is an overview of the 4 levels, or "buckets," of data usage in business, starting at monitoring and progressing to automation.
- Cartoon: How is Data Science Different From Religion?
- Jul 8, 2018.
This difference between Data Science and Religion is not what you expect ...
- Weak and Strong Bias in Machine Learning
- Jul 6, 2018.
With the arrival of the GDPR there has been increased focus on non-discrimination in machine learning. This post explores different forms of model bias and suggests some practical steps to improve fairness in machine learning.
- Introduction to Apache Spark
- Jul 6, 2018.
This is the first blog in this series to analyze Big Data using Spark. It provides an introduction to Spark and its ecosystem.
- fast.ai Machine Learning Course Notes
- Jul 6, 2018.
This posts is a collection of a set of fantastic notes on the fast.ai machine learning MOOC freely available online, as written and shared by a student. These notes are a valuable learning resource either as a supplement to the courseware or on their own.
- The AI Conference in London – Exclusive KDNuggets Offer
- Jul 5, 2018.
The AI Conference will premiere in London, 8–11 October. The Best Price expires on 13 July. Tutorials, training courses, and hotel rooms all book up quickly. Save and additional 20% on Gold, Silver and Bronze passes with the code KDN20.
-
5 of Our Favorite Free Visualization Tools - Jul 5, 2018.
5 key free data visualization tools that can provide flexible and effective data presentation. - Manage your Machine Learning Lifecycle with MLflow – Part 1
- Jul 5, 2018.
Reproducibility, good management and tracking experiments is necessary for making easy to test other’s work and analysis. In this first part we will start learning with simple examples how to record and query experiments, packaging Machine Learning models so they can be reproducible and ran on any platform using MLflow.
- Text Classification & Embeddings Visualization Using LSTMs, CNNs, and Pre-trained Word Vectors
- Jul 5, 2018.
In this tutorial, I classify Yelp round-10 review datasets. After processing the review comments, I trained three model in three different ways and obtained three word embeddings.
- Deep Learning Tips and Tricks
- Jul 4, 2018.
This post is a distilled collection of conversations, messages, and debates on how to optimize deep models. If you have tricks you’ve found impactful, please share them in the comments below!
- Overview and benchmark of traditional and deep learning models in text classification
- Jul 3, 2018.
In this post, traditional and deep learning models in text classification will be thoroughly investigated, including a discussion into both Recurrent and Convolutional neural networks.
- Deep Quantile Regression
- Jul 3, 2018.
Most Deep Learning frameworks currently focus on giving a best estimate as defined by a loss function. Occasionally something beyond a point estimate is required to make a decision. This is where a distribution would be useful. This article will purely focus on inferring quantiles.
- Data Retrieval and Cleaning: Tracking Migratory Patterns
- Jul 3, 2018.
In this post, we walk through investigating, retrieving, and cleaning a real world data set. We will also describe the cost benefits and necessary tools involved in building your own data sets.
- AI for Fraud Detection – How does Mastercard do it? Learn how global leaders use AI
- Jul 2, 2018.
At the AI in Finance Summit, Sept 6-7 in NYC, RE•WORK we will be showcasing the latest breakthrough technologies & their application in the financial sector with topics including Financial Compliance, Financial Forecasting, NLP, Investment, Blockchain & more.
- From Insights to Value in 90 Minutes – with Snowflake, July 12 Webinar
- Jul 2, 2018.
Learn How to Accelerate Data Warehouse Modernization at a Low Cost.
-
SQL Cheat Sheet - Jul 2, 2018.
A good programmer or software developer should have a basic knowledge of SQL queries in order to be able retrieve data from a database. This cheat sheet can help you get started in your learning, or provide a useful resource for those working with SQL. - Why a Professional Association for Data Scientists is a Bad Idea
- Jul 2, 2018.
This post presents the argument against having a professional association for data scientists.
-
Automated Machine Learning vs Automated Data Science - Jul 2, 2018.
Just by adding the term "automated" in front of these 2 separate, distinct concepts does not somehow make them equivalent. Machine learning and data science are not the same thing. - Top Stories, Jun 25 – Jul 1: 5 Data Science Projects That Will Get You Hired in 2018; 30 Free Resources for Machine Learning, Deep Learning, NLP & AI
- Jul 2, 2018.
Also: Top 20 Python Libraries for Data Science in 2018; Why Data Scientists Love Gaussian; How to Execute R and Python in SQL Server with Machine Learning Services; Explaining Reinforcement Learning: Active vs Passive; What's the Difference Between Data Integration and Data Engineering?