- Mastering Advanced Analytics with Apache Spark - May 22, 2018.
Get ebook with a collection of the most popular technical blog posts that introduce you to machine learning on Apache Spark, and highlight many of the major developments around Spark MLlib and GraphX.
Advanced Analytics, Apache Spark, Databricks, Graph Analytics, Machine Learning, MLlib
- Machine Learning Model Metrics - Jan 23, 2018.
In this article we explore how to calculate machine learning model metrics, using the example of fraud detection. We'll see lots of different ways that we can try to understand just how good our learned model is.
Pages: 1 2
Logistic Regression, Machine Learning, Metrics, MLlib, ROC-AUC
- Machine Learning with Optimus on Apache Spark - Nov 30, 2017.
The way most Machine Learning models work on Spark are not straightforward, and they need lots of feature engineering to work. That’s why we created the feature engineering section inside the Optimus Data Frame Transformer.
Pages: 1 2
Apache Spark, Data Science, Feature Engineering, Machine Learning, MLlib, Python, Workflow
- Top 15 Frameworks for Machine Learning Experts - Apr 19, 2016.
Either you are a researcher, start-up or big organization who wants to use machine learning, you will need the right tools to make it happen. Here is a list of the most popular frameworks for machine learning.
Data Science Tools, Deep Learning, Devendra Desale, Machine Learning, MLlib
- Exclusive Interview: Matei Zaharia, creator of Apache Spark, on Spark, Hadoop, Flink, and Big Data in 2020 - May 22, 2015.
Apache Spark is one the hottest Big Data technologies in 2015. KDnuggets talks to Matei Zaharia, creator of Apache Spark, about key things to know about it, why it is not a replacement for Hadoop, how it is better than Flink, and vision for Big Data in 2020.
Apache Spark, Big Data, Databricks, Flink, Hadoop, Matei Zaharia, MLlib, Spark SQL
- How Big Data Pieces, Technology, and Animals fit together - Feb 5, 2015.
How Big Data Pieces and animals fit together: MapReduce, HDFS, Apache Spark,, Pregel, Zookeeper, Flume, Hive, Pig, and more, explained by a Quora (and past Facebook) Data Scientist.
Apache Hive, Apache Spark, Google, Hadoop, MLlib
- Top stories in July: Cartoon: Facebook data science experiment and Cats; Data Mining/Data Science “Nobel Prize” - Aug 5, 2014.
Cartoon: Facebook data science experiment and Happy Cats; Data Mining/Data Science "Nobel Prize": ACM SIGKDD 2014 Innovation Award to Pedro Domingos; Is "Data Scientist" more than "Data Analyst"?; When Watson Meets Machine Learning.
Cartoon, Data Scientist, Facebook, Machine Learning, MLlib, Pedro Domingos, SIGKDD, Top stories, Watson
- KDnuggets 14:n19, Big Data Gap; Boundary of Effectiveness; Great interviews; MLlib machine learning - Jul 30, 2014.
KDnuggets Analytics, Data mining, and Data Science stories, including Features, Software, News, Opinions, Interviews, Reports, Webcasts, Courses, Meetings, Jobs, Academic, Publications, and Tweets.
Big Data, MLlib, Poll, Strata
- Top stories for Jul 20-26 - Jul 27, 2014.
Baby steps in Learning Python; 7 Steps for Learning Data Mining; Spotting Bad Data Visualizations; MLlib: Apache Spark component for machine learning.
Apache Spark, Data Visualization, MLlib, Python, Top stories
- Top KDnuggets tweets, Jul 23-24: 81% of retail firms gather #BigData, only 34% use analytics - Jul 25, 2014.
81% of retail firms gather #BigData, only 34% use analytics to drive pricing optimization; Google Brain project: Google is not really a search company. The Journal of Big Data has published its first articles - Hadoop, Mahout, Data; MLlib; Apache Spark component for machine learning.
Alteryx, Big Data, Google, Journal, MLlib, Retail
- MLlib: Apache Spark component for machine learning - Jul 24, 2014.
MLlib, the machine learning component of Apache Spark, has developed into a tool that supports many common machine learning algorithms and now comes with more mature documentation and a stable API.
Apache Spark, Daniel D. Gutierrez, Machine Learning, MLlib
- Top KDnuggets tweets, Apr 9-10: MLlib: Scalable Machine Learning on Spark; Ensemble methods overview - Apr 11, 2014.
MLlib: Scalable Machine Learning on Spark (free ebook); Ensemble methods usually give best results in Machine Learning - an overview; Prediction.io open source machine learning server ; Maslow Hierarchy of Analytical Needs - too clever?
Apache Spark, Ensemble Methods, Maslow Hierarchy, MLlib, PredictionIO