Dataiku Data Science Studio
Data Science Studio (DSS) from Dataiku is a complete Data Science software tool for developers and analysts,
which significantly shortens the time-consuming load-clean-train-test-deploy cycles of building predictive applications.
A community edition and a free trial available.
on Aug 26, 2014 in Data Mining Software, Data Preparation, Data Science, Dataiku, Florian Douetteau, Prediction
Deep Learning – important resources for learning and understanding
New and fundamental resources for learning about Deep Learning - the hottest machine learning method, which is approaching human performance level.
on Aug 21, 2014 in Deep Learning, Image Recognition, Machine Learning, Yann LeCun, Yoshua Bengio
Sibyl: Google’s system for Large Scale Machine Learning
A review of 2014 keynote talk about Sibyl, Google system for large scale machine learning. Parallel Boosting algorithm and several design principles are introduced.
on Aug 20, 2014 in Algorithms, Boosting, Google, Machine Learning, Sibyl
Interview: Pedro Domingos: the Master Algorithm, new type of Deep Learning, great advice for young researchers
Top researcher Pedro Domingos on useful maxims for Data Mining, Machine Learning as the Master Algorithm, new type of Deep Learning called sum-product networks, Big Data and startups, and great advice to young researchers.
on Aug 19, 2014 in Advice, Deep Learning, KDD-2014, Machine Learning, Pedro Domingos, Startups
Four main languages for Analytics, Data Mining, Data Science
New KDnuggets Poll shows the growing dominance of four main languages for Analytics, Data Mining, and Data Science: R, SAS, Python, and SQL - used by 91% of data scientists - and decline in popularity of other languages, except for Julia and Scala.
on Aug 18, 2014 in Analytics Languages, Data Mining, Data Science, Julia, Poll, Python, R, SAS, Scala, SQL
Top Research Leaders in Data Mining, Data Science, and KDD
We identify the top researchers in Data Mining, Data Science, and KDD. Jiawei Han, Philip Yu, and Christos Faloutsos remain the leaders, but they are joined by many fast rising young researchers - the leaders of tomorrow.
on Aug 16, 2014 in Christos Faloutsos, Data Mining, Hans-Peter Kriegel, Jian Pei, Jiawei Han, KDD, Philip S. Yu, Researchers, Top list
Interesting Social Media Datasets
Learn about some of the many interesting social media datasets available to you, some of which are quite new, and the different features and challenges they offer you for your next big data science project.
on Aug 13, 2014 in Challenge, Data Visualization, Datasets, Open Data, Social Media Analytics
OpenML: Share, Discover and Do Machine Learning
OpenML is designed to share, organize and reuse data, code and experiments, so that scientists can make discoveries more efficiently. It is an interesting idea to build a network of machine learning.
on Aug 11, 2014 in Kaggle, Machine Learning, OpenML, Ran Bi, Weka
Interview: Michael Berthold, President and Founder of KNIME, on Data Mining, Startups, and Visual Workflow
We discuss KNIME key features and how it compares to competition, KNIME business model, Pharma, planned development, and transition from an academic project to a company.
on Aug 9, 2014 in Knime, Konstanz University, Michael Berthold, Open Source
BAT: China’s Three Big Data Leaders
We examine the “three big mountains” in Chinese Internet and Big Data industry: Baidu, Alibaba, and Tencent (together called BAT), and look into their different strategy and focus.
on Aug 5, 2014 in Alibaba, Baidu, Big Data, China, Liyang Tang, Search Infrastructure, Social Media Analytics, Tencent
Book: Data Classification: Algorithms and Applications
Learn a wide variety of data classification techniques and their methods, domains, and variations in this comprehensive survey of the area of data classification.
on Aug 2, 2014 in Algorithms, Book, Charu Aggarwal, Classification, CRC Press
18 essential Hadoop tools
Hadoop tools develop at a rapid rate, and keeping up with the latest can be difficult. Here we detail 18 of the most essential tools that work well with Hadoop.
on Aug 1, 2014 in Apache Spark, Data Infrastructure, Hadoop
|