Search results for

    Found 6239 documents, 5968 searched:

  • Data Science and Big Data: Two very Different Beasts

    Creating artifact from the ore requires the tools, craftmanship and science. Same is the case of big data and data science, here we present the distinguishing factors between the ore and the artifact.

    https://www.kdnuggets.com/2015/07/data-science-big-data-different-beasts.html

  • Using Ensembles in Kaggle Data Science Competitions- Part 3

    Earlier, we showed how to create stacked ensembles with stacked generalization and out-of-fold predictions. Now we'll learn how to implement various stacking techniques.

    https://www.kdnuggets.com/2015/06/ensembles-kaggle-data-science-competition-p3.html

  • Excellent Tutorial on Sequence Learning using Recurrent Neural Networks

    Excellent tutorial explaining Recurrent Neural Networks (RNNs) which hold great promise for learning general sequences, and have applications for text analysis, handwriting recognition and even machine translation.

    https://www.kdnuggets.com/2015/06/rnn-tutorial-sequence-learning-recurrent-neural-networks.html

  • Open Source Enabled Interactive Analytics: An Overview

    Explaining the aspects of creating an interactive data driven dashboard using open source technologies i.e. MongoDB, D3.Js, DC.JS and Node JS.

    https://www.kdnuggets.com/2015/06/open-source-interactive-analytics-overview.html

  • Using Ensembles in Kaggle Data Science Competitions – Part 2

    Aspiring to be a Top Kaggler? Learn more methods like Stacking & Blending. In the previous post we discussed about ensembling models by ways of weighing, averaging and ranks. There is much more to explore in Part-2!

    https://www.kdnuggets.com/2015/06/ensembles-kaggle-data-science-competition-p2.html

  • Top 20 R packages by popularity

    Wondering which are the most popular R packages? Here's an analysis based on most downloaded R packages from Jan to May 2015 to identify the top trending packages in the R world!

    https://www.kdnuggets.com/2015/06/top-20-r-packages.html

  • Top 20 R Machine Learning and Data Science packages

    We list out the top 20 popular Machine Learning R packages by analysing the most downloaded R packages from Jan-May 2015.

    https://www.kdnuggets.com/2015/06/top-20-r-machine-learning-packages.html

  • Top 10 Machine Learning Videos on YouTube

    The top machine learning videos on YouTube include lecture series from Stanford and Caltech, Google Tech Talks on deep learning, using machine learning to play Mario and Hearthstone, and detecting NHL goals from live streams.

    https://www.kdnuggets.com/2015/06/top-10-machine-learning-videos-youtube.html

  • Popular Deep Learning Tools – a review

    Deep Learning is the hottest trend now in AI and Machine Learning. We review the popular software for Deep Learning, including Caffe, Cuda-convnet, Deeplearning4j, Pylearn2, Theano, and Torch.

    https://www.kdnuggets.com/2015/06/popular-deep-learning-tools.html

  • In Machine Learning, What is Better: More Data or better Algorithms

    Gross over-generalization of “more data gives better results” is misguiding. Here we explain, in which scenario more data or more features are helpful and which are not. Also, how the choice of the algorithm affects the end result.

    https://www.kdnuggets.com/2015/06/machine-learning-more-data-better-algorithms.html

  • Interview: Joseph Babcock, Netflix on Genie, Lipstick, and Other In-house Developed Tools

    We discuss role of analytics in content acquisition, data architecture at Netflix, organizational structure, and open-source tools from Netflix.

    https://www.kdnuggets.com/2015/06/interview-joseph-babcock-netflix-in-house-developed-tools.html

  • Interview: Joseph Babcock, Netflix on Discovery and Personalization from Big Data

    We discuss the steps involved in Discovery process at Netflix, impact due to multitude of devices, system generated logs, and surprising insights.

    https://www.kdnuggets.com/2015/06/interview-joseph-babcock-netflix-discovery-personalization.html

  • Cognitive Computing: Solving the Big Data Problem?

    With a shortage of data scientists, what are the alternatives for making sense of Big Data? We examine Cognitive Computing, its strengths, and how it can fit into the current Big Data landscape.

    https://www.kdnuggets.com/2015/06/cognitive-computing-solving-big-data-problem.html

  • Which Big Data, Data Mining, and Data Science Tools go together?

    We analyze the associations between the top Big Data, Data Mining, and Data Science tools based on the results of 2015 KDnuggets Software Poll. Download anonymized data and analyze it yourself.

    https://www.kdnuggets.com/2015/06/data-mining-data-science-tools-associations.html

  • Love, Sex and Predictive Analytics

    Here, we are trying to understand the working mechanisms of dating sites, algorithms used and role of predictive analytics while matchmaking. We have also gleaned some interesting analytical insights from them.

    https://www.kdnuggets.com/2015/06/love-sex-predictive-analytics.html

  • Top 30 Social Network Analysis and Visualization Tools

    We review major tools and packages for Social Network Analysis and visualization, which have wide applications including biology, finance, sociology, network theory, and many other domains.

    https://www.kdnuggets.com/2015/06/top-30-social-network-analysis-visualization-tools.html

  • Top 20 Python Machine Learning Open Source Projects

    We examine top Python Machine learning open source projects on Github, both in terms of contributors and commits, and identify most popular and most active ones.

    https://www.kdnuggets.com/2015/06/top-20-python-machine-learning-open-source-projects.html

  • Applied Statistics Is A Way Of Thinking, Not Just A Toolbox

    The choice of tools in applied statistics is driven by the objective, the structure of the data, and the nature of the uncertainty in the numbers, whereas in academic statistics its driven by publishing or teaching. Here we provide some of common statistical tools and the overlapping genealogy.

    https://www.kdnuggets.com/2015/05/applied-statistics-thinking-not-toolbox.html

  • Insights from Data Science Handbook

    Here you can find perspective of lead data scientists on the definitions ranging from data science, metrics selection while solving a problem, work ethics, the art of storytelling and why data science is important in todays world.

    https://www.kdnuggets.com/2015/05/insights-from-data-science-handbook.html

  • Miner3D Data Visualization System Version 8

    The new software features a redesigned user interface, making it a perfect complement for Excel. New graphics visualization engine is now faster and smoother.

    https://www.kdnuggets.com/2015/05/miner3d-data-visualization-system-version-8.html

  • KDnuggets™ News 15:n17, May 27: R wins Annual Poll; Top 10 Algorithms; Interview with Spark Creator

    R leads RapidMiner, Python catches up - Annual Software Poll; Top 10 Data Mining Algorithms; Exclusive Interview: Matei Zaharia, creator of Apache Spark; 5 Not-to-be-Missed Ideas about Big Data.

    https://www.kdnuggets.com/2015/n17.html

  • Dark Knowledge Distilled from Neural Network

    Geoff Hinton never stopped generating new ideas. This post is a review of his research on “dark knowledge”. What’s that supposed to mean?

    https://www.kdnuggets.com/2015/05/dark-knowledge-neural-network.html

  • R vs Python for Data Science: The Winner is …

    In the battle of "best" data science tools, python and R both have their pros and cons. Selecting one over the other will depend on the use-cases, the cost of learning, and other common tools required.

    https://www.kdnuggets.com/2015/05/r-vs-python-data-science.html

  • R leads RapidMiner, Python catches up, Big Data tools grow, Spark ignites

    R is the most popular overall tool among data miners, although Python usage is growing faster. RapidMiner continues to be most popular suite for data mining/data science. Hadoop/Big Data tools usage grew to 29%, propelled by 3x growth in Spark. Other tools with strong growth include H2O (0xdata), Actian, MLlib, and Alteryx.

    https://www.kdnuggets.com/2015/05/poll-r-rapidminer-python-big-data-spark.html

  • Exclusive Interview: Matei Zaharia, creator of Apache Spark, on Spark, Hadoop, Flink, and Big Data in 2020

    Apache Spark is one the hottest Big Data technologies in 2015. KDnuggets talks to Matei Zaharia, creator of Apache Spark, about key things to know about it, why it is not a replacement for Hadoop, how it is better than Flink, and vision for Big Data in 2020.

    https://www.kdnuggets.com/2015/05/interview-matei-zaharia-creator-apache-spark.html

  • Top 10 Data Mining Algorithms, Explained

    Top 10 data mining algorithms, selected by top researchers, are explained here, including what do they do, the intuition behind the algorithm, available implementations of the algorithms, why use them, and interesting applications.

    https://www.kdnuggets.com/2015/05/top-10-data-mining-algorithms-explained.html

  • I’ve Been Replaced by an Analytics Robot

    A veteran statistician reflects on the journey from a statistician of the past to data scientist of today, how the work he used to do became automated, and what future can data scientists can expect.

    https://www.kdnuggets.com/2015/05/replaced-by-analytics-robot.html

  • Most Viewed Data Mining Videos on YouTube

    The top Data Mining YouTube videos by those like Google and Revolution Analytics covers topics ranging from statistics in data mining to using R for data mining to data mining in sports.

    https://www.kdnuggets.com/2015/05/most-viewed-data-mining-videos-youtube.html

  • How to Lead a Data Science Contest without Reading the Data

    We examine a “wacky” boosting method that lets you climb the public leaderboard without even looking at the data . But there is a catch, so read on before trying to win Kaggle competitions with this approach.

    https://www.kdnuggets.com/2015/05/data-science-contest-leaderboard-without-reading-data.html

  • Data Science for Workforce Optimization: Reducing Employee Attrition

    Predictive analytics is growing its reach, see how it is affecting workforce analytics domain. In this presentation Pasha Roberts explains what is in it for students, managers and practitioners.

    https://www.kdnuggets.com/2015/05/data-science-workforce-optimization-reducing-employee-attrition.html

  • Surprising Random Correlations

    An interesting demo showing how easy it is to find surprising correlations in real data. Is German unemployment rate related to Apple Stock? Is 10-year Treasury rate related to price of Red Winter Wheat? You will be surprised.

    https://www.kdnuggets.com/2015/05/surprising-random-correlations.html

  • Seven Techniques for Data Dimensionality Reduction

    Performing data mining with high dimensional data sets. Comparative study of different feature selection techniques like Missing Values Ratio, Low Variance Filter, PCA, Random Forests / Ensemble Trees etc.

    https://www.kdnuggets.com/2015/05/7-methods-data-dimensionality-reduction.html

  • Plotly: Online Dashboards That Update Your Data and Graphs

    New online visualization option from Plot.ly allows you to have data visualizations and graphs that update dynamically.

    https://www.kdnuggets.com/2015/05/plotly-online-dashboards-update-data-graphs.html

  • Machine Learning Wars: Amazon vs Google vs BigML vs PredicSis

    Comparing 4 Machine Learning APIs: Amazon Machine Learning, BigML, Google Prediction API and PredicSis on a real data from Kaggle, we find the most accurate, the fastest, the best tradeoff, and a surprise last place.

    https://www.kdnuggets.com/2015/05/machine-learning-wars-amazon-google-bigml-predicsis.html

  • Cartoon: Data Scientist Mother

    We revisit KDnuggets Cartoon which looks at the Mother of All Data. Enjoy and don't forget the mothers in your life - Big Data predicted that 67.53% of you would remember!

    https://www.kdnuggets.com/2015/05/cartoon-data-scientist-mother.html

  • Most Viewed Big Data Videos on YouTube

    The top Big Data YouTube videos by those like Hortonworks and Kirk D. Borne cover diverse topics including Hadoop, Big Data Trends, Deep Learning, and Big Data Leadership.

    https://www.kdnuggets.com/2015/05/most-viewed-big-data-videos-youtube.html

  • The Inconvenient Truth About Data Science

    Data is never clean, you will spend most of your time cleaning and preparing data, 95% of tasks do not require deep learning, and more inconvenient wisdom.

    https://www.kdnuggets.com/2015/05/data-science-inconvenient-truth.html

  • Data Scientists Automated and Unemployed by 2025?

    Will Data Scientists be unemployed by 2025? Majority of voters in latest KDnuggets Poll expect expert-level Data Science to be automated in 10 years or less.

    https://www.kdnuggets.com/2015/05/data-scientists-automated-2025.html

  • Top LinkedIn Groups for Analytics, Big Data, Data Mining, and Data Science – Discussions up, Engagement down

    While discussions are growing, the comments and engagements are falling, especially since 2012. We cluster groups into 4 quadrants by activity level and identify most active and engaged groups. Open groups are twice as active as closed.

    https://www.kdnuggets.com/2015/05/top-linkedin-groups-analytics-big-data-mining-activity-engagement.html

  • WebDataCommons – the Data and Framework for Web-scale Mining

    The WebDataCommons project extracts the largest publicly available hyperlink graph, large product-, address-, recipe-, and review corpora, as well as millions of HTML tables from the Common Crawl web corpus and provides the extracted data for public download.

    https://www.kdnuggets.com/2015/05/webdatacommons-data-web-scale-mining.html

  • How To Become a Data Scientist And Get Hired

    A data scientist should be able to choose the right technology, understand the business context and solve a wide range of problems. To hire the the right data scientist, check the tips list in the post.

    https://www.kdnuggets.com/2015/05/datafloq-become-data-scientist-get-hired.html

  • The Myth of Model Interpretability

    Deep networks are widely regarded as black boxes. But are they truly uninterpretable in any way that logistic regression is not?

    https://www.kdnuggets.com/2015/04/model-interpretability-neural-networks-deep-learning.html

  • New Hybrid Rare-Event Sampling Technique for Fraud Detection

    Proposed hybrid sampling methodology may prove useful when building and validating machine learning models for applications where target event is rare, such as fraud detection.

    https://www.kdnuggets.com/2015/04/new-hybrid-rare-event-sampling-technique-fraud-detection.html

  • Data Mining: New Comprehensive Textbook by Charu Aggarwal

    This comprehensive data mining textbook explores the different aspects of data mining, from basics to advanced, and their applications, and may be used for both introductory and advanced data mining courses.

    https://www.kdnuggets.com/2015/04/data-mining-textbook-charu-aggarwal.html

  • Top 10 R Packages to be a Kaggle Champion

    Kaggle top ranker Xavier Conort shares insights on the “10 R Packages to Win Kaggle Competitions”.

    https://www.kdnuggets.com/2015/04/top-10-r-packages-kaggle.html

  • Algorithmia Tested: Human vs Automated Tag Generation

    Algorithmia, the marketplace for algorithms, can be a platform for hosting APIs to do a plethora of text analytics and information retrieval tasks. Automatic post tagging is done in this case study to demonstrate the effectiveness and ease-of-use of the platform.

    https://www.kdnuggets.com/2015/04/algorithmia-tested-automated-tag-generation.html

  • Algorithmia: Building a web site explorer in 5 easy steps

    We show how to use Algorithmia for quickly building a functional web site explorer in 5 steps: GetLinks, PageRank, Url2text, Summarizer and AutoTag.

    https://www.kdnuggets.com/2015/04/algorithmia-building-web-site-explorer-5-easy-steps.html

  • Cartoon: A solution for Data Scientists allergies caused by Big Data

    With more and more allergies and big trend towards gluten-free everything, new KDnuggets cartoon envisions a possible solution for Data Scientists allergies.

    https://www.kdnuggets.com/2015/04/cartoon-data-scientist-allergies-big-data.html

  • Data Science 101: Preventing Overfitting in Neural Networks

    Overfitting is a major problem for Predictive Analytics and especially for Neural Networks. Here is an overview of key methods to avoid overfitting, including regularization (L2 and L1), Max norm constraints and Dropout.

    https://www.kdnuggets.com/2015/04/preventing-overfitting-neural-networks.html

  • Interview: Ksenija Draskovic, Verizon on Dissecting the Anatomy of Predictive Analytics Projects

    We discuss Predictive Analytics use cases at Verizon Wireless, advantages of a unified data view, model selection and common causes of failure.

    https://www.kdnuggets.com/2015/04/interview-ksenija-draskovic-verizon-predictive-analytics.html

  • Awesome Public Datasets on GitHub

    A long, categorized list of large datasets (available for public use) to try your analytics skills on. Which one would you pick?

    https://www.kdnuggets.com/2015/04/awesome-public-datasets-github.html

  • Hadoop as a Service: 18 Cloud Options

    Hadoop as a service in the cloud makes big data applications and projects easier to approach and these 18 platforms each provide their own unique solutions.

    https://www.kdnuggets.com/2015/04/hadoop-as-service-18-cloud-options.html

  • Computing Platforms for Analytics, Data Mining, Data Science

    The poll results suggest a split between a majority of data miners and data scientists who work with growing but still "PC-size", small GB-sized data, and a smaller group of Big Data analysts who work with cloud-sized data. Cloud computing, Unix, and especially Mac gained in popularity.

    https://www.kdnuggets.com/2015/04/computing-platforms-analytics-data-mining-data-science.html

  • Interview: Bill Moreau, USOC on Evidence-based Medicine to Reduce Sports Injuries

    We discuss the success of Analytics in predicting sports injuries, recent progress in concussion management and the trends in data-driven evidence-based sports medicine.

    https://www.kdnuggets.com/2015/03/interview-bill-moreau-usoc-sports-medicine.html

  • More Free Data Mining, Data Science Books and Resources

    More free resources and online books by leading authors about data mining, data science, machine learning, predictive analytics and statistics.

    https://www.kdnuggets.com/2015/03/free-data-mining-data-science-books-resources.html

  • Talking Machine – 3 Deep Learning Gurus Talk about History and Future of Machine Learning, part 1

    An recent interview from the talking machine podcast with three deep learning experts. They talked about the neural network winter and its renewal.

    https://www.kdnuggets.com/2015/03/talking-machine-deep-learning-gurus-p1.html

  • Do We Need More Training Data or More Complex Models?

    Do we need more training data? Which models will suffer from performance saturation as data grows large? Do we need larger models or more complicated models, and what is the difference?

    https://www.kdnuggets.com/2015/03/more-training-data-or-complex-models.html

  • Interview: Brad Klingenberg, StitchFix on Building Analytics-powered Personal Stylist

    We discuss StitchFix, how it leverages Analytics, understanding customer preferences, and pros-and-cons of involving human judgement in the recommendation process.

    https://www.kdnuggets.com/2015/03/interview-brad-klingenberg-stitchfix-analytics.html

  • Top KDnuggets tweets, Mar 16-18: 87 Studies shown that accurate numbers aren’t more useful than the ones you make up (Dilbert)

    Also Sirius - a free, open-source version of Siri; #PI art: the first 13,689 digits of pi; Great tutorial + #Python code: 1-Layer Neural Networks.

    https://www.kdnuggets.com/2015/03/top-tweets-mar16-18.html

  • Small Data requires Specialized Deep Learning and Yann LeCun response

    For industries that have relatively small data sets (less than a petabyte), a Specialized Deep Learning approach based on unsupervised learning and domain knowledge is needed.

    https://www.kdnuggets.com/2015/03/small-data-specialized-deep-learning-yann-lecun.html

  • Interview: Vince Darley, King.com on the Serious Analytics behind Casual Gaming

    We discuss key characteristics of social gaming data, ML use cases at King, infrastructure challenges, major problems with A-B testing and recommendations to resolve them.

    https://www.kdnuggets.com/2015/03/interview-vince-darley-king-analytics-gaming.html

  • Coursera: Process Mining: Data science in Action, April 2015

    Due to the big success of the first run, this 6 week online course is repeated on Coursera, starting April 1. This free course provides data science knowledge that can be applied directly to analyze and improve processes in a variety of domains.

    https://www.kdnuggets.com/2015/03/coursera-process-mining-data-science-action.html

  • Deep Learning for Text Understanding from Scratch

    Forget about the meaning of words, forget about grammar, forget about syntax, forget even the very concept of a word. Now let the machine learn everything by itself.

    https://www.kdnuggets.com/2015/03/deep-learning-text-understanding-from-scratch.html

  • Deep Learning, The Curse of Dimensionality, and Autoencoders

    Autoencoders are an extremely exciting new approach to unsupervised learning and for many machine learning tasks they have already surpassed the decades of progress made by researchers handpicking features.

    https://www.kdnuggets.com/2015/03/deep-learning-curse-dimensionality-autoencoders.html

  • SQL-like Query Language for Real-time Streaming Analytics

    We need SQL like query language for Realtime Streaming Analytics to be expressive, short, fast, define core operations that cover 90% of problems, and to be easy to follow and learn.

    https://www.kdnuggets.com/2015/03/sql-query-language-realtime-streaming-analytics.html

  • Machine Learning Table of Elements Decoded

    Machine learning packages for Python, Java, Big Data, Lua/JS/Clojure, Scala, C/C++, CV/NLP, and R/Julia are represented using a cute but ill-fitting metaphor of a periodic table. We extract the useful links.

    https://www.kdnuggets.com/2015/03/machine-learning-table-elements.html

  • 7 common mistakes when doing Machine Learning

    In statistical modeling, there are various algorithms to build a classifier, and each algorithm makes a different set of assumptions about the data. For Big Data, it pays off to analyze the data upfront and then design the modeling pipeline accordingly.

    https://www.kdnuggets.com/2015/03/machine-learning-data-science-common-mistakes.html

  • 10 Predictive Analytics Influencers You Need to Know

    A list of Predictive Analytics Influencers based on Twitter activity around “#PredictiveAnalytics” and “Predictive Analytics”: Gregory Piatetsky, Vineet Vashishta, Aki Kakko and more.

    https://www.kdnuggets.com/2015/03/10-predictive-analytics-influencers-dataconomy.html

  • The Elements of Data Analytic Style – checklist

    Jeff Leek book "Elements of Data Analytic Style" had a rocket launch, thanks to author course on Coursera. The book includes a useful checklist that can guide beginning data analysts or serve for evaluating data analyses.

    https://www.kdnuggets.com/2015/03/jtleek-elements-data-analytic-style.html

  • IBM Big Data & Analytics Heroes

    IBM's Big Data & Analytics Heroes include leaders in the field that propel the industry in order to promote thought leadership and progress in Big Data Analytics.

    https://www.kdnuggets.com/2015/02/ibm-big-data-analytics-heroes.html

  • Interview: David Kasik, Boeing on Data Analysis vs Data Analytics

    We discuss the impact of increasing amount of data on visualization, difference between Data Analysis and Data Analytics, motivation, trends, desired skills and more.

    https://www.kdnuggets.com/2015/02/interview-david-kasik-boeing-data-analytics.html

  • Google BigQuery Public Datasets

    Google BigQuery is not only a fantastic tool to analyze data, but it also has a repository of public data, including GDELT world events database, NYC Taxi rides, GitHub archive, Reddit top posts, and more.

    https://www.kdnuggets.com/2015/02/google-bigquery-public-datasets.html

  • Fun and Top! US States in 2 Words using twitteR

    Combining twitteR package with text mining techniques and visualization tools can produce interesting outputs. Find out which US state is fun and top, and which is good and crazy, according to Twitter.

    https://www.kdnuggets.com/2015/02/us-states-2-words-twitter.html

  • History of Data Science Infographic in 5 strands

    History of Data Science infographic presents key events in Data Science across 5 strands: Computer Science, Data Technology, Visualization, Mathematics/OR, and Statistics.

    https://www.kdnuggets.com/2015/02/history-data-science-infographic.html

  • Automatic Statistician and the Profoundly Desired Automation for Data Science

    The Automatic Statistician project by Univ. of Cambridge and MIT is pushing ahead the frontiers of automation for the selection and evaluation of machine learning models. In general, what does automation mean to Data Science?

    https://www.kdnuggets.com/2015/02/automated-statistician-data-science.html

  • Interview: David Kasik, Boeing on How Visual Analytics is Improving Aviation Safety

    We discuss data visualization at Boeing, the importance of Visual Analytics, Aviation Safety improvement through Analytics and augmented reality.

    https://www.kdnuggets.com/2015/02/interview-david-kasik-boeing-visual-analytics-aviation.html

  • Tinderbox: Automating Romance with Tinder and Eigenfaces

    Tinderbox is a software uses machine learning and image recognition to automate Tinder, a popular app for single meetings. The author describes his experience and feedback until it started to work too well.

    https://www.kdnuggets.com/2015/02/tinderbox-automating-romance-tinder-eigenfaces.html

  • Data Science’s Most Used, Confused, and Abused Jargon

    As data science has spread through the mainstream, so too has a dense vocabulary of ill-defined jargon. In a split-personality post, we offer several perspectives on many of data science's most confused terms.

    https://www.kdnuggets.com/2015/02/data-science-confusing-jargon-abused.html

  • 10 things statistics taught us about big data analysis

    There are 10 ideas in applied statistics are relevant for big data analysis, focusing on prediction accuracy, interactive analysis and more.

    https://www.kdnuggets.com/2015/02/10-things-statistics-big-data-analysis.html

  • Top 30 people in Big Data and Analytics

    Innovation Enterprise has compiled a top 30 list for individuals in big data that have had a large impact on the development or popularity of the industry.

    https://www.kdnuggets.com/2015/02/top-30-people-big-data-analytics.html

  • Facebook Open Sources deep-learning modules for Torch

    We review Facebook recently released Torch module for Deep Learning, which helps researchers train large scale convolutional neural networks for image recognition, natural language processing and other AI applications.

    https://www.kdnuggets.com/2015/02/facebook-open-source-deep-learning-torch.html

  • Interview: Eli Collins, Cloudera on Evolution and Future of Big Data Ecosystem

    We discuss the change in Big Data priorities, risks, Big Data ecosystem, rise of data culture in organizations, challenges, advice and more.

    https://www.kdnuggets.com/2015/02/interview-eli-collins-cloudera-big-data-ecosystem.html

  • (Deep Learning’s Deep Flaws)’s Deep Flaws

    Recent press has challenged the hype surrounding deep learning, trumpeting several findings which expose shortcomings of current algorithms. However, many of deep learning's reported flaws are universal, affecting nearly all machine learning algorithms.

    https://www.kdnuggets.com/2015/01/deep-learning-flaws-universal-machine-learning.html

  • Text Analysis 101: Document Classification

    Document classification is an example of Machine Learning (ML) in the form of Natural Language Processing (NLP). By classifying text, we are aiming to assign one or more classes or categories to a document, making it easier to manage and sort.

    https://www.kdnuggets.com/2015/01/text-analysis-101-document-classification.html

  • The High Cost of Maintaining Machine Learning Systems

    Google researchers warn of the massive ongoing costs for maintaining machine learning systems. We examine how to minimize the technical debt.

    https://www.kdnuggets.com/2015/01/high-cost-machine-learning-technical-debt.html

  • Can noise help separate causation from correlation?

    How to tell correlation from causation is one of the key problems in data science and Big Data. New Additive Noise Models methods can do it with over 65% accuracy, opening new breakthrough possibilities.

    https://www.kdnuggets.com/2015/01/can-noise-help-separate-causation-from-correlation.html

  • Interview: Arno Candel, H2O.ai on the Basics of Deep Learning to Get You Started

    We discuss how Deep Learning is different from the other methods of Machine Learning, unique characteristics and benefits of Deep Learning, and the key components of H2O architecture.

    https://www.kdnuggets.com/2015/01/interview-arno-candel-0xdata-deep-learning.html

  • Simple Data Science of Global Warming

    You don't have to be a climatologist to empirically confirm global warming. It is enough to have a computer, a reliable data set of historical temperatures, and software like R to do simple calculations.

    https://www.kdnuggets.com/2015/01/data-science-global-warming.html

  • Top SlideShare Presentations on Big Data, updated

    REST APIs and crawling offer two different ways to gather big data presentations from SlideShare, but they provide different results and lead to a very different view of the data. We examine why and find a useful data science lesson.

    https://www.kdnuggets.com/2015/01/top-slideshare-presentations-big-data-updated.html

  • IE Masters in Analytics and Big Data – first hand report

    First hand report on Master in business analytics and big data program at IE (Madrid, Spain) - why, what, how, days, and challenges.

    https://www.kdnuggets.com/2015/01/ie-data-science-education-first-hand-report.html

  • MetaMind Competes with IBM Watson Analytics and Microsoft Azure Machine Learning

    While Microsoft and IBM rush to bring data science and visualization to the masses, MetaMind follows another path, offering deep learning as a service.

    https://www.kdnuggets.com/2015/01/metamind-ibm-watson-analytics-microsoft-azure-machine-learning.html

  • Deep Learning can be easily fooled

    It is almost impossible for human eyes to label the images below to be anything but abstract arts. However, researchers found that Deep Neural Network will label them to be familiar objects with 99.99% confidence. The generality of DNN is questioned again.

    https://www.kdnuggets.com/2015/01/deep-learning-can-be-easily-fooled.html

  • Exclusive: Interview with Chris Wiggins, NYTimes Chief Data Scientist

    New York Times Chief Data Scientist Chris Wiggins on the transformation of digital journalism, key Data Science skills, favorite tools, why better wrong than nice, and how Thomas Jefferson is very relevant today.

    https://www.kdnuggets.com/2015/01/exclusive-interview-chris-wiggins-nytimes-chief-data-scientist.html

  • Predictions: 2015 Analytics and Data Science Hiring Market

    Thanks to Big Data, analytics have become inescapable. Forget the C-Suite if you’re not a Data Geek, recruiting for startups gets harder, analytics salary bands get a lift, and more 2015 predictions.

    https://www.kdnuggets.com/2015/01/predictions-2015-analytics-data-science-hiring-market.html

  • Deep Learning in a Nutshell – what it is, how it works, why care?

    Deep learning and neural networks are increasingly important concepts in computer science with great strides being made by large companies like Google and startups like DeepMind.

    https://www.kdnuggets.com/2015/01/deep-learning-explanation-what-how-why.html

  • Fundamental methods of Data Science: Classification, Regression And Similarity Matching

    Data classification, regression, and similarity matching underpin many of the fundamental algorithms in data science to solve business problems like consumer response prediction and product recommendation.

    https://www.kdnuggets.com/2015/01/fundamental-methods-data-science-classification-regression-similarity-matching.html

  • Cartoon: Hello, Singularity

    New KDnuggets cartoon takes a look at what can happen when Artificial Intelligence (AI) achieves Singularity.

    https://www.kdnuggets.com/2015/01/cartoon-hello-singularity.html

  • Differential Privacy: How to make Privacy and Data Mining Compatible

    Can privacy coexist with machine learning and data mining? Differential privacy allows the learning of general characteristics of populations while guaranteeing the privacy of individual records.

    https://www.kdnuggets.com/2015/01/differential-privacy-data-mining-compatible.html

  • Research Leaders on Data Mining, Data Science, and Big Data key trends, top papers

    We asked global research leaders in Data Science and Big Data what are the most interesting research papers/advances of 2014 and what are the key trends they see in 2015. Here are their answers.

    https://www.kdnuggets.com/2015/01/research-leaders-data-science-big-data-key-trends-top-papers.html

  • PAN Competition 2015: Plagiarism Detection, Author ID, Author Profiling

    Take part in one of 3 tasks: Plagiarism Detection - given a document, is it an original? Author Identification - given a document, who wrote it? Author Profiling - given a document, what is author age / gender?

    https://www.kdnuggets.com/2015/01/pan-competition-2015-plagiarism-detection-author-id-author-profiling.html

  • Interview: Paul Robbins, STATS on the Potential and Challenges for Sports Analytics

    We discuss Analytics at STATS, typical daily tasks, ICE Analytics platform, key challenges, response from coaches/players, career advice and more.

    https://www.kdnuggets.com/2015/01/interview-paul-robbins-stats-sports-analytics.html

  • Causation vs Correlation: Visualization, Statistics, and Intuition

    Visualizations of correlation vs. causation and some common pitfalls and insights involving the statistics are explored in this case study involving stock price time series.

    https://www.kdnuggets.com/2015/01/causation-vs-correlation-visualization-statistics-intuition.html

  • 11 Clever Methods of Overfitting and how to avoid them

    Overfitting is the bane of Data Science in the age of Big Data. John Langford reviews "clever" methods of overfitting, including traditional, parameter tweak, brittle measures, bad statistics, human-loop overfitting, and gives suggestions and directions for avoiding overfitting.

    https://www.kdnuggets.com/2015/01/clever-methods-overfitting-avoid.html

  • KDnuggets™ News 14:n35, Dec 29

    Features | Software | Opinions | Interviews | News | Courses | Meetings | Jobs | Academic | Tweets | CFP | Quote Features 2015 Read more »

    https://www.kdnuggets.com/2014/n35.html

  • Hot or Not: Data Science Trends in 2015

    CrowdFlower infographic predicts the hot trends for data science in 2015 and which trends will fade away.

    https://www.kdnuggets.com/2014/12/data-science-trends-2015.html

  • Interview: Brian Hampton, San Francisco 49ers on Playing Football the Analytics Way

    We discuss the role of analytics in football, the underrated challenges, evolution since the era of draft trade value chart and analytics-supported team selection.

    https://www.kdnuggets.com/2014/12/interview-brian-hampton-49ers-football-analytics-pt1.html

  • KDnuggets™ News 14:n34, Dec 17

    Features | Software | Opinions | Interviews | Reports | News | Webcasts | Jobs | Academic | Tweets | CFP | Quote Features New Read more »

    https://www.kdnuggets.com/2014/n34.html

  • IBM Watson Analytics vs. Microsoft Azure Machine Learning (Part 1)

    IBM Watson Analytics prototype seeks to abstract away data science, taking ordinary natural language queries and answering them based on the content of uploaded datasets. Microsoft Azure Machine Learning goes the opposite route, streamlining existing data mining methodology for fast results and integration with MS's other cloud services.

    https://www.kdnuggets.com/2014/12/ibm-watson-analytics-microsoft-azure-machine-learning-p1.html

  • 16 NoSQL, NewSQL Databases To Watch

    NoSQL and NewSQL databases have become much more important with the proliferation of big, mobile, and networked data, and these sixteen database solutions are some of the biggest up-and-comers.

    https://www.kdnuggets.com/2014/12/16-nosql-newsql-databases-to-watch.html

  • Most Demanded Data Science and Data Mining Skills

    Our analysis of most demanded data scientist skills shows that Data Science is a team effort focused on business analytics, with top 5 platform skills being SQL, Python, R, SAS, and Hadoop.

    https://www.kdnuggets.com/2014/12/data-science-skills-most-demand.html

  • Interview: Daqing Zhao, Macys.com on Building Effective Data Models for Marketing

    We discuss the challenges in identifying the fair price of ad media, recommendations for building effective models for online marketing, unique challenges of Mobile channel, selection of Big Data tools, and more.

    https://www.kdnuggets.com/2014/12/interview-daqing-zhao-macys-data-models-marketing.html

  • KDnuggets™ News 14:n33, Dec 10

    Features | Software | Opinions | Interviews | News | Webcasts | Courses | Meetings | Jobs | Academic | Tweets | CFP | Quote Read more »

    https://www.kdnuggets.com/2014/n33.html

  • Geoff Hinton AMA: Neural Networks, the Brain, and Machine Learning

    In a wide-ranging Q&A, Geoff Hinton addresses the future of deep learning, its biological inspirations, and his research philosophy.

    https://www.kdnuggets.com/2014/12/geoff-hinton-ama-neural-networks-brain-machine-learning.html

  • SlamData Open Source Analytics Tool for MongoDB

    SlamData is an open source SQL-based tool designed to make accessing data in MongoDB easy for developers and non-developers alike with the goal of making application intelligence easier.

    https://www.kdnuggets.com/2014/12/slamdata-open-source-analytics-tool-mongodb.html

  • KDnuggets™ News 14:n32, Dec 3

    Features | Software | Opinions | News | Webcasts | Courses | Meetings | Jobs | Academic | Publications | Tweets | CFP | Quote Read more »

    https://www.kdnuggets.com/2014/n32.html

  • Top 10 Big Data Companies by Revenue

    IBM, HP, Dell, and SAP lead the list of Big Data companies with the most revenue from big data hardware, software, and IT services.

    https://www.kdnuggets.com/2014/12/top-10-big-data-companies-revenue.html

  • Geoffrey Hinton talks about Deep Learning, Google and Everything

    A review of Dr. Geoffrey Hinton’s Ask Me Anything on Reddit. He talked about his current research and his thought on some deep learning issues.

    https://www.kdnuggets.com/2014/12/geoffrey-hinton-talks-deep-learning-google-everything.html

  • Most Popular Slideshare Presentations on Big Data

    Hadoop, the cloud, and Microsoft Azure are just a few of the many topics covered by the top Big Data SlideShare presentations retrieved from the SlideShare API.

    https://www.kdnuggets.com/2014/11/most-popular-slideshare-presentations-big-data.html

  • Most Popular Slideshare Presentations on Data Science

    Top SlideShare data science presentations provide a unique view on topics like data science management, using Python and NumPy in your data science project, and leveraging data science for enterprise big data.

    https://www.kdnuggets.com/2014/11/most-popular-slideshare-presentations-data-science.html

  • 9 Must-Have Skills You Need to Become a Data Scientist

    Burtch Works details the top 9 data science skills that potential data scientists must have to be competitive in this growing marketplace from the perspective of a recruiter.

    https://www.kdnuggets.com/2014/11/9-must-have-skills-data-scientist.html

  • Top KDnuggets tweets, Nov 17-18: Keep this #Python Cheat Sheet handy; Is #BigData The Most Hyped Technology Ever?

    Keep this #Python Cheat Sheet handy when learning to code; Is #BigData The Most Hyped Technology Ever? No (at least not yet); How to become a data scientist in 8 (not so) easy steps;R and Hadoop make Machine Learning Possible for Everyone.

    https://www.kdnuggets.com/2014/11/top-tweets-nov17-18.html

  • KDnuggets™ News 14:n31, Nov 25

    Features | Opinions | Interviews | Reports | News | Webcasts | Jobs | Academic | Publications | Tweets | CFP | Quote Features Update: Read more »

    https://www.kdnuggets.com/2014/n31.html

  • Why Azure ML is the Next Big Thing for Machine Learning?

    With advanced capabilities, free access, strong support for R, cloud hosting benefits, drag-and-drop development and many more features, Azure ML is ready to take the consumerization of ML to the next level.

    https://www.kdnuggets.com/2014/11/microsoft-azure-machine-learning.html

  • R and Hadoop make Machine Learning Possible for Everyone

    R and Hadoop make machine learning approachable enough for inexperienced users to begin analyzing and visualizing interesting data to start down the path in this lucrative field.

    https://www.kdnuggets.com/2014/11/r-hadoop-make-machine-learning-possible-everyone.html

  • Most Popular Slideshare Presentations on Data Mining

    SlideShare data mining presentations cover many topics, offering a unique way of consuming data mining content and exploring a variety of slideshows, both narrow and broad in scope.

    https://www.kdnuggets.com/2014/11/most-popular-slideshare-presentations-data-mining.html

  • IBM Watson Analytics – Will it Replace Data Scientists ?

    We review IBM Watson Analytics Beta version, the service which aims to provide an automated data scientist and intended for business users who want to move beyond spreadsheets for analysis .

    https://www.kdnuggets.com/2014/11/ibm-watson-analytics-replace-data-scientists.html

  • To Hire Quants, Fix Your Hiring Process

    Hiring talented quants requires an up-to-date hiring process including components like competitive salaries, special bonuses, expedient timelines, and that extra special touch to make your company stand out to quality candidates.

    https://www.kdnuggets.com/2014/11/hire-quants-fix-your-hiring-process.html

  • KDnuggets™ News 14:n30, Nov 19

    Features | Software | Opinions | Interviews | Reports | News | Webcasts | Courses | Meetings | Jobs | Academic | Publications | Tweets Read more »

    https://www.kdnuggets.com/2014/n30.html

  • DrivenData: Data Science Competitions for Social Good

    DrivenData plans to bring cutting-edge practices in data science and crowdsourcing to some of the world's biggest social challenges and the organizations taking them on.

    https://www.kdnuggets.com/2014/11/drivendata-data-science-competitions-social-good.html

  • KDnuggets™ News 14:n29, Nov 5

    Features | Software | Opinions | News | Webcasts | Courses | Meetings | Jobs | Publications | Tweets | CFP | Quote Features Big Read more »

    https://www.kdnuggets.com/2014/n29.html

Refine your search here:

No, thanks!