Search results for

    Found 6294 documents, 6023 searched:

  • LinkedIn Knowledge Graph – KDnuggets Interview

    We interview LinkedIn about their recently published LinkedIn Knowledge Graph which connects their many millions of members, jobs, companies, and more.

  • MLDB: The Machine Learning Database

    MLDB is an open­source database designed for machine learning. Send it commands over a RESTful API to store data, explore it using SQL, then train machine learning models and expose them as APIs.

  • Top 10 Data Science Videos on Youtube">Gold BlogTop 10 Data Science Videos on Youtube

    Learning and the future are the key topics in the recent Youtube videos on Data Science. The main questions revolve around: “how to become a Data Scientist”, “what is a data scientist”, and “where data science is going”. But why there is so little explanation of data science to the masses?

  • Artificial Intelligence, Deep Learning, and Neural Networks, Explained">Silver BlogArtificial Intelligence, Deep Learning, and Neural Networks, Explained

    This article is meant to explain the concepts of AI, deep learning, and neural networks at a level that can be understood by most non-practitioners, and can also serve as a reference or review for technical folks as well.

  • EDISON Data Science Framework to define the Data Science Profession

    EDISON Data Science Framework provides conceptual, instructional and policy components required to establish the Data Science profession.

  • The R Graph Gallery Data Visualization Collection

    Welcome to the R graph gallery, a collection of R graph examples, organized by chart type, searchable by R function, with reproducible code and explanation.

  • Top 12 Interesting Careers to Explore in Big Data

    From data driven strategies to decision making, the true worth of Big Data has been realized, and has led to opening up of amazing career choices. Check out these 12 interesting careers to explore in Big Data.

  • KDnuggets™ News 16:n36, Oct 12: Battle of the Data Science Venn Diagrams; 9 Bizarre and Surprising Insights; ROI in Big Data Analytics

    Battle of the Data Science Venn Diagrams; Top September Stories in KDnuggets; Open Images Dataset; Still Searching for ROI in Big Data Analytics?

  • Here’s How IT Departments are Using Big Data

    The use cases for big data are clear when it comes to areas like marketing, healthcare, and retail, but IT’s use of big data is a little less clear. Recently, however, some IT departments are finding ways to use big data to improve their individual operations along with that of the entire organization.

  • Adversarial Validation, Explained

    This post proposes and outlines adversarial validation, a method for selecting training examples most similar to test examples and using them as a validation set, and provides a practical scenario for its usefulness.

  • Top /r/MachineLearning Posts, September: Open Images Dataset; Whopping Deep Learning Grant; Advanced ML Courseware

    Google Research announces the Open Images dataset; Canadian Government Deep Learning Research grant; DeepMind: WaveNet - A Generative Model for Raw Audio; Machine Learning in a Year - From total noob to using it at work; Phd-level machine learning courses; xkcd: Linear Regression

  • Battle of the Data Science Venn Diagrams">Gold BlogBattle of the Data Science Venn Diagrams

    First came Drew Conway's data science Venn diagram. Then came all the rest. Read this comparative overview of data science Venn diagrams for both the insight into the profession and the humor that comes along for free.

  • Automated Data Science & Machine Learning: An Interview with the Auto-sklearn Team">Silver BlogAutomated Data Science & Machine Learning: An Interview with the Auto-sklearn Team

    This is an interview with the authors of the recent winning KDnuggets Automated Data Science and Machine Learning blog contest entry, which provided an overview of the Auto-sklearn project. Learn more about the authors, the project, and automated data science.

  • Embedded Analytics: The Future of Business Intelligence

    An overview of the evolution of Business Intelligence, and some insight into where its future lie: embedded analytics.

  • Deep Learning Reading Group: SqueezeNet

    This paper introduces a small CNN architecture called “SqueezeNet” that achieves AlexNet-level accuracy on ImageNet with 50x fewer parameters.

  • Data Science of Sales Calls: The Surprising Words That Signal Trouble or Success

    While not as profound a problem as uncovering the secrets of the universe, how to conduct a successful sales conversation is an age-old problem, impacting millions of people every day.

  • Top Data Scientist Claudia Perlich on Biggest Issues in Data Science">Silver BlogTop Data Scientist Claudia Perlich on Biggest Issues in Data Science

    Find out what top data scientist Claudia Perlich believes are - and are not - the biggest issues in data science today, and why spending 80% of their time with data preparation is not a problem.

  • Data Science Basics: Data Mining vs. Statistics

    As a beginner I was confused at the relationship between data mining and statistics. This is my attempt to help straighten out this connection for others who may now be in my old shoes.

  • Data Science for Internet of Things (IoT): Ten Differences From Traditional Data Science">Gold BlogData Science for Internet of Things (IoT): Ten Differences From Traditional Data Science

    The connected devices (The Internet of Things) generate more than 2.5 quintillion bytes of data daily. All this data will significantly impact business processes and the Data Science for IoT will take increasingly central role. Here we outline 10 main differences between Data Science for IoT and traditional Data Science.

  • Comparing Clustering Techniques: A Concise Technical Overview

    A wide array of clustering techniques are in use today. Given the widespread use of clustering in everyday data mining, this post provides a concise technical overview of 2 such exemplar techniques.

  • Top 16 Active Big Data, Data Science Leaders on LinkedIn

    Who are the most active Big Data, Data Science Influencers and Leaders on LinkedIn? We analyze the data and bring you the list of key people to follow.

  • Deep Learning Reading Group: Deep Residual Learning for Image Recognition

    Published in 2015, today's paper offers a new architecture for Convolution Networks, one which has since become a staple in neural network implementation. Read all about it here.

  • Data Science Basics: 3 Insights for Beginners

    For data science beginners, 3 elementary issues are given overview treatment: supervised vs. unsupervised learning, decision tree pruning, and training vs. testing datasets.

  • Support Vector Machines: A Concise Technical Overview

    Support Vector Machines remain a popular and time-tested classification algorithm. This post provides a high-level concise technical overview of their functionality.

  • 9 Key Deep Learning Papers, Explained">Gold Blog9 Key Deep Learning Papers, Explained

    If you are interested in understanding the current state of deep learning, this post outlines and thoroughly summarizes 9 of the most influential contemporary papers in the field.

  • The Great Algorithm Tutorial Roundup

    This is a collection of tutorials relating to the results of the recent KDnuggets algorithms poll. If you are interested in learning or brushing up on the most used algorithms, as per our readers, look here for suggestions on doing so!

  • Random Forest®: A Criminal Tutorial

    Get an overview of Random Forest here, one of the most used algorithms by KDnuggets readers according to a recent poll.

  • Decision Trees: A Disastrous Tutorial

    Get a concise overview of decision trees here, one of the most used KDnuggets reader algorithms as measured in a recent poll.

  • SlangSD: A Sentiment Dictionary for Slang Words

    The Slang Sentiment Dictionary (SlangSD) includes over 90,000 slang words together with their sentiment scores, facilitating sentiment analysis in user-generated contents.

  • Top Algorithms and Methods Used by Data Scientists">Gold BlogTop Algorithms and Methods Used by Data Scientists

    Latest KDnuggets poll identifies the list of top algorithms actually used by Data Scientists, finds surprises including the most academic and most industry-oriented algorithms.

  • Urban Sound Classification with Neural Networks in Tensorflow

    This post discuss techniques of feature extraction from sound in Python using open source library Librosa and implements a Neural Network in Tensorflow to categories urban sounds, including car horns, children playing, dogs bark, and more.

  • The (Not So) New Data Scientist Venn Diagram

    This post outlines a (relatively) new(er) Data Science-related Venn diagram, giving an update to Conway's classic, and providing further fuel for flame wars and heated disagreement.

  • Doing the Data Science That Drives Predictive Personalization

    Agile collaboration within data science teams is essential to the vision of customer analytics and personalization. Attend IBM DataFirst Launch Event on Sep 27 in New York City to engage with open-source community leaders and practitioners.

  • Deep Learning Reading Group: Deep Networks with Stochastic Depth

    An concise overview of a recent paper which introduces a new way to perturb networks during training in order to improve their performance, stochastic depth networks.

  • A Beginner’s Guide To Understanding Convolutional Neural Networks Part 2

    This is the second part of a thorough introductory treatment of convolutional neural networks. Have a look after reading the first part.

  • Introducing Dask for Parallel Programming: An Interview with Project Lead Developer

    Introducing Dask, a flexible parallel computing library for analytics. Learn more about this project built with interactive data science in mind in an interview with its lead developer.

  • KDnuggets™ News 16:n32, Sep 7: Cartoon: Data Scientist was sexiest job until…; Up to Speed on Deep Learning

    Cartoon: Data Scientist - the sexiest job of the 21st century until...; Up to Speed on Deep Learning: July Update; How Convolutional Neural Networks Work; Learning from Imbalanced Classes; What is the Role of the Activation Function in a Neural Network?

  • A Beginner’s Guide To Understanding Convolutional Neural Networks Part 1">Gold BlogA Beginner’s Guide To Understanding Convolutional Neural Networks Part 1

    Interested in better understanding convolutional neural networks? Check out this first part of a very comprehensive overview of the topic.

  • Cartoon: Labor Day in the era of Robotics

    Amidst all the discussion about robots and automation taking over human jobs, new KDnuggets cartoon looks at how Labor Day can evolve by 2050.

  • The Evolution of IoT Edge Analytics: Strategies of Leading Players

    This article explores the significance and evolution of IoT edge analytics. Since the author believes that hardware capabilities will converge for large vendors, IoT analytics will be the key differentiator.

  • The Human Vector: Incorporate Speaker Embeddings to Make Your Bot More Powerful

    One of the many ways in which bots can fail is by their (lack of) persona. Learn how speaker embeddings can help with this problem, and can help improve the persona of your bot.

  • Data Science vs Crime: Detecting Pickpocket Suspects from Transit Records

    A team of US and Chinese researchers has creatively used massive data collected by automated fare collectors for identifying thieves in the public transit systems. The system was tested in Beijing and was able to identify 93% of known pickpockets.

  • Learning from Imbalanced Classes

    Imbalanced classes can cause trouble for classification. Not all hope is lost, however. Check out this article for methods in which to deal with such a situation.

  • How Convolutional Neural Networks Work

    Get an overview of what is going on inside convolutional neural networks, and what it is that makes them so effective.

  • What is the Role of the Activation Function in a Neural Network?

    Confused as to exactly what the activation function in a neural network does? Read this overview, and check out the handy cheat sheet at the end.

  • Data Mining Tip: How to Use High-cardinality Attributes in a Predictive Model

    High-cardinality nominal attributes can pose an issue for inclusion in predictive models. There exist a few ways to accomplish this, however, which are put forward here.

  • Cartoon: Data Scientist – the sexiest job of the 21st century until …">Silver BlogCartoon: Data Scientist – the sexiest job of the 21st century until …

    This Data Scientist thought that he had the sexiest job of the 21st century until the arrival of the competition ...

  • MDL Clustering: Unsupervised Attribute Ranking, Discretization, and Clustering

    MDL Clustering is a free software suite for unsupervised attribute ranking, discretization, and clustering based on the Minimum Description Length principle and built on the Weka Data Mining platform.

  • The top 5 Big Data courses to help you break into the industry

    Here is an updated and in-depth review of top 5 providers of Big Data and Data Science courses: Simplilearn, Cloudera, Big Data University, Hortonworks, and Coursera

  • A Tutorial on the Expectation Maximization (EM) Algorithm

    This is a short tutorial on the Expectation Maximization algorithm and how it can be used on estimating parameters for multi-variate data.

  • Introduction to Local Interpretable Model-Agnostic Explanations (LIME)

    Learn about LIME, a technique to explain the predictions of any machine learning classifier.

  • A Gentle Introduction to Bloom Filter

    The Bloom Filter is a probabilistic data structure which can make a tradeoff between space and false positive rate. Read more, and see an implementation from scratch, in this post.

  • A simple approach to anomaly detection in periodic big data streams

    We describe a simple and scaling algorithm that can detect rare and potentially irregular behavior in a time series with periodic patterns. It performs similarly to Twitter's more complex approach.

  • Data Science of Reviews: ReviewMeta tool Automatically Detects Unnatural Reviews on Amazon

    ReviewMeta is a tool that analyzes millions of reviews and helps customers decide which ones to trust. As the dataset grows, so do the insights on unbiased reviews.

  • How to Become a (Type A) Data Scientist">Gold BlogHow to Become a (Type A) Data Scientist

    This post outlines the difference between a Type A and Type B data scientist, and prescribes a learning path on becoming a Type A.

  • A Neat Trick to Increase Robustness of Regression Models

    Read this take on the validity of choosing a different approach to regression modeling. Why isn't L1 norm used more often?

  • How to Become a Data Scientist – Part 1">2016 Silver BlogHow to Become a Data Scientist – Part 1

    Check out this excellent (and exhaustive) article on becoming a data scientist, written by someone who spends their day recruiting data scientists. Do yourself a favor and read the whole way through. You won't regret it!

  • Misinformation Key Terms, Explained

    Misinformation has emerged as a key issue for social media platforms. This post will introduce the concept of misinformation and the 8 Key Terms, which provides insights into mining misinformation in social media.

  • The Gentlest Introduction to Tensorflow – Part 2

    Check out the second and final part of this introductory tutorial to TensorFlow.

  • Top Machine Learning Projects for Julia

    Julia is gaining traction as a legitimate alternative programming language for analytics tasks. Learn more about these 5 machine learning related projects.

  • The 10 Algorithms Machine Learning Engineers Need to Know">2016 Gold BlogThe 10 Algorithms Machine Learning Engineers Need to Know

    Read this introductory list of contemporary machine learning algorithms of importance that every engineer should understand.

  • Approaching (Almost) Any Machine Learning Problem

    If you're looking for an overview of how to approach (almost) any machine learning problem, this is a good place to start. Read on as a Kaggle competition veteran shares his pipelines and approach to problem-solving.

  • Does Data Scientist Mean What You Think It Means?

    Do we have an accurate idea of what "data scientist" actually means? Read this thought-provoking opinion on the topic.

  • Central Limit Theorem for Data Science – Part 2

    This post continues an explanation of Central Limit Theorem started in a previous post, with additional details... and beer.

  • Cartoon: Make Data Great Again">Silver BlogCartoon: Make Data Great Again

    This KDnuggets cartoon considers a speech that a certain presidential candidate can give on a topic of Big Data.

  • Central Limit Theorem for Data Science

    This post is an introductory explanation of the Central Limit Theorem, and why it is (or should be) of importance to data scientists.

  • Understanding the Empirical Law of Large Numbers and the Gambler’s Fallacy

    Law of large numbers is a important concept for practising data scientists. In this post, The empirical law of large numbers is demonstrated via simple simulation approach using the Bernoulli process.

  • 5 EBooks to Read Before Getting into A Data Science or Big Data Career

    A short, carefully-curated list of 5 free ebooks to help you better understand what Data Science is all about and how you can best prepare for a career in data science, big data, and data analysis.

  • A Beginner’s Guide to Neural Networks with R!

    In this article we will learn how Neural Networks work and how to implement them with the R programming language! We will see how we can easily create Neural Networks with R and even visualize them. Basic understanding of R is necessary to understand this article.

  • Visualizing 1 Billion Points of Data: Doing It Right – Aug 18 Webinar

    Join Continuum Analytics on August 18 for a webinar on Big Data visualization with the datashader library. Save your spot today!

  • Big Data Key Terms, Explained

    Just getting started with Big Data, or looking to iron out the wrinkles in your current understanding? Check out these 20 Big Data-related terms and their concise definitions.

  • 7 Steps to Understanding Computer Vision

    A starting point for Computer Vision and how to get going deeper. Dive into this post for some overview of the right resources and a little bit of advice.

  • Short course: Statistical Learning and Data Mining IV, Washington, DC, Oct 19-20

    This new two-day course gives a detailed and modern overview of statistical models used by data scientists for prediction and inference, including sparse models and deep learning.

  • Cartoon: Facebook data science experiments and Cats

    In honor of International Cat Day, we revisit KDnuggets cartoon that looks at the Facebook data science experiment on emotion manipulation and the importance of happy kittens.

  • Understanding the Bias-Variance Tradeoff: An Overview

    A model's ability to minimize bias and minimize variance are often thought of as 2 opposing ends of a spectrum. Being able to understand these two types of errors are critical to diagnosing model results.

  • Brain Monitoring with Kafka, OpenTSDB, and Grafana

    Interested in using open source software to monitor brain activity, and control your devices? Sure you are! Read this fantastic post for some insight and direction.

  • Contest Winner: Winning the AutoML Challenge with Auto-sklearn

    This post is the first place prize recipient in the recent KDnuggets blog contest. Auto-sklearn is an open-source Python tool that automatically determines effective machine learning pipelines for classification and regression datasets. It is built around the successful scikit-learn library and won the recent AutoML challenge.

  • Nigeria: Telling Internally Displaced Persons Stories Using Visual Data and Infographics

    Read a data-driven discussion on the plight of internally displaced persons (IDPs) in Nigeria, and see the real power of data science and data visualization.

  • Reinforcement Learning and the Internet of Things

    Gain an understanding of how reinforcement learning can be employed in the Internet of Things world.

  • Contest 2nd Place: Automated Data Science and Machine Learning in Digital Advertising

    This post is an overview of an automated machine learning system in the digital advertising realm. It is an entrant and second-place recipient in the recent KDnuggets blog contest.

  • Contest 2nd Place: Automating Data Science

    This post discusses some considerations, options, and opportunities for automating aspects of data science and machine learning. It is the second place recipient (tied) in the recent KDnuggets blog contest.

  • What Statistics Topics are Needed for Excelling at Data Science?

    Here is a list of skills and statistical concepts suggested for excelling at data science, roughly in order of increasing complexity.

  • Doing Statistics with SQL

    This post covers how to perform some basic in-database statistical analysis using SQL.

  • And the Winner is… Stepwise Regression

    This post evaluates several methods for automating the feature selection process in large-scale linear regression models and show that for marketing applications the winner is Stepwise regression.

  • The Core of Data Science

    This post provides a simplifying framework, an ontology for Machine Learning and some important developments in dynamical machine learning. From first hand Data Science product experience, the author suggests how best to execute Data Science projects.

  • Dataiku DSS 3.1 – Now with 5 ML Backends & Scala!

    Introducing Dataiku DSS 3.1, with new visual machine learning engines that allow users to create incredibly powerful predictive applications within a code-free interface.

  • Yann LeCun Quora Session Overview

    Here is a quick oversight, with excerpts, of the Yann LeCun Quora Session which took place on Thursday July 28, 2016.

  • Data Science of Visiting Famous Movie Locations in San Francisco

    Using the Google Places API and IMDb API, we selected movie locations in The Golden City which every movie fan should visit while they are in town, and optimize sightseeing by solving the travelling salesman problem.

  • Theoretical Data Discovery: Using Physics to Understand Data Science

    Data science may be a relatively recent buzzword, but the collection of tools and techniques to which it refers come from a broad range of disciplines. Physics has a wealth of concepts to learn from, as evidenced in this piece.

  • Build vs Buy – Analytics Dashboards

    Read this post on choosing between available analytics dashboard options, and designing your own. Get an informed opinion.

  • Data Science Statistics 101

    Statistics can often be the most intimidating aspect of data science for aspiring data scientists to learn. Gain some personal perspective from someone who has traveled the path.

  • 7 Steps to Understanding NoSQL Databases

    Are you a newcomer to NoSQL, interested in gaining a real understanding of the technologies and architectures it includes? This post is for you.

  • Internet of Things Key Terms, Explained

    This post will define 12 Key Terms for the Internet of Things, in straightforward manner.

  • Would You Survive the Titanic? A Guide to Machine Learning in Python Part 2

    This is part 2 of a 3 part introductory series on machine learning in Python, using the Titanic dataset.

  • Data Science for Beginners 1: The 5 questions data science answers

    A series of videos and write-ups covering the basics of data science for beginners. This first video is about the kinds of questions that data science can answer.

  • Would You Survive the Titanic? A Guide to Machine Learning in Python Part 1

    Check out the first of a 3 part introductory series on machine learning in Python, fueled by the Titanic dataset. This is a great place to start for a machine learning newcomer.

  • 35 Open Source tools for Internet of Things

    If you have heard about the Internet of Things many times by now, its time to join the conversation. Explore the many open source tools & projects related to Internet of Things.

  • SAS vs R vs Python: Which Tool Do Analytics Pros Prefer?

    There are lots of flame wars involving different data science and analytics tools... but this isn't one of them. Check out the quantitative results and analysis of a Burtch Works survey on the subject.

  • Building a Data Science Portfolio: Machine Learning Project Part 1

    Dataquest's founder has put together a fantastic resource on building a data science portfolio. This first of three parts lays the groundwork, with subsequent posts over the following 2 days. Very comprehensive!

  • Multi-Task Learning in Tensorflow: Part 1

    A discussion and step-by-step tutorial on how to use Tensorflow graphs for multi-task learning.

  • In Deep Learning, Architecture Engineering is the New Feature Engineering

    A discussion of architecture engineering in deep neural networks, and its relationship with feature engineering.

  • What the Next Generation of IoT Sensors Have in Store

    This post is an overview of some of the next-generation IoT sensors, and what they could mean for our future.

  • MNIST Generative Adversarial Model in Keras

    This post discusses and demonstrates the implementation of a generative adversarial network in Keras, using the MNIST dataset.

  • Statistical Data Analysis in Python

    This tutorial will introduce the use of Python for statistical data analysis, using data stored as Pandas DataFrame objects, taking the form of a set of IPython notebooks.

  • Why Big Data is in Trouble: They Forgot About Applied Statistics">2016 Silver BlogWhy Big Data is in Trouble: They Forgot About Applied Statistics

    This "classic" (but very topical and certainly relevant) post discusses issues that Big Data can face when it forgets, or ignores, applied statistics. As great of a discussion today as it was 2 years ago.

  • Predictive Analytics Introductory Key Terms, Explained

    Here is a collection of introductory predictive analytics terms and concepts, presented for the newcomer in a straight-forward, no frills definition style.

  • America’s Next Topic Model

    Topic modeling is a a great way to get a bird's eye view on a large document collection using machine learning. Here are 3 ways to use open source Python tool Gensim to choose the best topic model.

  • 10 Algorithm Categories for AI, Big Data, and Data Science

    With a focus on leveraging algorithms and balancing human and AI capital, here are the top 10 algorithm categories used to implement A.I., Big Data, and Data Science.

  • How to Start Learning Deep Learning

    Want to get started learning deep learning? Sure you do! Check out this great overview, advice, and list of resources.

  • What Data Scientists Can Learn From Qualitative Research

    Learn what data scientists can learn from qualitative researchers when it comes to analysing text, and how this relates to writing quality code.

  • Bayesian Machine Learning, Explained">2016 Silver BlogBayesian Machine Learning, Explained

    Want to know about Bayesian machine learning? Sure you do! Get a great introductory explanation here, as well as suggestions where to go for further study.

  • TalkingData Data Science Competition: understand mobile users

    Unique opportunity to solve complex real world big data challenges for the China mobile market - predict users demographic characteristics based on their app usage, geolocation, and mobile device properties.

  • 5 Deep Learning Projects You Can No Longer Overlook

    There are a number of "mainstream" deep learning projects out there, but many more niche projects flying under the radar. Have a look at 5 such projects worth checking out.

  • The Hard Problems AI Can’t (Yet) Touch

    It's tempting to consider the progress of AI as though it were a single monolithic entity, advancing towards human intelligence on all fronts. But today's machine learning only addresses problems with simple, easily quantified objectives

  • Top Machine Learning MOOCs and Online Lectures: A Comprehensive Survey

    This post reviews Machine Learning MOOCs and online lectures for both the novice and expert audience.

  • New Book: Effective CRM using Predictive Analytics – get 20% discount

    A comprehensive step-by-step guide to designing, setting up, executing and deploying data mining techniques in marketing. Use code VBM93 for 20% discount.

  • Big Data, Bible Codes, and Bonferroni

    This discussion will focus on 2 particular statistical issues to be on the look out for in your own work and in the work of others mining and learning from Big Data, with real world examples emphasizing the importance of statistical processes in practice.

  • Streamlining Analytic Deployment: Inside the FICO Decision Management Suite 2.0

    This post explains what’s new in the 2.0 version of the FICO Decision Management Suite, and how it can be used by data scientists and others to create stronger customer relationships and provide strategic competitive advantage.

  • Support Vector Machines: A Simple Explanation

    A no-nonsense, 30,000 foot overview of Support Vector Machines, concisely explained with some great diagrams.

  • Interview: Florian Douetteau, Dataiku Founder, on Empowering Data Scientists

    Here is an interview with Florian Douetteau, founder of Dataiku, on how their tools empower data scientists, and how data science itself is evolving.

  • Deep Residual Networks for Image Classification with Python + NumPy

    This post outlines the results of an innovative Deep Residual Network implementation for Image Classification using Python and NumPy.

  • Storytelling: The Power to Influence in Data Science

    Data scientists need to share results, which is different than talking shop with other data scientists. Read about influencing people and telling stories as a data scientist.

  • Success Criteria for Process Mining

    This article provides tips about the pitfalls and advice that will help you to make your first process mining project as successful as it can be.

  • Mining Twitter Data with Python Part 7: Geolocation and Interactive Maps

    The final part of this 7 part series explores using geolocation and interactive maps with Twitter data.

  • 3 Key Ethics Principles for Big Data and Data Science

    If ethics in general are important, should ethics training be a crucial element of the data science field?

  • Mining Twitter Data with Python Part 6: Sentiment Analysis Basics

    Part 6 of this series builds on the previous installments by exploring the basics of sentiment analysis on Twitter data.

  • Data Mining History: The Invention of Support Vector Machines

    The story starts in Paris in 1989, when I benchmarked neural networks against kernel methods, but the real invention of SVMs happened when Bernhard decided to implement Vladimir Vapnik algorithm.

  • What is Softmax Regression and How is it Related to Logistic Regression?

    An informative exploration of softmax regression and its relationship with logistic regression, and situations in which each would be applicable.

  • Text Mining 101: Topic Modeling

    We introduce the concept of topic modelling and explain two methods: Latent Dirichlet Allocation and TextRank. The techniques are ingenious in how they work – try them yourself.

  • Recursive (not Recurrent!) Neural Networks in TensorFlow

    Learn how to implement recursive neural networks in TensorFlow, which can be used to learn tree-like structures, or directed acyclic graphs.

Refine your search here:

No, thanks!