Top stories for May 24-30: R vs Python for Data Science; R leads RapidMiner, Python catches up, Spark ignites
R vs Python for Data Science: The Winner is ...; R leads RapidMiner, Python catches up, Big Data tools grow, Spark ignites; Top 10 Data Mining Algorithms, Explained; 150 Most Influential People in Big Data & Hadoop.
on May 31, 2015 in Top stories
Looker: Why and When to Embed Business Intelligence, June 10 Webinar
Learn about how to bring an external-facing data product to market by embedding BI software, and what that can add to your offering.
on May 30, 2015 in Business Intelligence, Looker
Spark Summit, the leading edge of Big Data, San Francisco, June 15-17
Spark is the hottest technology in Big Data. Join Spark Summit to learn how organizations such as the CIA, NASA, and Andreessen Horowitz along with companies like Databricks, Toyota, IBM, Intel, Baidu, Amazon are using Spark to run their businesses more effectively. KDnuggets discount.
on May 29, 2015 in Apache Spark, CA, San Francisco
INFORMS Essential Practice Skills for Analytics Professionals + Baseball Night
Register for the "Essential Practice Skills for Analytics Professionals" Course (Chicago, Jun 23-24) and join INFORMS for a Special Networking Night and Baseball at Wrigley Field.
on May 29, 2015 in Baseball, Chicago, IL, INFORMS, Skills
Applied Statistics Is A Way Of Thinking, Not Just A Toolbox
The choice of tools in applied statistics is driven by the objective, the structure of the data, and the nature of the uncertainty in the numbers, whereas in academic statistics its driven by publishing or teaching. Here we provide some of common statistical tools and the overlapping genealogy.
on May 29, 2015 in Applied Statistics, Randy Bartlett, Statistics, Toolbox
Discover the WHY behind your Customer Scores, June 10 webinar with Seth Grimes
Text Analytics thought leader Seth Grimes and MeaningCloud present a special webinar on ensuring you are getting the most from your customer feedback.
on May 28, 2015 in Customer Analytics, Daedalus, MeaningCloud, Seth Grimes, Text Analytics
Insights from Data Science Handbook
Here you can find perspective of lead data scientists on the definitions ranging from data science, metrics selection while solving a problem, work ethics, the art of storytelling and why data science is important in todays world.
on May 28, 2015 in Data Science, Data Science Fellows, Data Science Jargon, DJ Patil, Handbook, Hilary Mason
21 Essential Data Visualization Tools
We have collected leading data visualization tools, with a short overview of each tool, its strong and weak points.
on May 28, 2015 in D3.js, Data Science Tools, Data Visualization, Tableau
CRN 2015 Big Data Infrastructure Companies
The CRN identifies top 25 big data infrastructure, tools and service companies offering everything from hardware servers, to software platforms and applications, to cloud-based services. The list includes major players in the big data space like Microsoft, Amazon, and IBM!
on May 28, 2015 in Big Data, CRN, Data Infrastructure, Data Platform, Hadoop, Startups
150 Most Influential People in Big Data & Hadoop
A list of 150 Most Influential People on Twitter in Big Data & Hadoop includes Merv Adrian @merv, Alistair Croll @acroll, Ben Lorica @bigdata, Paul Zikopoulos @BigData_paulz, Mathias Herberts @herberts, and Gregory Piatetsky @kdnuggets.
on May 27, 2015 in About Gregory Piatetsky, Alistair Croll, Ben Lorica, Big Data Influencers, Gil Press, Grey Campus, Hadoop, Merv Adrian
Miner3D Data Visualization System Version 8
The new software features a redesigned user interface, making it a perfect complement for Excel. New graphics visualization engine is now faster and smoother.
on May 27, 2015 in Data Visualization, Miner3D
Top KDnuggets tweets, May 19-25: KDnuggets Poll: R leads RapidMiner, Python catches up, Spark ignites; Choosing a Learning Algorithm in Azure ML
R vs #Python, why each is better; Machine Learning predicts that a fair race between Mo Farah and Usain Bolt is 492m; How Machine Learning Is Eating the #Software World; Handy Guide: Choosing a Learning Algorithm in Azure ML.
on May 26, 2015 in Azure ML, Neural Networks, Usain Bolt
Dark Knowledge Distilled from Neural Network
Geoff Hinton never stopped generating new ideas. This post is a review of his research on “dark knowledge”. What’s that supposed to mean?
on May 26, 2015 in Dark Knowledge, Deep Learning, Geoff Hinton, Neural Networks, Ran Bi
R vs Python for Data Science: The Winner is …
In the battle of "best" data science tools, python and R both have their pros and cons. Selecting one over the other will depend on the use-cases, the cost of learning, and other common tools required.
on May 26, 2015 in Data Science Tools, DataCamp, Python, Python vs R, R
CRN 2015 Emerging Big Data Vendors
The 2015 Computer Reseller News, CRN Big Data Emerging vendors list features 54 companies launched since 2009 for their innovative tools, technology and committment to help businesses manage the Big Data Challenge.
on May 26, 2015 in Big Data, Big Data Vendors, CRN, Startups
Upcoming Webcasts on Analytics, Big Data, Data Science – May 26 and beyond
The Data Lake Debate, 3 Ways to Improve Targeted Marketing, Turning Machine Data Into Intelligent Action, Data Mining - Failure to Launch, Using electronic health records for better care, and more.
on May 25, 2015 in GridGain, IIA, James Taylor, Ontotext, Pentaho, Salford Systems, TMA
White House sees Data as the 21st Century Catalyst for Effective Policing
Review of the steps taken by White House over last six months to modernize police data systems to better fight crime as well as build trust between community and police.
on May 25, 2015 in Chief Data Officer, Crime, DJ Patil, Government, Obama, Open Data, Police, Standards, Twitter, White House
R leads RapidMiner, Python catches up, Big Data tools grow, Spark ignites
R is the most popular overall tool among data miners, although Python usage is growing faster. RapidMiner continues to be most popular suite for data mining/data science. Hadoop/Big Data tools usage grew to 29%, propelled by 3x growth in Spark. Other tools with strong growth include H2O (0xdata), Actian, MLlib, and Alteryx.
on May 25, 2015 in Actian, Apache Spark, Data Mining Software, H2O, Knime, Poll, Python, R, RapidMiner, SQL
Top stories for May 17-23: 7 Methods for Data Dimensionality Reduction; Will the Real Data Scientists Please Stand Up?
Poll: What Predictive Analytics, Data Mining, Data Science software/tools used? Seven Techniques for Data Dimensionality Reduction; Will the Real Data Scientists Please Stand Up?; R vs Python, why each is better.
on May 24, 2015 in Top stories
Deep Learning and Big Data Products: Synthos Technologies
Using Deep Learning to build products on massive scale, virtues of transaction analytics, and which industries will likely to be disrupted in the near future, a perspective from Synthos Technologies.
on May 23, 2015 in Deep Learning, RE.WORK, Synthos Technologies, Transactions
Exclusive Interview: Matei Zaharia, creator of Apache Spark, on Spark, Hadoop, Flink, and Big Data in 2020
Apache Spark is one the hottest Big Data technologies in 2015. KDnuggets talks to Matei Zaharia, creator of Apache Spark, about key things to know about it, why it is not a replacement for Hadoop, how it is better than Flink, and vision for Big Data in 2020.
on May 22, 2015 in Apache Spark, Big Data, Databricks, Flink, Hadoop, Matei Zaharia, MLlib, Spark SQL
Interview: Linda Powell, Consumer Financial Protection Bureau (CFPB) on Data Governance for Finance Industry
We discuss the chief data officer role at CFPB, big data opportunities and challenges, ontology, vintage data, data governance trends, advice, and more.
on May 22, 2015 in CFPB, Data Governance, Data Management, Interview, Linda Powell, Ontology, Policies, Standards
Trifacta – Wrangling US Flight Data, part 2
This post shows how to use Trifacta to clean the data and enrich it with airport geo-locations and airline names, including filling missing values, and doing a lookup from another dataset. We also learn which is the best airline at O’Hare airport.
on May 22, 2015 in Air traffic, Data Processing, Data Wrangling, Tableau, Trifacta
Top 10 Data Mining Algorithms, Explained
Top 10 data mining algorithms, selected by top researchers, are explained here, including what do they do, the intuition behind the algorithm, available implementations of the algorithms, why use them, and interesting applications.
on May 21, 2015 in Algorithms, Apriori, Bayesian, Boosting, C4.5, CART, Data Mining, Explained, K-means, K-nearest neighbors, Naive Bayes, Page Rank, Support Vector Machines, Top 10
How to reduce Data Hoarding, get Better Visualizations and Decisions
Creating a hodge-podge of pretty pictures of every datapoint is a guaranteed way to destroy the value of a visualization. We examine how to reduce such data hoarding and improve decisions.
on May 21, 2015 in Alex Jones, Dashboard, Data Visualization, Linear Discriminant Analysis, PCA
5 Not-to-be-Missed Ideas about Big Data
The things we can measure are never exactly what we care about; When everything hinges on metrics, people will game the metrics to the point of losing any meaning; and more key ideas summarized by Kaiser Fung.
on May 21, 2015 in Big Data, Kaiser Fung, Numbersense
CRN 2015 Big Data Management Companies
Big Data and it's ease-of-use plays a key role in this year’s ‘CRN Big Data 100: Top 30 Data Management companies’. New additions include At Scale, Databricks, and Tamr. A majority of these companies develop open-source NoSQL database technology.
on May 21, 2015 in Big Data, CA, CRN, Data Management, Israel, MA
Simplilearn: Big Data, Analytics online courses discount, Free ebook
The role of a Data Scientist is ever evolving and a candidate with up-to-date skills will be preferred over their peers across industries. Learn SAS, Hadoop, R, Big Data, and Analytics skills with Simplilearn online courses, special discount until May 30.
on May 21, 2015 in Big Data Analytics, Certification, Free ebook, Hadoop, Online Education, R, SAS, Simplilearn
Essays On Statistics Denial
Statistics denial comes in waves as areas of application discover and rediscover the potential of data insights. We examine the statistics denial myths and where they come from.
on May 20, 2015 in Big Data Hype, Hype, Randy Bartlett, Statistics
I’ve Been Replaced by an Analytics Robot
A veteran statistician reflects on the journey from a statistician of the past to data scientist of today, how the work he used to do became automated, and what future can data scientists can expect.
on May 20, 2015 in Automation, Data Science, Future, History, Robots
Top KDnuggets tweets, May 12-18: Hadoop demand falls; Andrew Ng Machine Learning class, excellent course notes
#Hadoop demand falls, 54% of enterprises have no plans for it; Download your entire #Google search history; #DataScience for #Dummies: An interview with author Lilian @BigDataGal Pierson; Andrew Ng Machine Learning Coursera class - complete, excellent course notes.
on May 19, 2015 in Andrew Ng, Coursera, Data Science Education, Google, Hadoop, SAS
Big Data Lessons from Microsoft “how-old” Experiment
Salil Mehta examines Microsoft’s viral “How old do I look?” site, the limits of its age recognition, possible algorithms, and implications for Big Data analysis.
on May 19, 2015 in HowOldRobot, Image Recognition, Project Fail, Salil Mehta
Interview: Antonio Magnaghi, TicketMaster on Why Honesty is Key for Analytics Success
We discuss lessons from implementing lambda architecture, impact of Big Data on recommender systems, trends, advice, and more.
on May 19, 2015 in Advice, Analytics, Antonio Magnaghi, Interview, Personalization, Recommendation, Skills, Success, TicketMaster, Trends
R vs Python, why each is better
A report on a free-wheeling Australian meetup discussing "Why R is Better" and "Why Python is Better". What do you think?
on May 19, 2015 in Australia, Python, Python vs R, R
Strategies for Monetizing Big Data
In the current tsunami of “Big Data” every business wants to get value out of the data. We examine four overarching data strategies and their specific monetization strategies.
on May 19, 2015 in Big Data, Big Data Strategy, Business Value, Monetizing, Russell Walker
PAW Chicago: Five Unbeatable Analytics Workshops
Take your predictive analytics up a notch with unbeatable PAW Chicago workshops, covering R, predictive modeling, ensemble methods and more.
on May 19, 2015 in Chicago, Dean Abbott, Ensemble Methods, IL, John Elder, PAW, Predictive Analytics World
Upcoming Webcasts on Analytics, Big Data, Data Science – May 19 and beyond
Demystify Data Flows, How Analytics Might Save Your Life, 5 Predictive Analytics Lessons, 3 Ways to Improve Targeted Marketing, Building an Analytics Team, Data Mining - Failure to Launch, and more.
on May 18, 2015 in GridGain, Lavastorm, Salford Systems, Stanford, TMA
Cartoon: How Smart Do You Look?
The follow-up to "How-Old" demo, New Deep Learning "How Smart You Are" IQ and Face Recognition Algorithm accidentally discovers the smartest being on the internet.
on May 18, 2015 in Cartoon, Cats, HowOldRobot
Interview: Antonio Magnaghi, TicketMaster on Unifying Heterogeneous Analytics through Lambda Architecture
We discuss the role of Data Science team at Ticketmaster, ecommerce data characteristics, analytics based on highly variant data flow, infrastructure challenges, and merits of lambda architecture.
on May 18, 2015 in Antonio Magnaghi, Challenges, Data Science, Ecommerce, Lambda Architecture, Live Nation, Performance, TicketMaster
Will the Real Data Scientists Please Stand Up?
Job postings for data scientists are everywhere. But what is a data scientist? I present a few archetypes.
on May 18, 2015 in Data Science, Data Science Jargon, Data Science Skills, Machine Learning, Zachary Lipton
Most Viewed Data Mining Videos on YouTube
The top Data Mining YouTube videos by those like Google and Revolution Analytics covers topics ranging from statistics in data mining to using R for data mining to data mining in sports.
on May 18, 2015 in Ayasdi, Data Mining, Google, Grant Marshall, R, Rattle, Revolution Analytics, Statistica, Text Mining, Weka, Youtube
How to Lead a Data Science Contest without Reading the Data
We examine a “wacky” boosting method that lets you climb the public leaderboard without even looking at the data . But there is a catch, so read on before trying to win Kaggle competitions with this approach.
on May 17, 2015 in Accuracy, Benchmark, Competition, Kaggle, Model Performance
Top stories for May 10-16: Poll: Analytics, Data Mining software used; 3 things about Data Science not in books
Predictive Analytics, Data Mining, Data Science software used?; 3 Things About Data Science You Won't Find In Books; Most Viewed Big Data Videos on YouTube; Machine Learning Wars: Amazon vs Google vs BigML vs PredicSis.
on May 17, 2015 in Top stories
UC Analytics Summit, Cincinnati, May 29, 2015
UC Summit will feature two analytics leaders: John Elder and Stephen Few, 4 afternoon tracks focusing on descriptive / prescriptive / predictive analytics, building your analytics team, and more.
on May 15, 2015 in Business Analytics, Cincinnati, John Elder, OH
Interview: Sheridan Hitchens, Auction.com on Customer Lifetime Value as the Cornerstone for Marketing Analytics
We discuss Customer Lifetime Value (CLV) metric, maturity level for the CLV metric, different models for calculating it, challenges in designing strategy based on CLV and tackling attribution.
on May 15, 2015 in Auction, Customer Value, Interview, Marketing Analytics, Metrics, Sheridan Hitchens
ebook: Learning Apache Mahout, Big Data Analytics
Acquire practical skills in Big Data Analytics and explore data science with Apache Mahout with this comprehensive guide with numerous code examples and end-to-end case studies.
on May 15, 2015 in Apache Mahout, Big Data Analytics, Packt Publishing
ebook: Learning Apache Mahout Classification
If you are a data scientist with Hadoop experience and interest in machine learning, this book is for you. Learn about different classification in Apache Mahout and build your own classifiers.
on May 15, 2015 in Apache Mahout, Classification, ebook, Packt Publishing
Data Science for Workforce Optimization: Reducing Employee Attrition
Predictive analytics is growing its reach, see how it is affecting workforce analytics domain. In this presentation Pasha Roberts explains what is in it for students, managers and practitioners.
on May 15, 2015 in Pasha Roberts, PAW, Talent Analytics, Workforce Analytics
Stanford Webinar: Big Data + Electronic Health Records = Better Healthcare, June 18
We show how to transform unstructured patient notes into a de-identified, temporally ordered, patient-feature matrix, and examine use-cases use the resulting data to improve learning of practice-based evidence in electronic medical records.
on May 15, 2015 in Anonymized, Healthcare, Medical research, Stanford
Surprising Random Correlations
An interesting demo showing how easy it is to find surprising correlations in real data. Is German unemployment rate related to Apple Stock? Is 10-year Treasury rate related to price of Red Winter Wheat? You will be surprised.
on May 14, 2015 in Correlation, Overfitting, Quandl, Random
INFORMS Analytics courses: Discrete-Event Simulation, Essential Practice Skills for Analytics Professionals
INFORMS Continuing Education courses are designed to enhance the skills for today’s analytics professionals. SPECIAL OFFER: Register now to save $350 on registration for the upcoming course Introduction to Monte Carlo and Discrete-Event Simulation, offered on May 28-29 in Washington, DC.
on May 14, 2015 in Chicago, DC, IL, INFORMS, Simulation, Skills, Washington
Interview: Hobson Lane, SHARP Labs on How Analytics can Show You “All the Light You Cannot See”
We discuss the impact of rapid growth in magnitude of data, programming skills for data science, major trends, advice, data science skills, and more.
on May 14, 2015 in Advice, Analytics, Challenges, Hobson Lane, Interview, SHARP, Trends
Cloud Machine Learning’s Ostrich Mania & Uncanny Valley
Cloud machine learning services are popping up by the tens, providing automated data science solutions. What will the anticipated customers want? They may follow a peculiar distribution reminiscent of the uncanny valley.
on May 14, 2015 in Amazon, Azure ML, Clarifai, Cloud Analytics, Economics, IBM Watson, MetaMind, Startup, Zachary Lipton
Seven Techniques for Data Dimensionality Reduction
Performing data mining with high dimensional data sets. Comparative study of different feature selection techniques like Missing Values Ratio, Low Variance Filter, PCA, Random Forests / Ensemble Trees etc.
By Rosaria Silipo on May 14, 2015 in Data Processing, High-dimensional, Knime, Rosaria Silipo
In-Memory Computing Summit, San Francisco, June 29-30
The In-Memory Computing Summit 2015 is the first and only industry-wide event of its kind, where Fast Data meets Big Data. Get KDnuggets discount if you register by may 31.
on May 14, 2015 in CA, GridGain, In-Memory Computing, San Francisco
Interview: Hobson Lane, SHARP Labs on the Beauty of Simplicity in Analytics
We discuss Predictive Analytics projects at Sharp Labs of America, common myths, value of simplicity, tools and technologies, and notorious data quality issues.
on May 13, 2015 in Hobson Lane, Interview, Natural Language Processing, Predictive Analytics, Project Fail, SHARP, Tools
Plotly: Online Dashboards That Update Your Data and Graphs
New online visualization option from Plot.ly allows you to have data visualizations and graphs that update dynamically.
on May 13, 2015 in Data Visualization, Plotly
Last chance – Participate in the Rexer Analytics 2015 Data Miner Survey – before it closes May 30
Data Analysts, Predictive Modelers, Data Scientists, Data Miners, and all other types of analytic professionals, students, and academics - please participate in the Rexer Analytics 2015 Data Miner Survey before it closes May 30.
on May 13, 2015 in Data Mining, Rexer Analytics, Survey
Should Data Science Really Do That?
Data Science amazing progress in its ability to do predictions and analysis is raising important ethical questions, such as should that data be collected? Should the collected data be used for that application? Should you be involved?
on May 13, 2015 in Data Science, DJ Patil, Ethics, Jeremy Howard, Kaggle, Online advertising, Privacy, Yanir Seroussi
Predictive Analytics World Workforce 2015: Highlights
PAW Workforce 2015 highlights include: analytics is now redefining Human Resources, Analytics lessons are quite applicable to workforce questions, and unique challenges of Workforce analytics.
on May 13, 2015 in CA, Greta Roberts, PAW, Predictive Analytics World, San Francisco, Workforce Analytics
Top KDnuggets tweets, May 4-11: Why #HowOldRobot went viral and how does it work? 3 Things About #DataScience You Won’t Find In Books
Why #HowOldRobot went viral and how does it work? 3 Things About #DataScience You Won't Find In Books; Cartoon reminds all Data Scientists to remember their Mother; Data Scientist vs Data #Engineer.
on May 12, 2015 in Cartoon, Data Engineer, Data Science Skills, Deep Learning, Gmail, HowOldRobot
PAW Chicago: The Power of Predictive Analytics World – Why the buzz?
PAW for Business, the industry leading predictive analytics conference, will be in Chicago, June 8-11, 2015. Learn why there is such buzz about the conference and what people are saying about it. KDnuggets discount.
on May 12, 2015 in Chicago, IL, PAW, Predictive Analytics World
Interview: Mark Weiner, Temple University Health System on Addressing Healthcare Data Gaps through Advanced Simulation
We discuss dealing with current gaps in healthcare data, challenges in using real world healthcare data, desired skills for data scientists in healthcare industry, advice, and more.
on May 12, 2015 in Healthcare, Interview, Mark Weiner, Myths, Simulation, Skills, Temple University, Trends
CRN Big Data Business Analytics Companies
Data, Analytics, and Intelligence play a key role in CRN Big Data 100 in 2015. New additions include predictive analytics firms Knime, Logi Analytics, Looker, Luminoso, Predixion, RapidMiner, and Salesforce.
on May 12, 2015 in Alteryx, Ayasdi, Business Analytics, CRN, DataRPM, Domo, Google, Grant Marshall, H2O, Knime, Palantir, RapidMiner, Tableau
Machine Learning Wars: Amazon vs Google vs BigML vs PredicSis
Comparing 4 Machine Learning APIs: Amazon Machine Learning, BigML, Google Prediction API and PredicSis on a real data from Kaggle, we find the most accurate, the fastest, the best tradeoff, and a surprise last place.
on May 12, 2015 in Amazon, BigML, Google, Louis Dorard, Machine Learning, PredicSis
Trifacta – Wrangling US Flight Data
A useful case study shows how Trifacta can clean and analyze US Flight data, including cleaning up markup, removing unrelated and redundant columns, cleaning geographic names and more.
on May 12, 2015 in Air traffic, Data Processing, Data Wrangling, Trifacta
Salford: 3 Ways to Improve your Targeted Marketing with Analytics, May 28
This webinar will show you how to optimize your targeted marketing using common analytics techniques. We will demonstrate with real-world data and give you software, data, and step-by-step instructions to enable you to do this on your own data.
on May 11, 2015 in Direct Marketing, Salford Systems
Upcoming Webcasts on Analytics, Big Data, Data Science – May 12 and beyond
Smart Apps With Deep Learning APIs, Actionable Insights from Life Sciences and Healthcare Data, Demystify Data Flows, 5 Predictive Analytics Lessons, 3 Ways to Improve Targeted Marketing, and more.
on May 11, 2015 in Angoss, Deep Learning, GridGain, James Taylor, Lavastorm, Ontotext, Salford Systems
Interview: Mark Weiner, Temple University Health System on Maturity Assessment of Healthcare Analytics
We discuss the challenges and opportunities created by increased collection of healthcare data, state of data accessibility, and the value of Analytics to the drug development process.
on May 11, 2015 in Challenges, Drugs, Healthcare, Interview, Mark Weiner, Pharma, Temple University
Gaming Analytics Summit 2015, San Francisco – Day 2 Highlights
Highlights from the presentations by Gaming Analytics leaders from Activision, Riot Games and Daybreak Game Company (formerly Sony Online Entertainment) on day 2 of Gaming Analytics Innovation Summit 2015 in San Francisco.
on May 11, 2015 in Activision, Analytics, Apache Storm, Conference, Data Lakes, Gaming Analytics, IE Group, Real-time, Sony
3 Things About Data Science You Won’t Find In Books
There are many courses on Data Science that teach the latest logistic regression or deep learning methods, but what happens in practice? Data Scientist shares his main practical insights that are not taught in universities.
on May 11, 2015 in Cross-validation, Data Preparation, Data Science, Feature Engineering, Feature Extraction, Overfitting
Angoss: 5 Predictive Analytics Lessons from a Decision Management Guru, May 21 Webinar
Decision Management thought leader James Taylor will present 5 key lessons to help you move from predictive analytics to "prescriptive analytics" and maximize the ROI on analytics strategies.
on May 11, 2015 in Angoss, Decision Management, James Taylor
Top stories for May 3-9: Data Scientists Automated by 2025? The Inconvenient Truth About Data Science
Data Scientists Automated and Unemployed by 2025? The Inconvenient Truth About Data Science; Most Viewed Big Data Videos on YouTube; Poll: Predictive Analytics, Data Mining, Data Science software/tools used?
on May 10, 2015 in Top stories
Cartoon: Data Scientist Mother
We revisit KDnuggets Cartoon which looks at the Mother of All Data. Enjoy and don't forget the mothers in your life - Big Data predicted that 67.53% of you would remember!
on May 10, 2015 in Cartoon
Most Viewed Big Data Videos on YouTube
The top Big Data YouTube videos by those like Hortonworks and Kirk D. Borne cover diverse topics including Hadoop, Big Data Trends, Deep Learning, and Big Data Leadership.
on May 9, 2015 in Big Data, Cloudera, Deep Learning, Google, Grant Marshall, Hadoop, IBM, Kirk D. Borne, TED, Youtube
April 2015 Analytics, Big Data, Data Mining Acquisitions and Startups Activity
Apr 2015 acquisitions, startups, and company activity in Analytics, Big Data, Data Mining, and Data Science: Microsoft + RevolutionR, Most-Funded Startups, VC says boom not bubble - look at valuations, and more.
on May 8, 2015 in Cloudera, Datazen, Domo, IBM Watson, Microsoft, MongoDB, Palantir, Revolution Analytics, Startups
Ontotext: Creating Actionable Insights from Life Sciences and Healthcare Data, May 14
Life sciences and healthcare organizations sit on mountains of structured and unstructured information. This webinar shows how semantic technology and text analytics help get value from this information, with a focus on Data Modeling, Data Mining, and Data Fusion.
on May 8, 2015 in Healthcare, Life Science, Ontotext, Semantic Analysis, Unstructured data
Interview: Alison Burnham, Scorebig on Optimal, Real-time Pricing through Analytics
We discuss Analytics at ScoreBig, company’s business model, unexpected insights, challenges in customer value management, advice, and more.
on May 8, 2015 in Advice, Analytics, Big Data, Career, Customer Analytics, Customer Value, Interview, Machine Learning, Pricing, Real-time
Gaming Analytics Summit 2015, San Francisco – Day 1 Highlights
Highlights from the presentations by Gaming Analytics leaders from Facebook, Turbine/Warner Bros Games, and Sega on day 1 of Gaming Analytics Innovation Summit 2015 in San Francisco.
on May 8, 2015 in Apache Spark, Conference, Facebook, Games, Gaming Analytics, Highlights, IE Group, Prediction
Top April stories: Awesome Public Datasets on GitHub; Forrester Wave Big Data Predictive Analytics – Gainers and Losers
Awesome Public Datasets on GitHub; Forrester Wave Big Data Predictive Analytics - Gainers and Losers; Cloud Machine Learning Wars; Top LinkedIn Groups for Analytics, Big Data, Data Mining, and Data Science.
on May 8, 2015 in Top stories
Interview: Michael Stonebraker, greatest living contributor to database technology
Michael Stonebraker, described as the greatest living contributor to database technology, on how he adjusts to the award and what trends he foresees in database management systems and big data.
on May 7, 2015 in ACM, Awards, Michael Stonebraker, Relational Databases
10 reasons why how-old.net went viral and how does it work?
10 reasons why how-old.net went viral and how does it actually works - classic linear regression on top of amazing face recognition.
on May 7, 2015 in Face Recognition, HowOldRobot, Joseph Sirosh, Microsoft
Lavastorm Webinar: Demystify Your Data Flows for Better Regulatory Compliance, May 19
Complying with new financial regulations can be incredibly costly. Learn how Lavastorm helps firms aggregate and manage data, provide visibility into data sources, transformation and lineage, and reduce time and costs spent on compliance.
on May 7, 2015 in Compliance, Data Preparation, Lavastorm
Poll: What Predictive Analytics, Data Mining, Data Science software/tools you used in the past 12 months?
Vote in KDnuggets 16th Annual Poll: What Analytics, Data Mining, Data Science software/tools you used in the past 12 months for a real project?. We will clean and analyze the results and publish our trend analysis afterwards.
on May 7, 2015 in Analytics Languages, Data Mining Software, Data Science Platform, Deep Learning, Hadoop, Poll
Best 5 minutes in Data Science, Season 1
Ingo Mierswa, Data Scientist and RapidMiner CEO, shares his thoughts about trends, challenges and opportunities in analytics and explains data science concepts in understandable and fun style, with help from a unicorn, a wizard, and other special guests.
on May 6, 2015 in Data Science Education, Ingo Mierswa, RapidMiner
KDD 2015 Innovation and Service Awards, nominations due June 5
ACM SIGKDD Innovation and Service Awards recognize outstanding technical innovations and outstanding professional contributions to the field of Big data, Data Mining, Knowledge Discovery, and Predictive Analytics.
on May 6, 2015 in Awards, KDD-2015, SIGKDD, Ted Senator
Deep Learning with Structure – a preview
A big problem with Deep Learning networks is that their internal representation lacks interpretability. At the upcoming #DeepLearning Summit, Charlie Tang, a student of Geoff Hinton, will present an approach to address this concern - here is a preview.
on May 6, 2015 in Deep Learning, Geoff Hinton, Image Recognition, NVIDIA, RE.WORK
The Inconvenient Truth About Data Science
Data is never clean, you will spend most of your time cleaning and preparing data, 95% of tasks do not require deep learning, and more inconvenient wisdom.
on May 5, 2015 in Advice, Data Cleaning, Data Science
Data Scientists Automated and Unemployed by 2025?
Will Data Scientists be unemployed by 2025? Majority of voters in latest KDnuggets Poll expect expert-level Data Science to be automated in 10 years or less.
on May 5, 2015 in Automation, Data Scientist, Poll
Basketball Predictive Analytics: Will he take the shot?
Sports analytics has reached a new level - now researchers can predict whether and from where a basketball player will take shot, Check a fun online app that lets you play with predictions.
on May 5, 2015 in Basketball, Disney, Francois Petitjean, Predictive Analytics, Sports
Has your data become overwhelming? Attend PAW Chicago
Has your data become overwhelming? Let Predictive Analytics World Help! Attend our Chicago event(s) to develop the skills and strategies to use your data effectively. KDnuggets discount.
on May 5, 2015 in Chicago, PAW, PAW-mfg, Predictive Analytics World
Try JMP® free for 30 days
JMP software from SAS gives you dynamic data visualization and analytics on the desktop - speed the learning cycle and make it easier to reach breakthrough discoveries.
on May 4, 2015 in JMP, SAS
Top KDnuggets tweets, Apr 27 – May 3: Attack of the #BigData Startups; Data Science from Scratch: First Principles with Python
DataScience from Scratch: First Principles with Python; Attack of the #BigData Startups - @cbinsights maps the industries; Not accurate, but #fun ! See how old Microsoft thinks you are; US Chief Data Scientist @dpatil on #DataScience #BigData past and future.
on May 4, 2015 in DJ Patil, Face Recognition, Kirk D. Borne, Microsoft, NoSQL, Python, Startups
Upcoming Webcasts on Analytics, Big Data, Data Science – May 5 and beyond
Making Sense of Hadoop, Data Mining - Failure to Launch, Demystify your Data Flows, 5 Lessons from a Decision Management Guru, 3 Ways to Improve Regression, and more.
on May 4, 2015 in Angoss, Lavastorm, Salford Systems, The Modeling Agency
Top LinkedIn Groups for Analytics, Big Data, Data Mining, and Data Science – Discussions up, Engagement down
While discussions are growing, the comments and engagements are falling, especially since 2012. We cluster groups into 4 quadrants by activity level and identify most active and engaged groups. Open groups are twice as active as closed.
on May 4, 2015 in About KDnuggets, LinkedIn, LinkedIn Groups
80 upcoming May – November 2015 Meetings in Analytics, Big Data, Data Mining, Data Science
Coming soon: Big Data Innovation London, Business Analytics Chicago, Smartcon Istanbul, Open Data Science Boston, PAW Chicago, ICML Lille, Sentiment Analysis NYC, and many more.
on May 4, 2015 in Boston, Chicago, London, New York City, San Francisco
Guiding Principles to Build a Demand Forecast
Demand forecasting is key for many industries, including finance, healthcare, and retails, and it is one of the most challenging tasks for predictive analytics. We review challenges and guiding principles of demand forecasting.
on May 4, 2015 in Forecasting, Lana Klein
Strata + Hadoop World 2015, London, May 5-7, Watch Live
As Strata media partner, KDnuggets brings to you this opportunity to watch keynotes from Strata + Hadoop London live, May 5-7. Check the details inside.
on May 4, 2015 in Hadoop, London, O'Reilly, Strata
NoSQL matters, great fun and tech nerdity, Dublin, June 4
NoSQL matters provides a opportunity to network with leading NoSQL experts from all around the world. Also, enter a raffle for a KDnuggets free ticket.
on May 4, 2015 in Dublin, Ireland, NoSQL
NYC Data Science Academy 12 weeks Bootcamp, Apply by May 15
Upcoming NYC Data Science Academy education includes a 12-week bootcamp, classes on Intro to Data Science, and Data Analysis with R and with Python, meetups, and job placement assistance.
on May 3, 2015 in Bootcamp, Data Science Education, NYC Data Science Academy, Python, R
Additions to KDnuggets Directory in April
20+ new meetings, including Smartcon (Istabul), Collab. Data Science, Boston Data Festival, SIGMOD 2016, ICDM 2016; Awesome public datasets; DecisionIQ, VisualText and more.
on May 3, 2015 in CA, Datasets, San Francisco
Top stories for Apr 26 – May 2: The Myth of Model Interpretability; How To Become a Data Scientist and Get Hired
The Myth of Model Interpretability; How To Become a Data Scientist And Get Hired; Top LinkedIn Groups for Analytics, Big Data, Data Mining; Emmanuel Letouze, Data-Pop Alliance on Big Data for Development and Future Prospects.
on May 3, 2015 in Top stories
Talking Machine – Deep Learning in Speech Recognition
A summary about an episode on the talking machine about deep neural networks in speech recognition given by George Dahl, who is one of Geoffrey Hinton’s students and just defended his Ph.D last month.
on May 2, 2015 in Deep Learning, Kaggle, Microsoft, Podcast, Ran Bi, Speech Recognition
Interview: Haile Owusu, Mashable on Surviving Imprecision in Digital Media Analytics
We discuss the challenges in tracking social media sharing, advice, important trends, and more.
on May 1, 2015 in Advice, Analytics, Career, Digital Media, Haile Owusu, Interview, Mashable, Trends
Data Mining Process/Workflow Reproducibility and KNIME
What happens with analytics and data mining workflows when different components change? KNIME approach of keeping the old versions as part of the platform guarantees reproducibility.
on May 1, 2015 in Data Processing, Knime, Michael Berthold, Reproducibility, Workflow
WebDataCommons – the Data and Framework for Web-scale Mining
The WebDataCommons project extracts the largest publicly available hyperlink graph, large product-, address-, recipe-, and review corpora, as well as millions of HTML tables from the Common Crawl web corpus and provides the extracted data for public download.
on May 1, 2015 in Big Data Analytics, Graph Databases, RDF, Web Mining
How Data Science makes Better Products
This video defines adaptive software, shows how data science realizes these applications, and discusses how these new tools are addressing real world challenges across all industries.
on May 1, 2015 in Data Science, Products, Sean McClure
DrivenData Competition: Keeping Boston Fresh
DrivenData civic innovation competition, "Keeping it Fresh," aims to help cities capitalize on their data. The task is to predict Boston restaurant violations from social media.
on May 1, 2015 in Boston, Competition, DrivenData, Yelp
How To Become a Data Scientist And Get Hired
A data scientist should be able to choose the right technology, understand the business context and solve a wide range of problems. To hire the the right data scientist, check the tips list in the post.
on May 1, 2015 in Business, Data Scientist, Hiring, Salary