- PyCaret 2.3.5 Is Here! Learn What’s New - Nov 26, 2021.
Read about the new functionalities added in PyCaret’s recent release.
Open Source, PyCaret, Python
- 7 Top Open Source Datasets to Train Natural Language Processing (NLP) & Text Models - Nov 8, 2021.
With a lot of excitement and research around NLP, there are growing opportunities to apply these technologies to real-world scenarios. It's not trivial to become familiar with NLP and these open-source data sets can help you increase your skills.
Dataset, NLP, Open Source
- How to Build Data Frameworks with Open Source Tools to Enhance Agility and Security - Oct 27, 2021.
Let’s take a look at how to harness open source tools to build your data frameworks.
Data Democratization, Deployment, Open Source, Security
- Introducing PostHog: An open-source product analytics platform - Sep 23, 2021.
PostHog is an open-source product analytics platform that helps you and your product team capture, analyze, and make informed decisions based on user behaviour.
Data Analytics, Data Platform, Open Source, Platform, Tools
The Machine & Deep Learning Compendium Open Book - Sep 16, 2021.
After years in the making, this extensive and comprehensive ebook resource is now available and open for data scientists and ML engineers. Learn from and contribute to this tome of valuable information to support all your work in data science from engineering to strategy to management.
Deep Learning, ebook, GitHub, Machine Learning, Open Source
- KDnuggets™ News 21:n32, Aug 25: Open Source Datasets for Computer Vision; Django’s 9 Most Common Applications - Aug 25, 2021.
Open Source Datasets for Computer Vision; Django’s 9 Most Common Applications; How to Select an Initial Model for your Data Science Problem; Automate Microsoft Excel and Word Using Python; Stack Overflow Survey Data Science Highlights
Computer Vision, Datasets, Django, Microsoft, Modeling, Open Source, Python, StackOverflow
Open Source Datasets for Computer Vision - Aug 18, 2021.
Access to high-quality, noise-free, large-scale datasets is crucial for training complex deep neural network models for computer vision applications. Many open-source datasets are developed for use in image classification, pose estimation, image captioning, autonomous driving, and object segmentation. These datasets must be paired with the appropriate hardware and benchmarking strategies to optimize performance.
Computer Vision, Datasets, Open Source
- Querying the Most Granular Demographics Dataset - Aug 13, 2021.
Having access to broad and detailed population data can potentially offer enormous value to any organization looking to interact with specific demographics. However, access alone is not sufficient without being able to leverage advanced techniques to explore and visualize the data.
Big Data, Data Visualization, Geolocation, Neo4j, Open Source
- KDnuggets™ News 21:n29, Aug 4: GitHub Copilot Open Source Alternatives; 3 Reasons Why You Should Use Linear Regression Models Instead of Neural Networks - Aug 4, 2021.
GitHub Copilot Open Source Alternatives; 3 Reasons Why You Should Use Linear Regression Models Instead of Neural Networks; A Brief Introduction to the Concept of Data; MLOps Best Practices; GPU-Powered Data Science (NOT Deep Learning) with RAPIDS
Generative Models, GitHub, Linear Regression, Modeling, Neural Networks, NLP, Open Source, Programming
- Facebook Open Sources a Chatbot That Can Discuss Any Topic - Jul 27, 2021.
The new version expands the capabilities of its predecessor building a much more natural conversational experience.
Chatbot, Facebook, NLP, Open Source
- How to Use Kafka Connect to Create an Open Source Data Pipeline for Processing Real-Time Data - Jul 23, 2021.
This article shows you how to create a real-time data pipeline using only pure open source technologies. These include Kafka Connect, Apache Kafka, Kibana and more.
Data Processing, Kafka, Open Source, Pipeline, Real-time
- Overview of Albumentations: Open-source library for advanced image augmentations - Jul 22, 2021.
With code snippets on augmentations and integrations with PyTorch and Tensorflow pipelines.
Image Processing, Open Source, Python, PyTorch, TensorFlow
- 7 Open Source Libraries for Deep Learning Graphs - Jul 15, 2021.
In this article we’ll go through 7 up-and-coming open source libraries for graph deep learning, ranked in order of increasing popularity.
Deep Learning, Graphs, Open Source
- Amazing Low-Code Machine Learning Capabilities with New Ludwig Update - Jun 22, 2021.
Integration with Ray, MLflow and TabNet are among the top features of this release.
Low-Code, Machine Learning, Open Source, Uber
- The 7 Best Open Source AI Libraries You May Not Have Heard Of - Jun 9, 2021.
AI researchers today have many exciting options for working with specialized tools. Although starting original projects from scratch is often not necessary, knowing which existing library to leverage remains a challenge. This list of generally unknown yet awesome, open-source libraries offers an interesting collection to consider for state-of-the-art research that spans from automatic machine learning to differentiable quantum circuits.
AI, Hyperparameter, Julia, Open Source, Probability, Quantum Computing
- 5 Data Science Open-source Projects You Should Consider Contributing to - Jun 7, 2021.
As you prepare to interview for a position in data science or are looking to jump to the next level, now is the time to enhance your skills and your resume with by working on rea, open-source projects. Here, we suggest a great selection of projects you can contribute to and help build something awesome, so, all you need to do choose one and tackle it head on.
Caffe, Data Science, Data Science Skills, GitHub, Google, Machine Learning, Open Source
- Binary Classification with Automated Machine Learning - May 17, 2021.
Check out how to use the open-source MLJAR auto-ML to build accurate models faster.
Automated Machine Learning, AutoML, Classification, Open Source
- Easy AutoML in Python - Apr 1, 2021.
We’re excited to announce that a new open-source project has joined the Alteryx open-source ecosystem. EvalML is a library for automated machine learning (AutoML) and model understanding, written in Python.
Automated Machine Learning, AutoML, Machine Learning, Open Source, Python
Google’s Model Search is a New Open Source Framework that Uses Neural Networks to Build Neural Networks - Mar 1, 2021.
The new framework brings state-of-the-art neural architecture search methods to TensorFlow.
Automated Machine Learning, AutoML, Google, Neural Networks, Open Source
- Easy, Open-Source AutoML in Python with EvalML - Feb 16, 2021.
We’re excited to announce that a new open-source project has joined the Alteryx open-source ecosystem. EvalML is a library for automated machine learning (AutoML) and model understanding, written in Python.
Automated Machine Learning, AutoML, Machine Learning, Open Source, Python
- Facebook Open Sources ReBeL, a New Reinforcement Learning Agent - Dec 14, 2020.
The new model tries to recreate the reinforcement learning and search methods used by AlphaZero in imperfect information scenarios.
Agents, AI, Facebook, Open Source, Reinforcement Learning
Facebook Open Sourced New Frameworks to Advance Deep Learning Research - Nov 17, 2020.
Polygames, PyTorch3D and HiPlot are the new additions to Facebook’s open source deep learning stack.
Deep Learning, Facebook, Open Source, PyTorch, Research
- Microsoft and Google Open Sourced These Frameworks Based on Their Work Scaling Deep Learning Training - Nov 2, 2020.
Google and Microsoft have recently released new frameworks for distributed deep learning training.
Deep Learning, Google, Microsoft, Open Source, Scalability, Training
- Uber Open Sources the Third Release of Ludwig, its Code-Free Machine Learning Platform - Oct 13, 2020.
The new release makes Ludwig one of the most complete open source AutoML stacks in the market.
Automated Machine Learning, AutoML, Machine Learning, Open Source, Uber
- Getting Started in AI Research - Oct 5, 2020.
A guide on how to contribute to confirming the reproducibility of some of the most recent papers and join open-search research.
AI, GitHub, Google, Open Source, Research
Netflix’s Polynote is a New Open Source Framework to Build Better Data Science Notebooks - Aug 5, 2020.
The new notebook environment provides substantial improvements to streamline experimentation in machine learning workflows.
IDE, Jupyter, Netflix, Open Source, Scala
- What I learned from looking at 200 machine learning tools - Jul 21, 2020.
While hundreds of machine learning tools are available today, the ML software landscape may still be underdeveloped with more room to mature. This review considers the state of ML tools, existing challenges, and which frameworks are addressing the future of machine learning software.
Data Science Platform, Data Science Tools, Machine Learning, MLOps, Open Source, Tools
- Lynx Analytics is open-sourcing LynxKite, its Complete Graph Data Science Platform - Jun 25, 2020.
Check out this article for a brief summary on what LynxKite is, where it is coming from and how it can help with your data science projects.
Data Science Platform, Graph Analytics, Open Source
Uber’s Ludwig is an Open Source Framework for Low-Code Machine Learning - Jun 15, 2020.
The new framework allow developers with minimum experience to create and train machine learning models.
Low-Code, Machine Learning, No-Code, Open Source, Uber
- LinkedIn Open Sources a Small Component to Simplify the TensorFlow-Spark Interoperability - May 25, 2020.
Spark-TFRecord enables the processing of TensorFlow’s TFRecord structures in Apache Spark.
LinkedIn, Open Source, Spark, TensorFlow
Build and deploy your first machine learning web app - May 22, 2020.
A beginner’s guide to train and deploy machine learning pipelines in Python using PyCaret.
App, Flask, Heroku, Machine Learning, Modeling, Open Source, Pipeline, PyCaret, Python
- Facebook Open Sources Blender, the Largest-Ever Open Domain Chatbot - May 15, 2020.
The new conversational agent exhibit human-like behavior in conversations about almost any topic.
Chatbot, Facebook, NLP, Open Source
- Google Open Sources SimCLR, A Framework for Self-Supervised and Semi-Supervised Image Training - Apr 27, 2020.
The new framework uses contrastive learning to improve image analysis in unlabeled datasets.
Google, Image Recognition, Open Source, Self-supervised Learning
- Announcing PyCaret 1.0.0 - Apr 21, 2020.
An open source low-code machine learning library in Python. PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with few words only. This makes experiments exponentially fast and efficient.
Machine Learning, Modeling, Open Source, PyCaret, Python
- OpenAI Open Sources Microscope and the Lucid Library to Visualize Neurons in Deep Neural Networks - Apr 17, 2020.
The new tools shows the potential of data visualizations for understanding features in a neural network.
Neural Networks, Open Source, OpenAI, Visualization
- Free Workshop Preview: Data Thinking with Martin Szugat - Apr 13, 2020.
As anticipation grows for Predictive Analytics World’s virtual conferences (PAW for Industry 4.0, PAW for Healthcare and Deep Learning World on 11-12 May 2020) and virtual workshops (13 May 2020), here is a chance to start familiarising yourself with the quality of the content and of the virtual networking. Gain an insight into how to apply design thinking for data science & analytics. Reserve your spot.
Data Science, Open Source, PAW, Predictive Analytics World
- Sharing your machine learning models through a common API - Feb 12, 2020.
DEEPaaS API is a software component developed to expose machine learning models through a REST API. In this article we describe how to do it.
API, Deep Learning, Machine Learning, Open Source, Python
- KDnuggets™ News 19:n46, Dec 4: The Future of Data Science Careers; Which Data Visualization Should I Use? - Dec 4, 2019.
This week: The Future of Careers in Data Science & Analysis; Task-based effectiveness of basic visualizations; Open Source Projects by Google, Uber and Facebook for Data Science and AI; Getting Started with Automated Text Summarization; A Non-Technical Reading List for Data Science; and much more!
Books, Careers, Data Science, Data Visualization, NLP, Open Source, Text Summarization
- Google Open Sources MobileNetV3 with New Ideas to Improve Mobile Computer Vision Models - Dec 2, 2019.
The latest release of MobileNets incorporates AutoML and other novel ideas in mobile deep learning.
Automated Machine Learning, Computer Vision, Google, Mobile, Open Source
Open Source Projects by Google, Uber and Facebook for Data Science and AI - Nov 28, 2019.
Open source is becoming the standard for sharing and improving technology. Some of the largest organizations in the world namely: Google, Facebook and Uber are open sourcing their own technologies that they use in their workflow to the public.
Advice, AI, Data Science, Data Scientist, Data Visualization, Deep Learning, Facebook, Google, Open Source, Python, Uber
- Contributing to PyTorch: By someone who doesn’t know a ton about PyTorch - Oct 9, 2019.
By the end of my week with the team, I managed to proudly cut two PRs on GitHub. I decided that I would write a blog post to knowledge share, not just to show that YES, you can too.
Open Source, Python, PyTorch
- What’s the Best Data Strategy for Enterprises: Build, buy, partner or acquire? - Jul 22, 2019.
Every large organization is investing heavily in building data solutions and tools. They are building data solutions from scratch when they could be taking advantage of readily available tools and solutions. Many organizations are re-inventing the wheel and wasting resources.
Acquisitions, Enterprise, Implementation, Open Source, Strategy
- 2018 Year-in-Review: Machine Learning Open Source Projects & Frameworks - Dec 17, 2018.
This post is a look at the top open source projects and major developments in machine learning frameworks over the past 12 months.
Machine Learning, Open Source, Review
- A comprehensive list of Machine Learning Resources: Open Courses, Textbooks, Tutorials, Cheat Sheets and more - Dec 7, 2018.
A thorough collection of useful resources covering statistics, classic machine learning, deep learning, probability, reinforcement learning, and more.
Cheat Sheet, Data Science Education, Deep Learning, Machine Learning, Mathematics, Open Source, Reinforcement Learning, Resources, Statistics
- Open Source Data Science Adoption: The How & Why - Dec 4, 2018.
Get the report on Enterprise Open Source Data Science Adoption which outlines the most popular open source tools for a host of jobs. Free download.
Data Science, Dataiku, Enterprise, Open Source
- The Evolution of Build Engineering in Managing Open Source [Webinar Replay] - Nov 13, 2018.
Explore how the role of build engineering is evolving to reconcile two key trends: massive wide-scale adoption of open source; the most devastating cyber-attacks in recent history tied to unpatched dependencies and other vulnerabilities.
ActiveState, Cybersecurity, DevOps, Open Source, Risks
- How to Mitigate Open Source License Risks - Oct 30, 2018.
This whitepaper from ActiveState investigates the various types of OSS licenses, common myths and risks, DIY risk management, the importance of enterprise legal indemnification, and more.
ActiveState, Open Source, Risks, White Paper
- Implementing Automated Machine Learning Systems with Open Source Tools - Oct 25, 2018.
What if you want to implement an automated machine learning pipeline of your very own, or automate particular aspects of a machine learning pipeline? Rest assured that there is no need to reinvent any wheels.
Automated Machine Learning, Feature Engineering, Feature Selection, Hyperparameter, Machine Learning, Open Source
- Datmo: the Open Source tool for tracking and reproducible Machine Learning experiments - Sep 26, 2018.
As a data scientist, managing environments and experiments is always hard and results in wasted time and effort with all the troubleshooting and lost work. With datmo, you can track your experiments using this common standard and not worry about reproduction of previous work.
Data Science, Docker, Machine Learning, Open Source, Reproducibility
- 10 Big Data Trends You Should Know - Sep 17, 2018.
A collection of Big Data trends to familiarize yourself with, covering IoT Networks, Artificial Intelligence, Predictive Analytics, Dark Data and more.
AI, Big Data, Big Data Analytics, Chatbot, Dark Data, Data Analytics, IoT, Open Source, Trends
- Analyze, engineer, design: Do it all with Dash - Aug 24, 2018.
Open-source Dash lets you wrap a GUI around that analytical code, without leaving the familiarity of Python. Explore your data with rich, interactive drop-down menus, sliders, and other components, all in the web browser.
Dash, Data Visualization, Open Source, Plotly, Python
The 6 components of Open-Source Data Science/ Machine Learning Ecosystem; Did Python declare victory over R? - Jun 6, 2018.
We find 6 tools form the modern open source Data Science / Machine Learning ecosystem; examine whether Python declared victory over R; and review which tools are most associated with Deep Learning and Big Data.
Anaconda, Apache Spark, Data Science, Keras, Machine Learning, Open Source, Poll, Python, R, RapidMiner, Scala, scikit-learn, TensorFlow
- ioModel Machine Learning Research Platform – Open Source - Jun 5, 2018.
This article introduces ioModel, an open source research platform that ingests data and automatically generates descriptive statistics on that data.
Data Preparation, GitHub, Machine Learning, Open Source, Postgres, Python
- Torus for Docker-First Data Science - May 8, 2018.
To help data science teams adopt Docker and apply DevOps best practices to streamline machine learning delivery pipelines, we open-sourced a toolkit based on the popular cookiecutter project structure.
Data Science, DevOps, Docker, Machine Learning Engineer, Open Source, Python
- Top 16 Open Source Deep Learning Libraries and Platforms - Apr 24, 2018.
We bring to you the top 16 open source deep learning libraries and platforms. TensorFlow is out in front as the undisputed number one, with Keras and Caffe completing the top three.
Caffe, GitHub, Keras, Machine Learning, Open Source, TensorFlow
- How I Unknowingly Contributed To Open Source - Apr 24, 2018.
This article explains what is meant by the term 'open source' and why all data scientists should be a part of it.
fast.ai, GitHub, Jeremy Howard, Open Source, Python, scikit-learn
- Top KDnuggets tweets, Feb 14-20: Neural Network AI is simple. So… Stop pretending you are a genius - Feb 21, 2018.
#NeuralNetwork #AI is simple. Stop pretending you are a genius; Cartoon: #ValentinesDay or #MachineLearning Problems in 2118; #MachineLearning Top 10 Open Source Projects.
Open Source, Top tweets
Top 20 Python AI and Machine Learning Open Source Projects - Feb 20, 2018.
We update the top AI and Machine Learning projects in Python. Tensorflow has moved to the first place with triple-digit growth in contributors. Scikit-learn dropped to 2nd place, but still has a very large base of contributors.
GitHub, Machine Learning, Open Source, Python, scikit-learn, TensorFlow
- Supercharging Visualization with Apache Arrow - Jan 5, 2018.
Interactive visualization of large datasets on the web has traditionally been impractical. Apache Arrow provides a new way to exchange and visualize data at unprecedented speed and scale.
Apache Arrow, Big Data, Data Analytics, Data Visualization, Dremio, GPU, Graphistry, Open Source
- DeepSchool.io: Deep Learning Learning - Dec 22, 2017.
What I truly envision for deep school is that this will build a whole lot of Meetup nodes across the world where people will learn, mentor and network around sharing AI knowledge.
Deep Learning, Neural Networks, Online Education, Open Source
- Choosing an Open Source Machine Learning Library: TensorFlow, Theano, Torch, scikit-learn, Caffe - Nov 8, 2017.
Open Source is the heart of innovation and rapid evolution of technologies, these days. Here we discuss how to choose open source machine learning tools for different use cases.
Pages: 1 2
Caffe, Machine Learning, Open Source, scikit-learn, TensorFlow, Theano, Torch
- Full Stack Data Science at ODSC - Aug 29, 2017.
Improve your skills in every layer of the Data Science stack at ODSC West 2017 and test drive the leading open source tools. Save 60% with code KD60 until Sep 1.
CA, Data Science, ODSC, Open Source, San Francisco
- Data Version Control in Analytics DevOps Paradigm - Aug 14, 2017.
DevOps and DVC tools can help reduce time data scientists spend on mundane data preparation and achieve their dream of focusing on cool machine learning algorithms and interesting data analysis.
Analytics, Data Preparation, Data Science, DevOps, DVC, Open Source, Version Control
- Why Apache Arrow is the future for open source-columnar memory analytics - Aug 7, 2017.
Apache Arrow is a de-facto standard for columnar in-memory analytics. In the coming years we can expect all the big data platforms adopting Apache Arrow as its columnar in-memory layer.
Analytics, Apache, Apache Arrow, Big Data, In-Memory Computing, Open Source
- Visualizing Convolutional Neural Networks with Open-source Picasso - Aug 1, 2017.
Toolkits for standard neural network visualizations exist, along with tools for monitoring the training process, but are often tied to the deep learning framework. Could a general, easy-to-setup tool for generating standard visualizations provide a sanity check on the learning process?
Convolutional Neural Networks, Neural Networks, Open Source, Visualization
- Data Version Control: iterative machine learning - May 11, 2017.
ML modeling is an iterative process and it is extremely important to keep track of all the steps and dependencies between code and data. New open-source tool helps you do that.
CRISP-DM, DVC, GitHub, Machine Learning, Open Source, Reproducibility, Version Control
- You Scored 200 Dollars Off Open Source Data Event in Boston - May 2, 2017.
Use code KDPV17 to save on Postgres Vision, June 26-28, 2017, at the Royal Sonesta Boston. Co-hosted by EnterpriseDB and MIT, the event sponsors include Amazon Web Services, Avnet, credativ, EnterpriseDB, IBM, Microsoft, MIT, NEC, Palisade Compliance, Quest, TechData, and The Executive Council.
Boston, Data Management, MA, Open Source, Postgres
- Open Source is Central to the Data Management Conversation, Boston, June 26-28 - Apr 18, 2017.
Open source dominates the data management conversation. Postgres Vision, June 26-28, Boston, explores the business value realized from innovative solutions and strategies. Use code KDPV17 to save.
Boston, Data Management, MA, Open Source, Postgres
- DataRobot Webinar, May 2: How Automated Machine Learning is Transforming the Predictive Analytics Landscape - Apr 11, 2017.
Learn how DataRobot automates predictive modeling, and how our platform can deliver these same types of insights and a substantial productivity boost to your machine learning endeavors.
Automated Data Science, Automated Machine Learning, DataRobot, Open Source
- Help Define the Future of Open Source Data Management, Boston, June 26-28 - Apr 10, 2017.
Postgres Vision, June 26-28, Boston, will be a forum for the sharpest minds in open source as organizations strive to harvest greater strategic value and actionable insight from their data. Use code KDPV17 to save.
Boston, Data Management, MA, Open Source, Postgres
- Open Source Toolkits for Speech Recognition - Mar 14, 2017.
This article reviews the main options for free speech recognition toolkits that use traditional Hidden Markov Models and n-gram language models.
C++, Java, Open Source, Python, Speech Recognition, SVDS
- Kobielus Predictions for Data Science in 2017 - Dec 5, 2016.
IBM Data Evangelist James Kobielus predictions for 2017, including key role of data scientists in survival of their companies. Join industry experts for a live #MakeDataSimple Crowdchat on Thursday December 8 at 1:00pm EST.
2017 Predictions, Data Science, Data Science Skills, Data Scientist, IBM, IBM Watson, Open Source
- Top KDnuggets tweets, Nov 16-22: Top 20 #Python #MachineLearning #OpenSource Projects; Shortcomings of #DeepLearning - Nov 23, 2016.
Top 20 #Python #MachineLearning #OpenSource Projects; Shortcomings of #DeepLearning; What is the Difference Between #DeepLearning and Regular #MachineLearning?; Questions To Ask When Moving #MachineLearning From Practice to Production; How to Choose the Right #Database System
Deep Learning, Machine Learning, Open Source, Python, Top tweets
Top 20 Python Machine Learning Open Source Projects, updated - Nov 21, 2016.
Open Source is the heart of innovation and rapid evolution of technologies, these days. This article presents you Top 20 Python Machine Learning Open Source Projects of 2016 along with very interesting insights and trends found during the analysis.
GitHub, Machine Learning, Open Source, Python, scikit-learn
- Webinar: Breaking Data Science Open, Sep 15 - Sep 12, 2016.
Learn how to drive collaboration and data science teamwork; how to mitigate legal risk through open source assurance and appropriate package selection, and how to democratize innovation through broad access to open data science tools.
Continuum Analytics, Data Science, Open Source, Python, R
- Top Machine Learning Projects for Julia - Aug 19, 2016.
Julia is gaining traction as a legitimate alternative programming language for analytics tasks. Learn more about these 5 machine learning related projects.
Deep Learning, Julia, Machine Learning, Open Source, scikit-learn
- 35 Open Source tools for Internet of Things - Jul 25, 2016.
If you have heard about the Internet of Things many times by now, its time to join the conversation. Explore the many open source tools & projects related to Internet of Things.
Pages: 1 2 3
Internet of Things, IoT, Open Source, Tools
- Webinar, July 28: How Open Data Science Can Help Analytics Leaders Survive & Thrive in an Era of Accelerating Technology Disruption - Jul 22, 2016.
Continuum Analytics CTO Peter Wang will show how you, an analytics leader, and your team can continuously leverage the latest innovations in data, analytics and computation by joining the big data party in the Open Data Science tent.
Continuum Analytics, Data Science, Open Source
- Getting Started with Analytics: What’s the Upfront Investment? - Jul 5, 2016.
Everyone wants to leverage analytics, but should everyone dive into the deep end right away? Heed some sensible advice on getting started with analytics, and assessing the true upfront investment.
Analytics, Excel, Investment, Open Source
- IBM: Open Source Data Scientist - Jun 8, 2016.
IBM seeks an Open Source Data Scientist to assist the sales team with solution sales activities to address a client’s specific challenges implementing Big Data solutions; must be entrepreneurial and self-driven.
Data Scientist, IBM, Open Source, USA
- Open Source Machine Learning Degree - Jun 6, 2016.
A set of free resources for learning machine learning, inspired by similar open source degree resources. Find links to books and book-length lecture notes for study.
Free, Machine Learning, Mathematics, Open Source
- 5 Machine Learning Projects You Can No Longer Overlook - May 19, 2016.
We all know the big machine learning projects out there: Scikit-learn, TensorFlow, Theano, etc. But what about the smaller niche projects that are actively developed, providing useful services to users? Here are 5 such projects.
Data Cleaning, Deep Learning, Machine Learning, Open Source, Overlook, Pandas, Python, scikit-learn, Theano
- ODSC East 2016: 3 ways to become a better Data Scientist - Apr 7, 2016.
This year’s 2016 ODSC East brings together the most influential data scientists, practitioners, innovators, and thought leaders in data science and big data, including many open source data science pioneers.
Boston, Data Science, MA, ODSC, Open Source
- Data Science Tools – Are Proprietary Vendors Still Relevant? - Mar 25, 2016.
We examine and quantify the dramatic impact of open source tools like R and Python on SAS, IBM, Microsoft, and other proprietary Data Science vendors. We also investigate how open source tools were faring against each other, which are growing, which are falling, and look R versus Python debate.
Pages: 1 2
Data Science Tools, IBM, Microsoft, Open Source, Python, R, SAS
- Top 10 Data Science Resources on Github - Mar 24, 2016.
The top 10 data science projects on Github are chiefly composed of a number of tutorials and educational resources for learning and doing data science. Have a look at the resources others are using and learning from.
Coursera, GitHub, IPython, Johns Hopkins, Open Source, Top 10
- Journey to Open Data Science, March 23 Webinar - Mar 15, 2016.
Learn how to drive collaboration and teamwork through open data science; mitigate legal risk through indemnification and appropriate package selection; bring advanced analytics to Excel-loving analysts with AnacondaXL.
Continuum Analytics, Data Science, Excel, Open Source, Python, R
- Top 10 Data Visualization Projects on Github - Feb 22, 2016.
Github provides a number of open source data visualization options for data scientists and application developers integrating quality visuals. This is a list and description of the top project offerings available, based on the number of stars.
D3.js, Data Visualization, GitHub, Matthew Mayo, Open Source, Top 10
- Opening Up Deep Learning For Everyone - Feb 19, 2016.
Opening deep learning up to everyone is a noble goal. But is it achievable? Should non-programmers and even non-technical people be able to implement deep neural models?
Caffe, Deep Learning, Feature Engineering, Open Source, TensorFlow
- Auto-Scaling scikit-learn with Spark - Feb 11, 2016.
Databricks gives us an overview of the spark-sklearn library, which automatically and seamlessly distributes model tuning on a Spark cluster, without impacting workflow.
Apache Spark, Databricks, Open Source, scikit-learn
- Spark 2015 Year In Review - Jan 15, 2016.
Apache Spark went through a lot in 2015. Get a solid review from Databricks, the steward organization founded by the creators of Spark and the drivers of its innovation.
Apache Spark, Databricks, Matei Zaharia, Open Source, Tungsten
- Top 10 Deep Learning Projects on Github - Jan 13, 2016.
The top 10 deep learning projects on Github include a number of libraries, frameworks, and education resources. Have a look at the tools others are using, and the resources they are learning from.
Caffe, Deep Learning, GitHub, Open Source, Top 10, Tutorials
Top 10 Machine Learning Projects on Github - Dec 14, 2015.
The top 10 machine learning projects on Github include a number of libraries, frameworks, and education resources. Have a look at the tools others are using, and the resources they are learning from.
Pages: 1 2
GitHub, Machine Learning, Matthew Mayo, Open Source, scikit-learn, Top 10
- R Style Ninjas: New Lifestyle Site for R Enthusiasts - Dec 11, 2015.
New R themed apparel site with several designs generated from R data visualizations. A portion of each purchase goes toward supporting R development.
Data Visualization, Open Source, Programming Languages, R, Startup
- Topological Data Analysis – Open Source Implementations - Nov 6, 2015.
Topological Data Analysis (TDA) is making waves in the analytics community lately, but are there open source options available?
C++, Java, Matthew Mayo, Open Source, Python, R, Topological Data Analysis
- H2O World 50% off for 24 hours only – Open Source Machine Learning - Sep 23, 2015.
Join machine learning industry leaders, H2O customers, and community in a day of H2O training and two days of talks. 50% OFF valid for 24 hours only.
CA, H2O, Hilary Mason, Machine Learning, Monica Rogati, Mountain View, Open Source, Robert Tibshirani
- YCML Machine Learning library on Github - Aug 24, 2015.
YCML is a new Machine Learning library available on Github as an Open Source (GPLv3) project. It can be used in iOS and OS X applications, and includes Machine Learning and optimization algorithms.
Backpropagation, GitHub, iOS, Machine Learning, Open Source, Optimization
- Interview: Stefan Groschupf, Datameer on Why Domain Expertise is More Important than Algorithms - Aug 6, 2015.
We discuss large-scale data architectures in 2020, career path, open source involvement, advice, and more.
Advice, Algorithms, Architecture, Career, Datameer, Domain Knowledge, Interview, Open Source, Stefan Groschupf
- Interview: Reiner Kappenberger, HP Security Voltage on How to Secure Data-in-Motion - Jul 9, 2015.
We discuss the security concerns in Big Data, challenges in securing Big Data locally and over cloud, and open source solutions – Knox and Ranger.
Challenges, Cloud, HP, HP Security Voltage, Interview, Open Source, Reiner Kappenberger, Security
- KDnuggets Interview: Amr Awadallah, CTO & Co-founder, Cloudera on the Secret Sauce of Open Source - Jul 2, 2015.
We discuss the critical success factor for open source projects, entrepreneurial lessons, advice, desired qualities in data scientists and more.
Amr Awadallah, Apache, Cloudera, Data Science Skills, Entrepreneur, Hadoop, Hiring, Interview, Open Source
- Interview: Joseph Babcock, Netflix on Genie, Lipstick, and Other In-house Developed Tools - Jun 16, 2015.
We discuss role of analytics in content acquisition, data architecture at Netflix, organizational structure, and open-source tools from Netflix.
Data Science, ETL, In-house, Interview, Joseph Babcock, Netflix, Open Source, Tools
- Top KDnuggets tweets, Jun 2-8: Starting salaries for #DataScientists have gone north of $200,000 - Jun 9, 2015.
Starting salaries for #DataScientists have gone north of $200K; Top 20 #Python #MachineLearning #OpenSource Projects; Neural Networks and Deep Learning, free online book (draft); #Airbnb announces #Aerosolve, an #OpenSource #MachineLearning #software package.
AirBnB, Free ebook, Machine Learning, Open Source, Python, Salary
- Interview: James Taylor, Salesforce on Apache Phoenix – RDBMS for Big Data - Jun 5, 2015.
We discuss the beginning of Phoenix project, decision of making it open source, relational database layer on HBase, and key reasons for the superior performance of Apache Phoenix.
Apache Phoenix, HBase, Interview, James Taylor, Open Source, Performance, RDBMS, Salesforce
- Top 20 Python Machine Learning Open Source Projects - Jun 1, 2015.
We examine top Python Machine learning open source projects on Github, both in terms of contributors and commits, and identify most popular and most active ones.
GitHub, Machine Learning, Open Source, Python, scikit-learn
- Open drives Boston Open Data Science Conference, May 30-31 - Apr 25, 2015.
Data science is built on transparency, effort, and the exchange of ideas. Join Open Data Science Conference, Boston, May 30-31, 2015.
Boston, Data Science, MA, ODSC, Open Source, Python, R, Sheamus McGovern
- Top /r/MachineLearning Posts, Apr 12-18: Andrew Ng AMA, Autoencoders, and Deep Learning Textbooks - Apr 23, 2015.
Andrew Ng's AMA, a probabilistic view of Autoencoders, open source sentiment analysis, deep learning textbooks, and Airbnb's host matching are all discussed this week on /r/MachineLearning.
AirBnB, Andrew Ng, Baidu, Deep Learning, Grant Marshall, Open Source, Reddit, Sentiment Analysis, Textbook
- Top KDnuggets tweets, Mar 19-22: Tensor methods for Machine Learning; Tibco survey: Big Data top use cases - Mar 23, 2015.
Tensor methods for #MachineLearning: fast, accurate, scalable, need open-source libs; #DataScience and Reproducibility: Explaining when the experiment does not work; Google #DeepLearning FaceNet is the best ever for recognizing faces; Tibco survey #BigData top use cases: Customer & Experience Analytics, Risk/Threat.
Big Data, Deep Learning, Face Recognition, Google, Open Source, Reproducibility, Tensor, Use Cases
- PredictionIO: Machine Learning Engineer (Evangelist) - Feb 26, 2015.
Are you passionate about machine learning and open source? Do you have the ability to engage other developers and data scientists? If yes, read on ...
API, CA, Machine Learning, Open Source, PredictionIO, San Francisco, Scala, USA
- PredictionIO: Machine Learning Evangelist - Feb 4, 2015.
Are you passionate about machine learning and open source? Do you have the ability to engage other developers and data scientists? If yes, read on ...
API, CA, Machine Learning, Open Source, PredictionIO, San Francisco, Scala, USA
- Top /r/MachineLearning posts, Jan 11-17 - Jan 18, 2015.
SVMs, open source datasets, Bayesian decision theory, game AI, and deep learning visualizations are all featured in the past week's top /r/MachineLearning posts.
AI, Bayesian, Datasets, Deep Learning, Games, Grant Marshall, Machine Learning, Open Source, Reddit, SVM, Visualization
- Top KDnuggets tweets, Dec 17-18: Why Amazon Ratings Might Mislead You; Open Source Tools for Machine Learning - Dec 19, 2014.
Why #Amazon Ratings Might Mislead You: The Story of Herding Effects; Open Source Tools for Machine Learning; #DeepLearning Intelligence Platform - Addressing AML #Terrorism #Financing; #NIPS2014 #MachineLearning Trends: Rapid progress in #DeepLearning.
Amazon, Deep Learning, NIPS, Open Source, Recommendations, Terrorism
- Open Source Tools for Machine Learning - Dec 17, 2014.
Open source machine learning software makes it easier to implement machine learning solutions on single computers and at scale, and the diversity of packages provide more options for implementers.
Free Data Mining Software, Free Software, Open Source, scikit-learn, Weka
- Open Source Big Data Analytics Platform - Dec 14, 2014.
Download IKANOW open source analytics platform for FREE and start analyzing structured and unstructured data sources. Great for cyber, social, and crisis use cases.
Big Data Analytics, Elasticsearch, IKANOW, MongoDB, Open Source
- Mode Playbook for Open Source Analytics - Dec 5, 2014.
Mode Analytics is open-sourcing their internal analysis and data visualizations which can be tailored to common data structures in SQL databases.
Churn, Mode Analytics, Open Data, Open Source, SQL
- SlamData Open Source Analytics Tool for MongoDB - Dec 4, 2014.
SlamData is an open source SQL-based tool designed to make accessing data in MongoDB easy for developers and non-developers alike with the goal of making application intelligence easier.
MongoDB, NoSQL, Open Source, SlamData, SQL
- KDnuggets Exclusive: Marten Mickos, SVP, HP on the Role of Open Source in Cloud industry - Nov 15, 2014.
In an exclusive interview with KDnuggets, Marten talks about HP’s Open Source strategy, evolution of Open Source production model, learning from the success of Open Source in Web, trends and more.
Career, Cloud, Helion, HP, Interview, Marten Mickos, Open Source, OpenStack
- H2O World, Open Source Machine Learning Meeting, Nov 18-19, Mountain View - Oct 27, 2014.
H2O World (Nov 18-19, Mountain View) is where the users of the very popular Open Source Machine Learning Engine H2O gather to share their knowledge and know-how to build Smart Applications.
Deep Learning, H2O, Machine Learning, Mountain View-CA, Open Source, Python, R, Scala
- Book: Modern Optimization with R - Oct 10, 2014.
Learn the most relevant concepts related to modern optimization methods and how to apply them using multi-platform, open source, R tools in this new book on metaheuristics.
Book, Open Source, Optimization, Paulo Cortez, R, Springer
- Mirador, a free tool for visual exploration of complex datasets - Oct 1, 2014.
Mirador is an open-source tool for visual exploration of complex datasets, enabling users to discover correlation patterns and derive new hypotheses from the data. Download Windows and Mac OS X versions from Github.
Ben Fry, Data Visualization, GitHub, Mirador, Open Source
- Rattle package for Data Mining and Data Science in R - Sep 17, 2014.
Try the newly-released version of Rattle, the open source R package for data mining, and enjoy accessing a huge array of data mining algorithms through a convenient interface.
Data Mining Software, Free Software, Graham Williams, Open Source, R, Togaware
- Interview: Michael Berthold, President and Founder of KNIME, on Data Mining, Startups, and Visual Workflow - Aug 9, 2014.
We discuss KNIME key features and how it compares to competition, KNIME business model, Pharma, planned development, and transition from an academic project to a company.
Knime, Konstanz University, Michael Berthold, Open Source
- Interview: Sujee Maniyam, Elephant Scale on Why Open Source is So Important for Big Data - Aug 8, 2014.
We discuss the importance of contributing to Open Source, Big Data skills for business managers, Big Data predictions, key qualities sought in data engineers, career advice and more.
Advice, Big Data, Elephant Scale, Hadoop, Hiring, Open Source, Sujee Maniyam, Trends
- Interview: Sujee Maniyam, Elephant Scale on the Best Free Online Resources to Learn Hadoop - Aug 7, 2014.
We discuss the startup - Elephant Scale, DIY Hadoop learning, best free online resources for learning Hadoop, getting a good job in Big Data, and the experience of authoring a book - Hadoop Illuminated (available for free).
Big Data, Certification, Elephant Scale, Free, GitHub, Hadoop, Hiring, Open Source, Skills, Sujee Maniyam
- Top KDnuggets tweets, Aug 1-3: Open Source Data Science Masters plan - Aug 4, 2014.
Open Source #DataScience Masters plan, with courses from Coursera, Stanford, edX; Book: Data Classification: Algorithms and Applications; Markov Chains, key #MachineLearning technique, nice visual explanation; Data Science with #Python: Part 1.
Classification, Data Science Education, Markov Chains, Master of Science, Open Source, Python
- BIDMach machine learning toolkit - Jul 14, 2014.
BIDMach machine learning toolkit offers "rooflined" (optimized to the limit) compute primitives and competitive performance on learning tasks like regression, clustering, classification, and matrix factorization.
Machine Learning, Open Source, Tools, UC Berkeley
- Interview: Ingo Mierswa, RapidMiner CEO on “Predaction” and Key Turning Points - Jun 27, 2014.
RapidMiner CEO Ingo Mierswa talks about "predaction", reasons for RapidMiner popularity, business source model, analytics to investigate fraud, key turning points, and more.
Ajay Ohri, Fraud analytics, Ingo Mierswa, Open Source, RapidMiner
- The R User Conference, June 30 – July 3, Los Angeles - Jun 19, 2014.
The open source R language is a leading tool for data scientists. Attend useR! conference, the main annual event of the R community, June 30 - July 3, in Los Angeles.
Los Angeles-CA, Open Source, R
- DLib: Library for Machine Learning - Jun 10, 2014.
DLib is an open source C++ library implementing a variety of machine learning algorithms, including classification, regression, clustering, data transformation, and structured prediction.
C++, DLib, Machine Learning, Open Source, Tools
- OpenNN, An Open Source Library For Neural Networks - Jun 2, 2014.
OpenNN is an open source class library written in C++ which implements neural networks, and runs on Windows, Apple, or Linux.
Neural Networks, Open Source, OpenNN
- Big Data Landscape, v 3.0, analyzed - May 15, 2014.
We analyze the Big Data Landscape and identify the most popular market segments in Analytics, Infrastructure, Applications, Open Source, and Data Sources categories. It is still early - only 4.5% of companies had exits.
Big Data, Big Data Analytics, Data Platform, Infrastructure, Landscape, Open Source, Startups
- Oracle Academy – Teaching Students Around The World - Apr 15, 2014.
Oracle academy teaches millions on students around the world, supports Oracle and open-source applications, with courses ranging from computer science for kids to Big Data education.
Bootcamp, Khan Academy, Online Education, Open Source, Oracle, Oracle Academy