- Data Engineering Technologies 2021 - Sep 21, 2021.
Emerging technologies supporting the field of data engineering are growing at a rapid clip. This curated list includes the most important offerings available in 2021.
Abacus.ai, Dask, Data Engineering, Databricks, Dataiku, DataRobot, dbt, Fivetran, Pachyderm
- Model Experiments, Tracking and Registration using MLflow on Databricks - Jan 5, 2021.
This post covers how StreamSets can help expedite operations at some of the most crucial stages of Machine Learning Lifecycle and MLOps, and demonstrates integration with Databricks and MLflow.
Data Science, Databricks, DataOps, Experimentation, MLflow, MLOps, Modeling, StreamSets
- Working with Spark, Python or SQL on Azure Databricks - Aug 27, 2020.
Here we look at some ways to interchangeably work with Python, PySpark and SQL using Azure Databricks, an Apache Spark-based big data analytics service designed for data science and data engineering offered by Microsoft.
Apache Spark, Databricks, Microsoft Azure, Python, SQL
- Automating Security & Privacy Controls for Data Science & BI – Webinar - Jul 28, 2020.
Moving sensitive data to the Cloud introduces the possibility of exposing data teams to new levels of risk, making it challenging to manage and prepare sensitive data for data science and analytics. Join our live webinar, Automating Security & Privacy Controls for Data Science & BI, Aug 12 @ 1PM ET to learn how Immuta for Databricks enables you to maximize the value of your sensitive data.
Automation, Data Science, Databricks, Immuta, Privacy, Security, Webinar
Leaders, Changes, and Trends in Gartner 2020 Magic Quadrant for Data Science and Machine Learning Platforms - Feb 24, 2020.
The Gartner 2020 Magic Quadrant for Data Science and Machine Learning Platforms has the largest number of leaders ever. We examine the leaders and changes and trends vs previous years.
Alteryx, Data Science Platform, Databricks, Dataiku, DataRobot, Domino, Gartner, Google, H2O, IBM, Knime, Machine Learning, Magic Quadrant, MathWorks, Microsoft Azure, RapidMiner, SAS, TIBCO
- How to solve 4 big problems in data science – eBook. - Mar 26, 2019.
This eBook includes insights on how data scientists from 4 leading companies delivered impressive business results such as accelerating global inventory from 48 hours to 45 minutes and reducing operational cost of analytics infrastructure by 30%. Get the eBook now!
Data Science, Databricks, Deployment, ebook
- Scaling Big Data and AI – Spark + AI Summit 2019 - Mar 25, 2019.
Data and AI are all about scale. Databricks is bringing the Spark + AI Summit to San Francisco Apr 23-25. Check out the full list of sessions at Summit to see more exciting talks. Use code KDNuggets200 and get $200 off registration.
AI, Apache Spark, CA, Databricks, San Francisco
- [eBook] Standardizing the Machine Learning Lifecycle - Mar 15, 2019.
We explore what makes the machine learning lifecycle so challenging compared to regular software, and share the Databricks approach.
Databricks, ebook, Life Cycle, Machine Learning, MLflow
- [Webinar] Managing the Complete Machine Learning Lifecycle - Feb 28, 2019.
Join Databricks Mar 7, 2019, to learn how using MLflow can help you keep track of experiment runs and results across frameworks, execute projects remotely on to a Databricks cluster, and quickly reproduce your runs, and more. Sign up for this webinar now.
Databricks, Life Cycle, Machine Learning, Workflow
- How to solve 4 big problems in data science – eBook - Feb 15, 2019.
This eBook includes insights and learnings on how data scientists from four leading companies delivered impressive business results like accelerating global inventory from 48 hours to 45 minutes and reducing operational cost of analytics infrastructure by 30%. Get the eBook now!
Data Science, Databricks, ebook
- [Webinar] Accelerating Machine Learning on Databricks - Jan 9, 2019.
In this webinar, we will cover some of the latest innovations brought into the Databricks Unified Analytics Platform for Machine Learning.
Databricks, Deep Learning, Deployment, Machine Learning
- Spark + AI Summit: learn best practices in ML and DL, latest frameworks, and more – special KDnuggets offer - Dec 14, 2018.
Check agenda for the Spark + AI Summit in San Francisco on April 23-25, 2019, comprising of 12 technical tracks on data and AI across verticals, and get the biggest discount: $700 off until Dec 31.
AI, Apache Spark, CA, Databricks, San Francisco
- Four Real-Life Machine Learning Use Cases - Dec 13, 2018.
This ebook will walk you through four use cases for Machine Learning on Databricks, covering loan risk, advertising analytics and predictive use case, market basket analysis, suspicious behaviour identification in video use, and more.
Databricks, ebook, Machine Learning, Use Cases
- A Machine Learning Deep Dive [Webinar, Dec 13] - Dec 11, 2018.
Learn how ShopRunner uses Databricks on AWS and Snowflake to tackle data science problems across personalization, recommendations, targeting, and analysis of text and images.
AWS, Databricks, Deployment, Machine Learning, Personalization
- [Download] Real-Life ML Examples + Notebooks - Nov 13, 2018.
In this eBook, we will walk you through four Machine Learning use cases on Databricks: Loan Risk Use Case; Advertising Analytics & Prediction Use Case; Market Basket Analysis Problem at Scale; Suspicious Behavior Identification in Video Use Case. Get your copy now!
Databricks, ebook, Jupyter, Machine Learning, Use Cases
- [ebook] Manipulating Data in Apache Spark - Oct 29, 2018.
In this ebook from Databricks, learn how DataFrames leverage the power of distributed processing through Spark, how to make big data processing easier for a wider audience, and more.
Apache Spark, Clustering, Databricks, ebook, Free ebook
- [Webinar] Neural Network Fundamentals - Oct 16, 2018.
In this webinar, Oct 25, 2018, 10:00 am PST, we will apply your convolutional neural network using the ImageNet scenario. We will also review some of the ImageNet architectures and how convolutions work.
Convolutional Neural Networks, Databricks, Neural Networks
- Struggling with AI? You’re not alone – Read the report - Sep 27, 2018.
Download this report from Databricks to understand how enterprises are adopting AI technology, the primary challenges holding enterprises back from seeing success with AI, and the benefits of a taking a unified approach to data and AI.
AI, Databricks, Report
- ebook: Aggregating Data with Apache Spark™ - Sep 12, 2018.
Learn why cluster computing makes Spark the ideal processing engine for complex aggregations, the different types of aggregations that you can do with Spark, and more.
Apache Spark, Data Preparation, Databricks, ebook
- Project Hydrogen, new initiative based on Apache Spark to support AI and Data Science - Aug 16, 2018.
An introduction to Project Hydrogen: how it can assist machine learning and AI frameworks on Apache Spark and what distinguishes it from other open source projects.
AI, Apache Spark, Data Science, Databricks, Distributed Computing, Production
- ebook: Using Deep Learning to Solve Real-World Problems - Aug 14, 2018.
Read this eBook to learn: How deep learning enables image classification, sentiment analysis, and other advanced analysis techniques and get a a starter workflow for building and training deep learning models.
AI, Databricks, Deep Learning, ebook
- [eBook] A Unified Approach to Analytics with Apache Spark - Jul 25, 2018.
How your data scientists and engineers can build models and data pipelines rapidly while collaborating with the business - download the ebook now.
Analytics, Apache Spark, Data Science, Databricks, ebook
- Manage your Machine Learning Lifecycle with MLflow – Part 1 - Jul 5, 2018.
Reproducibility, good management and tracking experiments is necessary for making easy to test other’s work and analysis. In this first part we will start learning with simple examples how to record and query experiments, packaging Machine Learning models so they can be reproducible and ran on any platform using MLflow.
Databricks, Life Cycle, MLflow, Pipeline, Python, Workflow
- [ebook] Apache Spark™ Under the Hood - Jun 27, 2018.
Learn how to install and run Spark yourself; A summary of Spark core architecture and concepts; Spark powerful language APIs and how you can use them.
Apache Spark, Databricks, ebook, PyTorch, R, scikit-learn, TensorFlow
- ebook: A Guide to Data Science at Scale - Jun 12, 2018.
Read our eBook to learn how easy it is to build and scale ML models with a unified analytics platform, how to collaborate across data teams to uncover insights faster, and more. Free download.
Data Science, Databricks, ebook, Scalability
- Mastering Advanced Analytics with Apache Spark - May 22, 2018.
Get ebook with a collection of the most popular technical blog posts that introduce you to machine learning on Apache Spark, and highlight many of the major developments around Spark MLlib and GraphX.
Advanced Analytics, Apache Spark, Databricks, Graph Analytics, Machine Learning, MLlib
- Spark + AI Summit, Top Speakers – Andreessen, Karpathy, Zaharia and more – KDnuggets Offer - May 11, 2018.
Join 4,000 of the top developers, data scientists, and business executives who will be tuning into the sessions and training at this year's Spark+AI Summit. Use code KDnuggets to save 30% when you register by May 18.
AI, Andrej Karpathy, CA, Databricks, Marc Andreessen, Matei Zaharia, San Francisco, Spark
- Deep Learning With Apache Spark: Part 1 - Apr 18, 2018.
First part on a full discussion on how to do Distributed Deep Learning with Apache Spark. This part: What is Spark, basics on Spark+DL and a little more.
Apache Spark, Databricks, Deep Learning, Pipeline
- [ebook] 7 Steps for a Developer to Learn Apache Spark - Apr 17, 2018.
We offer a step-by-step guide to technical content and related assets that to help you learn Apache Spark, whether you're getting started with Spark or are an accomplished developer.
Apache Spark, Databricks, Developer, ebook, Spark SQL
- Solve Data Science Challenges Through Collaboration - Apr 3, 2018.
Get this eBook to learn key issues that hamper fragmented data science teams; how accelerate innovation via collaborative workspaces, and how top data science teams boosted productivity by up to 4x.
Collaboration, Data Science Team, Databricks, ebook
- Making Machine Learning Simple - Mar 20, 2018.
Learn how to build better models with support for multiple data sources and feature extraction at scale, simplify operations with on-demand cluster management, and more.
Apache Spark, Databricks, Feature Extraction, Machine Learning
- [eBook] Solving 4 Big Problems in Data Science - Mar 6, 2018.
Insights and tools from leading data science teams to accelerate results.
Apache Spark, Big Data, Cloud Computing, Databricks, Deployment, ebook
- The Data Scientist’s Guide to Apache Spark™ - Feb 16, 2018.
How data scientists can leverage Spark for advanced analytics.
Advanced Analytics, Apache Spark, Data Scientist, Databricks, Matei Zaharia
- Using Deep Learning to Solve Real World Problems - Jan 25, 2018.
Deep learning offers every company with large data new techniques to solve complex analytical problems. Read this ebook to learn more.
AI, Databricks, Deep Learning, ebook
- [eBook] A Gentle Introduction to Apache Spark(tm) - Nov 21, 2017.
If you are a developer or data scientist interested in big data, Spark is the tool for you. Download this ebook to learn why Spark is a popular choice for data analytics, what tools and features are available, and much more.
Apache Spark, Databricks, ebook, Free ebook
- Data Scientist Guide to Apache Spark - Oct 20, 2017.
Learn how data scientists can leverage Spark for advanced analytics with The Data Scientist’s Guide to Apache Spark, from Databricks!
Apache Spark, Data Science, Data Scientist, Databricks, Free ebook
- Spark – The Definitive Guide – exclusive preview - Sep 25, 2017.
Get an exclusive preview of "Spark: The Definitive Guide" from Databricks! Learn how Spark runs on a cluster, see examples in SQL, Python and Scala, Learn about Structured Streaming and Machine Learning and more.
Apache Spark, Databricks, Free ebook, Python, Scala, SQL
- Benchmarking Big Data SQL Platforms in the Cloud - Sep 21, 2017.
TPC-DS benchmarks demonstrate Databricks Runtime 3.0's superior performance. Sign-up for a Databricks account to get fastest performance.
Apache Spark, AWS, Benchmark, Cloud Computing, Databricks, Presto
A Vision for Making Deep Learning Simple - Sep 5, 2017.
This post introduces Deep Learning Pipelines from Databricks, a new open-source library aimed at enabling everyone to easily integrate scalable deep learning into their workflows, from machine learning practitioners to business analysts.
Apache Spark, Databricks, Deep Learning, Hyperparameter
- Spark Summit Europe – Big Ideas About Big Data- KDnuggets Offer - Aug 24, 2017.
Spark Summit will bring together more than 1,200 developers, data scientists, analysts, researchers, and business pros from around the world. Reg by Aug 25 to catch early bird rates and save extra 15% w. code KD824.
Apache Spark, Big Data, Databricks, Dublin, Summit
- Machine learning made simple with Apache Spark - Jun 15, 2017.
Powered by Apache Spark, Databricks provides an end-to-end platform designed to help data engineers and data scientists easily implement advanced analytics at scale. Download the Making Machine Learning Simple Whitepaper from Databricks to learn more.
Apache Spark, Databricks, White Paper
- How To Make Your Mark As A Woman In Big Data - Dec 3, 2016.
Despite the shift in big data technology innovation that is driving tremendous growth and opportunities, women still play a small role in this arena. Here are 5 thoughts for women considering a career in big data.
Advice, Big Data, Databricks, Women
7 Steps to Mastering Apache Spark 2.0 - Sep 16, 2016.
Looking for a comprehensive guide on going from zero to Apache Spark hero in steps? Look no further! Written by our friends at Databricks, this exclusive guide provides a solid foundation for those looking to master Apache Spark 2.0.
Pages: 1 2 3
7 Steps, Apache Spark, Databricks
- Achieving End-to-end Security for Apache Spark with Databricks - Jun 23, 2016.
The Databricks just-in-time data platform takes a holistic approach to solving the enterprise security challenge by building all the facets of security — encryption, identity management, role-based access control, data governance, and compliance standards — natively into the data platform with DBES.
Apache Spark, Databricks, Security
- Apache Spark Key Terms, Explained - Jun 13, 2016.
An overview of 13 core Apache Spark concepts, presented with focus and clarity in mind. A great beginner's overview of essential Spark terminology.
Pages: 1 2
Apache Spark, Databricks, Dataset, Explained, Key Terms, RDD, Tungsten
- Be Part of Spark Summit 2016, the Premier Big Data Event Dedicated to Apache Spark - May 25, 2016.
Whether you’re an Apache Spark newbie or a hardcore enthusiast, Spark Summit, June 6-8 in San Francisco, is the place to be to gain new insights and make valuable connections. Use promo code KDNuggets to save 15%
Apache Spark, CA, Databricks, San Francisco
- Spark 2.0 Preview Now on Databricks Community Edition: Easier, Faster, Smarter - May 17, 2016.
The preview of Spark 2.0 is here, and it promises to be easier, faster, and smarter.
Apache Spark, Databricks, SQL
- Introducing GraphFrames, a Graph Processing Library for Apache Spark - Mar 7, 2016.
An overview of Spark's new GraphFrames, a graph processing library based on DataFrames, built in a collaboration between Databricks, UC Berkeley's AMPLab, and MIT.
Apache Spark, Databricks, Graph Analytics
- Top Spark Ecosystem Projects - Mar 2, 2016.
Apache Spark has developed a rich ecosystem, including both official and third party tools. We have a look at 5 third party projects which complement Spark in 5 different ways.
Apache Mesos, Apache Spark, Cassandra, Databricks, Distributed Systems
- Auto-Scaling scikit-learn with Spark - Feb 11, 2016.
Databricks gives us an overview of the spark-sklearn library, which automatically and seamlessly distributes model tuning on a Spark cluster, without impacting workflow.
Apache Spark, Databricks, Open Source, scikit-learn
- Spark 2015 Year In Review - Jan 15, 2016.
Apache Spark went through a lot in 2015. Get a solid review from Databricks, the steward organization founded by the creators of Spark and the drivers of its innovation.
Apache Spark, Databricks, Matei Zaharia, Open Source, Tungsten
- Spark Summit 2015 San Francisco – Day 2 Keynote Highlights - Jun 19, 2015.
Highlights from keynote speeches delivered by various eminent big data technology leaders from industry and academia at Spark Summit 2015 Conference held in San Francisco.
Apache Spark, AWS, Baidu, CIA, Cloudera, Databricks, Intel, Spark SQL, Toyota
- Spark Summit 2015 San Francisco – Day 1 Keynote Highlights - Jun 17, 2015.
Highlights from keynote speeches delivered by various eminent big data technology leaders from industry and academia at Spark Summit 2015 Conference held in San Francisco.
Apache Spark, Conference, Databricks, Highlights, Hortonworks, IBM, MapR, Matei Zaharia, NASA
- Exclusive Interview: Matei Zaharia, creator of Apache Spark, on Spark, Hadoop, Flink, and Big Data in 2020 - May 22, 2015.
Apache Spark is one the hottest Big Data technologies in 2015. KDnuggets talks to Matei Zaharia, creator of Apache Spark, about key things to know about it, why it is not a replacement for Hadoop, how it is better than Flink, and vision for Big Data in 2020.
Apache Spark, Big Data, Databricks, Flink, Hadoop, Matei Zaharia, MLlib, Spark SQL
- Strata + Hadoop World 2015 San Jose – Day 2 Highlights - Mar 10, 2015.
Strata + Hadoop World 2015 was a great conference, and here are key insights from some of the best sessions on day 2.
Anomaly Detection, Apache Spark, Cloudera, Databricks, Intel, Microsoft, Netflix, Strata, Trifacta
- Apache Spark: O’Reilly Certification, EU Training, University Program - Sep 26, 2014.
Recent news on Apache Spark includes developer certification from O'Reilly, upcoming training workshops in EU by Databricks, and Spark tutorial events at major universities.
Academics, Apache Spark, Big Data, Certification, Databricks, Paco Nathan, Strata, Training
- July 2014 Analytics, Big Data, Data Mining Acquisitions and Startups Activity - Aug 7, 2014.
July 2014 acquisitions, startups, and company activity in Analytics, Big Data, Data Mining, and Data Science: Twitter buys Madbits, WalmartLabs buys Luvocracy, Zillow buys Trulia, Apple buys Booklamp, Yahoo buys Flurry, Salesforce buys RelateIQ, Couchbase, GE, Databricks, and more.
Acquisitions, Apple, Companies, Databricks, GE, Salesforce, Startups, Twitter, Yahoo, Zillow
- Top KDnuggets tweets, Jun 30 – Jul 1: Is “Data Scientist” more than “Data Analyst”? Good list of 41 Big Data Influencers - Jul 2, 2014.
Is "Data Scientist" more than "Data Analyst"? ; 41 Big Data Influencers - Journalists, Public Sector, Industry, Academia; Alteryx and Databricks to lead development of Apache SparkR ; Top data mining researcher @Jure Leskovec lecture on Webgraph structure.
Alteryx, Apache Spark, Big Data Influencers, Data Analyst, Data Scientist, Databricks
- Alpine Data expects faster, easier Data Science with Spark - Mar 18, 2014.
Alpine Data Labs becomes one of the first companies to be certified on Apache Spark, reported up to 100x faster than Hadoop. Alpine answers 3 questions from KDnuggets.
Alpine, Apache Spark, Collaborative, Databricks, Hadoop, Workflow