Search results for spark streaming

    Found 97 documents, 5944 searched:

  • Fast Big Data: Apache Flink vs Apache Spark for Streaming Data

    Real-time stream processing has been gaining momentum in recent past, and major tools which are enabling it are Apache Spark and Apache Flink. Learn with the help of a case study about Data processing, Data Flow, Data management using these tools.

    https://www.kdnuggets.com/2015/11/fast-big-data-apache-flink-spark-streaming.html

  • Containerization of PySpark Using Kubernetes

    This article demonstrates the approach of how to use Spark on Kubernetes. It also includes a brief comparison between various cluster managers available for Spark.

    https://www.kdnuggets.com/2020/08/containerization-pyspark-kubernetes.html

  • Apache Spark on Dataproc vs. Google BigQuery

    This post looks at research undertaken to provide interactive business intelligence reports and visualizations for thousands of end users, in the hopes of addressing some of the challenges to architects and engineers looking at moving to Google Cloud Platform in selecting the best technology stack based on their requirements and to process large volumes of data in a cost effective yet reliable manner.

    https://www.kdnuggets.com/2020/07/apache-spark-dataproc-vs-google-bigquery.html

  • The Benefits & Examples of Using Apache Spark with PySpark

    Apache Spark runs fast, offers robust, distributed, fault-tolerant data objects, and integrates beautifully with the world of machine learning and graph analytics. Learn more here.

    https://www.kdnuggets.com/2020/04/benefits-apache-spark-pyspark.html

  • Learn how to use PySpark in under 5 minutes (Installation + Tutorial)

    Apache Spark is one of the hottest and largest open source project in data processing framework with rich high-level APIs for the programming languages like Scala, Python, Java and R. It realizes the potential of bringing together both Big Data and machine learning.

    https://www.kdnuggets.com/2019/08/learn-pyspark-installation-tutorial.html

  • Practical Apache Spark in 10 Minutes

    Check out this series of articles on Apache Spark. Each part is a 10 minute tutorial on a particular Apache Spark topic. Read on to get up to speed using Spark.

    https://www.kdnuggets.com/2019/01/practical-apache-spark-10-minutes.html

  • Apache Spark Introduction for Beginners">Silver BlogApache Spark Introduction for Beginners

    An extensive introduction to Apache Spark, including a look at the evolution of the product, use cases, architecture, ecosystem components, core concepts and more.

    https://www.kdnuggets.com/2018/10/apache-spark-introduction-beginners.html

  • Introduction to Apache Spark

    This is the first blog in this series to analyze Big Data using Spark. It provides an introduction to Spark and its ecosystem.

    https://www.kdnuggets.com/2018/07/introduction-apache-spark.html

  • [ebook] 7 Steps for a Developer to Learn Apache Spark

    We offer a step-by-step guide to technical content and related assets that to help you learn Apache Spark, whether you're getting started with Spark or are an accomplished developer.

    https://www.kdnuggets.com/2018/04/databricks-ebook-7-steps-learn-apache-spark.html

  • A powerful new IDE to build, test, and run Apache Spark applications on your desktop for free!

    Build enterprise-grade functionally rich Spark applications with the aid of an intuitive drag-and-drop user interface and a wide array of pre-built Spark operators.

    https://www.kdnuggets.com/2018/02/impetus-visual-spark-studio.html

  • Natural Language Processing Library for Apache Spark – free to use

    Introducing the Natural Language Processing Library for Apache Spark - and yes, you can actually use it for free! This post will give you a great overview of John Snow Labs NLP Library for Apache Spark.

    https://www.kdnuggets.com/2017/11/natural-language-processing-library-apache-spark.html

  • PySpark SQL Cheat Sheet: Big Data in Python

    PySpark is a Spark Python API that exposes the Spark programming model to Python - With it, you can speed up analytic applications. With Spark, you can get started with big data processing, as it has built-in modules for streaming, SQL, machine learning and graph processing.

    https://www.kdnuggets.com/2017/11/pyspark-sql-cheat-sheet-big-data-python.html

  • A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets

    In this blog, I explore three sets of APIs—RDDs, DataFrames, and Datasets—available in a pre-release preview of Apache Spark 2.0; why and when you should use each set; outline their performance and optimization benefits; and enumerate scenarios when to use DataFrames and Datasets instead of RDDs.

    https://www.kdnuggets.com/2017/08/three-apache-spark-apis-rdds-dataframes-datasets.html

  • Apache Spark Key Terms, Explained

    An overview of 13 core Apache Spark concepts, presented with focus and clarity in mind. A great beginner's overview of essential Spark terminology.

    https://www.kdnuggets.com/2016/06/spark-key-terms-explained.html

  • Top Spark Ecosystem Projects

    Apache Spark has developed a rich ecosystem, including both official and third party tools. We have a look at 5 third party projects which complement Spark in 5 different ways.

    https://www.kdnuggets.com/2016/03/top-spark-ecosystem-projects.html

  • Spark SQL for Real-Time Analytics

    Apache Spark is the hottest topic in Big Data. This tutorial discusses why Spark SQL is becoming the preferred method for Real Time Analytics and for next frontier, IoT (Internet of Things).

    https://www.kdnuggets.com/2015/09/spark-sql-real-time-analytics.html

  • Exclusive Interview: Matei Zaharia, creator of Apache Spark, on Spark, Hadoop, Flink, and Big Data in 2020

    Apache Spark is one the hottest Big Data technologies in 2015. KDnuggets talks to Matei Zaharia, creator of Apache Spark, about key things to know about it, why it is not a replacement for Hadoop, how it is better than Flink, and vision for Big Data in 2020.

    https://www.kdnuggets.com/2015/05/interview-matei-zaharia-creator-apache-spark.html

  • Unify Batch and ML Systems with Feature/Training/Inference Pipelines

    A new way to do MLOps for your Data-ML-Product Teams.

    https://www.kdnuggets.com/2023/09/hopsworks-unify-batch-ml-systems-feature-training-inference-pipelines

  • A Tour of End-to-End Machine Learning Platforms

    An end-to-end machine learning platform needs a holistic approach. If you’re interested in learning more about a few well-known ML platforms, you’ve come to the right place!

    https://www.kdnuggets.com/2020/07/tour-end-to-end-machine-learning-platforms.html

  • The ravages of concept drift in stream learning applications and how to deal with it

    Stream data processing has gained progressive momentum with the arriving of new stream applications and big data scenarios. These streams of data evolve generally over time and may be occasionally affected by a change (concept drift). How to handle this change by using detection and adaptation mechanisms is crucial in many real-world systems.

    https://www.kdnuggets.com/2019/12/ravages-concept-drift-stream-learning-applications.html

  • Platinum BlogEverything a Data Scientist Should Know About Data Management">Silver BlogPlatinum BlogEverything a Data Scientist Should Know About Data Management

    For full-stack data science mastery, you must understand data management along with all the bells and whistles of machine learning. This high-level overview is a road map for the history and current state of the expansive options for data storage and infrastructure solutions.

    https://www.kdnuggets.com/2019/10/data-scientist-data-management.html

  • Overview of Different Approaches to Deploying Machine Learning Models in Production

    Learn the different methods for putting machine learning models into production, and to determine which method is best for which use case.

    https://www.kdnuggets.com/2019/06/approaches-deploying-machine-learning-production.html

  • Introduction to Trainspotting: Computer Vision, Caltrain, and Predictive Analytics

    We previously analyzed delays using Caltrain’s real-time API to improve arrival predictions, and we have modeled the sounds of passing trains to tell them apart. In this post we’ll start looking at the nuts and bolts of making our Caltrain work possible.

    https://www.kdnuggets.com/2016/11/introduction-trainspotting.html

  • The top 5 Big Data courses to help you break into the industry

    Here is an updated and in-depth review of top 5 providers of Big Data and Data Science courses: Simplilearn, Cloudera, Big Data University, Hortonworks, and Coursera

    https://www.kdnuggets.com/2016/08/simplilearn-5-big-data-courses.html

  • A simple approach to anomaly detection in periodic big data streams

    We describe a simple and scaling algorithm that can detect rare and potentially irregular behavior in a time series with periodic patterns. It performs similarly to Twitter's more complex approach.

    https://www.kdnuggets.com/2016/08/anomaly-detection-periodic-big-data-streams.html

  • Big Data Key Terms, Explained

    Just getting started with Big Data, or looking to iron out the wrinkles in your current understanding? Check out these 20 Big Data-related terms and their concise definitions.

    https://www.kdnuggets.com/2016/08/big-data-key-terms-explained.html

  • Deep Learning for Internet of Things Using H2O

    H2O is feature-rich open source machine learning platform known for its R and Spark integration and it’s ease of use. This is an overview of using H2O deep learning for data science with the Internet of Things.

    https://www.kdnuggets.com/2016/04/deep-learning-iot-h2o.html

  • 7 Python Libraries Every Data Engineer Should Know

    Interested in switching to data engineering? Here’s a list of Python libraries you’ll find super helpful.

    https://www.kdnuggets.com/7-python-libraries-every-data-engineer-should-know

  • 7 Steps to Mastering Data Engineering

    The only data engineering roadmap you need for an introduction to concepts, tools, and techniques to collect, store, transform, analyze, and model data.

    https://www.kdnuggets.com/7-steps-to-mastering-data-engineering

  • A Data Lake, You Call It? It’s a Data Swamp

    How and why the data lake architecture often fails to meet its promises. And how better governance helps mitigate such challenges.

    https://www.kdnuggets.com/a-data-lake-you-call-it-it-a-data-swamp

  • The Only Free Course You Need To Become a Professional Data Engineer

    Data Engineering ZoomCamp offers free access to reading materials, video tutorials, assignments, homeworks, projects, and workshops.

    https://www.kdnuggets.com/the-only-free-course-you-need-to-become-a-professional-data-engineer

  • How Big Data Is Saving Lives in Real Time: IoV Data Analytics Helps Prevent Accidents

    This posts talks about what needs to be taken care of in IoV data analysis, and shows the difference between a near real-time analytic platform and an actual real-time analytic platform with a real-world example.

    https://www.kdnuggets.com/how-big-data-is-saving-lives-in-real-time-iov-data-analytics-helps-prevent-accidents

  • Working with Big Data: Tools and Techniques

    Where do you start in a field as vast as big data? Which tools and techniques to use? We explore this and talk about the most common tools in big data.

    https://www.kdnuggets.com/working-with-big-data-tools-and-techniques

  • Data Engineering Landscape in the AI-Driven World

    Generative AI has just started to capture the imagination of data engineers, so the impact thus far has been just a fraction of what it will be a year or two from now.

    https://www.kdnuggets.com/2023/05/data-engineering-landscape-aidriven-world.html

  • Learn Data Engineering From These GitHub Repositories

    KDnuggets Top Blog Kickstart your Data Engineering career with these curated GitHub repositories.

    https://www.kdnuggets.com/2023/02/learn-data-engineering-github-repositories.html

  • Top 38 Python Libraries for Data Science, Data Visualization & Machine Learning

    This article compiles the 38 top Python libraries for data science, data visualization & machine learning, as best determined by KDnuggets staff.

    https://www.kdnuggets.com/2020/11/top-python-libraries-data-science-data-visualization-machine-learning.html

  • 7 Essential Cheat Sheets for Data Engineering

    KDnuggets Top Blog Learn about the data life cycle, PySpark, dbt, Kafka, BigQuery, Airflow, and Docker.

    https://www.kdnuggets.com/2022/12/7-essential-cheat-sheets-data-engineering.html

  • 3 Simple Ways to Speed Up Your Python Code

    The post explains three popular frameworks, PySpark, Dask, and Ray, and discusses various factors to select the most appropriate one for your project.

    https://www.kdnuggets.com/2022/10/3-simple-ways-speed-python-code.html

  • Everything You Need to Know About Data Lakehouses

    Learn everything you need to know about data lakehouses.

    https://www.kdnuggets.com/2022/09/everything-need-know-data-lakehouses.html

  • Machine Learning Metadata Store

    In this article, we will learn about metadata stores, the need for them, their components, and metadata store management.

    https://www.kdnuggets.com/2022/08/machine-learning-metadata-store.html

  • 10 Modern Data Engineering Tools

    Learn about the modern tools for data orchestration, data storage, analytical engineering, batch processing, and data streaming.

    https://www.kdnuggets.com/2022/07/10-modern-data-engineering-tools.html

  • Free Data Engineering Courses

    Get into the highly in-demand world of data engineering for free and earn 6 figures salary.

    https://www.kdnuggets.com/2022/05/free-data-engineering-courses.html

  • 15 Trending MLOps Talks You can Access for Free at ODSC East 2022

    Covering topics like workflows and full-stack machine learning, these are 15 free MLOps talks coming to #ODSCEast 2022 that you can see with a free Bronze Pass.

    https://www.kdnuggets.com/2022/04/odsc-15-trending-mlops-talks-access-free-odsc-east-2022.html

  • Feature Stores for Real-time AI & Machine Learning

    Real-time AI/ML is on the rise and feature stores are key to successfully deploying them. Read on to see how the choice of online store and the feature store architecture play important roles in determining its performance and cost.

    https://www.kdnuggets.com/2022/03/feature-stores-realtime-ai-machine-learning.html

  • Is the Modern Data Stack Leaving You Behind?

    The modern data stack narrative is largely dominated by analytics engineering. Where does that leave data engineers? Discover the difference between the MDS for data engineers & analytics engineers.

    https://www.kdnuggets.com/2021/11/modern-data-stack-leaving-behind.html

  • Data Engineering Technologies 2021

    Emerging technologies supporting the field of data engineering are growing at a rapid clip. This curated list includes the most important offerings available in 2021.

    https://www.kdnuggets.com/2021/09/data-engineering-technologies-2021.html

  • How to Use Kafka Connect to Create an Open Source Data Pipeline for Processing Real-Time Data

    This article shows you how to create a real-time data pipeline using only pure open source technologies. These include Kafka Connect, Apache Kafka, Kibana and more.

    https://www.kdnuggets.com/2021/07/kafka-open-source-data-pipeline-processing-real-time-data.html

  • Awesome list of datasets in 100+ categories

    With an estimated 44 zettabytes of data in existence in our digital world today and approximately 2.5 quintillion bytes of new data generated daily, there is a lot of data out there you could tap into for your data science projects. It's pretty hard to curate through such a massive universe of data, but this collection is a great start. Here, you can find data from cancer genomes to UFO reports, as well as years of air quality data to 200,000 jokes. Dive into this ocean of data to explore as you learn how to apply data science techniques or leverage your expertise to discover something new.

    https://www.kdnuggets.com/2021/05/awesome-list-datasets.html

  • Machine learning is going real-time

    Extracting immediate predictions from machine learning algorithms on the spot based on brand-new data can offer a next level of interaction and potential value to its consumers. The infrastructure and tech stack required to implement such real-time systems is also next level, and many organizations -- especially in the US -- seem to be resisting. But, what even is real-time ML, and how can it deliver a better experience?

    https://www.kdnuggets.com/2021/01/machine-learning-real-time.html

  • How to Get a Job as a Data Engineer

    Data engineering skills are currently in high demand. If you are looking for career prospects in this fast-growing profession, then these 10 skills and key factors will help you prepare to land an entry-level position in this field.

    https://www.kdnuggets.com/2021/01/get-job-as-data-engineer.html

  • Model Experiments, Tracking and Registration using MLflow on Databricks

    This post covers how StreamSets can help expedite operations at some of the most crucial stages of Machine Learning Lifecycle and MLOps, and demonstrates integration with Databricks and MLflow.

    https://www.kdnuggets.com/2021/01/model-experiments-tracking-registration-mlflow-databricks.html

  • Industry 2021 Predictions for AI, Analytics, Data Science, Machine Learning

    We bring you industry predictions from 12 innovative companies - what key trends they expect in 2021 in AI, Analytics, Data Science, and Machine Learning?

    https://www.kdnuggets.com/2020/12/industry-2021-predictions-ai-data-science-machine-learning.html

  • Introduction to Data Engineering">Gold BlogIntroduction to Data Engineering

    The Q&A for the most frequently asked questions about Data Engineering: What does a data engineer do? What is a data pipeline? What is a data warehouse? How is a data engineer different from a data scientist? What skills and programming languages do you need to learn to become a data engineer?

    https://www.kdnuggets.com/2020/12/introduction-data-engineering.html

  • The Rise of the Machine Learning Engineer">Gold BlogThe Rise of the Machine Learning Engineer

    The evolution of Big Data into machine learning applications ushered in an exciting era of new roles and skillsets that became necessary to implement these technologies. With the Machine Learning Engineer being such a crucial component today, where the evolution of this field will take us tomorrow should be fascinating.

    https://www.kdnuggets.com/2020/11/rise-machine-learning-engineer.html

  • Skills to Build for Data Engineering">Silver BlogSkills to Build for Data Engineering

    This article jumps into the latest skill set observations in the Data Engineering Job Market which could definitely add a boost to your existing career or assist you in starting off your Data Engineering journey.

    https://www.kdnuggets.com/2020/06/skills-build-data-engineering.html

  • The 4 Hottest Trends in Data Science for 2020">Silver BlogThe 4 Hottest Trends in Data Science for 2020

    The field of Data Science is growing with new capabilities and reach into every industry. With digital transformations occurring in organizations around the world, 2019 included trends of more companies leveraging more data to make better decisions. Check out these next trends in Data Science expected to take off in 2020.

    https://www.kdnuggets.com/2019/12/4-hottest-trends-data-science-2020.html

  • Understanding Cloud Data Services">Gold BlogUnderstanding Cloud Data Services

    Ready to move your systems to a cloud vendor or just learning more about big data services? This overview will help you understand big data system architectures, components, and offerings with an end-to-end taxonomy of what is available from the big three cloud providers.

    https://www.kdnuggets.com/2019/06/understanding-cloud-data-services.html

  • Gold Blog Data Science Jobs Report 2019: Python Way Up, TensorFlow Growing Rapidly, R Use Double SAS">Silver BlogGold Blog Data Science Jobs Report 2019: Python Way Up, TensorFlow Growing Rapidly, R Use Double SAS

    Data science jobs continue to grow in 2019, and this report shares the change and spread of jobs by software over recent years.

    https://www.kdnuggets.com/2019/06/data-science-jobs-report.html

  • Gold BlogPython leads the 11 top Data Science, Machine Learning platforms: Trends and Analysis">Gold BlogGold BlogPython leads the 11 top Data Science, Machine Learning platforms: Trends and Analysis

    Python continues to lead the top Data Science platforms, but R and RapidMiner hold their share; Almost 50% have used Deep Learning tools; SQL is steady; Consolidation continues.

    https://www.kdnuggets.com/2019/05/poll-top-data-science-machine-learning-platforms.html

  • Building Recommender systems with Azure Machine Learning service

    Microsoft has provided a GitHub repository with Python best practice examples to facilitate the building and evaluation of recommendation systems using Azure Machine Learning services.

    https://www.kdnuggets.com/2019/05/recommender-systems-azure-machine-learning.html

  • What’s Going to Happen this Year in the Data World

    "If we wish to foresee the future of mathematics, our proper course is to study the history and present condition of the science." Henri Poncairé.

    https://www.kdnuggets.com/2019/05/whats-going-happen-this-year-data-world.html

  • The Difference Between Data Scientists and Data Engineers

    ODSC East 2019 has multiple tracks for both Data Scientists and Data Engineers, including workshops, talks, and training sessions. Save 45% with code KDN45.

    https://www.kdnuggets.com/2019/03/odsc-difference-data-scientists-data-engineers.html

  • Top 10 Data Science Use Cases in Telecom

    In this article, we attempt to present the most relevant and efficient data science use cases in the field of telecommunication.

    https://www.kdnuggets.com/2019/02/top-10-data-science-use-cases-telecom.html

  • Gold BlogGainers, Losers, and Trends in Gartner 2019 Magic Quadrant for Data Science and Machine Learning Platforms">Gold BlogGold BlogGainers, Losers, and Trends in Gartner 2019 Magic Quadrant for Data Science and Machine Learning Platforms

    We compare Gartner 2019 MQ for Data Science, Machine Learning Platforms to its previous versions and identify notable changes for leaders and challengers, including RapidMiner, KNIME, TIBCO, Alteryx, Dataiku, SAS, and MathWorks.

    https://www.kdnuggets.com/2019/02/gartner-2019-mq-data-science-machine-learning-changes.html

  • Building AI to Build AI: The Project That Won the NeurIPS AutoML Challenge

    This is an overview of designing a computer program capable of developing predictive models without any manual intervention that are trained & evaluated in a lifelong machine learning setting in NeurIPS 2018 AutoML3 Challenge.

    https://www.kdnuggets.com/2019/01/building-ai-to-build-ai-neurips-automl-challenge.html

  • Comparison of the Top Speech Processing APIs

    There are two main tasks in speech processing. First one is to transform speech to text. The second is to convert the text into human speech. We will describe the general aspects of each API and then compare their main features in the table.

    https://www.kdnuggets.com/2018/12/activewizards-comparison-speech-processing-apis.html

  • Best Machine Learning Languages, Data Visualization Tools, DL Frameworks, and Big Data Tools">Silver BlogBest Machine Learning Languages, Data Visualization Tools, DL Frameworks, and Big Data Tools

    We cover a variety of topics, from machine learning to deep learning, from data visualization to data tools, with comments and explanations from experts in the relevant fields.

    https://www.kdnuggets.com/2018/12/machine-learning-data-visualization-deep-learning-tools.html

  • The Most in Demand Skills for Data Scientists">Platinum BlogThe Most in Demand Skills for Data Scientists

    Data scientists are expected to know a lot — machine learning, computer science, statistics, mathematics, data visualization, communication, and deep learning. How should data scientists who want to be in demand by employers spend their learning budget?

    https://www.kdnuggets.com/2018/11/most-demand-skills-data-scientists.html

  • Things you should know when traveling via the Big Data Engineering hype-train

    Maybe you want to join the Big Data world? Or maybe you are already there and want to validate your knowledge? Or maybe you just want to know what Big Data Engineers do and what skills they use? If so, you may find the following article quite useful.

    https://www.kdnuggets.com/2018/10/big-data-engineering-hype-train.html

  • DIY Deep Learning Projects">Silver BlogDIY Deep Learning Projects

    Inspired by the great work of Akshay Bahadur in this article you will see some projects applying Computer Vision and Deep Learning, with implementations and details so you can reproduce them on your computer.

    https://www.kdnuggets.com/2018/06/diy-deep-learning-projects.html

  • Event Processing: Three Important Open Problems

    This article summarizes the three most important problems to be solved in event processing. The facts in this article are supported by a recent survey and an analysis conducted on the industry trends.

    https://www.kdnuggets.com/2018/05/event-processing-important-open-problems.html

  • Python eats away at R: Top Software for Analytics, Data Science, Machine Learning in 2018: Trends and Analysis">Platinum BlogPython eats away at R: Top Software for Analytics, Data Science, Machine Learning in 2018: Trends and Analysis

    Python continues to eat away at R, RapidMiner gains, SQL is steady, Tensorflow advances pulling along Keras, Hadoop drops, Data Science platforms consolidate, and more.

    https://www.kdnuggets.com/2018/05/poll-tools-analytics-data-science-machine-learning-results.html

  • Ranking Popular Distributed Computing Packages for Data Science

    We examined 140 frameworks and distributed programing packages and came up with a list of top 20 distributed computing packages useful for Data Science, based on a combination of Github, Stack Overflow, and Google results.

    https://www.kdnuggets.com/2018/03/top-distributed-computing-packages-data-science.html

  • Gainers and Losers in Gartner 2018 Magic Quadrant for Data Science and Machine Learning Platforms">Silver BlogGainers and Losers in Gartner 2018 Magic Quadrant for Data Science and Machine Learning Platforms

    We compare Gartner 2018 Magic Quadrant for Data Science, Machine Learning Platforms vs its 2017 version and identify notable changes for leaders and challengers, including IBM, SAS, RapidMiner, KNIME, Alteryx, H2O.ai, and Domino.

    https://www.kdnuggets.com/2018/02/gartner-2018-mq-data-science-machine-learning-changes.html

  • Big Data: Main Developments in 2017 and Key Trends in 2018">Silver BlogBig Data: Main Developments in 2017 and Key Trends in 2018

    As we bid farewell to one year and look to ring in another, KDnuggets has solicited opinions from numerous Big Data experts as to the most important developments of 2017 and their 2018 key trend predictions.

    https://www.kdnuggets.com/2017/12/big-data-main-developments-2017-key-trends-2018.html

  • 277 Data Science Key Terms, Explained">Silver Blog, Sep 2017277 Data Science Key Terms, Explained

    This is a collection of 277 data science key terms, explained with a no-nonsense, concise approach. Read on to find terminology related to Big Data, machine learning, natural language processing, descriptive statistics, and much more.

    https://www.kdnuggets.com/2017/09/data-science-key-terms-explained.html

  • Making Sense of Machine Learning">Silver Blog, June 2017Making Sense of Machine Learning

    Broadly speaking, machine learners are computer algorithms designed for pattern recognition, curve fitting, classification and clustering. The word learning in the term stems from the ability to learn from data.

    https://www.kdnuggets.com/2017/06/making-sense-machine-learning.html

  • Data Science for Newbies: An Introductory Tutorial Series for Software Engineers

    This post summarizes and links to the individual tutorials which make up this introductory look at data science for newbies, mainly focusing on the tools, with a practical bent, written by a software engineer from the perspective of a software engineering approach.

    https://www.kdnuggets.com/2017/05/data-science-tutorial-series-software-engineers.html

  • Data science platforms are on the rise and IBM is leading the way

    Download the 2017 Gartner Magic Quadrant for Data Science Platforms today to learn why IBM is named a leader in data science and to find out why data science, analytics, and machine learning are the engines of the future.

    https://www.kdnuggets.com/2017/05/ibm-data-science-platforms-gartner.html

  • Continuous improvement for IoT through AI / Continuous learning">Silver BlogContinuous improvement for IoT through AI / Continuous learning

    In reality, especially for IoT, it is not like once an analytics model is built, it will give the results with same accuracy till the end of time. Data pattern changes over the time which makes it absolutely important to learn from new data and improve/recalibrate the models to get correct result. Below article explain this phenomenon of continuous improvement in analytics for IoT.

    https://www.kdnuggets.com/2016/11/continuous-improvement-iot-ai-learning.html

  • Evaluating HTAP Databases for Machine Learning Applications

    Businesses are producing a greater number of intelligent applications; which traditional databases are unable to support. A new class of databases, Hybrid Transactional and Analytical Processing (HTAP) databases, offers a variety of capabilities with specific strengths and weaknesses to consider. This article aims to give application developers and data scientists a better understanding of the HTAP database ecosystem so they can make the right choice for their intelligent application.

    https://www.kdnuggets.com/2016/11/evaluating-htap-databases-machine-learning-applications.html

  • Big Data Science: Expectation vs. Reality">Gold BlogBig Data Science: Expectation vs. Reality

    The path to success and happiness of the data science team working with big data project is not always clear from the beginning. It depends on maturity of underlying platform, their cross skills and devops process around their day-to-day operations.

    https://www.kdnuggets.com/2016/10/big-data-science-expectation-reality.html

  • The Big Data Ecosystem is Too Damn Big">2016 Silver BlogThe Big Data Ecosystem is Too Damn Big

    The Big Data ecosystem is just too damn big! It's complex, redundant, and confusing. There are too many layers in the technology stack, too many standards, and too many engines. Vendors? Too many. What is the user to do?

    https://www.kdnuggets.com/2016/06/big-data-ecosystem-too-damn-big.html

  • 5 Reasons Machine Learning Applications Need a Better Lambda Architecture

    The Lambda Architecture enables a continuous processing of real-time data. It is a painful process that gets the job done, but at a great cost. Here is a simplified solution called as Lambda-R (Æ›-R) for the Relational Lambda.

    https://www.kdnuggets.com/2016/05/5-reasons-machine-learning-applications-lambda-architecture.html

  • Top Big Data Processing Frameworks

    A discussion of 5 Big Data processing frameworks: Hadoop, Spark, Flink, Storm, and Samza. An overview of each is given and comparative insights are provided, along with links to external resources on particular related topics.

    https://www.kdnuggets.com/2016/03/top-big-data-processing-frameworks.html

  • Data Lake Plumbers: Operationalizing the Data Lake

    Gain insight into data lakes, their benefits, when they are appropriate, and how to operationalize them. How do they compare to the data warehouse?

    https://www.kdnuggets.com/2016/02/data-lakes-plumbers-operationalizing.html

  • Gartner 2016 Magic Quadrant for Advanced Analytics Platforms: gainers and losers">2016 Silver BlogGartner 2016 Magic Quadrant for Advanced Analytics Platforms: gainers and losers

    We compare Gartner 2016 Magic Quadrant Advanced Analytics Platforms vs its 2015 version and identify notable changes for leaders and challengers: SAS, IBM, RapidMiner, KNIME, Dell, Angoss, and Microsoft.

    https://www.kdnuggets.com/2016/02/gartner-2016-mq-analytics-platforms-gainers-losers.html

  • Getting started with Python and Apache Flink

    Apache Flink built on top of the distributed streaming dataflow architecture, which helps to crunch massive velocity and volume data sets. With version 1.0 it provided python API, learn how to write a simple Flink application in python.

    https://www.kdnuggets.com/2015/11/getting-started-python-apache-flink.html

  • What No One Tells You About Real-Time Machine Learning

    Real-time machine learning has access to a continuous flow of transactional data, but what it really needs in order to be effective is a continuous flow of labeled transactional data, and accurate labeling introduces latency.

    https://www.kdnuggets.com/2015/11/petrov-real-time-machine-learning.html

  • 5 Best Machine Learning APIs for Data Science

    Machine Learning APIs make it easy for developers to develop predictive applications. Here we review 5 important Machine Learning APIs: IBM Watson, Microsoft Azure Machine Learning, Google Prediction API, Amazon Machine Learning API, and BigML.

    https://www.kdnuggets.com/2015/11/machine-learning-apis-data-science.html

  • Interview: Arno Candel, H2O.ai on the Basics of Deep Learning to Get You Started

    We discuss how Deep Learning is different from the other methods of Machine Learning, unique characteristics and benefits of Deep Learning, and the key components of H2O architecture.

    https://www.kdnuggets.com/2015/01/interview-arno-candel-0xdata-deep-learning.html

  • R and Hadoop make Machine Learning Possible for Everyone

    R and Hadoop make machine learning approachable enough for inexperienced users to begin analyzing and visualizing interesting data to start down the path in this lucrative field.

    https://www.kdnuggets.com/2014/11/r-hadoop-make-machine-learning-possible-everyone.html

  • KDnuggets™ News 14:n30, Nov 19

    Features | Software | Opinions | Interviews | Reports | News | Webcasts | Courses | Meetings | Jobs | Academic | Publications | Tweets Read more »

    https://www.kdnuggets.com/2014/n30.html

  • KDnuggets™ News 14:n27, Oct 22

    Features | Software | Opinions | Interviews | Reports | News | Webcasts | Courses | Meetings | Jobs | Academic | Publications | Tweets Read more »

    https://www.kdnuggets.com/2014/n27.html

  • KDnuggets™ News 14:n09, Apr 16

    Features (6) | Opinions (4) | Software (2) | News (6) | Webcasts (2) | Courses (4) | Meetings (3) | Jobs (9) | Academic Read more »

    https://www.kdnuggets.com/2014/n09.html

  • Software Suites/Platforms for Analytics, Data Mining, Data Science, and Machine Learning

    commercial | free/open source A B C D E F G H I J K L M N O PQ R S T U V Read more »

    https://www.kdnuggets.com/software/suites.html

  • Consulting Companies in AI, Analytics, Data Science, and Machine Learning

    A B C D E F G H I J K L M N O P Q R S T U V W XYZ 4i, Read more »

    https://www.kdnuggets.com/companies/consulting.html

Refine your search here:

No, thanks!