Search results for Research
-
Paradoxes in Data Science
Have a look into some of the main paradoxes associate with Data Science and it’s statistical foundations.https://www.kdnuggets.com/2021/09/paradoxes-data-science.html
-
What 2 years of self-teaching data science taught me
Many of us self-learn data science from the very beginning. While continuing to self-learn on demand is crucial, especially after you become a professional, there can be many pitfalls early on for learning the wrong way or missing out on key ideas that are important for the real-world application of data science.https://www.kdnuggets.com/2021/09/2-years-self-teaching-data-science.html
-
Introducing TensorFlow Similarity
TensorFlow Similarity is a newly-released library from Google that facilitates the training, indexing and querying of similarity models. Check out more here.https://www.kdnuggets.com/2021/09/introducing-tensorflow-similarity.html
-
What Is The Real Difference Between Data Engineers and Data Scientists?
To launch your data career, you’ll need both theoretical knowledge and applied skills. Bootcamp programs like Springboard’s Data Science Career Track and Data Engineering Career Track can help make you job-ready through hands-on, project-based learning and one-on-one mentorship. Wondering which data career path is right for you? Read on to find out.https://www.kdnuggets.com/2021/09/springboard-difference-data-engineers-data-scientists.html
-
The Machine & Deep Learning Compendium Open Book">The Machine & Deep Learning Compendium Open Book
After years in the making, this extensive and comprehensive ebook resource is now available and open for data scientists and ML engineers. Learn from and contribute to this tome of valuable information to support all your work in data science from engineering to strategy to management.https://www.kdnuggets.com/2021/09/machine-deep-learning-open-book.html
-
Introduction to Automated Machine Learning
AutoML enables developers with limited ML expertise (and coding experience) to train high-quality models specific to their business needs. For this article, we will focus on AutoML systems which cater to everyday business and technology applications.https://www.kdnuggets.com/2021/09/introduction-automated-machine-learning.html
-
How Many AI Neurons Does It Take to Simulate a Brain Neuron?
A new research shows some shocking answers to that question.https://www.kdnuggets.com/2021/09/ai-neurons-simulate-brain-neuron.html
-
Text Preprocessing Methods for Deep Learning
While the preprocessing pipeline we are focusing on in this post is mainly centered around Deep Learning, most of it will also be applicable to conventional machine learning models too.https://www.kdnuggets.com/2021/09/text-preprocessing-methods-deep-learning.html
-
8 Deep Learning Project Ideas for Beginners">8 Deep Learning Project Ideas for Beginners
Have you studied Deep Learning techniques, but never worked on a useful project? Here, we highlight eight deep learning project ideas for beginners that will help you sharpen your skills and boost your resume.https://www.kdnuggets.com/2021/09/8-deep-learning-project-ideas-beginners.html
-
7 Differences Between a Data Analyst and a Data Scientist">7 Differences Between a Data Analyst and a Data Scientist
This article discusses the 7 key differences between data analysts and data scientists with an aim to help potential data analysts/scientists determine which is the right one for them. I touch on day-to-day tasks, skill requirements, typical career progression, and salary and career prospects for both.https://www.kdnuggets.com/2021/09/7-differences-between-data-analyst-data-scientist.html
-
Math 2.0: The Fundamental Importance of Machine Learning
Machine learning is not just another way to program computers; it represents a fundamental shift in the way we understand the world. It is Math 2.0.https://www.kdnuggets.com/2021/09/math-fundamental-importance-machine-learning.html
-
How Machine Learning Leverages Linear Algebra to Solve Data Problems
Why you should learn the fundamentals of linear algebra.https://www.kdnuggets.com/2021/09/machine-learning-leverages-linear-algebra-solve-data-problems.html
-
ebook: Learn Data Science with R – free download
Check out this new book for data science beginners with many practical examples that covers statistics, R, graphing, and machine learning. As a source to learn the full breadth of data science foundations, "Learn Data Science with R" starts at the beginner level and gradually progresses into expert content.https://www.kdnuggets.com/2021/09/ebook-learn-data-science-r.html
-
Fast AutoML with FLAML + Ray Tune
Microsoft Researchers have developed FLAML (Fast Lightweight AutoML) which can now utilize Ray Tune for distributed hyperparameter tuning to scale up FLAML’s resource-efficient & easily parallelizable algorithms across a cluster.https://www.kdnuggets.com/2021/09/fast-automl-flaml-ray-tune.html
-
Five Key Facts About Wu Dao 2.0: The Largest Transformer Model Ever Built
The record-setting model combines some clever research and engineering methods.https://www.kdnuggets.com/2021/09/five-key-facts-wu-dao-largest-transformer-model.html
-
How to solve machine learning problems in the real world
Becoming a machine learning engineer pro is your goal? Sure, online ML courses and Kaggle-style competitions are great resources to learn the basics. However, the daily job of a ML engineer requires an additional layer of skills that you won’t master through these approaches.https://www.kdnuggets.com/2021/09/solve-machine-learning-problems-real-world.html
-
Best Resources to Learn Natural Language Processing in 2021
In this article, the author has listed listed all the best resources to learn natural language processing including Online Courses, Tutorials, Books, and YouTube Videos.https://www.kdnuggets.com/2021/09/best-resources-learn-natural-language-processing-2021.html
-
How is Machine Learning Beneficial in Mobile App Development?
Mobile app developers have a lot to gain by implementing AI & Machine Learning from the revolutionary changes that these disruptive technologies can offer. This is due to AI and ML's potential to strengthen mobile applications, providing for smoother user experiences capable of leveraging powerful features.https://www.kdnuggets.com/2021/09/machine-learning-beneficial-mobile-app-development.html
-
Multilabel Document Categorization, step by step example
This detailed guide explores an unsupervised and supervised learning two-stage approach with LDA and BERT to develop a domain-specific document categorizer on unlabeled documents.https://www.kdnuggets.com/2021/08/multilabel-document-categorization.html
-
3 Data Acquisition, Annotation, and Augmentation Tools
Check out these 3 projects found around GitHub that can help with your data acquisition, annotation, and augmentation tasks.https://www.kdnuggets.com/2021/08/3-data-labeling-synthesizing-augmentation-tools.html
-
The Significance of Data-centric AI
How a systematic way of maintaining data quality can do wonders to your model performance.https://www.kdnuggets.com/2021/08/significance-data-centric-ai.html
-
11 Best Data Science Education Platforms
We cover 11 best Data Science Education platforms for 11 different use cases, ranging from specific languages to hands-on learners, to the best free option.https://www.kdnuggets.com/2021/08/11-best-data-science-education-platforms.html
-
Essential Features of An Efficient Data Integration Solution
This blog highlights the essential features of a data integration solution that help an organization generate consistent and accurate data to keep the business running smoothly.https://www.kdnuggets.com/2021/08/essential-features-efficient-data-integration-solution.html
-
Learning Data Science and Machine Learning: First Steps After The Roadmap">Learning Data Science and Machine Learning: First Steps After The Roadmap
Just getting into learning data science may seem as daunting as (if not more than) trying to land your first job in the field. With so many options and resources online and in traditional academia to consider, these pre-requisites and pre-work are recommended before diving deep into data science and AI/ML.https://www.kdnuggets.com/2021/08/learn-data-science-machine-learning.html
-
7 reasons you should get a formal degree in Data Science
So many options are now available online to learn in the field of data science. There are several factors to consider to determine if these options or a traditional degree from an academic institution is the best approach for your personal learning style and career aspirations.https://www.kdnuggets.com/2021/08/7-reasons-degree-data-science.html
-
5 Data Science Career Mistakes To Avoid
Everyone makes mistakes, which can be a good thing when they lead to learning and improvements over time. But, we can also try to first learn from others to expedite our personal growth. To get started, consider these lessons learned the hard way, so you don’t have to.https://www.kdnuggets.com/2021/08/5-data-science-career-mistakes-avoid.html
-
15 Things I Look for in Data Science Candidates
This article presents advice for anyone looking or hiring for data science jobs, written by someone with practical and useful insight.https://www.kdnuggets.com/2021/08/15-things-data-science-candidates.html
-
Amazon Web Services Webinar: Accelerating clinical trial and biomedical development processes with healthcare data
Join this webinar on August 27 to learn how to leverage external healthcare datasets to make faster decisions with greater accuracy – accelerating biomedical development and improving patient welfare.https://www.kdnuggets.com/2021/08/aws-webinar-clinical-trial-biomedical-development-healthcare.html
-
Open Source Datasets for Computer Vision">Open Source Datasets for Computer Vision
Access to high-quality, noise-free, large-scale datasets is crucial for training complex deep neural network models for computer vision applications. Many open-source datasets are developed for use in image classification, pose estimation, image captioning, autonomous driving, and object segmentation. These datasets must be paired with the appropriate hardware and benchmarking strategies to optimize performance.https://www.kdnuggets.com/2021/08/open-source-datasets-computer-vision.html
-
Data Scientist’s Guide to Efficient Coding in Python
Read this fantastic collection of tips and tricks the author uses for writing clean code on a day-to-day basis.https://www.kdnuggets.com/2021/08/data-scientist-guide-efficient-coding-python.html
-
What I Learned From “Women in Data Science” Conferences
Read the author's perspective after attending 3 "Women in Data Science" conferences.https://www.kdnuggets.com/2021/08/learned-women-data-science-conferences.html
-
Agile Data Labeling: What it is and why you need it
The notion of Agile in software development has made waves across industries with its revolution for productivity. Can the same benefits be applied to the often arduous task of annotating data sets for machine learning?https://www.kdnuggets.com/2021/08/agile-data-labeling.html
-
MLOps And Machine Learning Roadmap
A 16–20 week roadmap to review machine learning and learn MLOps.https://www.kdnuggets.com/2021/08/mlops-machine-learning-roadmap.html
-
2021 State of Production Machine Learning Survey
We invite you to take the 2021 State of Production Machine Learning survey and help shed light on the latest trends in the adoption of machine learning (ML) in the industry.https://www.kdnuggets.com/2021/08/anyscale-2021-state-production-machine-learning-survey.html
-
The Difference Between Data Scientists and ML Engineers">The Difference Between Data Scientists and ML Engineers
What's the difference? Responsibilities, expertise, and salary expectations.https://www.kdnuggets.com/2021/08/difference-between-data-scientists-ml-engineers.html
-
DeepMind’s New Super Model: Perceiver IO is a Transformer that can Handle Any Dataset
The new transformer-based architecture can process audio, video and images using a single model.https://www.kdnuggets.com/2021/08/deepmind-new-super-model-perceiver-io-transformer.html
-
Visualizing Bias-Variance
In this article, we'll explore some different perspectives of what the bias-variance trade-off really means with the help of visualizations.https://www.kdnuggets.com/2021/08/visualizing-bias-variance.html
-
Using Twitter to Understand Pizza Delivery Apprehension During COVID
Analyzing customer sentiments and capturing any specific difference in emotion to order Dominos pizza in India during lockdown.https://www.kdnuggets.com/2021/08/twitter-understand-pizza-delivery-covid.html
-
Essential Math for Data Science: Introduction to Systems of Linear Equations
In this post, you’ll see how you can use systems of equations and linear algebra to solve a linear regression problem.https://www.kdnuggets.com/2021/08/essential-math-data-science-introduction-systems-linear-equations.html
-
Most Common Data Science Interview Questions and Answers">Most Common Data Science Interview Questions and Answers
After analyzing 900+ data science interview questions from companies over the past few years, the most common data science interview question categories are reviewed in this guide, each explained with an example.https://www.kdnuggets.com/2021/08/common-data-science-interview-questions-answers.html
-
Artificial Intelligence vs Machine Learning in Cybersecurity
Artificial Intelligence and Machine Learning are the next-gen technology used in various fields. With the rise in online threats, it has become essential to include these technologies in cybersecurity. In this post, we will know what roles do AI and ML play in cybersecurity.https://www.kdnuggets.com/2021/08/artificial-intelligence-machine-learning-cybersecurity.html
-
How Visualization is Transforming Exploratory Data Analysis">How Visualization is Transforming Exploratory Data Analysis
Data analysts are dealing with bigger datasets than ever before, making interrogation difficult. Visualized Exploratory Data Analysis, supported by advanced parallel computing, promises an answer.https://www.kdnuggets.com/2021/08/visualization-transforming-exploratory-data-analysis.html
-
Free dataset worth $1350 to test the accent gap!
With so many accent variations, how do speech and voice technologies keep up? In a few words: accented speech training data, representative of diverse groups of people. The more people your model can understand, the more likely you are to acquire and retain customers.https://www.kdnuggets.com/2021/08/definedcrowd-free-dataset-accent-gap.html
-
30 Most Asked Machine Learning Questions Answered
There is always a lot to learn in machine learning. Whether you are new to the field or a seasoned practitioner and ready for a refresher, understanding these key concepts will keep your skills honed in the right direction.https://www.kdnuggets.com/2021/08/30-machine-learning-questions-answered.html
-
How To 2x Your Data Analytics Consulting Rates (Overnight)
Looking to up your data analytics consulting rates? Learn exactly what most freelancers are charging, and the rates you SHOULD be charging as a business intelligence and analytics consultant. This post will show you what you need to know to achieve maximum results for your data consulting career.https://www.kdnuggets.com/2021/08/2x-data-analytics-consulting-rates-overnight.html
-
GPU-Powered Data Science (NOT Deep Learning) with RAPIDS">GPU-Powered Data Science (NOT Deep Learning) with RAPIDS
How to utilize the power of your GPU for regular data science and machine learning even if you do not do a lot of deep learning work.https://www.kdnuggets.com/2021/08/gpu-powered-data-science-deep-learning-rapids.html
-
Towards a Responsible and Ethical AI
It is not the technology at fault, but the intention.https://www.kdnuggets.com/2021/07/towards-responsible-ethical-ai.html
-
Data Monetization 101
The evolving marketplace of data now includes many firms that support a variety of needs from organizations looking to grow with data. This listing of the key players categorized by target market provides an interesting picture of this exciting industry sector.https://www.kdnuggets.com/2021/07/data-monetization-101.html
-
10 Machine Learning Model Training Mistakes
These common ML model training mistakes are easy to overlook but costly to redeem.https://www.kdnuggets.com/2021/07/10-machine-learning-model-training-mistakes.html
-
A Brief Introduction to the Concept of Data">A Brief Introduction to the Concept of Data
Every aspiring data scientist must know the concept of data and the kind of analysis they can run. This article introduces the concept of data (quantitative and qualitative) and the types of analysis.https://www.kdnuggets.com/2021/07/brief-introduction-concept-data.html
-
Machine Learning Skills – Update Yours This Summer
The process of mastering new knowledge often requires multiple passes to ensure the information is deeply understood. If you already began your journey into machine learning and data science, then you are likely ready for a refresher on topics you previously covered. This eight-week self-learning path will help you recapture the foundations and prepare you for future success in applying these skills.https://www.kdnuggets.com/2021/07/update-your-machine-learning-skills.html
-
Facebook Open Sources a Chatbot That Can Discuss Any Topic
The new version expands the capabilities of its predecessor building a much more natural conversational experience.https://www.kdnuggets.com/2021/07/facebook-open-sources-chatbot-discuss-any-topic.html
-
Not Only for Deep Learning: How GPUs Accelerate Data Science & Data Analytics">Not Only for Deep Learning: How GPUs Accelerate Data Science & Data Analytics
Modern AI/ML systems’ success has been critically dependent on their ability to process massive amounts of raw data in a parallel fashion using task-optimized hardware. Can we leverage the power of GPU and distributed computing for regular data processing jobs too?https://www.kdnuggets.com/2021/07/deep-learning-gpu-accelerate-data-science-data-analytics.html
-
5 Mistakes I Wish I Had Avoided in My Data Science Career
Everyone makes mistakes, which can be a good thing when they lead to learning and improvements over time. But, we can also try to first learn from others to expedite our personal growth. To get started, consider these lessons learned the hard way, so you don’t have to.https://www.kdnuggets.com/2021/07/5-mistakes-data-science-career.html
-
Full cross-validation and generating learning curves for time-series models
Standard cross-validation on time series data is not possible because the data model is sequential, which does not lend well to splitting the data into statistically useful training and validation sets. However, a new approach called Reconstructive Cross-validation may pave the way toward performing this type of important analysis for predictive models with temporal datasets.https://www.kdnuggets.com/2021/07/full-cross-validation-learning-curves-time-series.html
-
ColabCode: Deploying Machine Learning Models From Google Colab
New to ColabCode? Learn how to use it to start a VS Code Server, Jupyter Lab, or FastAPI.https://www.kdnuggets.com/2021/07/colabcode-deploying-machine-learning-models-google-colab.html
-
The Best SOTA NLP Course is Free!
Hugging Face has recently released a course on using its libraries and ecosystem for practical NLP, and it appears to be very comprehensive. Have a look for yourself.https://www.kdnuggets.com/2021/07/best-sota-nlp-course-free.html
-
WHT: A Simpler Version of the fast Fourier Transform (FFT) you should know
The fast Walsh Hadamard transform is a simple and useful algorithm for machine learning that was popular in the 1960s and early 1970s. This useful approach should be more widely appreciated and applied for its efficiency.https://www.kdnuggets.com/2021/07/wht-simpler-fast-fourier-transform-fft.html
-
How to Create Unbiased Machine Learning Models
In this post we discuss the concepts of bias and fairness in the Machine Learning world, and show how ML biases often reflect existing biases in society. Additionally, We discuss various methods for testing and enforcing fairness in ML models.https://www.kdnuggets.com/2021/07/create-unbiased-machine-learning-models.html
-
High-Performance Deep Learning: How to train smaller, faster, and better models – Part 5
Training efficient deep learning models with any software tool is nothing without an infrastructure of robust and performant compute power. Here, current software and hardware ecosystems are reviewed that you might consider in your development when the highest performance possible is needed.https://www.kdnuggets.com/2021/07/high-performance-deep-learning-part5.html
-
AWS Webinar: How are data-driven companies using ESG and sustainability data to make actionable decisions?
In this virtual session, on Jul 29 @ 11AM PT, 2PM ET, our panel of experts will uncover how companies across several verticals use ESG data to move beyond the reporting benchmark, deepen business insights, and create competitive differentiation.https://www.kdnuggets.com/2021/07/roidna-aws-webinar-data-driven-esg-sustainability-decisions.html
-
7 Open Source Libraries for Deep Learning Graphs
In this article we’ll go through 7 up-and-coming open source libraries for graph deep learning, ranked in order of increasing popularity.https://www.kdnuggets.com/2021/07/7-open-source-libraries-deep-learning-graphs.html
-
Top 6 Data Science Online Courses in 2021">Top 6 Data Science Online Courses in 2021
As an aspiring data scientist, it is easy to get overwhelmed by the abundance of resources available on the Internet. With these 6 online courses, you can develop yourself from a novice to experienced in less than a year, and prepare you with the skills necessary to land a job in data science.https://www.kdnuggets.com/2021/07/top-6-data-science-online-courses.html
-
Geometric foundations of Deep Learning">Geometric foundations of Deep Learning
Geometric Deep Learning is an attempt for geometric unification of a broad class of machine learning problems from the perspectives of symmetry and invariance. These principles not only underlie the breakthrough performance of convolutional neural networks and the recent success of graph neural networks but also provide a principled way to construct new types of problem-specific inductive biases.https://www.kdnuggets.com/2021/07/geometric-foundations-deep-learning.html
-
AGI and the Future of Humanity
The possibilities for humanity's future very likely includes at least one in which computers will exceed human abilities. Artificial General Intelligence (AGI) does not necessarily have to be all doom and gloom. However, we must begin now to understand how this technical evolution might progress and consider what actions to take now to prepare.https://www.kdnuggets.com/2021/07/agi-future-humanity.html
-
Exploring the SwAV Method
This post discusses the SwAV (Swapping Assignments between multiple Views of the same image) method from the paper “Unsupervised Learning of Visual Features by Contrasting Cluster Assignments” by M. Caron et al.https://www.kdnuggets.com/2021/07/swav-method.html
-
High-Performance Deep Learning: How to train smaller, faster, and better models – Part 4
With the right software, hardware, and techniques at your fingertips, your capability to effectively develop high-performing models now hinges on leveraging automation to expedite the experimental process and building with the most efficient model architectures for your data.https://www.kdnuggets.com/2021/07/high-performance-deep-learning-part4.html
-
A Learning Path To Becoming a Data Scientist">A Learning Path To Becoming a Data Scientist
Becoming a professional data scientist may not be as easy as "1... 2... 3...", but these 10 steps can be your self-learning roadmap to kickstarting your future in the exciting and ever-expanding field of data science.https://www.kdnuggets.com/2021/07/learning-path-data-scientist.html
-
GitHub Copilot: Your AI pair programmer – what is all the fuss about?
GitHub just released Copilot, a code completion tool on steroids dubbed your "AI pair programmer." Read more about it, and see what all the fuss is about.https://www.kdnuggets.com/2021/07/github-copilot-ai-pair-programmer.html
-
Data Scientists and ML Engineers Are Luxury Employees">Data Scientists and ML Engineers Are Luxury Employees
Maybe it seems that everyone wants to become a data scientist and every organization wants to hire one as quickly as possible. However, a mismatch often exists between what companies tend to need and what ML practitioners want to do. So, it's time for the field to take another step toward maturity through an enhanced appreciation of the broad range of technical foundations for an organization to become data-driven.https://www.kdnuggets.com/2021/07/data-scientists-machine-learning-engineers-luxury-employees.html
-
Semantic Search: Measuring Meaning From Jaccard to Bert
In this article, we’ll cover a few of the most interesting — and powerful — of these techniques — focusing specifically on semantic search. We’ll learn how they work, what they’re good at, and how we can implement them ourselves.https://www.kdnuggets.com/2021/07/semantic-search-measuring-meaning-jaccard-bert.html
-
High-Performance Deep Learning: How to train smaller, faster, and better models – Part 3
Now that you are ready to efficiently build advanced deep learning models with the right software and hardware tools, the techniques involved in implementing such efforts must be explored to improve model quality and obtain the performance that your organization desires.https://www.kdnuggets.com/2021/07/high-performance-deep-learning-part3.html
-
Prepare Behavioral Questions for Data Science Interviews
This is part 5 of a series by the author which helps readers nail the data science interviews with confidence.https://www.kdnuggets.com/2021/07/prepare-behavioral-questions-data-science-interviews.html
-
5 Lessons McKinsey Taught Me That Will Make You a Better Data Scientist">5 Lessons McKinsey Taught Me That Will Make You a Better Data Scientist
How to stand out from your peers in the data world.https://www.kdnuggets.com/2021/07/5-lessons-mckinsey-taught-better-data-scientist.html
-
Computational Complexity of Deep Learning: Solution Approaches
Why has deep learning been so successful? What is the fundamental reason that deep learning can learn from big data? Why cannot traditional ML learn from the large data sets that are now available for different tasks as efficiently as deep learning can?https://www.kdnuggets.com/2021/06/computational-complexity-deep-learning-solution-approaches.html
-
Unleashing the Power of MLOps and DataOps in Data Science
Organizations trying to move forward with analytics and data science initiatives -- while floating in an ocean of data -- must enhance their overall approach and culture to embrace a foundation on DataOps and MLOps. Leveraging these operational frameworks are necessary to enable the data to generate real business value.https://www.kdnuggets.com/2021/06/power-mlops-dataops-data-science.html
-
Add A New Dimension To Your Photos Using Python">Add A New Dimension To Your Photos Using Python
Read this to learn how to breathe new life into your photos with a 3D Ken Burns Effect.https://www.kdnuggets.com/2021/06/new-dimension-photos-python.html
-
High-Performance Deep Learning: How to train smaller, faster, and better models – Part 2
As your organization begins to consider building advanced deep learning models with efficiency in mind to improve the power delivered through your solutions, the software and hardware tools required for these implementations are foundational to achieving high-performance.https://www.kdnuggets.com/2021/06/high-performance-deep-learning-part2.html
-
Data Careers in Demand: Crowd Solutions Architect Explained
How can crowdsourcing support the applications of data teams at an organization? With an ever-increasing demand for more and higher quality data, a new role of the Crowd Solutions Architect (CSA) can leverage the potential of the masses to bring an advantage to a business's capability to deliver effective AI-driven solutions.https://www.kdnuggets.com/2021/06/data-careers-crowd-solutions-architect.html
-
Fine-Tuning Transformer Model for Invoice Recognition
The author presents a step-by-step guide from annotation to training.https://www.kdnuggets.com/2021/06/fine-tuning-transformer-model-invoice-recognition.html
-
Amazing Low-Code Machine Learning Capabilities with New Ludwig Update
Integration with Ray, MLflow and TabNet are among the top features of this release.https://www.kdnuggets.com/2021/06/ludwig-update-includes-low-code-machine-learning-capabilities.html
-
What is Segmentation?
Segmentation refers to many things, and is one of the most frequently used words in marketing This article looks at segmentation from a somewhat different-than-usual perspective.https://www.kdnuggets.com/2021/06/what-segmentation.html
-
Overview of AutoNLP from Hugging Face with Example Project
AutoNLP is a beta project from Hugging Face that builds on the company’s work with its Transformer project. With AutoNLP you can get a working model with just a few simple terminal commands.https://www.kdnuggets.com/2021/06/overview-autonlp-hugging-face-example-project.html
-
Pandas vs SQL: When Data Scientists Should Use Each Tool">Pandas vs SQL: When Data Scientists Should Use Each Tool
Exploring data sets and understanding its structure, content, and relationships is a routine and core process for any Data Scientist. Multiple tools exist for performing such analysis, and we take a deep dive into the benefits and different approaches of two important tools, SQL and Pandas.https://www.kdnuggets.com/2021/06/pandas-vs-sql.html
-
High Performance Deep Learning, Part 1
Advancing deep learning techniques continue to demonstrate incredible potential to deliver exciting new AI-enhanced software and systems. But, training the most powerful models is expensive--financially, computationally, and environmentally. Increasing the efficiency of such models will have profound impacts in many ways, so developing future models with this intension in mind will only help to further expand the reach, applicability, and value of what deep learning has to offer.https://www.kdnuggets.com/2021/06/efficiency-deep-learning-part1.html
-
The Best Way to Learn Practical NLP?
Hugging Face has just released a course on using its libraries and ecosystem for practical NLP, and it appears to be very comprehensive. Have a look for yourself.https://www.kdnuggets.com/2021/06/best-way-learn-practical-nlp.html
-
An introduction to Explainable AI (XAI) and Explainable Boosting Machines (EBM)
Understanding why your AI-based models make the decisions they do is crucial for deploying practical solutions in the real-world. Here, we review some techniques in the field of Explainable AI (XAI), why explainability is important, example models of explainable AI using LIME and SHAP, and demonstrate how Explainable Boosting Machines (EBMs) can make explainability even easier.https://www.kdnuggets.com/2021/06/explainable-ai-xai-explainable-boosting-machines-ebm.html
-
A Graph-based Text Similarity Method with Named Entity Information in NLP
In this article, the author summarizes the 2017 paper "A Graph-based Text Similarity Measure That Employs Named Entity Information" as per their understanding. Better understand the concepts by reading along.https://www.kdnuggets.com/2021/06/graph-based-text-similarity-method-named-entity-information-nlp.html
-
Facebook Launches One of the Toughest Reinforcement Learning Challenges in History
The FAIR team just launched the NetHack Challenge as part of the upcoming NeurIPS 2021 competition. The objective is to test new RL ideas using a one of the toughest game environments in the world.https://www.kdnuggets.com/2021/06/facebook-launches-toughest-reinforcement-learning-challenges.html
-
Data Scientists Will be Extinct in 10 Years">Data Scientists Will be Extinct in 10 Years
And why it’s not a bad thing.https://www.kdnuggets.com/2021/06/data-scientists-extinct-10-years.html
-
Building a Knowledge Graph for Job Search Using BERT
A guide on how to create knowledge graphs using NER and Relation Extraction.https://www.kdnuggets.com/2021/06/knowledge-graph-job-search-bert.html
-
Five types of thinking for a high performing data scientist"> Five types of thinking for a high performing data scientist
The way you think about a problem and the conceptual process you go through to find a solution may be guided by your personal skills or the type of problem at hand. Many mental models exist representing a variety of thinking patterns -- and as a Data Scientist, appreciating different approaches can help you more effectively model data in the business world and communicate your results to the decision-makers.https://www.kdnuggets.com/2021/06/five-types-thinking-data-scientist.html
-
The Essential Guide to Transformers, the Key to Modern SOTA AI
You likely know Transformers from their recent spate of success stories in natural language processing, computer vision, and other areas of artificial intelligence, but are familiar with all of the X-formers? More importantly, do you know the differences, and why you might use one over another?https://www.kdnuggets.com/2021/06/essential-guide-transformers-key-modern-sota-ai.html
-
Feature Selection – All You Ever Wanted To Know
Although your data set may contain a lot of information about many different features, selecting only the "best" of these to be considered by a machine learning model can mean the difference between a model that performs well--with better performance, higher accuracy, and more computational efficiency--and one that falls flat. The process of feature selection guides you toward working with only the data that may be the most meaningful, and to accomplish this, a variety of feature selection types, methodologies, and techniques exist for you to explore.https://www.kdnuggets.com/2021/06/feature-selection-overview.html
-
The 7 Best Open Source AI Libraries You May Not Have Heard Of
AI researchers today have many exciting options for working with specialized tools. Although starting original projects from scratch is often not necessary, knowing which existing library to leverage remains a challenge. This list of generally unknown yet awesome, open-source libraries offers an interesting collection to consider for state-of-the-art research that spans from automatic machine learning to differentiable quantum circuits.https://www.kdnuggets.com/2021/06/7-open-source-ai-libraries.html
-
5 Data Science Open-source Projects You Should Consider Contributing to
As you prepare to interview for a position in data science or are looking to jump to the next level, now is the time to enhance your skills and your resume with by working on rea, open-source projects. Here, we suggest a great selection of projects you can contribute to and help build something awesome, so, all you need to do choose one and tackle it head on.https://www.kdnuggets.com/2021/06/5-data-science-open-source-projects-contribute.html
-
How to Fine-Tune BERT Transformer with spaCy 3
A step-by-step guide on how to create a knowledge graph using NER and Relation Extraction.https://www.kdnuggets.com/2021/06/fine-tune-bert-transformer-spacy.html
-
Stop (and Start) Hiring Data Scientists
Large companies are losing many data scientists to smaller companies, so what should executives and managers do? These three “stop & start” tactics can improve talent retention, and help define a new way of recruiting and working for the Data Science field.https://www.kdnuggets.com/2021/06/hiring-data-scientists.html
-
Top 4 Data Extraction Tools
Data extraction tools give you the boost you need for gathering information from a multitude of data sources. These four data extraction tools will help liberate you from manual data entry, understand complex documents, and simplify the data extraction process.https://www.kdnuggets.com/2021/05/top-4-data-extraction-tools.html
-
Essential Math for Data Science: Basis and Change of Basis
In this article, you will learn what the basis of a vector space is, see that any vectors of the space are linear combinations of the basis vectors, and see how to change the basis using change of basis matrices.https://www.kdnuggets.com/2021/05/essential-math-data-science-basis-change-basis.html
-
Choosing the Right BI Tool for Your Business
Here are six questions to ask as you search for the best BI tool for your specific needs.https://www.kdnuggets.com/2021/05/choosing-right-bi-tool-business.html
-
Budgeting For Your AI Training Data: Consider These 3 Factors
Before you even plan to procure the data, one of the most important considerations in determining how much you should spend on your AI training data. In this article, we will give you insights to develop an effective budget for AI training data.https://www.kdnuggets.com/2021/05/shaip-budgeting-ai-training-data.html
-
These Soft Skills Can Make or Break Your Data Science Career
In an industry long ruled by hard skills, the future career success of tomorrow’s data scientists might well depend on their ability to deploy a variety of soft skills into the workplace.https://www.kdnuggets.com/2021/05/soft-skills-data-science-career.html
-
Data Validation in Machine Learning is Imperative, Not Optional
Before we reach model training in the pipeline, there are various components like data ingestion, data versioning, data validation, and data pre-processing that need to be executed. In this article, we will discuss data validation, why it is important, its challenges, and more.https://www.kdnuggets.com/2021/05/data-validation-machine-learning-imperative.html
-
6 Business Trends Benefiting Data Scientists
Here are six business trends making data scientists even more in-demand.https://www.kdnuggets.com/2021/05/6-business-trends-data-scientists.html
-
Awesome list of datasets in 100+ categories
With an estimated 44 zettabytes of data in existence in our digital world today and approximately 2.5 quintillion bytes of new data generated daily, there is a lot of data out there you could tap into for your data science projects. It's pretty hard to curate through such a massive universe of data, but this collection is a great start. Here, you can find data from cancer genomes to UFO reports, as well as years of air quality data to 200,000 jokes. Dive into this ocean of data to explore as you learn how to apply data science techniques or leverage your expertise to discover something new.https://www.kdnuggets.com/2021/05/awesome-list-datasets.html
-
How to Determine if Your Machine Learning Model is Overtrained">How to Determine if Your Machine Learning Model is Overtrained
WeightWatcher is based on theoretical research (done injoint with UC Berkeley) into Why Deep Learning Works, based on our Theory of Heavy Tailed Self-Regularization (HT-SR). It uses ideas from Random Matrix Theory (RMT), Statistical Mechanics, and Strongly Correlated Systems.https://www.kdnuggets.com/2021/05/how-determine-machine-learning-model-overtrained.html
-
A checklist to track your Data Science progress">A checklist to track your Data Science progress
Whether you are just starting out in data science or already a gainfully-employed professional, always learning more to advance through state-of-the-art techniques is part of the adventure. But, it can be challenging to track of your progress and keep an eye on what's next. Follow this checklist to help you scale your expertise from entry-level to advanced.https://www.kdnuggets.com/2021/05/checklist-data-science-progress.html
-
Machine Translation in a Nutshell
Marketing scientist Kevin Gray asks Dr. Anna Farzindar of the University of Southern California for a snapshot of machine translation. Dr. Farzindar also provided the original art for this article.https://www.kdnuggets.com/2021/05/machine-translation-nutshell.html
-
The NoSQL Know-It-All Compendium
Are you a NoSQL beginner, but want to become a NoSQL Know-It-All? Well, this is the place for you. Get up to speed on NoSQL technologies from a beginner's point of view, with this collection of related progressive posts on the subject. NoSQL? No problem!https://www.kdnuggets.com/2021/05/nosql-know-it-all-compendium.html
-
6 side hustles for an aspiring data scientist
As an aspiring data scientist or an employed professional, many opportunities exist for you to offer your skills to a broader audience through side gigs. While the difficulty and risk vary, experiences from applying your data science practice to areas outside your immediate career path can increase your expertise while even increasing your bank account.https://www.kdnuggets.com/2021/05/6-side-hustles-data-scientist.html
-
How to become an online data science tutor
Your expertise in data science may be serving you well in your day job or you are on track to land that next dream position to do what you love. There are many others aspiring to attain your level of skill, and maybe you could consider helping them out... through a side gig of teaching.https://www.kdnuggets.com/2021/05/how-become-online-data-science-tutor.html
-
What Makes AI Trustworthy?
This blog pertains to the importance of why AI needs to be trustworthy as well as what makes it trustworthy. AI predictions/suggestions should not just be taken at face value, but rather delved into at a deeper level. We need to understand how an AI system makes its predictions to put our trust in it. Trust should not be built on prediction accuracy alone.https://www.kdnuggets.com/2021/05/what-makes-ai-trustworthy.html
-
A Comprehensive Guide to Ensemble Learning – Exactly What You Need to Know
This article covers ensemble learning methods, and exactly what you need to know in order to understand and implement them.https://www.kdnuggets.com/2021/05/comprehensive-guide-ensemble-learning.html
-
Feature stores – how to avoid feeling that every day is Groundhog Day
Feature stores stop the duplication of each task in the ML lifecycle. You can reuse features and pipelines for different models, monitor models consistently, and sidestep data leakage with this MLOps technology that everyone is talking about.https://www.kdnuggets.com/2021/05/feature-stores-how-avoid-feeling-every-day-is-groundhog-day.html
-
What is Neural Search?
And how to get started with it with no prior experience in Machine Learning.https://www.kdnuggets.com/2021/05/what-neural-search.html
-
What makes a winning entry in a Machine Learning competition?
So you want to show your grit in a Kaggle-style competition? Many, many others have the same idea, including domain experts and non-experts, and academic and corporate teams. What does it take for your bright ideas and skills to come out on top of thousands of competitors?https://www.kdnuggets.com/2021/05/winning-machine-learning-competition.html
-
Gradient Boosted Decision Trees – A Conceptual Explanation
Gradient boosted decision trees involves implementing several models and aggregating their results. These boosted models have become popular thanks to their performance in machine learning competitions on Kaggle. In this article, we’ll see what gradient boosted decision trees are all about.https://www.kdnuggets.com/2021/04/gradient-boosted-trees-conceptual-explanation.html
-
Introducing The NLP Index
The NLP Index is a brand new resource for NLP code discovery, combining and indexing more than 3,000 paper and code pairs at launch. If you are interested in NLP research and locating the code and papers needed to understand an implement the latest research, you should check it out.https://www.kdnuggets.com/2021/04/nlp-index.html
-
Best Podcasts for Machine Learning
Podcasts, especially those featuring interviews, are great for learning about the subfields and tools of AI, as well as the rock stars and superheroes of the AI world. Here, we highlight some of the best podcasts today that are perfect for both those learning about machine learning and seasoned practitioners.https://www.kdnuggets.com/2021/04/best-podcasts-machine-learning.html
-
Using Data Science to Predict and Prevent Real World Problems
Do you have an interest in data science but lack an understanding of what, exactly, it can be used to accomplish in the real world? Read this article for a few examples of just how helpful data science can be for predicting and preventing real world problems.https://www.kdnuggets.com/2021/04/data-science-predict-prevent-real-world-problems.html
-
Getting Started with Reinforcement Learning
Demystifying some of the main concepts and terminologies associated with Reinforcement Learning and their association with other fields of AI.https://www.kdnuggets.com/2021/04/getting-started-reinforcement-learning.html
-
Data careers are NOT one-size fits all! Tips for uncovering your ideal role in the data space
Thriving as a data professional is about more than just making good money! It’s about FULFILLMENT & IMPACT. In this article, I will help you discover the BEST data role for you given your unique skill sets, personality & goals.https://www.kdnuggets.com/2021/04/data-careers-not-one-size-fits-all.html
-
Data Science Books You Should Start Reading in 2021">Data Science Books You Should Start Reading in 2021
Check out this curated list of the best data science books for any level.https://www.kdnuggets.com/2021/04/data-science-books-start-reading-2021.html
-
What is Adversarial Neural Cryptography?
The novel approach combines GANs and cryptography in a single, powerful security method.https://www.kdnuggets.com/2021/04/adversarial-neural-cryptography.html
-
Top 10 Must-Know Machine Learning Algorithms for Data Scientists – Part 1
New to data science? Interested in the must-know machine learning algorithms in the field? Check out the first part of our list and introductory descriptions of the top 10 algorithms for data scientists to know.https://www.kdnuggets.com/2021/04/top-10-must-know-machine-learning-algorithms-data-scientists-1.html
-
Build an Effective Data Analytics Team and Project Ecosystem for Success
Apply these techniques to create a data analytics program that delivers solutions that delight end-users and meet their needs.https://www.kdnuggets.com/2021/04/build-effective-data-analytics-team-project-ecosystem-success.html
-
Free From Stanford: Machine Learning with Graphs
Check out the freely-available Stanford course Machine Learning with Graphs, taught by Jure Leskovec, and see how a world renowned researcher teaches their topic of expertise. Accessible materials include slides, videos, and more.https://www.kdnuggets.com/2021/04/free-stanford-machine-learning-graphs.html
-
Continuous Training for Machine Learning – a Framework for a Successful Strategy
A basic appreciation by anyone who builds machine learning models is that the model is not useful without useful data. This doesn't change after a model is deployed to production. Effectively monitoring and retraining models with updated data is key to maintaining valuable ML solutions, and can be accomplished with effective approaches to production-level continuous training that is guided by the data.https://www.kdnuggets.com/2021/04/continuous-training-machine-learning.html
-
7 Must-Haves in your Data Science CV
If you are looking for a new role as a Data Scientist -- either as a first job fresh out of school, a career change, or a shift to another organization -- then check off as many of these critical points as possible to stand out in the crowd and pass the hiring manager's initial CV screen.https://www.kdnuggets.com/2021/04/7-must-haves-data-science-cv.html