The Complete Collection of Data Science Projects – Part 2
The second part covers the list of Machine Learning, Deep Learning, Computer Vision, Natural Language Processing, Data Engineering, and MLOps.
Image by Author
Editor's note: For the full scope of repositories included in this 2 part series, please see The Complete Collection of Data Science Projects – Part 1.
Machine learning is a hot topic in data science, and you will learn about the classification, regression, and clustering projects to solve business problems. It will help you understand the tabular dataset, data processing, training on algorithms, and model validation.
- Music Genre Classification: Tutorial
- Credit Card Fraud Detection: Tutorial
- Flight Price Prediction: Tutorial | Code Source
You will learn more advanced machine learning algorithms, neural networks, and data processing techniques. Deep learning is a huge subject, and to master it, you need to learn its applications in computer vision, NLP, forecasting, automatic speech recognition, generative art, and reinforcement learning.
- Reinforcement Learning: Tutorial
- Gender and Age Detection with OpenCV: Tutorial
- Deep Learning for Time Series Forecasting: Tutorial
In computer vision, you learn to process image data and train the model for various computer vision tasks such as image classification, generation, segmentation, and object detection.
- Automatic colorization: Code Source
- One Shot Face Stylization: Code Source
- Image Segmentation: Tutorial
Natural Language Processing (NLP)
You will learn to understand language through images, text, and audio. Due to the introduction of large language models and transformers NLP has seen multiple applications in the real world. It is used for translation, question and answers, text summarization, text classification, text generation, and conversational AI.
- Machine Translation Yorùbá to English: Tutorial | Code Source
- BERT Text Classifier on Tensor Processing Unit: Tutorial
- Automatic Speech Recognition: Tutorial | Code Source
Design, validate, and deploy data pipelines for data science projects. You will learn everything about the data engineering process. You will also learn how these modern tools integrate to provide seamless data streams. It will introduce you to ETL, data modeling, orchestration, analytics, and serving tools.
- Design, Development, and Deployment of a simple Data Pipeline: Tutorial | Code Source
- Uber Expenses Tracking: Tutorial | Code Source
- Data Compression and Data Decompression Pipeline: Tutorial | Code Source
It is the production side of machine learning where engineers test, retrain, validate, and server inference in production. You will learn about ml pipeline tools, experiment and artifact tracking, storing and versioning data and models, cloud computing, REST API, and web applications. You will learn to create an end-to-end machine learning system.
- MNIST MLOps Learning: Code Source
- NLP MLops Project With DagsHub: Tutorial | Code Source
- Machine Learning, Pipelines, Deployment, and MLOps: Tutorial
Working on projects and replicating the results will make you good at problem-solving, and it will also help you land a dream job.
I will suggest beginners and people who are looking for jobs either start working on a pet project or contribute to open source projects to learn more about standard practices.
We have learned about machine learning, deep learning, computer vision, natural language processing, data engineering, and MLOps. The projects consist of descriptions and code sources. Some of them even have a detailed tutorial to guide you throughout the project.
In the previous part, we have covered:
- Web scraping
- Data Analytics
- Business Intelligence
- Time Series
This is the 5th edition in the collection series, check out:
- The Complete Collection of Data Science Cheat Sheets – Part 1 and Part 2
- The Complete Collection of Data Repositories – Part 1 and Part 2
- The Complete Collection of Data Science Books – Part 1 and Part 2
- The Complete Collection of Data Science Interviews – Part 1 and Part 2
Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master's degree in Technology Management and a bachelor's degree in Telecommunication Engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.