Machine Learning and Cyber Security Resources
An overview of useful resources about applications of machine learning and data mining in cyber security, including important websites, papers, books, tutorials, courses, and more.
By Faizan Ahmad, Fsecurify
Data for Machine Learning and Cyber Security:
There is one huge source of data for using machine learning in cyber security and that is SecRepo. This website contains all sorts of data that you can use. I have not found a better data source for cyber security than this website.
Lets go through a few good papers that illustrate the usage of machine learning in cyber security.
- Fast, Lean, and Accurate: Modeling Password Guessability Using Neural Networks. This is an awesome paper where the authors used neural networks to crack passwords. I’ve read the paper and its great.
- Outside the Closed World: On Using Machine Learning for Network Intrusion Detection. The paper talks about network intrusion detection using machine learning. It is another good paper.
- Anomalous Payload-Based Network Intrusion Detection. Another paper with a large number of citations.
- Malicious PDF detection using metadata and structural features. A unique application of data science in cyber security.
- Adversarial support vector machine learning.
- Exploiting machine learning to subvert your spam filter. Another good read.
The below papers are taken from covert.io. You can check out their website for a huge collection of papers but there are just too many and not all of them are very readable and new.
- CAMP – Content Agnostic Malware Protection
- Notos – Building a Dynamic Reputation System for DNS
- Kopis – Detecting malware domains at the upper dns hierarchy
- Pleiades – From Throw-away Traffic To Bots – Detecting The Rise Of DGA-based Malware
- EXPOSURE – Finding Malicious Domains Using Passive DNS Analysis
- Polonium – Tera-Scale Graph Mining for Malware Detection
- Nazca – Detecting Malware Distribution in Large-Scale Networks
- PAYL – Anomalous Payload-based Network Intrusion Detection
- Anagram – A Content Anomaly Detector Resistant to Mimicry Attack
There are not many books available on the use of data science and machine learning for cyber security but I’ve found a few and these look quite promising. I’ll be reading these in my coming holidays.
- Data Mining and Machine Learning in Cybersecurity.
- Machine Learning and Data Mining for Computer Security.
There has been some amazing talks on the topic. I’ve gathered them as well.
- Using Machine Learning to Support Information Security.
- Defending Networks with Incomplete Information.
- Applying Machine Learning to Network Security Monitoring.
- Measuring the IQ of your Threat Intelligence Feed.
- Data-Driven Threat Intelligence: Metrics On Indicator Dissemination And Sharing.
- Applied Machine Learning for Data Exfil and Other Fun Topics.
- Secure Because Math: A Deep-Dive on ML-Based Monitoring.
- Machine Duping 101: Pwning Deep Learning Systems.
- Delta Zero, KingPhish3r – Weaponizing Data Science for Social Engineering.
- Defeating Machine Learning What Your Security Vendor Is Not Telling You.
- CrowdSource: Crowd Trained Machine Learning Model for Malware Capability Det.
- Defeating Machine Learning: Systemic Deficiencies for Detecting Malware.
- Packet Capture Village – Theodora Titonis – How Machine Learning Finds Malware.
- Build an Antivirus in 5 Min – Fresh Machine Learning #7. A fun video to watch.
- Hunting for Malware with Machine Learning.
These were the good ones I could find. I haven’t watched them all but they seem pretty good. Let me know in comments if you come across some more talks.
I’ve found some great tutorials related to this topic.
- Click Security Data Hacking Project. This project contains many tutorials with notebooks and code. This is a must read for everyone interested in the application of ML in InfoSec.
- Using Neural Networks to generate human readable passwords.
- Machine Learning based Password Strength Classification.
- Using Machine Learning to Detect Malicious URLs.
- Big Data and Data Science for Security and Fraud Detection. A nice one.
- Using deep learning to break a Captcha system. Nice read.
There are also few courses about the topic. But here are the ones I could find.
- Data Mining for Cyber Security by Stanford. This one is probably the best course on using Data for Cyber Security. There are a lot of applications and techniques given by the instructor in the slides. The course page also has a lot of projects done by the students using machine learning for security.
- System predicts 85 percent of cyber-attacks using input from human expert. A must read post.
- A list of open source projects in cyber security using machine learning have been posted on mlsecproject.
- An Introduction to Machine Learning for Cybersecurity and Threat Hunting.
That’s all. These were some of the very good resources that I could find related to this topic. If you know about some more resources, please comment them below and I’ll add them.
Bio: Faizan Ahmad is a Fulbright undergraduate currently studying in NUCES FAST and a Research Assistant at Lahore University of Management Sciences, Pakistan.