Introducing MICScan – a security code analysis tool that learns from our mistakes!
Corporate data breaches are a never-ending news story. Vulnerabilities in code put all of our personal and financial data at risk. A lot of these vulnerabilities follow patterns that have been around for decades. Why haven’t we learned how to reliably catch and fix these mistakes?
Let’s take SQL injection as an example. It has been on the OWASP Top 10 since its inception since 2003, and it continues to be number one today. We know how to recognize it, it’s not difficult to fix and there are tools that prevent it. So why is it still an issue?
It is with these questions in mind that we began work on MICScan. We began by creating a preprocessing application to extract machine learning friendly features from code snippets. We then used this application against the vulnerable code samples provided in the NIST SARD dataset, and used the resulting features to train a machine learning model. We were able to achieve 99.8% accuracy against the NIST dataset.
We thought this was a great start for a few cybersecurity students with little data science background. Our next goal was to see how well we could make the model generalize and perform against other datasets and to transform it into a usable tool. However, we ran into a roadblock finding additional vulnerable labelled code snippets. We realized that this problem is not unique to us. If there isn’t a good comprehensive source of what vulnerable code looks like, then how can anyone hope to build a tool that recognizes these vulnerabilities?
Our project evolved. We now had the additional goal of solving this problem and we set out to build our own dataset. The result is the MICScan vulnerable code sharing component that allows developers to share commits that contain vulnerabilities and fixes. The MICScan machine learn model will incorporate these shared code samples in order to get better at recognizing real world vulnerabilities. Any shared code samples will in turn be shared back with NIST so that other tool makers can use it to improve.
The extension allows the MICScan Scan application to be added to a DevOps pipeline in order to scan for vulnerabalities during a build.
Command Line Interface
This is the equivalent command line application that takes a project file as input.
Vulnerable Code Sharing
This project contains a Visual Studio extension that allows sharing of vulnerable commits through the MICScan API in the Web project. This is done by right clicking on a commit in the Git history within Visual Studio and selecting the option to share with MICScan.