DevSecOps 101 Part 2: Detecting Insecure Source Code 📡

This article is part of a series about integrating security tooling in the development process. You can find the rest of the articles here:

This tutorial will be based on the repository resulting from Part 1, so be sure to follow it first if you want to reproduce the steps below ⬇️

Like grep, but for code. ⚙️

Analyzing source code to find security vulnerabilities, namely Static Application Security Testing, has been part of the enterprise software development process for years.

But the tools used to do it were expensive, slow, and hard to master. Until recently, the only open-source tools with a decent developer experience were the linters like pylint, eslint, or their equivalents in other languages.

Thus, only the big corporations were able to test the security of their source code, leaving solo developers or tiny teams to rely on testing by hand, or worth: faith. 🕊️

But this time has come to an end with the release of an exciting tool: semgrep.

semgrep is, as its names suggest, like grep, but for source code. It allows developers to automatically find patterns in their source code while taking into account semantics like variable renaming. You can find an example of semgrep finding XSS in Django code here.

Even better, semgrep supports a lot of languages, and the semgrep community already has written plenty of rulesets to detect bad practices and security flaws for those.

The goal of this tutorial is to deploy semgrep on our vulnerable python app to detect vulnerable code. And guess what? It only takes a few minutes!

Detecting Insecure Code Patterns 🚩

Let's go into our dvpwa repository and source the virtualenv

cd <your_path_to_dvpwa>/dwpva
source .venv/bin/activate

And then install semgrep using pip

pip install semgrep

And then run semgrep

semgrep --config "p/ci" --exclude .venv --error

You might ask yourself What the hell did I just write?, so let's explain a bit the simple options we used here:

  • --config "p/ci" means "use the community-written security rules for running in a ci environment"
  • --exclude .venv means "do not search for vulnerable source code in the .venv folder" (otherwise it would return hundreds of alerts!)
  • --error means return a non-zero error code if alerts are found. Useful for making the CI fail if insecure patterns are detected

You then should see the following output:

Of course! dvpwa uses the md5 algorithm to hash passwords, which is known for being insecure! Semgrep even gives us advice on how to solve the problem.

Adding Semgrep to the CI/CD 🤖

Now that we discovered we were using vulnerable code, what about putting semgrep inside our CI/CD to avoid ever doing that in the future?

Let's improve our Github Action from part 1 to also use semgrep.

Open .github/workflows/main.yaml and add the following job:

  code_analysis:
    runs-on:  ubuntu-latest
    name: Analyse code for security flaws
    steps:
      - uses: actions/checkout@v2
      - name: Code Security Analysis
        run: pip3 install semgrep && semgrep --config "p/ci" --error
        shell: bash
        

Your main.yaml file should look like this:

on: [push]

jobs:
  dependency_analysis:
    runs-on:  ubuntu-latest
    name: Test dependencies for security flaws
    steps:
      - uses: actions/checkout@v2
      - name: Dependency Security
        run: pip3 install safety && safety check
        shell: bash
  code_analysis:
    runs-on:  ubuntu-latest
    name: Analyse code for security flaws
    steps:
      - uses: actions/checkout@v2
      - name: Code Security Analysis
        run: pip3 install semgrep && semgrep --config "p/ci" --error
        shell: bash

Now, let's push our changes on the distant repository:

git add .github/workflows/main.yaml
git commit -m "Add static analysis security testing."
git push origin master

Which should create an action named "Analyse code for security flaws" in your Github Action panel

Of course, this action fails because dvpwa contains insecure code!

Conclusion 😎

In only a few steps, we installed a tool that scans all our python code to find insecure patterns, gives us recommendations on how to solve them, and integrates seamlessly into our CI/CD.

But the power of semgrep goes far beyond: with it, you can write custom rules, create automated refactoring and enforce complex coding patterns. For more details, check out their documentation 😊.

In the next tutorial, we will have a look at dynamic analysis, aka programs that interact with your running app to find security flaws 🚀

Follow Escape on Twitter, or check our website

for more human-readable appsec content!

Tristan Kalos

Tristan Kalos

Co-founder & CEO of Escape, in charge of product, sales and marketing. I was a dev before.