Scroll to top
© 2024, Global Digital Services LLC.

Remove secrets or sensitive data from a Git repository


Carlos Noguera - May 29, 2024 - 0 comments

When you are working on any project using GIT repositories sometimes it is hard to be 100% sure that you didn’t write any secrets in your commit history, in my experience, this can happen at any time and could be a genuine mistake. GIT repositories are the most important element when you develop software as they are your unique source of truth, where the business logic is stored and shared with multiple actors like team members or third-party tools.

Is it important to be sure that we don’t write any secrets or any sensitive data in them, one way to achieve this is to find these secrets and remove them from the repo, another way is to configure a security check in your local environment with git hooks and check any commit before pushing them remotely or with a security scan as a part of a CI/CD pipeline.

In this post, I’ll show you how to detect secrets in a GIT repository and how to remove them from the commit history.

Before we start this tutorial, we are going to need the following tools:

Docker:  Software framework for building, running, and managing containers.

Git: Open source distributed version control system.

Gitleaks: SAST tool for detecting and preventing hardcoded secrets like passwords, API keys, and tokens in GIT repositories.

Git-filter-repo: Tool that allows to rewrite entire repository history using user-specified filters.

This demo was tested in Ubuntu 22.04, it should work fine in any Linux distribution.

Let’s begin, first install the required tools:

# We are going to use the docker version of gitleaks and configure 
# it with an alias for this demo

# Pull gitleaks docker image
docker pull zricethezav/gitleaks:latest

# Create an alias for this command
alias gitleaks="docker run -v $(pwd):/path zricethezav/gitleaks:latest --source='/path' detect"

# Install git-filter-repo
sudo apt install git-filter-repo

Let’s create a simple GIT repository:

# Create a simple folder
mkdir -p demo-secrets
cd demo-secrets

# Init a git repository
git init -b main
git commit --allow-empty -m "added main branch"

# Create a branch that is going to contain our 'secrets'
git checkout -b secrets

# Add a simple secret: 
cat << EOF > credentials
aws_access_key_id=R3IASEOASIAFI8QCU6Z0
aws_secret_access_key=2EwxuC9hwr1DAcco8hRnqbPtTSEEUNDDskUEJneA
region=us-east-2
EOF

# Add the file and commit the changes
git add --all .
git commit -m "added AWS fake secrets"

We added a secret in our repo, now let’s go back to the main branch and check if we can detect our secret:


# Return to main branch
git checkout main

# Execute gitleaks command and check if we can detect our secrets in any 
# branch in our repo
gitleaks

○
│╲
│ ○
○ ░
░ gitleaks
7:44PM INF 1 commits scanned.
7:44PM INF scan completed in 3.63ms
7:44PM WRN leaks found: 2

It detected a secret! let’s check the details.


# Execute gitleaks command with verbose option
gitleaks -v

○ 
│╲ 
│ ○ 
○ ░
░ gitleaks 
7:46PM INF 1 commits scanned.
Finding: aws_access_key_id=R3IASEOASIAFI8QCU6Z0
aws_secret_access_k...
Secret: R3IASEOASIAFI8QCU6Z0
RuleID: generic-api-key
Entropy: 3.746439
File: credentials
Line: 1
Commit: 242894d124058762c26d6f9fdf3b4ce2baa07743
Author: Carlos Noguera
Email: xxxxxxxxx
Date: 2024-05-29T19:39:11Z
Fingerprint: 242894d124058762c26d6f9fdf3b4ce2baa07743:credentials:generic-api-key:1

Finding: aws_secret_access_key=2EwxuC9hwr1DAcco8hRnqbPtTSEEUNDDskUEJneA
region=us-east-2
Secret: 2EwxuC9hwr1DAcco8hRnqbPtTSEEUNDDskUEJneA
RuleID: generic-api-key
Entropy: 4.703056
File: credentials
Line: 2
Commit: 242894d124058762c26d6f9fdf3b4ce2baa07743
Author: Carlos Noguera
Email: xxxxxx
Date: 2024-05-29T19:39:11Z
Fingerprint: 242894d124058762c26d6f9fdf3b4ce2baa07743:credentials:generic-api-key:2

7:46PM INF scan completed in 3.72ms
7:46PM WRN leaks found: 2

It found that the credentials file had a generic-api-key, since we don’t want to have any secrets in our repo we can remove it.

We can remove the credentials file from the repo with the following command, be careful using this command, it can rewrite new commits, trees, tags, and blobs and can damage the internal structure of your repo.

# Remove credentials file from repo
git-filter-repo --invert-paths --path "credentials" --force

Parsed 2 commits
New history written in 0.01 seconds; now repacking/cleaning...
Repacking your repo and cleaning out old unneeded objects
HEAD is now at 242894d added main branch
Enumerating objects: 2, done.
Counting objects: 100% (2/2), done.
Writing objects: 100% (2/2), done.
Total 2 (delta 0), reused 0 (delta 0), pack-reused 0
Completely finished after 0.05 seconds.

Now let’s check again if the secret was removed.

# Execute gitleaks command with verbose option again
gitleaks -v

○
│╲
│ ○
○ ░
░ gitleaks
7:53PM INF 0 commits scanned.
7:53PM INF scan completed in 3.13ms
7:53PM INF no leaks found

Awesome!, we could remove our secret successfully from our repo.

This is useful if you need to remove any secrets from the repo manually, however, I would recommend that you include a proactive solution to avoid pushing any sensitive information in any remote repository, like githooks, or use security checks in CI/CD pipelines.

Related posts