Delete Git History After Committing Large Files or Credentials - bfg-repo-cleaner

If you accidentally committed a large file to a public Git repository, or exposed sensitive information such as credentials or passwords, you may need to delete the file or remove it from commit history. This article introduces bfg-repo-cleaner for that purpose.

Installation

BFG is developed in Java, requires JRE Java 8 or later, and is distributed as a jar file.

The jar binary is stored in Maven Central. You can download it with the following command or open the URL directly in a browser.

$ curl -o bfg.jar -L https://repo1.maven.org/maven2/com/madgag/bfg/1.14.0/bfg-1.14.0.jar

Usage

Clone the repository

First, clone a fresh copy of the repository with the --mirror flag.

$ git clone --mirror git://example.com/some-big-repo.git

This creates a bare repository, so normal working files are not shown, but it contains a full copy of the repository’s Git database. Make a backup at this point in case something goes wrong.

Clean the repository

Now run BFG to clean the repository.

$ java -jar bfg.jar --strip-blobs-bigger-than 100M some-big-repo.git

This example removes large files. See usage examples below for deleting sensitive files or replacing sensitive information from commit history.

Apply the changed data to the repository

BFG updates commits, branches, and tags, but it does not physically delete unwanted objects. After checking that the history has been updated correctly, remove unwanted dirty data with the standard git gc command.

$ cd some-big-repo.git
$ git reflog expire --expire=now --all && git gc --prune=now --aggressive

Push to the remote repository

Finally, when you are satisfied with the updated repository state, push it back. Because the clone command used the mirror flag, this push updates all mirrored references on the remote server.

$ git push

Replace every old clone with a new clone

After that, delete everyone’s old repository copy and replace it with a newly cloned copy. The old clones can accidentally reintroduce dirty history into the newly cleaned repository, so it is best to remove them all.

Usage Examples

Delete sensitive files

Delete all SSH private key files named id_rsa or id_dsa.

$ java -jar bfg.jar --delete-files id_{dsa,rsa}  my-repo.git

Delete large files

Remove every Blob (Binary Large Object) larger than 50 MB.

$ java -jar bfg.jar --strip-blobs-bigger-than 50M  my-repo.git

Replace commit history that contains sensitive information

If you committed sensitive information such as credentials or passwords, you can remove it with the -rf, --replace-text {filename} option.

$ java -jar bfg.jar --replace-text passwords.txt  my-repo.git

If every secret listed in the file is prefixed with regex:, it can be handled as a regular expression. By default, secrets are replaced with ***REMOVED***.

An example passwords.txt is shown below.

PASSWORD1                       # Replace literal string 'PASSWORD1' with '***REMOVED***' (default)
PASSWORD2==>examplePass         # replace with 'examplePass' instead
PASSWORD3==>                    # replace with the empty string
regex:password=\w+==>password=  # Replace, using a regex
regex:\r(\n)==>$1               # Replace Windows newlines with Unix newlines

References