Work with BagIt packages from Python.

View the Project on GitHub LibraryOfCongress/bagit-python


bagit is a Python library and command line utility for working with BagIt style packages. BagIt is a minimalist packaging format for digital preservation. If you're not familiar with BagIt already you may be interested in the BagIt Wikipedia article, the IETF specification or this short video below.

Installation is a single-file python module that you can drop into your project as needed or you can install globally with:

pip install bagit

Python v2.4+ is required.


From python you can use the bagit module to make a bag like this:

import bagit
bag = bagit.make_bag('mydir', {'Contact-Name': 'Ed Summers'})

Or if you've got an existing bag

import bagit
bag = bagit.Bag('/path/to/bag')

Or from the command line: --contact-name 'Ed Summers' mydir

If you want to validate a bag you can:

bag = bagit.Bag('/path/to/bag')
if bag.is_valid():
    print "yay :)"
    print "boo :("

If you'd like to generate the checksums using parallel system processes, instead of single process:

bagit.make_bag('mydir', {'Contact-Name': 'Ed Summers'}, processes=4) 

or: --processes 4 --contact-name 'Ed Summers' mydir

bag --help will give the full set of options.


% git clone git://
% cd bagit 
% python

If you'd like to see how increasing parallelization of bag creation on your system effects the time to create a bag try using the included bench utility:

% ./