Python

Using Git as a versioned data store in Python

Git has sometimes been described as a versioning file-system which happens to support the underlying notions of version control. And while most people do simply use Git as a version control system, it remains true that it can be used for other tasks as well.

For example, if you ever need to store mutating data in a series of snapshots, Git may be just what you need. It’s fast, efficient, and offers a large array of command-line tools for examining and mutating the resulting data store.

To support this kind of usage — for the upcoming purpose of maintaining issue tracking data in a Git repository — I’ve created a Python class that wraps Git as a basic shelve object.

Read More...
|

Script of the week: linkdups

It's been a while since I've posted a script; life has been distracting lately. I also wanted to let this current script mature a lot more before sharing it, as it has the potential to be destructive. Use wisely!

It's name is linkdups, and it's a Python program to recursively walk through a directory tree and hard-links any files together whose contents match exactly. That means that if you have two files, each taking up 10 Kb, afterwards they will be linked to the same contents for a total savings of 10 Kb. Read More...
|

Stateful directory scanning in Python

About a half year ago I wrote a little Python module for myself to do "stateful" directory scans. This means keeping watch on the state of a directory so that you can act on changes, like files added or removed, files changed, etc. Now that I've been using this library every hour for that entire period -- with only a few minor bug fixes to cover some exceptional cases -- I believe that version 1.0 is ready for consumption. Today's article reviews the structure of this module and how to use it in your own, since I designed it with the full of intention of others being able to use it with their own scripts. Read More...
|
© 2008 John Wiegley