Python

Using Git as a versioned data store in Python

Git has sometimes been described as a versioning file-system which happens to support the underlying notions of version control. And while most people do simply use Git as a version control system, it remains true that it can be used for other tasks as well.

For example, if you ever need to store mutating data in a series of snapshots, Git may be just what you need. It’s fast, efficient, and offers a large array of command-line tools for examining and mutating the resulting data store.

To support this kind of usage — for the upcoming purpose of maintaining issue tracking data in a Git repository — I’ve created a Python class that wraps Git as a basic shelve object.

Read More...
|

Script of the week: linkdups

It's been a while since I've posted a script; life has been distracting lately. I also wanted to let this current script mature a lot more before sharing it, as it has the potential to be destructive. Use wisely!

It's name is linkdups, and it's a Python program to recursively walk through a directory tree and hard-links any files together whose contents match exactly. That means that if you have two files, each taking up 10 Kb, afterwards they will be linked to the same contents for a total savings of 10 Kb. Read More...
|

Serving up Mercurial using mod_python

The following article resulted from several hours of battling with SELinux and Apache, attempting to find some way of serving up my Mercurial repository (now at http://hg.newartisans.com) over HTTP. Now I'm happy to bring you the fruits of that research, even though I'm still getting errors from Mercurial when trying to push (I'm using ssh at the moment). More to come on that front later... Read More...
|

Securing the Buildbot

One of the services I’m setting up on newartisans.com is Buildbot, so that every time changes get checked in the Ledger source tree, they’ll be automatically built and tested on every platform I have access to. Automatically. Without any intervention on my part whatsoever.

In terms of ease of use and assuring that things get done, Buildbot is a dream. In terms of system security, it’s a bit more of a nightmare.

Buildbot is a Python script that runs as a standalone daemon, listening for connections from the slaves in its network. Well, in terms of security I can never really trust a Python script. If anyone was able to subvert the code over the public TCP port, they could tell the script to do just about anything. It might not have much in the way of privilege, but at the very least it could read a whole lot of files on the system.

To run Buildbot securely means doing four things:

  1. Freeze the Buildbot Python script in a single static binary, with no references at all to the system while running.

  2. Run the server in a chroot jail, with access only to its own configuration files, its logs, and the state directories it uses to keep track of how the slaves are doing.

  3. Run the server as the nobody user in the nobody group, so that it has the least permission possible.

  4. Write an SELinux policy so that the daemon can only read and write exactly the files it needs to, and so it has permission to open a single TCP port and to send and receive traffic on that port.

With the environment set, a subverted daemon should just be a nuisance. The attacker could start instructing all of the slaves to keep building continuously, hoping to starve their system resources. And this can be alleviated by restricting builds to once daily. I’ll write more about the SELinux policy once it’s written.

|
© 2008 John Wiegley