Single backup repo with git

Well as Linus says, with git he doesn't do backups, and it's true that dvcs partially solve the backup problem as long as someone has cloned your repo.

For those branches that are not published yet you have to take backups. Instead of setting up a backup repo for every project you could use a single one for all of them! Every project will have it's own prefix so the namespace stays consistent.

mkdir ~/backup
cd ~/backup
git --bare init

cd projectA
git remote add backup ~/backup
# You probably don't want to fetch from this repo
git config remote.backup.fetch ''
# Push branches to a project namespace
git config remote.backup.push '+refs/heads/*:refs/heads/projectA/*'

git push backup

If you want to track the backup branches you can do it with the following fetch refspec:

git config remote.backup.fetch '+refs/heads/projectA/*:refs/remotes/backup/*'

Switch to mod_wsgi

I just switched the blog to mod_wsgi because I had some issues with Spawning deployment.

All I had to do was to create a mod_wsgi application from the WebFaction control panel, and edit htttp.conf to map ''/' to my blog's wsgi script:

WSGIPythonPath /home/yatiohi/envs/lanata/lib/python2.5/site-packages
WSGIScriptAlias / /home/yatiohi/repos/git/django-lanata-new/lanata/wsgi/server.wsgi

I am using virtuaenv so I had to make somehow apache use it. This can be done in many ways, hack apache start scripts to activate the environment, site.addsitedir() the env's site-package dir or use the WSGIPythonPath mod_wsgi directive. I find the later a bit clearer.

Everything seems to work as expected, for now at least :)

Αναβάθμιση wordpress με git

Ανάμεσα στα άλλα κάνω maintain το wordpress blog που έχουμε στο Ιδεόπολις. O κώδικας του blog έχει το δικό του branch μέσα στο git αποθετήριο του wordpress.

Αυτό πρακτικά σημαίνει ότι κάθε αναβάθμιση (όπως η τελευταία security release 2.6.5) είναι μόνο τρεις εντολές:

     $ git fetch origin
     $ git checkout my_blog
     $ git rebase 2.6.5 (non-conflicting)

git coolness

I am using git to track my home directory, using a branch for every pc I use. Tracking large directories like home can be quite annoying: git status for example will try to stat every file for changes, and produces a 3-4 page output where all the useful information is 3 pages up from your cursor! The same happens with git commit editor.

Luckily, git supports status.showUntrackedFiles option.

git config status.showUntrackedFiles no saves the option to you .git/config file. The results:

$ git status|wc -l
240
$ git config  status.showUntrackedFiles no
$ git status|wc -l
7

Nice stuff :)

Submitting Spree

It seems that I have commited 25 revs the last 2 days in the transifex submissions branch. I really have to stop and start studing!

tx-subm]$ hg log --date=-2 --template='{desc|firstline}\n'
Handle exceptions correctly in `tx' cli
Better exceptions in transifex.client
capture generic validation errors in the api
log validation errors
Now all validation exception inherit from ValidationException
tx-init.py now creates local-vcs/{git,hg,svn,cvs}
Enable submit logging
minor: cleanup module.py
Fix major issue with db session storage in submission
Add `Submission.to_session()` method
We dont need `BrowserMixin.html_diff()` any more
msgfmt_check now returns boolean instead of raising an exception
minor: None instead of Null
Trying to fix Submission.files encoding problems (again)
Changed the way diff() functions work
minor: Documentation fix
minor: remove changelog logic from the template
Teach `SResource.valid_filename()` about global ignore filter
New style wildcards + tests
minor: move debug message
Added `SResource.get_files()` which returns valid filenames
Validation errors are raised from `Submission.validate()` ...
Add `BrowserMixin.walk()` function
minor: fix wrong log message
merge tx-devel

GSoC: Tx-api & our new submission system

Google summer of code 'pencils down day' is approaching! Luckily, the new transifex submission layer is ready! I have currently implemented git, mercurial and svn submitters on top of it. With this new layer writing a new submission method is plain easy: you just have to write a class with 2 or 3 methods! Also mercurial, svn and git submitters are now based on python modules to interact with the repositories.

API

A new api for the submit proccess is written (preview & submit). This will help us implement a cli client and to interact with other web apps or other transifex instances in the future.

Command line client

A command line client is implemented using the new transifex api. It's name is tx (no kidding!). Here is a demo session:

$ tx submit --module testmodule-svn --branch trunk \
--filename 'po/el.po' --contents el-new.po \
-m 'update greek translation' --diff
---  
+++  
@@ -256,7 +257,7 @@

 #: transifex/module.py:705
 msgid "Your submission was committed successfully."
-msgstr "Η υποβολή σας εναποτέθηκε με επιτυχία."
+msgstr "Η υποβολή σας καταχωρήθηκε με επιτυχία."

You can check out the code at our code browser. Any feedback is more than wellcome :)

GSoC 2008

The Google Summer of Code 2008 students were announced yesterday.. and I have been accepted! :)

I 'll work on an abstract submission system for Transifex, a promising web UI for translators used by the Fedora Project.

It's gonna be an interesting summer!

Django command line interface

Yesterday I started experimenting with a cli interface for a django application.

When you write a CLI you may need to pass complex data (dictionaries, python objects) to the application. So It's not a good idea to break their contents and send them via POST method to the server, it's better if you serialize them first.

On the application side, you have to find that data in the request.POST dictionary, deserialize it, do the things you want to do with it and finally return a json response to the client.

Bits

I wrote a BaseClient class that basically provides a send_request method. You just have to specify the remote action and the arguments to pass, and send_request takes care of the low level bits, POSTs the data and returns a deserialized response from the webserver.

At the server, a middleware detects if a request requires json response and if it's passing some serialized data. This data is binded to request.JSON.

Finally, the view uses a JSONResponse object which is a wrapper around the famous json_encoder by Wolfram Kriesing.

I have packaged all that code in a python module named json_utils so it can be reused.

Writing the application

Now using json_utils implementing a simple cli is pretty straight-forward:

Adding a middleware:

MIDDLEWARE_CLASSES = (
...
    'json_utils.middleware.JSONMiddleware',
...

The NotesClient:

#notes-client.py

#!/bin/env python

from json_utils.client import BaseClient

class NotesClient(BaseClient):
    "A command line client to interact with notes django-app"

    def take_note(self, name, note, tags):
        reqs = dict(title=name,body=note,tags=tags)
        return self._send_request('take_note', reqs)

    def list_notes(self,tags):
        reqs = dict(tags=tags)
        return self._send_request('list_notes',reqs)

if __name__ == "__main__":
    notes_cl = NotesClient('http://localhost:8080/notes/',debug=True)
    params = dict(name='A Demo note', note="Foo", tags=['test', 'demo'])

    notes_cl.take_note(**params)
    res = notes_cl.list_notes(tags=['demo'])
    for n in res['objects']:
        print n['title']

And the views.py serving the client:

#urls.py
urlpatterns = patterns('',
    url(r'^take_note/$', view = 'notes.views.take_note',
        name = 'take_note'
    ),
    url(r'^list_notes/$', view = 'notes.views.list',
        name = 'list'
    ),
)

#views.py
def take_note(request):
    if request.format != 'json':
        raise Http404
    data = request.JSON
    n = Note()
    n.title = data['title']
    n.body = data['body']
    n.tags = ' '.join(data['tags'])

    errors = n.validate()
    if errors:
        return JSONResponse({ 'errors' :errors })
    n.save()
    return JSONResponse({ 'object' : n })

def list(request):
    if request.format != 'json':
        raise Http404
    data = request.JSON
    tags = data['tags']
    notes = Note.tagged.with_all(tags=tags)
    return JSONResponse({'objects' : notes })

Well, that's it! It's the first time I'm messing with json and remote webapps, so I started by visiting djangosnippets.com and got a tone of good ideas about handling json in django. The thing I enjoyed the most is that in a cli-based app you don't have to write any css at all :)

If you want to check out the code, you can find json utils at github. Any feedback is more than welcome :)

History meme

Following up Dimitris' entry, here are my results:

$ history | awk '{a[$2]++ } END {for(i in a){print a[i] " " i}}' \
  | sort -rn | head
302 git
130 cd
98 gvim
51 ./manage.py
48 ls
46 grep
35 cat
28 ping
27 nosetests
25 sudo
$ echo $HISTFILESIZE
1000

AutoField in sqlite

It seems that AutoField in django behaves a bit weird when using sqlite.

When a new object is created, its integer primary key gets the largest value in the column at that time +1. Where in other dbs, that integer gets the largest value that ever existed in the column +1, causing the key to be lifetime unique.

How can this be an issue?

If you delete your last blog post and you don't take care of all the foreign keys pointing to it, comments for example, you will end up having all those comments pointing to your next blog post.

Why?

Django maps AutoField to this sql statement:

    "id" integer NOT NULL PRIMARY KEY

Sqlite seems to have added support recently for an AUTOINCREMENT keyword that behaves like the other dbs, so a better sql output would be:

    "id" integer NOT NULL PRIMARY KEY AUTOINCREMENT

Ticket: #6947

Playing with signals and threadedcomments

In my blog I use Eric's threadedcomments application (although in a non-threaded way!).

I noticed yesterday that when you delete a post, the related comments remain in the database and that was a good opportunity to use django signals for the first time!

Planning

We 'll introduce two new settings:

DELETECOMMENTSON_DELETE: Boolean, Always delete comments when the related objects are deleted.

DELETECOMMENTSONDELETEMODELS: List, Delete comments when the related model is in list. The list should be in the form ('myapp.mymodel',)

A small test

Before writing the actual code, let's write a simple test:

#######################################
### delete_comments_on_delete Tests ###
#######################################
>>> import datetime
>>> from threadedcomments.models import FreeThreadedComment, TestModel
>>> from django.contrib.contenttypes.models import ContentType
>>> topic = TestModel.objects.create(name = "Test")
>>> topic.save()
>>> comment = FreeThreadedComment.objects.create_for_object(
...     topic, name = "Eric", ip_address = '127.0.0.1',
...     comment = 'This is fun!  This is very fun!',
... )
>>> comment_id = comment.id
>>> topic.delete()
>>> FreeThreadedComment.objects.filter(id=comment_id).count()
0

We run the test now and it fails, so we have something to fix!

Coding

We now have to write a handler function that connects to the post_save signal for the models. After an object instance is deleted the handler is called and deletes the related comments from the database.

# handlers.py

# Loads the installed apps
from django.db.models.loading import get_models

from django.conf import settings
from django.contrib.contenttypes.models import ContentType
from django.dispatch import dispatcher
from django.db.models import signals
from models import FreeThreadedComment, ThreadedComment

DELETE_COMMENTS_ON_DELETE = getattr(settings, 'DELETE_COMMENTS_ON_DELETE', False)
DELETE_COMMENTS_ON_DELETE_MODELS = getattr(settings, 'DELETE_COMMENTS_ON_DELETE_MODELS', [])

def hook_delete_comments(sender, instance, signal, *args, **kwargs):
     FreeThreadedComment.objects.all_for_object(instance).delete()
     ThreadedComment.objects.all_for_object(instance).delete()

def delete_comments_on_delete(*args):
    for model in args:
        dispatcher.connect(hook_delete_comments, signal=signals.post_delete, sender=model)

models = []

if DELETE_COMMENTS_ON_DELETE:
    models = get_models()
    delete_comments_on_delete(*models)
elif DELETE_COMMENTS_ON_DELETE_MODELS:
    for pair in DELETE_COMMENTS_ON_DELETE_MODELS:
        app_name, model_name = pair.split('.')
        type = ContentType.objects.get(app_label=app_name, model=model_name)
        models += [type.model_class()]
    delete_comments_on_delete(*models)

Depending on our settings delete_comments_on_delete iterates through the models and registers the post_delete handler.

We want to make sure that is code is executed when threadedcomments is loaded so we import it from __init__.py

#__init__.py
import handlers

Conclusion

The good thing with this solution is that it doesn't touch another application's code (rewriting the delete method for ex.).This can be useful for apps based on generic relations so they can interact with the models in a transparent way.

Hello World!

Να μαστε λοιπόν! :)

Καλή συνέχεια να χουμε, για την ώρα όμως: ύπνος.