James Aylett: Recent diary entries

  1. Sunday, 7 Feb 2010: Exceptional HTTP status codes in Django
  2. Tuesday, 20 Oct 2009: Simple search for Django using Haystack and Xapian
  3. Friday, 24 Jul 2009: Implementing deferred registration (or lazy registration) with Django
  4. Tuesday, 24 Mar 2009: Ada Lovelace Day
  5. Wednesday, 25 Feb 2009: Reinstall
  6. Wednesday, 19 Nov 2008: The Open Rights Group
  7. Sunday, 21 Sep 2008: Interface magic
  8. Thursday, 11 Sep 2008: Improvising makes me gulp
  9. Tuesday, 2 Sep 2008: Thoughts on Google Chrome
  10. Tuesday, 24 Jun 2008: Complexity increases
  1. Page 4 of 7

Exceptional HTTP status codes in Django

Published at
Sunday 7th February, 2010
Tagged as
  • Django
  • /dev/fort

Note that the extension this talks about has subsequently been renamed to django_exceptional_middleware.

Django has support for emitting a response with any HTTP status code. However some of these are exceptional conditions, so the natural Pythonic way of generating them would be by throwing a suitable exception. However except for two very common situations (in which you get default template rendering unless you’re prepared to do a bit of work), Django doesn’t make this an easy approach.

At the third /dev/fort we needed something to make this easy, which Richard Boulton threw together and I subsequently turned into a Django app. It allows you to override the way that exceptional status codes are rendered.

Usual HTTP status codes in Django

In Django, HTTP responses are built up as an object, and one of the things you can set is the HTTP status code, a number that tells your user agent (web browser, indexing spider, Richard Stallman’s email gateway) what to do with the rest of the response. Django has some convenient built-in support for various common status codes.

For 200, your code is in control of the HTTP response entity, which is generally the thing displayed in a web browser, parsed by an indexing spider, or turning up in an email. For 301 and 302, the entity is ignored in many cases, although it can contain say a web page explaining the redirect.

For 404 and 500, the default Django behaviour is to render the templates 404.html and 500.html respectively; you can change the function that does this, which allows you to set your own RenderContext appropriately (so that you can inject some variables common across all your pages, for instance). Usually I make this use my project-wide render convenience function, which sets up various things in the context that are used on every page. (You could also add something to TEMPLATE_CONTEXT_PROCESSORS to manage this, but that’s slightly less powerful since you’re still accepting a normal template render in that case.)

I want a Pony!

Okay, so I had two problems with this. Firstly, I got thoroughly fed up of copying my handler404 and handler500 functions around and remembering to hook them up. Secondly, for the /dev/fort project we needed to emit 403 Forbidden regularly, and wanted to style it nicely. 403 is clearly an exceptional response, so it should be generated by raising an exception of some sort in the relevant view function. The alternative is some frankly revolting code:

def view(request, slug):
    obj = get_object_by_slug(slug)
    if not request_allowed_to_see_object(request, obj):
        return HttpResponseForbidden(...)

Urgh. What I want to do is this:

    def view_request, slug):
        obj = get_object_or_403(MyModel, slug=slug)

which will then raise a (new) Http403 exception automatically if there isn’t a matching object. Then I want the exception to trigger a nice rendered template configured elsewhere in the project.

This little unicorn went “501, 501, 501” all the way home

If you don’t care about 403, then maybe you just want to use HTTP status codes as easter eggs, in the way that Wildlife Near You does; some of its species are noted as 501 Not Implemented (in this universe).

get_object_or_403 (a disgression)

If you’ve used Django a bit you’ll have encountered the convenience function get_object_or_404, which looks up a model instance based on parameters to the function, and raises Http404 for you if there’s no matching instance. get_object_or_403 does exactly the same, but raises a different exception. Combine this with some middleware (which I’ll describe in a minute) and everything works as I want. The only question is: why would I want to raise a 403 when an object isn’t found? Surely that’s what 404 is for?

The answer is: privacy, and more generally: preventing unwanted information disclosure. Say you have a site which allows users (with URLs such as /user/james) to show support for certain causes. For each cause, there’s a given slug, so “vegetarianism” has the slug vegetarianism and so on; the page about that user’s support of vegetarianism would then be /user/james/vegetarianism. Someone’s support of a cause may be public (anyone can see it) or private (only people they specifically approve can see it). This leads to three possible options for each cause, and the “usual” way of representing these using HTTP status codes would be as follows:

This is fine when the main point is to hide the details of the user’s support. However if the cause is “international fascism” (international-fascism), a user supporting privately may not want the fact that they support the cause to be leaked to the general public at all. At this point, either the site should always return 404 to mean “does not support or you aren’t allowed to know”, or the site should always return 403 to mean the same thing.

The HTTP specification actually suggests (weakly) using 404 where you don’t want to admit the reason for denying access, but either way is apparently fine by the book, and in some cases 403 is going to make more sense than 404. However since Django doesn’t give any magic support for raising an exception that means 403, we needed an exception and some middleware to catch it and turn it into a suitable response.

django_custom_rare_http_responses

The idea then is that you raise exceptions that are subclasses of django_custom_rare_http_responses.responses.RareHttpResponse (some of the more useful ones are in django_custom_rare_http_responses.responses), and the middleware will then render this using a template http_responses/404.html or whatever. If that template doesn’t exist, it’ll use http_responses/default.html, and if that doesn’t exist, the app has its own default (make sure that you have django.template.loaders.app_directories.load_template_source in your TEMPLATE_LOADERS setting if you want to rely on this).

In order to override the way that the template is rendered, you want to call django_custom_rare_http_responses.middleware.set_renderer with a callable that takes a request object, a template name and a context dictionary, the last of which is set up with http_code, http_message (in English, taken from django.core.handlers.wsgi.STATUS_CODE_TEXT) and http_response_exception (allowing you to subclass RareHttpResponse yourself to pass through more information if you need). The default just calls the render_to_response shortcut in the expected way.

What this means in practice is that you can add django_custom_rare_http_responses.middleware.CustomRareHttpResponses to MIDDLEWARE_CLASSES, and django_custom_rare_http_responses to INSTALLED_APPS, than raise HttpForbidden (for 403), HttpNotImplemented (for 501) and so on. All you need to do to prettify them is to create a template http_responses/default.html and away you go.

My Pony only has three legs

This isn’t finished. Firstly, those 500 responses that Django generates because of other exceptions raised by your code aren’t being handled by this app. Secondly, and perhaps more importantly, adding this app currently changes the behaviour of Django in an unexpected fashion. Without it, and when settings.DEBUG is True, 404s are reported as a helpful backtrace; with it, these are suddenly rendered using your pretty template, which is a little unexpected.

However these are both fairly easy to fix; if it bugs you and I haven’t got round to it, send me a pull request on github once you’re done.

Simple search for Django using Haystack and Xapian

Published at
Tuesday 20th October, 2009
Tagged as
  • Django
  • Search
  • Xapian

Update: note that this is now pretty old, and so probably won’t work. Further, Django now supports full text search if you’re using PostgreSQL, which is probably a better starting point for getting some basic search up and running.

Search isn’t that hard

Back when programmers were real programmers, and everyone wrote in BCPL, search was hard. You had to roll your own indexing code, write your own storage backend, and probably invent your own ranking function. If you could do all that, you were probably working in the field of information retrieval yourself, which was kind of cheating.

These days, not only are there libraries that will do all the work for you, but there’s direct integration with a growing number of ORMs and web frameworks. I’m going to briefly show how to use one specific combination — Xapian and Django, bound together using Haystack — to add basic search functionality to a simple web application.

The Haystack tutorial is pretty good, but provides more options and different paths than are needed to demonstrate it working. Additionally, for licensing reasons the Xapian backend isn’t a fully-supported part of Haystack, so the options it spends most time talking about are Solr, a document search system built in Java on top of Lucene, and Whoosh, a pure-Python search library. There are ongoing problems with using Whoosh in a live environment, but people are having lots of success with both Solr and Xapian as backends; if Solr is your thing, run through the setup from the Haystack tutorial and either keep going, or come back here; after the initial setup, Haystack makes all the core features work across the different backends. If you’ve never worked with Solr or Java before, Xapian is often easier to get going, and hopefully this walk-through is easier to follow than Haystack’s by leaving out the things you don’t need.

Alternatives

Of course, these being the days of mashups, you could just pull in search from somewhere else, like Google Custom Search. This gives you something out of the box fairly quickly, and can be integrated into your site; it certainly looks like a viable alternative. However there are a number of reasons why you might not want to use it, which include:

For other reasons, it’s no longer really acceptable to build a “search engine” using things like FREETEXT indexes in MySQL, so we’ll skip right over that.

The app we’ll use

We’ll demonstrate this using a simple bookmarking system. We’re really only going to look at the Django models involved, and how to search across them; all of this integrates nicely with the Django admin, so if you want to follow along at home, use that to input some data.

I’ve called the entire project bookmarker, and the app that contains the following models bookmarks. If you create a project using Gareth Rushgrove’s django-project-templates, start the bookmarks app and you should be ready to go.

class RemotePage(SluggableModel):
  # meta; these are updated each time someone bookmarks the page
  title = models.CharField(max_length=255)
  summary = models.TextField()
  uri = models.URLField(verify_exists=False, unique=True)

  def bookmarks(self):
    from bookmarker.bookmarks.models import Bookmark
    return Bookmark.objects.filter(page=self)

  @models.permalink
  def get_absolute_url(self):
    return ('page', (), { 'slug': self.slug } )

  def __unicode__(self):
    return self.title

class Bookmark(LooselyAuditedModel):
  # what have we bookmarked?
  page = models.ForeignKey(RemotePage)

  @staticmethod
  def bookmarks_for_request(request):
    if request.user.is_authenticated():
      return Bookmark.objects.filter(created_by=request.user)
    else:
      return Bookmark.get_stashed_in_session(request.session)

  def __unicode__(self):
    return self.page.title

SluggableModel is from django_auto_sluggable, automatically managing the (inherited) slug field, and LooselyAuditedModel is from django_audited_model, which provides fields tracking when and who created and last modified an object. Both are projects of mine available on github.

The idea is that we store a reference to each unique URL bookmarked, as RemotePage objects; each actual bookmark becomes a Bookmark object. We’re then going to search through the RemotePage objects, based on the title and summary.

The only remaining hard part about search — at least, fairly simple search — is figuring out what to make available for searching. The right way to think about this is to start at the top, at the user: figure out what sort of questions they will want to ask, and then find out where the data to answer those questions will come from.

There’s some good work ongoing on how to approach this problem; for instance, search patterns is a project to document the behavioural and design patterns helpful for this.

For now let’s just assume a very basic scenario: our users will want to find out what pages people have bookmarked about a subject, or matching a set a words. The information for this is all going to come from the RemotePage objects, specifically from the title and summary information extracted when the bookmarks are created.

Search fields

Most search systems have the concept of fields. You’re probably already familiar with the idea from using Google, where for instance you can restrict your search to a single website using the site field, with a search such as site:tartarus.org james aylett.

In this case, we’re going to create two main search fields: title and text, with the latter being the important one, built from both the title and summary. We’ll make the title more important than the summary, in the hope that people give their webpages useful names summarising their contents.

The title field will only be built out of the title, although we’re not going to need this to achieve our goal; it’s just to show how it’s done.

We need to install Xapian, which is available as packages for Debian, Ubuntu, Fedora and others; and as source code. We only care about the core and the python bindings; if building from source these are in xapian-core and xapian-bindings. See the Xapian downloads page for more information.

We also need Haystack, available on PyPI as django-haystack; and also xapian-haystack, available on PyPI as xapian-haystack. So the following should get you up and running:

easy_install django-haystack
easy_install xapian-haystack

Indexing

Haystack makes indexing reasonably simple, and automatic on saving objects. There’s a bit of administration to set up in settings.py first:

Then we need to set up bookmarker.haystack. Create bookmarker/haystack/__init__.py containing:

import haystack
haystack.autodiscover()

When the Haystack app is set up, it will find this module, import it, and find the search index we’re about to create as bookmarker/bookmarks/search_indexes.py:

from haystack import indexes, site
from bookmarker.bookmarks.models import RemotePage

class RemotePageIndex(indexes.SearchIndex):
    text = indexes.CharField(document=True, use_template=True)
    title = indexes.CharField(model_attr='title')

site.register(RemotePage, RemotePageIndex)

If this looks a lot like an admin.py, then that’s good: it’s doing a similar job of registering a helper class alongside the RemotePage model that will take care of indexing it. Haystack then provides a separate facility for us to search this, which we’ll use later on.

We’ll take the text field first. It’s marked document=True, which you’ll want to put on the main search field in each index. Secondly, and more importantly, it’s marked use_template=True. This means that it won’t be simply extracted from a field on the RemotePage object, but rather that a Django template will be rendered to provide the index data. The template is stored under the slightly weird name search/indexes/bookmarks/remotepage_text.txt, which is built up from the app name (bookmarks), the model name (remotepage) and the field name in the SearchIndex (text). The contents I’ve used are:

{{ object.title }}
{{ object.title }}
{{ object.title }}
{{ object.title }}
{{ object.title }}

{{ object.summary }}

Okay, so this looks a little weird. In order to increase the weight of the title field (ie how important it is if words from the user’s search are found in the title), we have to repeat the contents of it. Haystack doesn’t provide another approach to this (and it’s unclear how it ever could, from the way it’s designed). In fact, writing a template like this can actually cause problems with phrase searching: say the title of a page is “Man runs the marathon”, the phrase search "marathon man" (including quotes) should not match the page — but will do because of the above trick.

The title field is simpler, and just gets its data from the title on the model, as specified by the model_attr parameter.

To build the index for the first time, run python manage.py reindex. Normally, saving an object will cause it to be reindexed, so any pages you create or modify after that will have their entry in the search index updated.

Searching

Adding search to the web interface is now really simple; in fact, because Haystack provides a default search view for us, we don’t have to write any code at all. We do two things: first, we create a search endpoint in your URLconf:

(r'^search/', include('haystack.urls')),

and then create a template search/search.html:

<h1>Search</h1>

<form method="get" action=".">
  <label for='id_q'>{{ form.q.label }}</label>
  {{ form.q }}
  <input type="submit" value="Search">
</form>

{% if page.object_list %}
  <ol>
    {% for result in page.object_list %}
      <li>
        <a href='{{ result.object.get_absolute_url }}'>{{ result.object.title }}</a>
        ({{ result.object.bookmarks.count }} bookmarks)
      </li>
    {% endfor %}
  </ol>
{% else %}
  <p>No results found.</p>
{% endif %}

And you can go to http://127.0.0.1:8000/search/ or wherever, and search any pages you’ve created.

Limitations of Haystack

When you get into writing your own search code, Haystack has a lot of cleverness on that side, making it appear much like database queries against Django’s ORM. I don’t actually think this is necessary, and in fact might be confusing in the long term, but it does make it easier to get into if you have no prior experience of search.

However the indexing side of Haystack is very limited. Because of the decision to use templates to generate the content for search fields (which is a fairly Django-ish choice), all sorts of sophisticated index strategies are simply impossible. We saw above the workaround required to upweight a particular ORM field when building a search field, and the dangers of that approach. An inability to finely control data from different sources going into a particular search field is a major drawback of Haystack’s design.

Also, Haystack is limited when it comes to excluding objects from the database. It has a mechanism which allows this to work — sometimes; but it often requires reindexing the entire database, or at least a particular model, to bring things in and out of the database later. There seems to be no way of using a single ORM field to indicate whether a given object should currently be searchable or not (ie in the index or not). This makes it difficult, for instance, to support the common pattern of marking objects as visible or not using a published field; for instance, the User object in django.contrib.auth.models does something similar with the active field.

Going further

Haystack has a bunch more features, allowing more sophisticated fields, and searching involving ranges and so forth. Check out its documentation for more information.

If you’re building an app that needs complex search, I’m not convinced that Haystack is the right choice. However it provides a very easy route in, and once you’ve got something up and running you can start analysing the real use of search in your system to help you plan something more sophisticated.

And having some search is much, much better than having none. Haystack is a great way of getting over that first hurdle, and into the world of Django search.

Implementing deferred registration (or lazy registration) with Django

Published at
Friday 24th July, 2009
Tagged as
  • Lazy registration
  • Deferred registration
  • Django

Rationale

Imagine a shopping site where you had to sign up before you added things to your basket; you could certainly use it, but it would have to have strong incentives for you to bother rather than, say, use Amazon or another site where you can just get stuck straight into shopping. Of course, most shopping sites work like this, so you never really think about it.

But why should you have to register for any site before using it? (Beyond special cases.) You don't have to register before using Google, and you don't have to register before using the BBC News site; you also don't have to register before using Hunch, a site to help you make decisions that first has to get various pieces of information out of you so it can make useful suggestions. Deferred registration is a way of describing this, a phrase I've heard bandied around over the last year or so among web folk. But isn't it terribly difficult?

While at /dev/fort 2 back in June, we built a cooking site, where the main object is a recipe. Since we were building the site from scratch, and since it seemed like obviously the right thing to do, we used deferred registration: you can create recipes before you login or register, and later on they get stuck under the right user. We did this partly to convince ourselves that it really wasn't that difficult; and partly so we could convince other people ;-)

At some point we may put the code for the entire site online, but for now I've extracted what we wrote into a small re-usable extension to Django; it provides an easy way of storing objects in the session until we're ready to assign them to the correct user. Here, I'll describe how to use it in building a very simple website for keeping track of your present wishlist, in a manner similar to Things I Love.

"Deferred registration"?

It made sense to me calling this deferred registration, but it turns out it's more commonly called lazy registration.

Doing it in Django

We have a few assumptions here, and one is that you know how to use Django; I'm not going to go into great detail about the mechanics that are common to Django projects, since we can get that from the online documentation. I won't show every detail, and you won't end up with a fully-working wishlist system at the end of this, but hopefully you'll understand how to implement deferred registration. Additionally, I assume you're using the Django session middleware, and the standard auth system (in django.contrib.auth).

The code I've written is on github; this just needs to be on your PYTHONPATH somewhere.

Later on, I assume you're going to use Simon Willison's new django-openid (also on github), which while still a work-in-progress is sufficiently complete that it can be used here, and makes life a little easier. (The name is slightly misleading, as it doesn't just support OpenId, but provides a consistent workflow around registration and authentication.)

Creating our models

We'll have three main models: Wishlist, WishlistItem and WishlistPurchase. There isn't going to be anything hugely sophisticated about what we build here; let's just keep things very simple. (Actually, I'm not even going to touch the latter two; but the intention is that this is a just-enough-complicated example I can use it again in future.)

The key thing here is that SessionStashable is a mixin to the Model class; it augments the class (and the instantiated objects) with some useful methods for managing a list of objects in the session. If you're working with more than one stashable model, as we are here, you need to set session_variable to something different on each model. For now, that's all we need to do: your models are session-stashable straight off.

We're making wishlists and purchases stashable, but not wishlist items, because every item is already in a wishlist, so we know all the items were created by whoever created the wishlist. Note that created_by on both our stashable models is marked null=True, blank=True, because we need to be able to create them without a creator in the database.

Finally, I'm not showing the mechanism for generating slugs; I use another tiny Django plugin that takes care of this for me. Hopefully I'll be able to write about this in due course.

# wishlist.wishlists.models
from django.db import models
from django.contrib.auth.models import User
from django_session_stashable import SessionStashable

class Wishlist(models.Model, SessionStashable):
    slug = models.CharField(max_length=255)
    created_by = models.ForeignKey(User, null=True, blank=True)
    name = models.CharField(max_length=255)

    session_variable = 'wishlist_stash'

    @models.permalink
    def get_absolute_url(self):
        return ('wishlist', (), { 'list_slug': self.slug })

    def __unicode__(self):
        return self.name

class WishlistItem(models.Model):
    slug = models.CharField(max_length=255)
    wishlist = models.ForeignKey(Wishlist)
    name = models.CharField(max_length=255)
    description = models.TextField()

    def __unicode__(self):
        return self.name

class WishlistItemPurchase(models.Model, SessionStashable):
    item = models.ForeignKey(WishlistItem)
    created_by = models.ForeignKey(User, null=True, blank=True)
    created_at = models.DateTimeField(auto_now_add=True)

    session_variable = 'purchase_stash'

    def __unicode__(self):
        if self.created_by:
            return u"%s: %s @ %s" % (self.created_by, self.item, self.created_at)
        else:
            return u"%s @ %s" % (self.item, self.created_at)

Hopefully there's nothing unexpected there, and you're following along nicely.

Creating our views

I'm not going to bother showing all the views here, as the same techniques will apply to both stashable models. Let's set up urls.py as follows.

urlpatterns += patterns('',
    url(r'^lists/$', 'wishlist.wishlists.views.list_list', name='wishlist-list'),
    url(r'^lists/create/$', 'wishlist.wishlists.views.list_create', name='create-wishlist'),
    url(r'^lists/(?P<list_slug>[a-zA-Z0-9-]+)/$', 'wishlist.wishlists.views.list_view', name='wishlist'),
    url(r'^lists/(?P<list_slug>[a-zA-Z0-9-]+)/edit/$', 'wishlist.wishlists.views.list_edit', name='edit-wishlist'),
)

Anyone (including an anonymous user) can list wishlists; in a full application, we'd show your friends' lists, but for now we'll just show any you've created. You can then edit any wishlist you created (similarly for delete, not shown here).

Don't worry about the render function for the moment; we'll look at that further in a minute. For now, it's doing basically the same job as render_to_response.

# wishlist.wishlists.views
from django.http import HttpResponse, HttpResponseRedirect, HttpResponseForbidden
from django.shortcuts import get_object_or_404
from django.forms.models import ModelForm
from wishlist.wishlists.models import Wishlist
from wishlist.shortcuts import render, HttpResponseRedirect

class WishlistForm(ModelForm):
    class Meta:
        model = Wishlist
        fields = ('name',)

def list_list(request):
    if request.user.is_anonymous():
        lists = Wishlist.get_stashed_in_session(request.session)
    else:
        lists = Wishlist.objects.filter(created_by=request.user)
    return render(request, 'wishlists/list_list.html', { 'lists': lists })

def list_view(request, list_slug):
    l = get_object_or_404(Wishlist, slug=list_slug)
    if not l.viewable_on_this_request(request):
        return HttpResponseForbidden("You're not allow to see this wishlist. Sorry!")
    return render(request, 'wishlists/list_view.html', { 'list': l, 'editable': l.editable_on_this_request(request) })

def list_edit(request, list_slug):
    l = get_object_or_404(Wishlist, slug=list_slug)
    if not l.editable_on_this_request(request):
        return HttpResponseForbidden("You're not allow to edit this wishlist. Sorry!")
    if request.method=='POST':
        form = WishlistForm(request.POST, instance=l)

        if form.is_valid():
            l = form.save()
            return HttpResponseRedirect(l.get_absolute_url())
    else:
        form = WishlistForm(instance=l)
    return render(request, 'wishlists/list_edit.html', {
        'list': l,
        'form': form,
    })

def list_create(request):
    if request.method=='POST':
        form = WishlistForm(request.POST)

        if form.is_valid():
            list = form.save(commit=False)
            if not request.user.is_anonymous():
                list.created_by = request.user
            list.save()
            list.stash_in_session(request.session)
            return HttpResponseRedirect(list.get_absolute_url())
    else:
        form = WishlistForm()

    return render(request, 'wishlists/list_create.html', {
        'form': form,
    })

Again, nothing terribly complex. The list_list view pulls objects out of the session stash if you're an anonymous user (since once you've logged in or registered they'll just be marked as created by you). list_create does a tiny dance to set created_by if you're logged in, and stashes the request away (stash_in_session is careful not to stash anything if the object has something in created_by).

Notice that we haven't used anything like login_required, because this shuts off everything to users who haven't registered yet. Instead, we check whether the user can view or edit a wishlist, and return a HttpResponseForbidden object. (This looks a little ugly by default, although you can set things up to get a custom template rendered instead.) We've introduced two new methods into the Wishlist class: editable_on_this_request() and viewable_on_this_request(), which are needed so we can check the session if needed.

def viewable_on_this_request(self, request):
    if self.created_by==None:
        return self.stashed_in_session(request.session)
    else:
        return True

def editable_on_this_request(self, request):
    if self.created_by==None:
        return self.stashed_in_session(request.session)
    else:
        return self.created_by == request.user

In editable_on_this_request, if created_by is set it won't match the AnonymousUser, only a real authenticated one.

Reparenting on registration/login

Finally, on registration or login, we need to pass over the wishlists we've created (and wishlist purchases we've made, although we haven't discussed any of that side of the site) and reparent them to the user we now have. This is actually very easy, and takes two lines of code in a suitable view.

Wishlist.reparent_all_my_session_objects(request.session, request.user)
WishlistItemPurchase.reparent_all_my_session_objects(request.session, request.user)

Let's look at how this will work with django_openid. First, we modify urls.py.

from auth.views import registration_views

urlpatterns += patterns('',
    (r'^login/$', registration_views, {
        'rest_of_url': 'login/',
    }),
    (r'^logout/$', registration_views, {
        'rest_of_url': 'logout/',
    }),
    (r'^account/(.*)$', registration_views),
)

Then we write our views file; django_openid uses a Strategy pattern, so all the views are tied into a single object.

# wishlist.auth.views
from django_openid.registration import RegistrationConsumer
from django_openid.forms import RegistrationForm
from django.conf import settings
from wishlist.wishlists.models import Wishlist, WishlistItemPurchase
from wishlist.shortcuts import render, HttpResponseRedirect

class RegistrationViews(RegistrationConsumer):
    def on_registration_complete(self, request):
        Wishlist.reparent_all_my_session_objects(request.session, request.user)
        WishlistItemPurchase.reparent_all_my_session_objects(request.session, request.user)
        return HttpResponseRedirect('/')

    def on_login_complete(self, request, user, openid=None):
        Wishlist.reparent_all_my_session_objects(request.session, request.user)
        WishlistItemPurchase.reparent_all_my_session_objects(request.session, request.user)
        return HttpResponseRedirect('/')

    def render(self, request, template_name, context = None):
        context = context or {}
        return render(request, template_name, context)

registration_views = RegistrationViews()

In practice you'll want to do more work here. For a start, there are lots of templates you'll need, although django_openid ships with ones that may be suitable for you. But this shows the bits that matter to us: two hook methods, on_registration_complete and on_login_complete, which are the right place to reparent all our stashed objects. It also makes django_openid use our render function — and now it's time to look at that.

Warning users of unparented objects

One final touch: it's nice to give a hint to people who haven't registered or logged in that they will lose their wishlists unless they do. Since we want to put this message on every single page, it needs to be done centrally, at the point that we render our templates. This is where the render function we've been using comes in.

# wishlist.shortcuts
from django.template import RequestContext
from django.shortcuts import render_to_response
from django.http import HttpResponseRedirect

def render(request, template_name, context=None):
    from wishlist.wishlists.models import Wishlist, WishlistItemPurchase
    context = context or {}
    context['stashed_wishlists'] = Wishlist.num_stashed_in_session(request.session)
    context['stashed_purchases'] = WishlistItemPurchase.num_stashed_in_session(request.session)
    return render_to_response(
        template_name, context, context_instance = RequestContext(request)
    )

Using a central function like this means that we can have a number of variables appear in the render context for every template rendered. In this case, we just have two: stashed_wishlists and stashed_purchases. We can then use them to show a brief warning. This would go in a base template (so all other templates would need to use {% extends "base.html" %} or similar).

{% if stashed_wishlists %}
    <div id="notifications" class="receptacle">
        <div class="section">
            <p><em>Warning</em>: you have <a href="{% url wishlist-list %}">unsaved wishlists</a>!
            <a href="/account/register/">Register</a> or <a href="/login/">log in</a> to keep them forever.</p>
        </div>
    </div>
{% endif %}

There isn't much to say here at all. (Except, okay, the HTML could be simpler; but this is all taken from a running app, in the hope it will avoid introducing bugs.)

Going further

You can use the same approach for things like comments; store the comment with no creator, stash it in the session, and come back to it later. With care, and some thought, you can apply this technique to most things that might need deferred registration.

It's a good idea to plan all the interactions around deferred registration before you start coding (which goes against the instincts of a lot of developers). As well as the code behind creation and reparenting, it can have an impact on how your pages fit together, and the interaction elements on those pages. You also have to pay attention when you render lists of things that can be created in this way, so you either don't display those waiting for a real creator, or alter the display suitably so you don't show text like created by None. If the latter, you may want to open up the display page to more than just the creator.

Despite a little bit more work up front, it shouldn't take much longer overall to get this right. Don't believe me? Grab the code and try using it next time you build a site.

Ada Lovelace Day

Published at
Tuesday 24th March, 2009
Tagged as
  • Inspiration
  • Ada Lovelace
  • Ada Lovelace Day
  • Women

Today, 24th March, is Ada Lovelace Day, an idea from Suw Charman-Anderson for people to celebrate the women in technology whom they admire. Amongst the aims are to provide role models for women in technology, and to celebrate inspirational women in technology. Both of these are great aims (and this approach strikes me as considerably more practical and sane than getting only women together for some tech love-in, or having serious round-table discussions about how to get more women into tech jobs).

But I'm not going to write about a leading woman in technology, either now or in the past. I want to spend a few quick words stating, flatly, that the women I've worked with on technology projects (be they programmers, designers, product folk, customer service people, or going right back to school and the girls I did experiments with in my A-level chemistry class1) have been, with very few exceptions, talented, dedicated, interesting and passionate people. Almost all have inspired me to be better at what I do; and many have challenged me at some point or other, forcing me to think again when I was certain I was right, or to consider a different perspective when I thought there was only one valid viewpoint. I can say that of far fewer of the men I've worked with2.

I could spend half an hour drawing up a list of those women, but I'd be bound to leave some out. Come talk to me about them, or perhaps check out who I follow on twitter and find some to inspire you as well. I would be poorer without any one of them.


1 With them, not on them. They were entirely willing to play reflux reactions with me.

2 Although it's possible that I just ignored more of the men. I don't react well to declarative statements I disagree with, which appears to be a more male way of trying to argue with me.

Reinstall

Published at
Wednesday 25th February, 2009
Tagged as
  • The future (is more shiny)
  • Old School
  • Microsoft

I'm currently reinstalling my Windows machine, giving it brand new drives and basically a complete make-over, this to prepare it for editing Talk To Rex's next production. Generally speaking, you have to reinstall Windows every so often anyway, and this machine has gone for some time without, so all is well and good.

Except... except that it all seems so alien to the modern way of working with computers. For instance, the motherboard doesn't have SATA onboard, so I have a card to do that. Windows therefore can't install onto the new drives, despite the card installing BIOS features so that everything other than Windows knows what's going on. Instead, I have to install onto a plain IDE drive, install the SATA drivers onto that (which is painful in the extreme, because the driver disk doesn't install directly but instead tries to write a floppy image containing the installer), and then let Windows take control of the new drives. Another example: my keyboard and graphics tablet are USB, like anything sane these days, and are plugged into my monitor, which is a USB hub plugged into the computer. Windows Setup doesn't bother initialising the hub, so it's like there's no keyboard plugged into the machine until I go and find an old PS/2 one to use.

Admittedly I'm spoiled because most of the time I'm using a Mac, or a server running *nix. Both of these tend to just plug and play anything that is actually going to work with them in any way; no messing around there. (When I had to replace a drive in my server earlier this year, it took five minutes; by the time I'd found a way of logging in from the insanely set-up network in the data centre, it was already rebuilding the volume onto the drive quite happily, thank you very much.)

But this spoiling is the way of the future. It's the reason I'm able to blog while waiting for Windows to figure out what I have to do next; this is probably the first time I've installed Windows while having a second machine lying around that wasn't just a server or firewall. And, despite having just bought a brand new Samsung NC-10 (no link because their website is utter shit and I gave up looking), this will likely be the last time I install Windows. Ever. The next evolution of this machine will be either to take the two 1TB SATA drives out and put them in a Mac Pro, or to slap linux on the machine once more and be done with it. Microsoft loses: there's nothing running on that machine I cannot replace with similar software on other platforms. Usually it's better. It's almost never as annoying.

Except for one thing. I'm doing another install, at the same time as this, to get a working Windows system on my Mac again, under VMWare Fusion, on the off-chance that I need to test things.

I doubt it'll be all that long before my multimedia machine ceases to run Windows. I'm guessing that Creative Suite 5 will be out, at the latest, in early 2010; at that point I'll probably bite the bullet and both upgrade and get them to transfer the license to Mac. Windows will have been relegated.

The Open Rights Group

Published at
Wednesday 19th November, 2008
Tagged as
  • ORG
  • Internet
  • Rights
  • Freedom

The Open Rights Group (sometimes known by their un-Googlable acronym ORG) has turned three. Happy birthday, and congratulations!

To, erm, me. When the initial word went out in mid 2005, I pledged to become a founder member, and (beyond a short period where PayPal cancelled my subscription) I've been a supporter ever since. Not a particularly active one, admittedly.

Anyway, as you can see from their birthday review, they've achieved lots of really good things - getting the word out on digital rights issues, working with policy makers, and confronting police officers using clipboards and hand gestures. These kinds of challenges aren't going away, so please help out, or give financial support, or even both.

Interface magic

Published at
Sunday 21st September, 2008
Tagged as
  • Interaction
  • Interfaces
  • UI
  • Doom

The PlayStation 3 has a pretty good interface for playing DVDs, all things told: the key controls are immediately available from the controller, mostly using shoulder buttons as jogs, and I'd guess that most people who have got a PS3 and know what a DVD is have little difficulty using it. The problem arises with the magic.

Magic interfaces are one of those things that are a very good idea, but very difficulty to get exactly right: you want to take the thinking away from the user, so that stuff just works, but this means that if there's any significant mismatch between what you think people will want to happen and what they actually want, your users will start getting frustrated.

This is the area that Apple lives in, at least for the iPod and iPhone: don't think about how this thing works, just assume it will, and get stuck into it. By and large they do pretty well, and by and large Sony do okay with the PS3 also. The key magic that Sony have come up with (or at least the bit that I've noticed) is around what happens when you don't finish watching a film, but need to take the disc out. This is a pretty common requirement with a games machine, so it's not difficult to see why they've decided to make this simple: put the disc back in, and the PS3 will pick up where you left off. This would be an unambiguously good thing, except for a couple of points. Firstly, and less importantly, if you eject the disc at the end of the movie (or TV episode, or whatever), while the credits are running, then the next time you put the disc in it'll jump back to that point, and you have to navigate back to the menu - which some DVDs make much harder than others, because of their desire to show lots of copyright notices for countries I'm not resident in. (I suspect this doesn't happen in the US, and perhaps also not in Japan either.) If DVD authors would stop trying to persuade us that we're criminals, this would be a non-issue.

But the second issue was much more of a pain when I encountered it. Something I didn't notice for ages. Something which is almost never important, because it's simply not something you're likely to want to do. I started watching an episode of something, and noticed there was a commentary by the writer, which I thought might be interesting... and after about five minutes decided it wasn't. So I ducked out of playback back to the menu, and hit the 'play episode' button instead of the 'play with commentary'. And the PS3 helpfully picked up where I'd stopped, with the commentary track.

It took me perhaps ten minutes of turning the machine off and back on, ejecting the disc, and so forth, until I figured out that I could turn the commentary off by resetting the language options on the disc. (For some reason the audio tracks control was disabled for the disc.)

The question, of course, is: is this remotely important? I've been playing DVDs on the PS3 for about eight months, and I haven't run into this problem before now. Most people, I'd guess, don't listen to commentaries anyway; those that do probably only seldom back out once they've started. And most DVDs probably won't cause this problem, because they won't have the audio tracks disabled. So it isn't actually important at all.

The important point is that magic is by its nature opaque; if it weren't, it wouldn't be magic. And, like diesel engines and anything containing a class 4 laser, you can't take apart magic and figure out what isn't working. Instead, you have to build up a conceptual model of how it works inside, and figure out how to game it - which is doubly difficult because the point where the magic needs fixing is the point where the conceptual model that you already have doesn't match the magic in the first place. All your preconceptions go out of the window, and you have to think. Uh-oh.

There are two solutions to this when designing a system. One is not to care: this is the simple route, and has many advantages (including simpler testing). The other is to provide a way of turning the magic off; a kind of less magic switch. Personally I think that the former is a better choice: decide how your system will work, and trust to your own ability to make good decisions. Of course, you may get feedback that suggests you're better off removing the magic entirely, but options force people to think, which goes against the reason you wanted to introduce the magic in the first place.

Just use the best magic you can possibly manage.

Improvising makes me gulp

Published at
Thursday 11th September, 2008
Tagged as
  • Seedcamp
  • Improvisation
  • Terror
  • Mentoring
  • Teaching

Improvising makes me gulp; I get a visceral reaction before I go on stage. To be honest, this is a reaction I have to lots of similar situations: concerts where I was a soloist while back at school, plays, presenting, teaching. It's part of why I do it, although it doesn't feel like that at the time: adrenaline starts to rush through my body, and I want to throw up. But that passes, quickly, and then I'm on a high. I could probably sit down and rank the different things I do according to how great I feel doing them, and afterwards. Improvisation, and teaching improvisation, would come out at the top.

For the last few Barcamps I've been to I've taught impro for half an hour to whoever turns up; Barcamp Brighton 3, last weekend, was no exception. Best of all, we had a big enough crowd to play one of my favourite games to wrap up. I know it as The King Game, from Keith Johnstone (p237 of the Faber & Faber edition of Impro for Storytellers, if you have it), and it's particularly satisfying because you end up with a huge mound of bodies on the stage, all of whom are still paying attention to the scene.

Basically, people come in as servants, and the King orders them to commit suicide when (as inevitably happens) they get irritated. It's actually very hard to be a good servant, but some people actually try very hard to be bad (and in a quick session, it's generally more satisfying to play like this, admittedly breaking all the rules of good impro). Where it gets interesting is where people come up with strategies to avoid being ordered to die; at the weekend, someone came on as the King's daughter. (I think she actually lasted less time than the average.) The only time I've ever seen someone come on and survive was when I was doing this in Cambridge, preparing for a Whose Line Is It Anyway-style late-night show, and one of the women walked on and seduced the King. I suspect that works quite well as a strategy in real life, as well.

I've actually done much more of teaching improvisation than performing over the last couple of years (something I hope to change); but it does at least provide a (weak) segue to Seedcamp, where I'm a technical mentor this year. Looking over the entire list of mentors is a little daunting, but there are enough people on the list that I know from various things to make me feel I'll fit in. If you're one of this year's 22 finalists, I'll see you on Wednesday 17th, talking about How to Scale. According to Tom Coates, it has something to do with elephants.

Thoughts on Google Chrome

Published at
Tuesday 2nd September, 2008
Tagged as
  • Unwise predictions
  • Google
  • Web browser

First, let's start by saying that having a new web browser on the market to shake things up is never going to be a bad thing; and having something fresh from Ben Goodger is always going to be fun. The other people on the Chrome team sound smart as well, and it's clear that they're trying to solve a problem using a savvy blend of UI nous, applied research, and good software development. All that stuff about testing? Awesome. Oh, and using Scott McCloud for your promo literature is inspired.

But... am I really the only person who disagrees with their fundamental tenet? They claim that the majority of people's use of the web is as applications, not as web pages. I'm sure this is true of people inside Google (if nothing else because it can cause problems when they don't have high uptime), but I'm less than convinced for the general populace. It's certainly not true of me: of the tabs I have open in my browser right now, only seven fall within their broad definition of 'web applications' (actually three are films and one a list of audio clips, both of which I'd actually have watched or listened to by now if they'd instead been presented as something I could shunt onto my iPod; one is the Google Chrome comic itself, which I include as a web application mostly to deride the fact that it requires Javascript to function for absolutely no reason, giving absolutely no benefit to the user), compared to 41 'normal' pages (six items on shopping sites which use little or no Javascript; most of the rest are articles, main pages of sites with lots of articles, blogs or software sites). My remaining two tabs are one site that's down (and I can't remember what it is), and MySpace (which is anybody's guess how to classify). That's around 16% 'web applications', or a mere 6% if people would have done things properly in the first place.

Okay, so - disclaimers. I don't use web mail, which would definitely be an application, and would probably count for a reasonable amount of my usage online if I used it. I do use Facebook, sometimes, but I don't have it open anywhere right now; in fact, I almost never leave Facebook open, except for event information, which in my book is a web page not a web application. However I'm perfectly prepared to admit that I might be unusual. Freakish, even.

Of course, I'll benefit from a faster Javascript engine once they release on the Mac (on Windows I run Firefox with NoScript, so frankly I couldn't care one way or the other); and the process separation of tabs is smart (and, unlike others who've thought of it in the past, they've actually gone to the effort of doing it). But what I really want is genuine, resource-focussed, web-browsing features. Like, I don't know, proper video and audio support in the browser.

What's that you say, Lassie? Little Timmy's done what?.

However it's a huge deal to bring a new browser to market (or even to beta), so congratulations to the Google Chrome team (although... Chrome... really?). As they say, it's open source, so everyone can learn from each other. (Although of course strictly speaking this is Chris diBona wafflecrap, but that's for another post entirely.) But I'm not convinced this is on the critical path between us and jetpacks. Not even little ones.

Complexity increases

Published at
Tuesday 24th June, 2008

Or more accurately: complexity never decreases, both in the strict sense that if something’s O(n), it will always be O(n), and if you find a faster way of doing it, the complexity hasn’t changed, you’ve just been really smart (or you were particularly dumb to start off with).

But I also mean this in another way, with a more sloppy use of the word ‘complexity’: that just because you’ve done something before doesn’t necessarily mean it will be easier this time round. The complexity of the problem is the same, after all; the only real advantage you’ve got from having done it before is an existence proof. I think this is an important point that a lot of people miss. Then they act all surprised when building a new widget takes longer than they expected.

Complexity never decreases. You can represent this mathematically as:

which is what I’ve got on my t-shirt today at the second half of the O’Reilly Velocity Conference, which is all about performance and operations for internet-scale websites. And there’s some good stuff here - besides interesting conversations with vendors, and a couple of product launches, the very fact that this is happening, rather than being confined to what I’m sure a lot of my fellow delegates consider more “academic” arenas such as Usenix feels like a positive step forward.

Of course, some of this isn’t new. One of the conference chairs is Steve Souders, who has been banging the drum about front-end web performance techniques for a while, so it’s not surprising that we’re seeing a lot of that kind of approach being talked about. Since I follow this stuff pretty closely anyway, even over the last six months when I’ve been only flakily connected to the industry, I know much of this already; however it doesn’t mean it’s a bad thing: there will be people here who haven’t had it explained sufficiently for them yet, so they’ll go away with important new tools for improving the sites they work on.

Some of the things people are saying are older yet; and some are a bad sign. At least four speakers yesterday laid into the advertising technology industry, including Souders. However I note that they don’t appear to have actually sat down with the tech vendors, and they haven’t got any representatives speaking here. No matter what people think, performance problems related to putting ads on your pages aren’t always the fault of the tech vendors, and even when they are they’re open and often eager to talk to people about improving things. There’s a panel this afternoon on how to avoid performance problems with adverts, which I’m sure will have some interesting and useful techniques, but I’m equally sure some of them will date very rapidly, and very few if any will have been submitted to the tech vendors to make sure that they don’t have unintended side-effects. People are thinking about this, which is good; but they also have to talk about it, and not just at smallish conferences in San Francisco, but at things like Ad:Tech in New York. Hopefully this is the start of the right approach, though: advertising isn’t going away.

I’ll probably have more thoughts by the end of the day, but for now sessions are starting up again, so I’m going to go learn.

  1. Page 4 of 7