web developer & system programmer

coder . cl

ramblings and thoughts on programming...


instant xml api using pyxser

published: 05-11-2011 / updated: 05-11-2011
posted in: development, programming, projects, python, pyxser, tips
by Daniel Molina Wegener

Probably you do not understand pyxser at all. It is a serializer and deserializer which converts Python objects into XML as plain text. Among JSON and other formats, XML can help in some tasks like transmitting object through the network, for example building API calls using remote queries. Here I will guide you on how to build an XML query API for your Django driven application in few minutes. You just need to understand how pyxser works and how to use the pyxser module. Remember that you can see the documentation once it is installed, even if you do not have Internet, just by running the pydoc daemon with pydoc -p 8080 and connecting to the port 8080 in your machine — you can choose another port if it is not working.


tl;dr

You can setup a query API that throws XML through HTTP under Django using pyxser.


advice

All examples here are working, you must be really careful with the authentication and object permissions before using the routines in this post. So, try to wrap those routines correctly using the Django authentication components to filter query requests. Probably OAuth related modules may help. Also the examples are not using the Python and Django best practices, so you need to adjust them to fit the best practices requirements. Finally, do not take all examples very literal, they are just examples and this is just a proof of concept article.


serializing model objects

The pyxser extension — which is written in C and uses libxml2 as its basis for XML processing — has two main arguments for the serialization routines: obj and enc, where obj is the object to be serialized and enc is the XML encoding to be used, so you can serialize a valid object using pyxser.serialize(obj = my_object, enc = 'utf-8'). You can see the full pyxser documentation using the pydoc command and looking forward for the pyxser module.

To serialize Django models, you need to restrict some fields, so you need to filter them, you do not need to worry about processing each model field, you just need to worry to filter the model fields properly using the selector argument and the depth argument. Take a look on the following decorator.


import pyxser as px

def render_to_xml(**pyxser_args):
    def outer(f):
        @wraps(f)
        def inner_xml(request, *args, **kwargs):

            result = f(request, *args, **kwargs)
            r = HttpResponse(mimetype='text/xml')
            try:
                render = px.serialize(obj=result,
                                      enc='utf-8',
                                      **pyxser_args)
            except Exception, exc:
                render = "<pyxs:obj/>"
            if result:
                r.write(render)
            else:
                r.write("<pyxs:obj/>")
            return r
        return inner_xml
    return outer

If you apply the decorator above in a Django view, it will return the serialized object as text/xml to the HTTP client. So, your view must return a valid object to be serialized by pyxser. It applies the pyxser.serialize function to the given output from your view. Now take a look to a view which uses this decorator to throw XML.

def get_class(kls):
    try:
        parts = kls.split('.')
        module = ".".join(parts[:-1])
        m = __import__(module)
        for comp in parts[1:]:
            m = getattr(m, comp)
        return m
    except:
        return False

## use an URL as follows:
## (r'x/g/(?P<model>[w.]+)/(?P<oid>d+)/',
##  u'views_xml.get_model_object'),
@require_http_methods(["GET", "OPTIONS", "HEAD"])
@render_to_xml(selector=do_value_attrs, depth=2)
def get_model_object(request, model=None, oid=None):
    obj = object()
    try:
        db_model = get_class(model)
        obj = db_model.objects.get(pk=oid)
        return obj
    except Exception, exc:
        log.error(exc)
    return obj

The view above returns an object from the given model name model and the given primary key oid, and passes the do_value_attrs selector function as attribute selector to pyxser, and restrict the serialization depth to two levels. Remember that pyxser allows to serialize circular references and cross references between objects, so we need to restrict the serialization depth, in case of Django models we can work with 2 levels in almost all models and the field selector do_value_attrs can be defined as follows.


DENIED_FIELDS = ['user', 'customer', 'users', 'customers']
DENIED_CLASSES = ['Related', 'Foreign', 'Many']

def is_allowed_class(fld):
    for nm in DENIED_CLASSES:
        if nm in fld.__class__.__name__:
            return False
    for nm in DENIED_FIELDS:
        if nm in fld.name:
            return False
    return True

def do_value_attrs(o):
    values = dict()
    if hasattr(o, '_meta') and hasattr(o._meta, 'fields'):
        for fldn in o._meta.fields:
            if is_allowed_class(fldn):
                values[fldn.name] = getattr(o, fldn.name)
    else:
        for kw in o.__dict__:
            values[kw] = getattr(o, kw)
    return values

Where we are filtering all fields in model objects that we do not want to serialize and all field classes that pyxser should not serialize for plain object transmision. Other objects which are not model related objects are serialized as plain Python objects using their dictionaries to get the object attributes, and also DENIED_FIELDS are skipped and DENIED_CLASSES are skipped too. The resulting XML for URLs like /p/x/g/offer.models.Marca/1/ is as follows.

<?xml version="1.0" encoding="utf-8"?>
<pyxs:obj xmlns:pyxs="http://projects.coder.cl/pyxser/model/"
          version="1.0"
          type="Marca"
          module="prod.models"
          objid="id3128007116">
  <pyxs:prop type="unicode" name="nombre" size="4">Sony</pyxs:prop>
  <pyxs:prop type="long" name="id">1</pyxs:prop>
  <pyxs:prop type="unicode" name="slug" size="4">sony</pyxs:prop>
</pyxs:obj>

The pyxser serialization model holds type information, so any serialized object carries type information to be user in deserialization tasks, then you can handle the object back in any machine supporting pyxser and get the object deserialized to its original class using the unserialize function.


defining object containers

The pyxser extension cannot handle Django model containers directly, I mean those returned by the all method in query sets. So, you need to create a plain container to hold those objects that are retrieved from the database. Take a look on following view.

class Container(object):
    count = 0
    items = []
    def __init__(self):
        pass

def collect_filters(qd):
    data = qd.copy()
    filters = dict()
    for kw in data:
        if kw.startswith('filter__'):
            name = kw.replace('filter__', '')
            filters[name] = data[kw]
    return filters

### use an URL as follows:
### (r'x/l/(?P<model>[w.]+)/(?P<limit>d+)/',
###  u'views_xml.get_model_list'),
@require_http_methods(["GET", "OPTIONS", "HEAD"])
@render_to_xml(selector=select_value_attrs, depth=4)
def get_model_list(request, model=None, limit=1):
    container = Container()
    container.count = 0
    container.items = []
    try:
        db_model = get_class(model)
        filters = collect_filters(request.GET)
        objs = db_model.objects.filter(**filters).all()[0:limit]
        container.count = len(objs)
        container.items = map(lambda x: x, objs)
        return container
    except Exception, exc:
        log.error(exc)
    return container

If you take a look carefully to this example, you will notice that we are using a very simple Container class to hold your objects. The resulting XML for the URL /p/x/l/prod.models.Marca/5/?filter__nombre__contains=son is as follows.

<?xml version="1.0" encoding="utf-8"?>
<pyxs:obj xmlns:pyxs="http://projects.coder.cl/pyxser/model/"
          version="1.0"
          type="Container"
          module="prod.views_xml"
          objid="id3107379372">
  <pyxs:prop type="int" name="count">3</pyxs:prop>
  <pyxs:col type="list" name="items">
    <pyxs:obj type="Marca" module="prod.models" objid="id3107380204">
      <pyxs:prop type="unicode" name="nombre" size="5">Epson</pyxs:prop>
      <pyxs:prop type="long" name="id">10</pyxs:prop>
      <pyxs:prop type="unicode" name="slug" size="5">epson</pyxs:prop>
    </pyxs:obj>
    <pyxs:obj type="Marca" module="prod.models" objid="id3107380300">
      <pyxs:prop type="unicode" name="nombre" size="8">Ericsson</pyxs:prop>
      <pyxs:prop type="long" name="id">15</pyxs:prop>
      <pyxs:prop type="unicode" name="slug" size="8">ericsson</pyxs:prop>
    </pyxs:obj>
  </pyxs:col>
</pyxs:obj>

The resulting XML serialized object holds three Marca objects and all of them have their type to be deserialized once they are retrieved. If you want your objects to be deserialized back, you just need to use the pyxser.unserialize function properly, which is documented in the pyxser extension itself. I hope that you will like how pyxser works.


one comment to “instant xml api using pyxser”

  1. That’s way more celver than I was expecting. Thanks!

post a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>