web developer & system programmer

coder . cl

ramblings and rants on software development...


Print Article R -a +A

pyxser, serializing collections

by Daniel Molina Wegener on 2009.01.18
posted in: projects, python, pyxser

Well, the last post was signed with pyxser as finished python-object to xml serializer, but my mistake was to forget the Python collections. Today I have finished the collection serialization, now it support list, tuple and dictionary serialization with the same algorithm basis, an O(n) algorithm that serializes the complete object-tree in one pass. Now the hard part of the job, the deserialization algorithm is comming… ;)

With the same base object, as example we get the next classes in the testpkg.sample module:


__all__ = [
    'SubNestedChild',
    'NestedChild',
    'ChildObject',
    'ParentObject']

class SubNestedChild:
    subnested1 = None
    def __init__(self, m1):
        self.subnested1 = m1

class NestedChild:
    nested1 = None
    nested2 = None
    nested3 = None
    nested4 = None
    def __init__(self, m1, m2, m3):
        self.nested1 = m1
        self.nested2 = m2
        self.nested3 = m3

class ChildObject:
    child1 = None
    child2 = None
    child3 = None
    child4 = None
    def __init__(self, m1, m2, m3):
        self.child1 = m1
        self.child2 = m2
        self.child3 = m3

class ParentObject:
    parent1 = None
    parent2 = None
    parent3 = None
    def __init__(self, m1, m2, m3):
        self.parent1 = m1
        self.parent2 = m2
    def child(self, m1, m2, m3):
        self.parent3 = ChildObject(m1, m2, m3)
    def nested(self, m1, m2, m3):
        self.parent3.child4 = NestedChild(m1, m2, m3)
    def subnested(self, m1):
        self.parent3.child4.nested4 = SubNestedChild(m1)

Then, with a small program, we can reach a complex to serialize class, with nested objects and cross referenced objects:

#!/usr/bin/env python

import pyxser
import testpkg.sample

class TestAnotherObject:
    first_element = '123'
    second_element = '456'

if __name__ == '__main__':
    another = TestAnotherObject()
    another.first_element = '123'
    another.second_element = '456'
    test = testpkg.sample.ParentObject('hola', 'chao', 2354345L)
    test.child('hi', 4.5, 2354)
    test.nested('holahola', 345, 'hola')
    test.subnested(test)
    test.dyn_prop1 = ['hola', 'chao', test, another]
    test.dyn_prop3 = ('hola', test, 'sun', 'trip')
    test.dyn_prop2 = {"saludo1": "hola", "saludo2": "chao", "saludo3": test, "saludo4": "goodbye"}
    print pyxser.serialize_xml(test)

And finally, we reach get the xml serialization as follows:


<?xml version="1.0"?>
<ParentObject module="testpkg.sample" id="136619884">
  <dyn_prop1 type="list">
    <item type="str">hola</item>
    <item type="str">chao</item>
    <ParentObject reference="#136619884" name="ParentObject"/>
    <TestAnotherObject module="__main__" id="136620684">
      <first_element type="str">123</first_element>
      <second_element type="str">456</second_element>
    </TestAnotherObject>
  </dyn_prop1>
  <dyn_prop2 type="dict">
    <item type="str" key="saludo1">hola</item>
    <ParentObject reference="#136619884" name="ParentObject" key="saludo3"/>
    <item type="str" key="saludo2">chao</item>
    <item type="str" key="saludo4">goodbye</item>
  </dyn_prop2>
  <dyn_prop3 type="tuple">
    <item type="str">hola</item>
    <ParentObject reference="#136619884" name="ParentObject"/>
    <item type="str">sun</item>
    <item type="str">trip</item>
  </dyn_prop3>
  <ChildObject module="testpkg.sample" id="136620716">
    <child1 type="str">hi</child1>
    <child2 type="float">4.5</child2>
    <child3 type="int">2354</child3>
    <NestedChild module="testpkg.sample" id="136620748">
      <SubNestedChild module="testpkg.sample" id="136620812">
        <ParentObject module="testpkg.sample" reference="#136619884" name="subnested1"/>
      </SubNestedChild>
      <nested1 type="str">holahola</nested1>
      <nested2 type="int">345</nested2>
      <nested3 type="str">hola</nested3>
    </NestedChild>
  </ChildObject>
  <parent2 type="str">chao</parent2>
  <parent1 type="str">hola</parent1>
</ParentObject>






6 comments to “pyxser, serializing collections”

  1. If you are looking for object persistence , maybe you can take a look on python pickle , or you can construct a pickle data exchange protocol version 0 (ascii) to xml document transformation method.

  2. I just looking for XML serialization/deserialization to reach certain interoperability between python and other systems.

    http://docs.python.org/library/pickle.html

    Pickle seems to be python-only related serializer. My extension uses libxml2 and it’s written in C under the Python C/API, then it’s faster than other methods.

    By using pickle with protocol 0 and a second conversion, you get an O(n^2) algorithm to get a serialized object in XML, rather than the O(n) algorithm that I have implemented and an O(2n^2) algorithm if you revert that kind of serialization, rather than the O(2n) — aka O(n) — algorithm for deserializing objects in my planned implementation.

    And not… I’m not looking for persistance, I’m looking for interoperability between certain systems.

  3. It is true , 2 serialization functions implies (at least) an O(n^2) algorithm. Anyways , you can transport serialized objects as payload over http , rpc or a common socket file descriptor (that is the “interoperatibility” support needed), but yes, it is a python oriented solution.

    Anyways, you have to implement an non-standard de-serialize function. The advantage of XML is the big amount of languages that support it and the DOM manipulation/xpath and extensions make your work easy.

  4. I missed something. Why does applying 2 serialization functions implies at least O(n^2)?

  5. Sorry, the first one is serialization function (object tree to ascii text) the second one is a deserialization (parsing the ascii text) and the third one is an xml conversion of the parsed represented by another object tree. Then you get 2 serialization functions O(2n) and one deserialization function that implies a custom text parsing algorithm, that usually have a lower level of O(n^2) complexity. Then we get a O(2n + n^2), that isn’t another than O(n^2).

  6. [...] my last posts: "pyxser, a work in progress", "pyxser, serializing collections" and "pyxser, a work in progress". I’ve signing that I have finished my [...]

post a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>