Today I was enhancing pyxser, and I’ve reduced some memory usage and enhanced its performance. I’ve reduced some functions to use less instructions and gained a little bit of better performance. But seems that still is using a large amount of dictionaries. or leaving them in memory until the test is finished. Also, I’ve modified the profiling script, to run 1000 times each function. The enhancements looks promising…
memory profiling
On the profiling script, the only remaining reference to constructed objects. Some interesting notes about the memory profiling task, is that the most allocated object type is str with 278 allocations, and is using the 47% with 10384 bytes, but the heaviest object is dictionary with 79 allocations, the 13% and 457720 bytes. I’ve tested pyxser using valgrind without any kind of true report of memory leaks.
Also, in the pyxser distribution, you can find the test-utf8-leak.py script, which executes serialization and deserialization functions up to 1000000 times, leaving python executable just using a small portion of the operating system memory, and certainly you will notice that pyxser maintains its memory usage along that test.
speed tests
Speed tests are going fine, I’ve reduced a small amount of the used by the serialized and created a new test that executes each pyxser function 1000 times. The result can be seen as follows:
Deserialization functions are the slowest ones, requiring almost the double time that serialization takes, but thinking a little, you will notice that the XML parsing process is slower than Python object tree traversal. Some functions, like getdtd() and getdtdc14n() practically do not require time to being executed, since both DTDs are pre-allocated on module loading, the same applies to pyxser XML schemas. I think that I will reduce the execution and load time along the time, since pyxser has reached a good maturity level, without memory leaks and well structured tests, for different kinds of objects — as any lazy resource initialization.