Re: PIM databases (SQLlite vs. XML arguments)

From: Owen Cliffe <occ.a.t.cs.bath.ac.uk>
Date: Sun Feb 10 2002 - 18:49:43 EST

> * My major complaint against SQLlite is that it feels like overkill for
> such a simple application, especially if we have libxml being installed
> as a dependency for libglade, which a quite a few applications use.

I think alan proposed replacing libglade with something lighter,most of
libglades' code seems to be the glade xml parser... something that just
translated the libglade XML to a binary rep which was trivially parsable
and then hacked out the XML handler code in libglade would be cool because
you could essentially have libglade without having to parse the XML ever,
while still retaining all the functionality/compatibility.

> * XML is human readable, and doesn't require a front-end to tweak it or
> play with it. Just a simple text editor.

SQL is hardly opaque, ok the underlying storage is but there is
always going to be an intermediate text SQL rep which is (vaguely)
standardised and also very readable.

> * I suspect that an XML backend could be made nearly as fast as an sql
> backend, although an indexing layer might need to be written.
I think i contest that statement, as your storage in XML is always going
to be abstracted far away from your code, so you would spend a lot of time
going from code->xml and vice versa, wheras with an SQL db you can tightly
bind your data to your code, with a very low overhead.

> (SAX-based
> interface scans file, and generates an index on "searchable" attributes"
> and stores <keyword,offset> in a index file) This would only apply when
> rapid lookups are needed, and I'm not sure how often this is the case,
> or if a sequential scan would suffice.
There are quite a few things that deal with XML indexing, i think there
was some tool that the gnome-doc people were playing with (i think it was
in scrollkeeper) which does indexing of XML and references it with XLink.

> Does sqllite build indexes? How does it speed searches?
I think it uses a b-tree rep inside, i can't remember how or what the
indexing of non-primary key fields is done. Suffice to say it is fairly
nippy for the kinds of datasets you could reasonably fit on your PDA
(1000s of records) assuming you keep to a fairly small number of joins,
also its storage is pretty compact. it gets by for trivial cases
out-performing postgresql without using any internal caching. (although it
doesn't have fine-grained locking so that figures)

The thing about using an XML back end is you end up with a fairly hefty
application-specific middle layer (i.e. sax handler) which would make all
the apps that wanted to use that format bigger, unless you move that out
into a library, in which case you are actually defining a common API,
rather than storage mechanism. tieing a DB to your app is a fairly simple
process.

I prefer to treat XML as an interchange format, so for example it would be
fairly easy to bind an SQL storage engine to something that did SyncML and
generated and parsed all your exchanged XML.

I would say i had the following aprihensions (feel free to enlighten
if i am wrong) about using XML storage:
1) a lot of data we are storing is /Database/ oriented (i.e. contact list
etc) even if it isnt relational
2) The overhead for record storage in XML is quite high (although i guess
its not a huge difference) but if you include indexing then it might be
substantial.
3) updating the files is a non-trivial operation (i.e. writing and
re-writing the whole file for a single operation, reading data is also
non-trival.
4) if you have indexing that is even more of a non-trivial operation.
5) concurrent reads/updates are hard to cope with (i.e.
a->read,b->write,a->read) you would pretty much have to re-parse the whole
file, sqllite can do this without a huge overhead.

One of the things that an SQL-oriented backend would give you is the
ability to have data that actually spanned apps, without too much
overhead, so operations like linking contacts to schedule items, or
contacts to other documents are much easier.

Any XML interchange (i.e. for syncml or whatever) can be done by a DB
client.

owen

-- 
Owen Cliffe, Ph.D. Student, Dept. Computer Science
University of Bath
Received on Sun Feb 10 15:49:12 2002

This archive was generated by hypermail 2.1.8 : Tue May 04 2004 - 09:41:27 EDT