Posted by: fdmanana | January 31, 2010

Implemented more 2 new features for Apache CouchDB

Since December that I’ve been working on a feature for CouchDB to support the storage of compressed (with gzip) attachments. This not only brings the avdantage of saving disk space but also reduces disk IO. The compression is done incrementally by using a zlib stream.

When an attachment chunk is received (attachments can have a size of kilobytes, megabytes or even gigabytes), the chunk is compressed and written to the DB file. This implied a slight change of the DB file format, and yet adding full compatibility with the previous format.

When a client downloads an attachment, the attachment is uncompressed on the fly, by CouchDB, only if the client’s HTTP request message doesn’t list gzip as an accepted content encoding (through the Accept-Encoding HTTP header).

Typically, and by default, XMLHttpRequests have an Accept-Encoding header with the value “gzip, deflate”. This way we push some work to the client side (the decompression), while reducing disk IO on the CouchDB side (serving attachments and compaction for example).

I’m not pointing out a few more subtle and technical details of course. Also worth noting is that attachments are compressed if they’re MIME type matches one of those listed in the CouchDB config file. Text based MIME types (text/plain, text/css, text/javascript, application/javascript, etc) are definitely worth compressing, while others such as images and audio (jpg, png, mp3) are not worth, as those formats are already compressed by nature.

I’m naturally very happy by giving this contribution:

Also implemented a minor feature related to the replicator:

Special thanks to Chris Anderson and Paul Joseph Davis.



  Thanks Didi.
    Yes you should take a look into CouchDB. It's definitely worth it.

    Does it? I will check the wordpress settings.


