PyMongo 1.7 Released
17 June 2010A new release for “PyMongo”:http://api.mongodb.org/python has been long overdue – the last release (1.6) was made on May 11th, over two months ago! It’s been a busy couple of months with travel and “writing”:http://oreilly.com/catalog/9781449389536/ (and life), so a release hasn’t been on the top of the list. I finally made the effort to get a release out today, though, and the result is PyMongo 1.7. For a full list of changes, check the “changelog”:http://api.mongodb.org/python/1.7/changelog.html; in this post I’ll talk about a few of them in detail.
Using a dict
to Specify Fields to Return
One of the smallest changes in 1.7 is that the fields argument to
find
/find\_one
can now take a
dict
in addition to a list
. So, the following
two find\_one
s are equivalent:
The second call should look familiar to those who have used the MongoDB
shell extensively, while the first is a little less verbose (and has
always been the PyMongo way of doing things). Since PyMongo was first
implemented, however, there have been some new features added to the
server that are best exposed through the dict
interface,
like specifying keys that we don’t want returned:
Or using the new $slice
operator to only return portions of
an array:
datetime
Handling
PyMongo 1.7 also improves support for working with datetime
instances. The first change is that timezone aware datetimes are now
properly encoded, by converting them to UTC before saving them. Naive
datetimes will still be assumed to be in UTC.
Since BSON is not currently capable of storing
datetimes with timezone information, all datetimes will still decode
as UTC. Currently the datetime
instances returned will be
naive, but that will probably change in the future as well.
A final change in 1.7 is that the PyMongo C extension now uses the y2038 project’s implementation of time.h. This allows the C extension to properly support dates beyond January 19, 2038. There might still be issues for users on 32-bit platforms using the pure-Python encoder, at least for older versions of Python.
max_scan
Version 1.7 adds support for the server’s new max\_scan
functionality. max_scan allows a developer to set the maximum number of
documents to scan for each individual query. This functionality can be
useful when a developer wants guaranteed performance, even if it means a
partial result set.
max\_scan
requires a MongoDB server version = 1.5.1.
max_scan can be passed as an argument to
find
/find\_one
, or can be added to a
Cursor
using chaining; the following two queries are
equivalent:
Custom Classes for Returned Documents
The BSON decoder in PyMongo normally decodes all documents as
dict
s. Version 1.7 adds the ability to specify a custom
class to decode to. All the class needs is to have a
*setitem*
method, which will be called for each key/value
pair in the BSON being decoded.
This ability is exposed through the as_class argument to
find
/find\_one
, and a new default class can be
set using a Connection
’s document_class attribute. The
two queries below both decode to SON
, so that the resultant
documents maintain key order:
There are a couple of cool things that fall out of this functionality.
First, third party libraries like
MongoKit should have
an easier time re-hydrating their models, and will probably be a little
more efficient. Second, the class used doesn’t even need to represent a
document: it can just treat *setitem*
like a hook, allowing
custom code to run on each key/value pair. Here, we create a custom
class that just prints key/value pairs to stdout:
This example is a bit pointless, but I’m sure readers can think of much cooler things to do with it!
There’s a whole bunch more that’s new in PyMongo 1.7: check the changelog for the full list, and go upgrade!