We use Cassandra pretty heavily. With all of its advantages, one pain point for us has been the python-cassandra driver’s connection speed. Today, I finally dove into the code to figure out if I could squeeze out a little more juice or if I’d need to start caching results to keep our pages loading quickly.
I discovered a potential issue (which has been submitted to Datastax https://github.com/datastax/python-driver) that forces Cassandra to fetch the metadata for all keyspaces each time the client connects to the cluster. A tweak to provide only the keyspace being used showed significant performance improvements (as much as 50%).
We’re only running this in our experimental environment, so I’m not even going to post the patch because this could cause some serious issues for people. However, I’m happy to report that it does seem as though there can be some speed improvements made one way or another.
Leave a Reply