With my recent move to working for Couchbase I’ve also made the leap from Big Data & SQL Analytics to NoSQL and document databases.  I’m going to ease into blogging more about it here (and in my channel on the corporate blog) so for the first step let’s go through a few basic concepts.  Then we’ll dive directly into an ultra simple Python example of storing and accessing a JSON document in Couchbase.

This is somewhat stream of consciousness (okay it’s a total ramble!) but I just want to get your juices flowing and hear what other areas you’d be interested in.

Growing Beyond NoSQL

When NoSQL terminology was invented it, somewhat derogatorily, referred to alternatives to the standard SQL databases.  Oddly enough the growth of NoSQL solutions has pushed the envelope so far that it has stretched to encompass much more than initially imagined.

For example, even the SQL database vendors have document management capabilities in many cases now, so what does that mean?

Data Platform > NoSQL

This is why Couchbase refers to their offering as a Data Platform, not just a NoSQL database as it doesn’t mean anything anymore.

Many users came to use NoSQL for simplified access to binary objects in a database-like way.  I.e. you’d specify a key/ID and get back the relevant object for your application.  This is often the case for running caching layers between a database and a web app.

Outside of that scenario, many are looking for a document database – the document, in this case, being JSON data.  Originally this meant just storing a chunk of JSON code and sending it to applications when requested.  I.e. an online game probably stores your user profile in a JSON object that it then calls  – in a single call to the database when needed.  The web app easily decodes the JSON into components needed for the GUI because it’s already in JavaScript!

Flexibility With Structure

The most interesting part of the evolution is the ability to access specific parts of the JSON document in the database.  But wait, isn’t this why we got away from SQL databases?  Yes and no.

SQL databases depend on a static schema, often highly normalized, which introduce more complexity than many web apps really want to deal with.  Need to add a new field or table to your application?  The schema changes can be onerous to handle as they trickle down through database views, middleware query design and ultimately to the end user experience – all layers need updates.

With flexible schemas of a modern data platform you model your data according to its end use.  Need a new field in your game profile, you just start adding it to the JSON and when the web UI receives it, it is made available to Javascript.

Back to SQL

To further complicate the NoSQL moniker, there are real strengths to being able to use SQL to aggregate records of data.  The NoSQL Query Language (N1QL) allows us to query across sets of JSON documents in the database.  It’s pretty sweet to use actually!  If there are documents without the given fields in them, they are just ignored.

I’ll show you more of this in the future, though there is a lot on our website about it already.

So that’s the background, now let’s do a simple example…

Python NoSQL Access to Couchbase Server

  • Install Couchbase Server – some instructions here if needed.
    • Ultra simple for OS X users, just launch a DMG – get the enterprise trial for latest features
    • It’s also very simple with Docker (user/password is Administrator/password):
docker run -t --name db -p 8091-8094:8091-8094 -p 11210:11210 couchbase/server:sandbox

These drop you into a web management console at http://localhost:8091.

You can pretty much accept all the defaults for the purposes of this walkthrough, just make sure you give it enough RAM if prompted.  The Couchbase icon should be in your desktop toolbar – with a handy link to stop and relaunch the web console.

Couchbase Python Packages

I’m on a Mac so will stick with instructions here.  There are only two simple commands you need to get started with Python and Couchbase.  Get the libcouchbase (C libraries) and the Couchbase Python modules:

brew install libcouchbase
pip install couchbase

Test that it’s installed by launching Python and importing Couchbase:

$ python
>>>import couchbase

Now let’s do something more interesting.

Basic Couchbase Python Document Example

As an aside, you may have noticed that Couchbase uses a term called Buckets – a loose collection of documents.  There is really no limitation to what kinds of documents to put in the same bucket – so for now think of them as a traditional database table.  Some documents will have some fields, some documents will not.  It’s a bit of mind-bender but there isn’t a much easier way to explain it without just trying it out.

There is a bucket called default that is installed, we’ll use it for now.  We’ll connect to it, then take a Python list and save (set) it to the database with a custom ID.  Then we get the document back as a list and I show how you can easily iterate over it.

>>> from couchbase.bucket import Bucket
>>> db = Bucket("couchbase://localhost/default")

>>> vip_people = ["me", "Myself", "I"]

>>> db.set("tyler::friends", vip_people)
>>> mydoc = db.get("tyler::friends").value
>>> for d in mydoc: print d

The cool thing is that most use cases aren’t much different than this!  Get, set and operate on elements of a JSON document.

You can open the web console and see the document sitting in the default bucket.  Just search for the document ID I used: tyler::friends.

There are still a ton of other options that I’ll touch on in a future blog post – for example, getting the server to do all the work of managing collections frameworks in .NET or Java.

What’s Next?

Want to learn more?  Leave a comment and I’ll go in that direction!  In particular I’m planning to show more about doing NoSQL operations in Spark, Zeppelin and Apache Kafka.  Interested?

See my conference presentation video for hints on what will come next 🙂

 

 

 

 

About Tyler Mitchell

Director Product Marketing @ OmniSci.com GPU-accelerate data analytics | Sr. Product Manager @ Couchbase.com - next generation Data Platform for System of Engagement! Former Eng. Director @Actian.com, author and technology writer in NoSQL, big data, graph analytics, geospatial and Internet of Things. Follow me @1tylermitchell or get my book from http://locatepress.com/.