USING COUCHDB

2012-02-27

Introduction

Before using this information, you need to know how the JSON format works. JSON is kind of like XML, it is a way of representing data. I won't go into more details in here.

Concepts

If you are switching from a SQL database like MySQL to couchdb, then chances are you will be wondering where are the tables and how do I query them? Well there is no table. To make things simple, try to think of it this way:

CouchDB is like a database that contains only one table and one column. Each row is filled with a JSON document. You could easily do that with MySQL, except that the server doesn't understand JSON, so it can't do any special processing based on what your JSON document contains.
Everything is done through a web interface using a REST API. This doesn't mean that you query the DB directly from your website (you still make the queries from the server side). And for that matter, it doesn't mean that CouchDB is only made for websites.
If you are searching for "stored procedures", you wanna use "views" with couchDB.

So consider this: If you are building a simple blog where each posts contains a timestamp, a title and a content, then you will probably create a table like this in MySQL:

ID	TimeStamp	Title	Content
1	330133439	A Post	oh yeah
2	330133439	Another post	blah blah blah
...

What happens if you wanna add a "tag" column at one point? You'd have to modify your schema. So instead, for flexibility, you will decide to use one column only and store each post with a format you like, maybe you'll choose XML:

Data
<post> <id>1<\id> <title>A post</title> <timestamp>330133439</timestamp> <content>oh yeah</content> </post>
<post> <id>2<\id> <title>Another post</title> <timestamp>330133439</timestamp> <content>blah blah blah</content> </post>
...

This is exactly what couchDB is used for. Except that instead of a row, it calls it a document. Instead of using XML, it uses the JSON format. You might be wondering what's the point of using couchdb over mysql if both can do the same thing then. Couch DB adds more functionalities, like adding attachments to a document, create views with javascript and so much more. You will find a lot of blogs with people debating SQL vs NoSQL, so I won't cover this here. I just wanted to explain what CouchDB is.

Cheatsheet

Check if DB exists: curl -X GET http://127.0.0.1:5984/database1/
where 'database1' is the name of your database
Will return an error if DB does not exist
Create a database: curl -X PUT http://127.0.0.1:5984/database1/
where 'database1' is the name of your database
Create a document: curl -X PUT http://127.0.0.1:5984/database1/document1 -H "Content-Type: application/json" -d {"field1":"value1","field2":"value2"}
where 'database1' is the name of your database
where 'document1' is the ID of the document to create
Retrieve a document: curl -X GET http://127.0.0.1:5984/database1/document1
where 'database1' is the name of your database
where 'document1' is the ID of the document to retrieve
Create a view: curl -X PUT http://127.0.0.1:5984/database1/_design/designdocument1 -H "Content-Type: application/json" -d {JSON_REPRESENTATION_OF_VIEW}/
where 'designdocument1' is the name of your designdocument
Note that a design document can contain more than one view. A view contains a map function and a reduce function. The following is an example of what could be included as the "JSON_REPRESENTATION_OF_VIEW" { "language": "javascript", "views": { "view1": { "map": "function(doc){emit(doc._id,doc);}" }, "view2": { "map": "function(doc){emit(doc._id,doc);}", "reduce": "function (key, values){return null;}" } } }
Query a view: http://127.0.0.1:5984/database1/_design/designdocument1/_view/view2?reduce=true&group=true&skip=2&limit=5
where 'database1' is the name of your database
This will return the results of the view "view1" in "designdocument1". We have also provided parameters in the URL that says: we want the reduce function to be executed, we want results grouped, we want to skip the 2 first documents returned by the view, we want a maximum of 5 documents in total.

using the results in php

If we query curl -X GET http://127.0.0.1:5984/database1/document1 and we get the result


{
   "_id": "document1",
   "_rev": "1-a227e6b8d34d14fbc59c4dde72e53848",
   "field1": "value1",
   "field2": {"sub1":"val1","sub2":"val2"},
   "field3": ["val1","val2","val3"]
}

Then we can take that result and decode it using json_decode


$obj = json_decode($jsonString);

We get:

$obj->field1="value1"
$obj->field2->sub2"val2" ($obj->field2 is an object)
$obj->field3[1]="val2" ($obj->field3 is an array)

Text Search

Consider this SQL query: SELECT * FROM posts WHERE content LIKE 'test'. With CouchDB, it gets a little more complicated. First, you need to create a view that emits a map of ALL the words in your documents.


function(doc) {
    var tokens;
    if (doc.content) {
        var st = doc.content.replace(/<(?:.|\n)*?>/gm, '');
        tokens = st.split(/[^A-Z0-9\-_]+/i);
        var uniqueTokens = {};
        for (var i=0;i<tokens.length;i++)
        {
            var key = (tokens[i]);
            if (key!="") uniqueTokens[key] = key;
        }
        for (var token in uniqueTokens){
            emit(token,doc.title);
        }
    }
}

So if you have the following documents in your database:


{"title":"doc1","content":"hello this is a test"}
{"title":"doc2","content":"another document"}

Your view would output the following:


"hello",doc1
"this",doc1
"is",doc1
"a",doc1
"test",doc1
"another",doc2
"document",doc2

So if you want to retrieve only the documents that contains the word "test", then you could invoke the following: http://127.0.0.1:5984/database1/_design/designdocument1/_view/view1?keys=["test"]