Couch Potato
… is a persistence layer written in ruby for CouchDB.
Upgrading from 1.x to 2.x
Version 1.x monkey patched Date#to_json (=> "2015/01/01") and Time#to_json (=> "2016/01/01 12:00:00 +0000"), 2.x does not (Date => "\"2015-01-01\"", Time => "\"2016-01-01 12:00:00 +0100\"").
In order to keep the old behavior, add the following to your code:
require 'couch_potato/core_ext/time'
require 'couch_potato/core_ext/date'
This will again apply the monkey patches. It is highly recommended to add the require statements if you are upgrading from 1.x. Otherwise, the format of Date and Time objects written to your database and used to query it will change, which means data written before the update won't be returned. To avoid the monkey patching, you have to re-write all your documents, so that dates/times are stored in the new format.
Mission
The goal of Couch Potato is to create a minimal framework in order to store and retrieve Ruby objects to/from CouchDB and create and query views.
It follows the document/view/querying semantics established by CouchDB and won't try to mimic ActiveRecord behavior in any way as that IS BAD.
Code that uses Couch Potato should be easy to test.
Lastly Couch Potato aims to provide a seamless integration with Ruby on Rails, e.g. routing, form helpers etc.
Core Features
- persisting objects by including the CouchPotato::Persistence module
- declarative views with either custom or generated map/reduce functions
- extensive spec suite
Supported Environments
- CouchDB 1.6.0+
- see .github/workflows/ruby.yml
(Supported means I run the specs against those before releasing a new gem.)
Installation
Couch Potato is hosted as a gem which you can install like this:
gem install couch_potato
Using with your ruby application:
require 'rubygems'
require 'couch_potato'
After that you configure the name of the database:
CouchPotato::Config.database_name = 'name_of_the_db'
The server URL will default to http://localhost:5984/
unless specified:
CouchPotato::Config.database_name = "http://example.com:5984/name_of_the_db"
But you can also specify the database host separately from the database name:
CouchPotato::Config.database_host = "http://example.com:5984"
CouchPotato::Config.database_name = "name_of_the_db"
Or with authentication
CouchPotato::Config.database_name = "http://username:[email protected]:5984/name_of_the_db"
Optionally you can configure the default language for design documents (:javascript
(default) or :erlang
).
CouchPotato::Config.default_language = :javascript | :erlang
Another switch allows you to store each CouchDB view in its own design document. Otherwise views are grouped by model.
CouchPotato::Config.split_design_documents_per_view = true
If you are using more than one database from your app, you can create aliases:
CouchPotato::Config.additional_databases = {'db1' => 'db1_production', 'db2' => 'https://db2.example.com/db'}
db1 = CouchPotato.use 'db1'
Using with Rails
Create a config/couchdb.yml
:
default: &default
split_design_documents_per_view: true # optional, default is false
digest_view_names: true # optional, default is false
default_language: :erlang # optional, default is javascript
database_host: "http://127.0.0.1:5984"
development:
<<: *default
database: development_db_name
test:
<<: *default
database: test_db_name
production:
<<: *default
database: <%= ENV['DB_NAME'] %>
additional_databases:
db1: db1_production
db2: https://db2.example.com/db
Add to your Gemfile
:
# gem 'rails' # we don't want to load activerecord so we can't require rails
gem 'railties'
gem 'actionpack'
gem 'actionmailer'
gem 'activemodel'
gem "couch_potato"
gem 'tzinfo'
Introduction
This is a basic tutorial on how to use Couch Potato. If you want to know all the details feel free to read the specs and the rdocs.
Save, load objects
First you need a class.
class User
end
To make instances of this class persistent include the persistence module:
class User
include CouchPotato::Persistence
end
If you want to store any properties you have to declare them:
class User
include CouchPotato::Persistence
property :name
end
Properties can be typed:
class User
include CouchPotato::Persistence
property :address, :type => Address
end
In this case Address
also implements CouchPotato::Persistence
which means its JSON representation will be added to the user document.
Couch Potato also has support for the basic types (right now Integer
, Date
, Time
and :boolean
are supported):
class User
include CouchPotato::Persistence
property :age, :type => Integer
property :receive_newsletter, :type => :boolean
end
With this in place when you set the user's age as a String (e.g. using an HTML form) it will be converted into a Integer
automatically.
Properties can have a default value:
class User
include CouchPotato::Persistence
property :active, :default => true
property :signed_up, :default => Proc.new { Time.now }
end
Now you can save your objects. All database operations are encapsulated in the CouchPotato::Database
class. This separates your domain logic from the database access logic which makes it easier to write tests and also keeps your models smaller and cleaner.
user = User.new :name => 'joe'
CouchPotato.database.save_document user # or save_document!
You can of course also retrieve your instance:
CouchPotato.database.load_document "id_of_the_user_document" # => <#User 0x3075>
Handling conflicts
CouchDB uses MVCC to detect write conflicts. If a conflict occurs when trying to update a document CouchDB returns an error. To handle conflicts easily you can save documents like this:
CouchPotato.database.save_document user do |user|
user.name = 'joe'
end
When a conflict occurs Couch Potato automatically reloads the document, runs the block and tries to save it again. Note that the block is also run before initally saving the document.
Caching load reqeusts
You can add a cache to a database instance to enable caching subsequent #load
calls to the same id.
Any write operation will completely clear the cache.
db = CouchPotato.database
db.cache = {}
db.load '1'
db.load '1' # goes to the cache instead of to the database
In web apps, the idea is to use a per request cache, i.e. set a new cache for every request.
Operations on multiple documents
You can also load a bunch of documents with one request.
CouchPotato.database.load ['user1', 'user2', 'user3'] # => [<#User 0x3075>, <#User 0x3076>, <#User 0x3077>]
Properties
You can access the properties you declared above through normal attribute accessors.
user.name # => 'joe'
user.name = {:first => ['joe', 'joey'], :last => 'doe', :middle => 'J'} # you can set any ruby object that responds_to :to_json (includes all core objects)
user._id # => "02097f33a0046123f1ebc0ebb6937269"
user._rev # => "2769180384"
user.created_at # => Fri Oct 24 19:05:54 +0200 2008
user.updated_at # => Fri Oct 24 19:05:54 +0200 2008
user.new? # => false
If you want to have properties that don't map to any JSON type, i.e. other than String
, Number
, Boolean
, Hash
or Array
you have to define the type like this:
class User
property :date_of_birth, :type => Date
end
The date_of_birth
property is now automatically serialized to JSON and back when storing/retrieving objects.
If you want to store an Array of objects, just pass the definiton as an Array of Dates:
class User
property :birthdays, :type => [Date]
end
Dirty tracking
CouchPotato tracks the dirty state of attributes in the same way ActiveRecord does:
user = User.create :name => 'joe'
user.name # => 'joe'
user.name_changed? # => false
user.name_was # => nil
You can also force a dirty state:
user.name = 'jane'
user.name_changed? # => true
user.name_not_changed
user.name_changed? # => false
CouchPotato.database.save_document user # does nothing as no attributes are dirty
Optional Deep Dirty Tracking
In addition to standard dirty tracking, you can opt-in to more advanced dirty tracking for deeply structured documents by including the CouchPotato::DeepDirtyAttributes
module in your models. This provides two benefits:
- Dirty checking for array and embedded document properties is more reliable, such that modifying elements in an array (by any means) or changing a property of an embedded document will make the root document be
changed?
. With standard dirty checking, the#{property}=
method must be called on the root document for it to bechanged?
. - It gives more useful and detailed change tracking for embedded documents, arrays of simple values, and arrays of embedded documents.
The #{property}_changed?
and #{property}_was
methods work the same as basic dirty checking, and the _was
values are always deep clones of the original/previous value. The #{property}_change
and changes
methods differ from basic dirty checking for embedded documents and arrays, giving richer details of the changes instead of just the previous and current values. This makes generating detailed, human friendly audit trails of documents easy.
Tracking changes in embedded documents gives easy access to the changes in that document:
book = Book.new(:cover => Cover.new(:color => "red"))
book.cover.color = "blue"
book.cover_changed? # => true
book.cover_was # => <deep clone of original state of book.cover>
book.cover_change # => [<deep clone of original state of book.cover>, {:color => ["red", "blue"]}]
Tracking changes in arrays of simple properties gives easy access to added and removed items:
book = Book.new(:authors => ["Sarah", "Jane"])
book..delete "Jane"
book. << "Sue"
book. # => true
book. # => ["Sarah", "Jane"]
book. # => [["Sarah", "Jane"], {:added => ["Sue"], :removed => ["Jane"]}]
Tracking changes in an array of embedded documents also gives changed items:
book = Book.new(:pages => [Page.new(:number => 1), Page.new(:number => 2)]
book.pages[0].title = "New title"
book.pages.delete_at 1
book.pages << Page.new(:number => 3)
book.pages_changed? # => true
book.pages_was # => <deep clone of original pages array>
book.pages_change[0] # => <deep clone of original pages array>
book.pages_change[1] # => {:added => [<page 3>], :removed => [<page 2>], :changed => [[<deep clone of original page 1>, {:title => [nil, "New title"]}]]}
For change tracking in nested documents and document arrays to work, the embedded documents must have unique _id
values. This can be accomplished easily in your embedded CouchPotato models by overriding initialize
:
def initialize(*args)
self._id = SecureRandom.uuid
super
end
Object validations
Couch Potato by default uses ActiveModel for validation
class User
property :name
validates_presence_of :name
end
user = User.new
user.valid? # => false
user.errors[:name] # => ['can't be blank']
Finding stuff / views / lists
In order to find data in your CouchDB you have to create a view first. Couch Potato offers you to create and manage those views for you. All you have to do is declare them in your classes:
class User
include CouchPotato::Persistence
property :name
view :all, :key => :created_at
end
This will create a view called "all" in the "user" design document with a map function that emits "created_at" for every user document.
CouchPotato.database.view User.all
This will load all user documents in your database sorted by created_at
.
CouchPotato.database.view User.all(:key => (Time.now- 10)..(Time.now), :descending => true)
Any options you pass in will be passed onto CouchDB.
Composite keys are also possible:
class User
property :name
view :all, :key => [:created_at, :name]
end
You can let Couch Potato generate these map/reduce functions in Erlang, which reslts in much faster view generation:
class User
property :name
view :all, :key => [:created_at, :name], :language => :erlang
end
So far only very simple views like the above work with Erlang.
You can also pass conditions as a JavaScript string:
class User
property :name
view :completed, :key => :name, :conditions => 'doc.completed === true'
end
The creation of views is based on view specification classes (see CouchPotato::View::BaseViewSpec and its descendants for more detailed documentation). The above code uses the ModelViewSpec
class which is used to find models by their properties. For more sophisticated searches you can use other view specifications (either use the built-in or provide your own) by passing a type parameter:
If you have larger structures and you only want to load some attributes you can use the PropertiesViewSpec
(the full class name is automatically derived):
class User
property :name
property :bio
view :all, :key => :created_at, :properties => [:name], :type => :properties
end
CouchPotato.database.view(User.all).first.name # => "joe"
CouchPotato.database.view(User.all).first.bio # => nil
CouchPotato.database.first(User.all).name # => "joe" # convenience function, returns nil if nothing found
CouchPotato.database.first!(User.all) # would raise CouchPotato::NotFound if nothing was found
If you want Rails to automatically show a 404 page when CouchPotato::NotFound
is raised add this to your ApplicationController
:
rescue_from CouchPotato::NotFound do
render(:file => 'public/404.html', :status => :not_found, :layout => false)
end
You can also pass in custom map/reduce functions with the custom view spec:
class User
view :all, :map => "function(doc) { emit(doc.created_at, null)}", :include_docs => true, :type => :custom
end
commonJS modules can also be used in custom views:
class User
view :all, :map => "function(doc) { emit(null, require("views/lib/test").test)}", :lib => {:test => "exports.test = 'test'"}, :include_docs => true, :type => :custom
end
If you don't want the results to be converted into models the raw view is your friend:
class User
view :all, :map => "function(doc) { emit(doc.created_at, doc.name)}", :type => :raw
end
When querying this view you will get the raw data returned by CouchDB which looks something like this:
{
"total_entries": 2,
"rows": [
{
"value": "alex",
"key": "2009-01-03 00:02:34 +000",
"id": "75976rgi7546gi02a"
}
]
}
To process this raw data you can also pass in a results filter:
class User
view :all, :map => "function(doc) { emit(doc.created_at, doc.name)}", :type => :raw, :results_filter => lambda {|results| results['rows'].map{|row| row['value']}}
end
In this case querying the view would only return the emitted value for each row.
You can pass in your own view specifications by passing in :type => MyViewSpecClass
. Take a look at the CouchPotato::View::*ViewSpec classes to get an idea of how this works.
Digest view names
If turned on, Couch Potato will append an MD5 digest of the map function to each view name. This makes sure (together with split_design_documents_per_view) that no views/design documents are ever updated. Instead, new ones are created. Since reindexing can take a long time once your database is larger, you want to avoid blocking your app while CouchDB is busy. Instead, you create a new view, warm it up, and only then start using it.
Lists
CouchPotato also supports CouchDB lists. With lists you can process the result of a view query with another JavaScript function. This can be useful for example if you want to filter your results, or add some data to each document.
Defining a list works similarly to views:
class User
include CouchPotato::Persistence
property :first_name
view :with_full_name, key: first_namne, list: :add_last_name
view :all, key: :first_name
list :add_last_name, <<-JS
function(head, req) {
var row;
send('{"rows": [');
while(row = getRow()) {
row.doc.name = row.doc.first_name + ' doe';
send(JSON.stringify(row));
};
send(']}');
}
JS
end
CouchPotato.database.save User.new(first_name: 'joe')
CouchPotato.database.view(User.with_full_name).first.name # => 'joe doe'
You can also pass in the list at query time:
CouchPotato.database.view(User.all(list: :add_last_name))
And you can pass parameters to the list:
CouchPotato.database.view(User.all(list: :add_last_name, list_params: {filter: '*'}))
Associations
Not supported. Not sure if they ever will be. You can implement those yourself using views and custom methods on your models.
Callbacks
Couch Potato supports the usual lifecycle callbacks known from ActiveRecord:
class User
include CouchPotato::Persistence
before_create :do_something_before_create
before_update {|user| user.do_something_on_update}
end
This will call the method do_something_before_create before creating an object and run the given lambda before updating one. Lambda callbacks get passed the model as their first argument. Method callbacks don't receive any arguments.
Supported callbacks are: :before_validation
, :before_validation_on_create
, :before_validation_on_update
, :before_validation_on_save
, :before_create
, :after_create
, :before_update
, :after_update
, :before_save
, :after_save
, :before_destroy
, :after_destroy
.
If you need access to the database in a callback: Couch Potato automatically assigns a database instance to the model before saving and when loading. It is available as database accessor from within your model instance.
Attachments
There is basic attachment support: if you want to store any attachments set the _attachments
attribute of a model before saving like this:
class User
include CouchPotato::Persistence
end
data = File.read('some_file.text') # or from upload
user = User.new
user. = {'photo' => {'data' => data, 'content_type' => 'image/png'}}
When saving this object an attachment with the name photo will be uploaded into CouchDB. It will be available under the url of the user object + /photo. When loading the user at a later time you still have access to the content_type and additionally to the length of the attachment:
user_reloaded = CouchPotato.database.load user.id
user_reloaded.['photo'] # => {'content_type' => 'image/png', 'length' => 37861}
Multi DB Support
Couch Potato supports accessing multiple CouchDBs:
CouchPotato.with_database('couch_customer') do |couch|
couch.save @customer
end
Unless configured otherwise this would save the customer model to http://127.0.0.1:5984/couch_customer
.
You can also first retrieve the database instance:
db = CouchPotato.use('couch_customer')
db.save @customer
Testing
To make testing easier and faster database logic has been put into its own class, which you can replace and stub out in whatever way you want:
class User
include CouchPotato::Persistence
end
# RSpec
describe 'save a user' do
it 'should save' do
couchrest_db = stub 'couchrest_db',
database = CouchPotato::Database.new couchrest_db
user = User.new
couchrest_db.should_receive(:save_doc).with(...)
database.save_document user
end
end
By creating you own instances of CouchPotato::Database
and passing them a fake CouchRest database instance you can completely disconnect your unit tests/spec from the database.
For stubbing out the database couch potato offers some helpers via the couch_potato-rspec
gem. Use version 2.x of the gem, you you are on RSpec 2, use 3.x for RSpec 3.
class Comment
view :by_commenter_id, :key => :commenter_id
end
# RSpec
require 'couch_potato/rspec'
db = stub_db # stubs CouchPotato.database
db.stub_view(Comment, :by_commenter_id).with('23').and_return([:comment1, :comment2])
CouchPotato.database.view(Comment.by_commenter_id('23)) # => [:comment1, :comment2]
CouchPotato.database.first(Comment.by_commenter_id('23)) # => :comment1
Testing map/reduce functions
Couch Potato provides custom RSpec matchers for testing the map and reduce functions of your views. For example you can do this:
class User
include CouchPotato::Persistence
property :name
property :age, :type => Integer
view :by_name, :key => :name
view :by_age, :key => :age
view :oldest_by_name,
:map => "function(doc) { emit(doc.name, doc.age); }",
:reduce => "function(keys, values, rereduce) { return Math.max.apply(this, values); }"
end
#RSpec
require 'couch_potato/rspec'
describe User, 'views' do
it "should map users to their name" do
User.by_name.should map(User.new(:name => 'bill', :age => 23)).to(['bill', 1])
end
it "should reduce the users to the sum of their age" do
User.by_age.should reduce([], [23, 22]).to(45)
end
it "should map/reduce users to the oldest age by name" do
docs = [
User.new(:name => "Andy", :age => 30),
User.new(:name => "John", :age => 25),
User.new(:name => "John", :age => 30),
User.new(:name => "Jane", :age => 20)]
User.oldest_by_name.should map_reduce(docs).(:group => true, :startkey => "Jane").to(
{"key" => "Jane", "value" => 20}, {"key" => "John", "value" => 30})
end
end
This will actually run your map/reduce functions in a JavaScript interpreter, passing the arguments as JSON and converting the results back to Ruby. map_reduce
specs map the input documents, reduce the emitted keys/values, and rereduce the results while also respecting the :group
, :group_level
, :key
, :keys
, :startkey
, and :endkey
couchdb options. For more examples see the spec.
In order for this to work you must have the js
executable in your PATH. This is usually part of the spidermonkey package/port. (On MacPorts that's spidemonkey, on Linux it could be one of libjs, libmozjs or libspidermonkey). When you installed CouchDB via your packet manager Spidermonkey should already be there.
Helping out
Please fix bugs, add more specs, implement new features by forking the github repo at http://github.com/langalex/couch_potato.
Issues are tracked at github: http://github.com/langalex/couch_potato/issues
There is a mailing list, just write to: [email protected]
You can run all the specs by calling 'rake spec_unit' and 'rake spec_functional' in the root folder of Couch Potato. The specs require a running CouchDB instance at http://localhost:5984
I will only accept patches that are covered by specs - sorry.
Contact
If you have any questions/suggestions etc. please contact me at alex at upstream-berlin.com or @langalex on twitter.