Adding Search with IndexTank

One tool we’ve been using more and more on Think Vitamin Membership lately is IndexTank. It’s a hosted search indexing service. They’ve got some big customers, like Reddit, and that definitely gave us a ton more confidence to check them out. Being able to store 100,000 documents with them for free also helped out a ton. We’ve been indexing some of our administrative search with IndexTank for about 6 months now and just recently started using IndexTank for our video search.

Before using IndexTank we were using Sphinx for search. It’s a great tool and something I’d recommend to almost anyone, but you have to run Sphinx’s server on your own hardware and maintain it yourself. We try to avoid as much system administration when we can, so we were on the lookout for a hosted service to handle search for us. Sphinx traditionally hasn’t had live indexing also, so search results don’t show up until you reindex the next time (we ran our Sphinx reindexes every 30 minutes). That means that if you get a support request from a user right after they sign up, you can’t find their information. I found myself logging in frequently to reindex so I could use our admin user search, and got tired of it. Sphinx has recently added live indexing, but many of the libraries that work with it don’t yet support that feature.

Getting set up with IndexTank is really easy. After you sign up you’re prompted to create an index. Once you define the new index by giving it a name you can start using the API to insert documents into the index. I’ll show how we have things set up using Ruby below, but check out their documentation site if you normally use some other language. They’ve got links to most popular languages and platforms on their documentation site.

Setting Up IndexTank With Your Rails App

First off you’ll want to add the indextank gem into your Gemfile:

1
gem "indextank"

And run bundle to install it. To help out with connecting to IndexTank, we put our settings into an initializer in config/initializers/indextank.rb:

1
2
3
4
5
class SearchConfig
  def self.api
    @api ||= IndexTank::Client.new("[YOUR INDEXTANK URL]")
  end
end

You’ll want to replace [YOUR INDEXTANK URL] with whatever your actual IndexTank url is. You can find that on your dashboard on IndexTank’s site.

Updating Your ActiveRecord Models for Indexing

The code we’re using to do indexing and search is pretty simple. Think Vitamin Membership is still a Rails 2 app, so you may need to tune the code below a bit to use it with Rails 3. Even writing the integration completely from scratch, I was surprised at how easy it is to get things running with IndexTank.

We use callbacks to index after we save any of the models that are indexed. Here’s the code we use to index:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
after_save :index
def index
  begin
    if Rails.production? || Rails.development?
      self.class.indextank_index.document(id).add({ :text => "#{title} #{description} #{chapter.title} #{course.title}"})
    end
  rescue
    # do nothing - we just won't worry about not updating the index
  end
end
 
def self.indextank_index_name
  index = "videos"
  index += "_dev" if Rails.development?
  index
end
 
def self.indextank_index
  @indextank_index ||= SearchConfig.api.indexes(indextank_index_name)
end

In the code above you can see that our indexing strategy is really simple. Right now we’re simply building a space delimited string of the attributes of the object that we’re indexing. IndexTank has more advanced features like scoring, but we’re not using them yet. We also typically create both development and production indexes, so we have a little bit of logic in the indextank_index_name method to determine what index to use.

Once we have some data in our index we can search it with our search method:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
def self.search(query, page = 1)
  limit = 20
  page = 1 if page == nil
 
  WillPaginate::Collection.create(page, limit, nil) do |pager|
    if query.blank?
      pager.total_entries = 0
    else
      it_results = indextank_index.search(query, { :function => 1, :len => pager.per_page, :start => pager.offset })
      pager.total_entries = it_results["matches"]
      pager.replace(all(:conditions => { :id => it_results["results"].collect { |result| result["docid"].to_i }}))
    end
 end
end

The search method is pretty no frills. It takes a search query as a string and a page number as arguments and returns a collection that’s compatible with will_paginate as a result. So, for example, since this is on our Video model we can call it from a controller with:

1
Video.search(params[:q], params[:page])

and we’ll get the first page of results back.

Third Party Libraries

We wrote up our own indexing and search code for our models using the IndexTank gem because there weren’t many third party libraries for IndexTank around when we got started with them. That said, since then some libraries have popped up that help out with indexing, and they’re on our radar to check out. Here are a few of them:

I’m really interested in checking out all three of those libraries to see how they can help us clean up our IndexTank integration code.

Finishing Up

It’s been excellent to have the ability to add search to pretty much anything in our site with IndexTank without having to worry about the system administration consequences. We’re excited about having a great deal of flexibility with how we develop search on the site, and are really looking forward to taking even more advantage of that flexibility as we develop Treehouse. We’ve been really impressed with the performance of searches and the reliability of their service and definitely recommend it.

Free Workshops

Watch one of our expert, full-length teaching videos. Choose from either HTML, CSS or Wordpress.

Start learning

Treehouse

Our mission is to bring affordable Technology education to people everywhere, in order to help them achieve their dreams and change the world.

Comments

5 comments on “Adding Search with IndexTank

    • Absolutely. I mentioned that in the post. I haven’t been able to find a ton of libraries that actually take advantage of RT indexes yet, though, and it still doesn’t get rid of having to manage sphinx processes and all the other goodies.

      I’m definitely a huge fan of Sphinx, though, and have relied on it for a long time. I’ve just been really impressed with IndexTank lately too.

  1. The lack of live indexing really is a downside of Sphinx. We get around this by using .find in our admin backends instead of .search. A bit slower, but real-time.

    • Sphinx actually does have live indexing – it’ll just take a bit more time before all of the great plugins and libraries support it.

      I used to always try to build SQL based searches into my apps, but there’s nothing like having a real inverted index. That’s where IndexTank has been awesome for us.