Recently I’ve been playing with Sphinx which promises to be a great site search engine solution at my place of work. This post isn’t meant to be a comprehensive tutorial but a brief overview meant to wet your appetite.
What Is Sphinx?
Sphinx is a standalone search engine that can be used to power search capability in many applications. It’s extremely quick, relevant, scalable, and highly configurable.
If you’re trying to create search functionality and using MySQL to do ‘LIKE’ searches, I highly recommend you at least look into using sphinx.
Get Sphinx
Download the source here at this link. My install experience on a Linux machine went very smoothly. It’s simply a matter of unpacking and then doing
./configure, make, make install. I decided to do ./configure --prefix=/usr/local/sphinx as that is the prefix used in the Sphinx documentation.
Here’s a quick rundown on the contents of your installation:
[sphinx dir]/etc/ – sphinx configuration file goes here
[sphinx dir]/var/data/ – index files
[sphinx dir]/bin/ – useful command line tools and the search daemon
Simple enough so far, right?
Setting Up Sphinx
Before you can start searching, you need to edit and create a sphinx.conf file. Go into the [sphinx dir]/etc/ folder and copy the example sphinx configuration file. Go through it and edit it to your hearts desire. Make sure to become good friends with the documentation as it’ll walk you through each and every available option.
The heart of config file is as follows:
- Define sources. Each source includes an SQL query. This query is the information you want to be searchable. You can even include fields in this query which you declare as attributes. Afterwards, you’ll be able to sort and/or filter by these attributes.
- Define indices. Each index points to a source and includes various additional options for how the information is searched. There can be multiple indices pointing to the same source. When searching, you have the ability to search one or more specific indices.
But again, make sure to check out Sphinx’s own documentation. After setting up your config file, run the [sphinx bin]/bin/indexer tool to collect the data and make your indices.
Use Sphinx!
To search directly on the index without going through the daemon, use [sphinx dir]/bin/search. Doing this after running indexer is a good idea as it bypasses api’s and daemons, which can be a source of bugs or confusion. You might also want to play with [sphinx dir]/bin/indextool which will give you some information about the indices you just created and along with [sphinx dir]/bin/search, can prove to be great debugging tools.
If the search looks to be working well, go ahead and turn on the [sphinx dir]/bin/searchd daemon and try the API’s. Currently Sphinx provides API’s for PHP, Python, and Java. My experience was using the PHP version. These API’s have very useful options related to sorting, field weighting, filtering, etc.
Ohnoes – API/daemon trouble!
This caused me hours of head scratching so I’m hoping I can save some people a bit of frustration here. Remember to restart the searchd daemon after you change configuration options! I had issues when I was getting garbage search results from the PHP API after changing sphinx.conf.
Now go and give your users an awesome search experience!