We have been tinkering around with a new quick search for HUKD, and we decided that some feedback would be great before sticking it out there, so you may notice a little "Beta Search" button beside the regular go search button. If you click the beta search button when you are doing a search you will see the results from the new HUKD "search engine". Here is a little bit of an explanation about HUKD search:
The current HUKD Search uses a combination of several factors when displaying search results. When we first implemented the search, we used a simple MySQL full text search, which is fairly standard in quite a few places(like vbulletin). Immediately we ran into people having problems finding relavent things. The problem was not because the search itself had problems finding matching threads, the problem was because if there are 10000 threads with the words "digital photo frame" in them, you as a user probably only care about the most recent ones. A deal on a photo frame from 2 years ago is not what you were searching for.
So we changed it, in the early days, so all results were ordered by date. So if there are 10000 matching results, you will see the most recent ones first, because they are probably what you are looking for. Immediately after switching to that mode of searching we ran into trouble. The main problem was, if you order your results by date, you are pretty much ignoring what MySQL says is a better search result. All of a sudden a result with just the word "digital" in it can appear before results with the words "digital photo frame". I know at that point a lot of people would just say make the search require all of the words in the result, but if you do that, if I post a deal for a photo frame and you search for digital photo frame, even though my deal interests you, you won't be able to see it.
So its obvious there needs to be some kind of balance between ranking and date, and maybe throw in a few other factors. Undoubtedly google has mastered that balance, but if google isn't an option what to do?
1) Group results by date group as opposed to date/time
- Instead of simply ordering by date/time, the results can be chunked into groups, and those results can then be ordered by relavency within that group. If you are searching for a deal, deals posted in the last month or 2 probably are going to be more relavent to you then deals from 5 or 6 months ago.
2) Allow movement between groups
- If something is a really good match, but it was posted 3 months ago, it should probably appear higher then something that was posted 1 month ago but is a really bad match, so allow some kind of movement between those groups.
3) Throw in other factors that aren't related to date/time or search term relavency
- If there is a deal with a really hot temperature, you probably want to see it more then something that has been voted really cold, or has even been expired. Google doesn't have this semantic data so IF all things were equal you could end up with a better search.
These things have been implemented in the search, but they were fairly difficult to do with MySQL. Some things we couldn't do, for example, HUKD users have been kind enough to create very descriptive thread titles, so we should give those a higher priority then the body of threads, which can contain a wide variety of different words unrelated to the deal itself. Anyone who has ever searched for PS3 Games and ended up with results for XBOX Games probably knows what I am talking about. So, we have switched to use an open source search engine called Sphinx. Sphinx is faster, and arguably better at matching search terms against results. The balance between date and relavence is a hard thing to pin down though and that is where some feedback would come in handy in our new search. We need to know if you're seeing things that are too old, or things that just aren't related to what you're searching for.