FullText Search Provider for CommunityServer 2007...

So people in Asia that using CommunityServer for forum or blog system should know that the search function will not work for asian languages like Chinese, Japanese, as well as Korean. the reason is that CommunityServer had built-in hashcode-indexed keyword search mechanism that will only work for languages like English that using space as seperator, but won't work in language like Chinese since there are no spaces to seperate words.

Also by using this indexing mechanism the index table will grow in a timely basis. usually result in big database size and sometimes will cause issues in a web-hosting environment because the disk space is limited.

Since CommunityServer 1.1, I've been solving this problem by modifying CS source code to change the search mechanism back to traditional fulltext search on post's title and body. At CS2.0 era, I didn't catch up the trend to upgrade my system but my friend Jeffrey at Taiwan upgraded his blog system to CS2.0 also digged into the source code and had a modified fulltext search engine for CS2.0.

As I didn't upgrade my blog system to CS2.0 but instead to the latest version of CommunityServer 2007 (CS3.0), the last task behind the whole upgrade is to re-write a fulltext search provider to be able to search Chinese / Japanese / English content in my box. As I figured out that by forcasting my work-load later I won't have time to do this anymore, I decided to use this weekend to finish this task. and finally it's done!!! (although it's now Monday morning at 6:30am while I am writing this post and I am going to work soon without sleeping today... orz).

Thanks to CS 2007's provider model, now the modified search provider can be provided in binary form with easy installation by just copy the dll and modify the communityserver.config file. I also provided source code for reference. you can do a test in my personal blog site, it's now able to search English / Chinese / Japanese without problems.

Be caution that don't use this provider in big traffic site since every search will go against the content (cs_Posts) table and since it used a "like '%...%'" search, you can imagine the load to your SQL server and the performance...

Download the FullText Search Provider for CommunityServer 2007 here.

Enjoy~

 

Technorati tags: asp.net, programming, communityserver, search, provider, blog, platform