We have a table that stores names of people and companies/organisations. What the user needs is the ability to specify the separate words of a name in any order. Like "Seet Kin Meng" can pick out "Kin Meng Seet" in the database.
Before you may want to suggest Full-text Search, we are already using FTS for keyword indexing of other columns that have english passages. What we found from using FTS is it filters out noise words (English here) so a company name of "Order by phone" will drop "by" when it should be a perfectly legitimate exact name match. Besides the wildcarding for FTS (prefix term) only supports the trailing end of a word ("John*") and not the front.
Has anybody implemented other means of name searching that easily avoids affinity to a particular order?
The melody of logic will always play out the truth. ~ Narumi Ayumu, Spiral
http://devpinoy.org/blogs/cruizer
Maybe something like regular expression can help. Something like this:
http://www.codeproject.com/cs/library/SqlRegEx.asp
You'll also need to find a regular expression for "match all these 3 words". Too bad i'm inexperienced in RegEx.
I don't know how fast it will be, but how about a lame straight query on all rows?
Something like:
WHERE (@word1 IS NULL OR ' ' + [name] + ' ' LIKE @word1) AND (@word2.... etc. up to max number of words that can be searched on.
(Name is prefixed + suffixed with a space). Then pass in words: @word1 = "% Seet %" etc.
usoup: You'll also need to find a regular expression for "match all these 3 words". Too bad i'm inexperienced in RegEx.
See if RegEx something like this helps:
^(\s?Seet\s?|\s?Kin\s?|\s?Meng\s?){3}$
Of course you'll need to dynamically fill in the names in the places of "Seet", "Kin", and "Meng", and count the number of words to match in "{3}"
icelava: Using an | OR operator would match "Seet Koon Leng" from a user's input "Seet Kin Meng".
Well, that's why I specify the count "{3}" at the end of the Regex "^(\s?Seet\s?|\s?Kin\s?|\s?Meng\s?){3}$". This RegEx can be read as "match Seet OR Kin OR Meng, and I want exactly 3 matches." So, order is unimportant, because you're matching by words. But of course, if you have records something like "Meng Meng Kin" or "Meng Seet Kin Ong" or "Kin Kin Kin Kin", they all will match too. We need some RegEx expert here to translate "match Seet OR Kin OR Meng, with exactly 1 or more matches EACH". Deep ...
cruizer:by the way can you make use of CLR stuff in this implementation you're doing? though i personally have no experience with integrating SQL Server with .NET...
Yes in fact this query that I am enhancing is already a managed stored procedure, which makes dynamic query generation a whole lot easier. I am not aiming for any efficiency, just additional functionality for the users, so I whipped up the following construction logic
internal class NameQuery{ private static readonly char[] nameToken = new char[] { ' ' }; private static readonly string AndOperator = "AND ";
internal static void QueryName(StringBuilder queryBuilder, SqlCommand command, string name, bool anyOrder) {
if (anyOrder) QueryAnyOrder(queryBuilder, command, name); else QuerySingular(queryBuilder, command, name); }
/// <summary>/// Constructs name query for only a singular name string./// </summary>internal static void QuerySingular(StringBuilder queryBuilder, SqlCommand command, string name){ queryBuilder.AppendFormat(" AND (en.MainName LIKE '%'+ @name +'%' OR ea.Alias LIKE '%'+ @name +'%')", name); command.Parameters.AddWithValue("@name", name);}
/// <summary>/// Constructs name query to consider all parts of the name string, in any order./// </summary>/// <remarks>/// In order to use conventional T-SQL LIKE operator to support matching all parts of a name,/// in any order, the full name string has to be broken down into its individual parts/// and place into its own LIKE comparison on the same column value. This is performance impeding/// as more name parts are compared against more records./// </remarks>internal static void QueryAnyOrder(StringBuilder queryBuilder, SqlCommand command, string name){ StringBuilder mainNameBuilder = new StringBuilder(200); StringBuilder aliasBuilder = new StringBuilder(200);
string[] nameParts = name.Split(nameToken); int validCounter = 0;
foreach (string namePart in nameParts) { if (namePart.Trim().Length == 0) continue;
if (validCounter > 0) { mainNameBuilder.Append(AndOperator); aliasBuilder.Append(AndOperator); }
// Each valid namePart is added as its own parameter // for its part of the query statement. string parameterName = "@namePart" + validCounter.ToString(); mainNameBuilder.AppendFormat("en.MainName LIKE '%'+ " + parameterName + " +'%' ", namePart); aliasBuilder.AppendFormat("ea.alias LIKE '%'+ " + parameterName + " +'%' ", namePart); command.Parameters.AddWithValue(parameterName, namePart); validCounter++; }
// If no valid namePart was attached as a parameter and query statement,// do not attach the AND and OR clauses since there is no condition.
if (validCounter == 0) return;
queryBuilder.AppendLine(" AND ("); queryBuilder.AppendLine(mainNameBuilder.ToString()); queryBuilder.AppendLine(") OR ("); queryBuilder.AppendLine(aliasBuilder.ToString()); queryBuilder.AppendLine(") ");}
}