SgDotNet
Singapore Professional .NET User Group -For Cool Developers

Querying people/company names in any order

Latest post 10-30-2007 11:40 PM by icelava. 10 replies.
  • 10-30-2007 12:26 PM

    Querying people/company names in any order

    We have a table that stores names of people and companies/organisations. What the user needs is the ability to specify the separate words of a name in any order. Like "Seet Kin Meng" can pick out "Kin Meng Seet" in the database.

    Before you may want to suggest Full-text Search, we are already using FTS for keyword indexing of other columns that have english passages. What we found from using FTS is it filters out noise words (English here) so a company name of "Order by phone" will drop "by" when it should be a perfectly legitimate exact name match. Besides the wildcarding for FTS (prefix term) only supports the trailing end of a word ("John*") and not the front.

    Has anybody implemented other means of name searching that easily avoids affinity to a particular order?

    The melody of logic will always play out the truth. ~ Narumi Ayumu, Spiral

  • 10-30-2007 12:51 PM In reply to

    • cruizer
    • Top 25 Contributor
    • Joined on 07-25-2007
    • Singapore
    • Posts 153

    Re: Querying people/company names in any order

    hmm this might be more complex but maybe you can extract each word and put them (per row) in some other table that you can run a query on.
    http://devpinoy.org/blogs/cruizer
  • 10-30-2007 1:57 PM In reply to

    Re: Querying people/company names in any order

    In other words, creating our own index catalog? Hmmm not exactly how I envisioned it.... :-)

    The melody of logic will always play out the truth. ~ Narumi Ayumu, Spiral

  • 10-30-2007 2:31 PM In reply to

    • usoup
    • Top 50 Contributor
    • Joined on 12-06-2004
    • Central
    • Posts 80

    Re: Querying people/company names in any order

    Maybe something like regular expression can help. Something like this:

    http://www.codeproject.com/cs/library/SqlRegEx.asp

    You'll also need to find a regular expression for "match all these 3 words". Too bad i'm inexperienced in RegEx.

    -- Signed with Soup --
  • 10-30-2007 2:48 PM In reply to

    • hannes
    • Top 25 Contributor
    • Joined on 05-04-2004
    • South Africa
    • Posts 240

    Re: Querying people/company names in any order

    I don't know how fast it will be, but how about a lame straight query on all rows?

    Something like:

    WHERE (@word1 IS NULL OR ' ' + [name] + ' ' LIKE @word1) AND (@word2.... etc. up to max number of words that can be searched on.

    (Name is prefixed + suffixed with a space). Then pass in words: @word1 = "% Seet %" etc.

  • 10-30-2007 4:17 PM In reply to

    • usoup
    • Top 50 Contributor
    • Joined on 12-06-2004
    • Central
    • Posts 80

    Re: Querying people/company names in any order

    usoup:

    You'll also need to find a regular expression for "match all these 3 words". Too bad i'm inexperienced in RegEx.

    See if RegEx something like this helps:

    ^(\s?Seet\s?|\s?Kin\s?|\s?Meng\s?){3}$

     Of course you'll need to dynamically fill in the names in the places of "Seet", "Kin", and "Meng", and count the number of words to match in "{3}"

    -- Signed with Soup --
  • 10-30-2007 4:40 PM In reply to

    • cruizer
    • Top 25 Contributor
    • Joined on 07-25-2007
    • Singapore
    • Posts 153

    Re: Querying people/company names in any order

    i wonder how performance will be in such a scenario. this will have to be a full table scan (no indices), plus reg exp evaluation to boot. anyway i'm sure it'll get the job done...
    http://devpinoy.org/blogs/cruizer
  • 10-30-2007 5:12 PM In reply to

    Re: Querying people/company names in any order

    Regex sounds interesting but it has to match all of the name words in any order. Using an | OR operator would match "Seet Koon Leng" from a user's input "Seet Kin Meng". I am thinking hannes' suggestion may be the thoroughly and "comprehensive" solution albeit an n-times performance penalty. I am going to tinker with that and see just how slow it will get.

    The melody of logic will always play out the truth. ~ Narumi Ayumu, Spiral

  • 10-30-2007 6:08 PM In reply to

    • cruizer
    • Top 25 Contributor
    • Joined on 07-25-2007
    • Singapore
    • Posts 153

    Re: Querying people/company names in any order

    by the way can you make use of CLR stuff in this implementation you're doing? though i personally have no experience with integrating SQL Server with .NET...
    http://devpinoy.org/blogs/cruizer
  • 10-30-2007 7:07 PM In reply to

    • usoup
    • Top 50 Contributor
    • Joined on 12-06-2004
    • Central
    • Posts 80

    Re: Querying people/company names in any order

    icelava:
    Using an | OR operator would match "Seet Koon Leng" from a user's input "Seet Kin Meng".

    Well, that's why I specify the count "{3}" at the end of the Regex "^(\s?Seet\s?|\s?Kin\s?|\s?Meng\s?){3}$". This RegEx can be read as "match Seet OR Kin OR Meng, and I want exactly 3 matches." So, order is unimportant, because you're matching by words. But of course, if you have records something like "Meng Meng Kin" or "Meng Seet Kin Ong" or "Kin Kin Kin Kin", they all will match too. We need some RegEx expert here to translate "match Seet OR Kin OR Meng, with exactly 1 or more matches EACH". Deep ...

    -- Signed with Soup --
  • 10-30-2007 11:40 PM In reply to

    Re: Querying people/company names in any order

    cruizer:
    by the way can you make use of CLR stuff in this implementation you're doing? though i personally have no experience with integrating SQL Server with .NET...

    Yes in fact this query that I am enhancing is already a managed stored procedure, which makes dynamic query generation a whole lot easier. I am not aiming for any efficiency, just additional functionality for the users, so I whipped up the following construction logic

    internal class NameQuery
    {
      private static readonly char[] nameToken = new char[] { ' ' };
      private static readonly string AndOperator = "AND ";

      internal static void QueryName(StringBuilder queryBuilder, SqlCommand command, string name, bool anyOrder)
      {

        if (anyOrder)
          QueryAnyOrder(queryBuilder, command, name);
        else
         
    QuerySingular(queryBuilder, command, name);
      }

    /// <summary>
    ///
    Constructs name query for only a singular name string.
    /// </summary>
    internal static void QuerySingular(StringBuilder queryBuilder, SqlCommand command, string name)
    {
      queryBuilder.AppendFormat(" AND (en.MainName LIKE '%'+ @name +'%' OR ea.Alias LIKE '%'+ @name +'%')", name);
      command.Parameters.AddWithValue("@name", name);
    }

    /// <summary>
    ///
    Constructs name query to consider all parts of the name string, in any order.
    /// </summary>
    ///
    <remarks>
    ///
    In order to use conventional T-SQL LIKE operator to support matching all parts of a name,
    /// in any order, the full name string has to be broken down into its individual parts
    /// and place into its own LIKE comparison on the same column value. This is performance impeding
    /// as more name parts are compared against more records.
    /// </remarks>
    internal static void QueryAnyOrder(StringBuilder queryBuilder, SqlCommand command, string name)
    {
      StringBuilder mainNameBuilder = new StringBuilder(200);
      StringBuilder aliasBuilder = new StringBuilder(200);

      string[] nameParts = name.Split(nameToken);
      int validCounter = 0;

      foreach (string namePart in nameParts)
      {
        if (namePart.Trim().Length == 0) continue;

        if (validCounter > 0)
        {
          mainNameBuilder.Append(AndOperator);
          aliasBuilder.Append(AndOperator);
        }

        // Each valid namePart is added as its own parameter
        // for its part of the query statement.
       
    string parameterName = "@namePart" + validCounter.ToString();
        mainNameBuilder.AppendFormat("en.MainName LIKE '%'+ " + parameterName + " +'%' ", namePart);
        aliasBuilder.AppendFormat("ea.alias LIKE '%'+ " + parameterName + " +'%' ", namePart);
        command.Parameters.AddWithValue(parameterName, namePart);
        validCounter++;
      }

    // If no valid namePart was attached as a parameter and query statement,
    // do not attach the AND and OR clauses since there is no condition.

      if (validCounter == 0) return;

      queryBuilder.AppendLine(" AND (");
      queryBuilder.AppendLine(mainNameBuilder.ToString());
      queryBuilder.AppendLine(") OR (");
      queryBuilder.AppendLine(aliasBuilder.ToString());
      queryBuilder.AppendLine(") ");
    }

    }

    The melody of logic will always play out the truth. ~ Narumi Ayumu, Spiral

Page 1 of 1 (11 items) | RSS
Copyright SgDotNet 2004-2008
Powered by Community Server (Commercial Edition), by Telligent Systems