Monday, March 25, 2013

MySQL 5.6 InnoDB Full Text Search Index

What is full text search Index in MySQL???

It is one of the index type FULL TEXT in MySQL.

If we want to search for a word we use LIKE ‘%word%’, if more than one word we use like ‘%word1%word2%’, what many people don’t know is for this kind of search is expensive and not optimized to our MySQL, in this cases we solve our problems with Full Text Index.

In MySQL 5.6 the InnoDB storage engines have the features of Full text search index option.
 
Full-text indexes can be used only with InnoDB or MyISAM tables, and can be created only for CHAR, VARCHAR, or TEXT columns. 

Full-text searching is performed using MATCH()...AGAINST() syntax. MATCH() takes a comma-separated list that names the columns to be searched. AGAINST() takes a string to search for, and an optional modifier that indicates what type of search to perform.



There are three ways of search option in full text index,

                                              I) Natural language
                                             II) Boolean mode
                                            III) Query expansion

I)Natural Language Full text searches:
           
      It search the given word alone in the table.If there is no the given word means it will show as empty sets.The search string is given as the argument to AGAINST(). For each row in the table, MATCH() returns a relevance value; that is, a similarity measure between the search string and the text in that row in the columns named in the MATCH() list.
 
Eg:

Create a database and table in it.






















Here the engine is InnoDB and next insert the values.
See all the datas in the table.

Now we check the natural language full text search.Here,we need to search the data containing the word 'India' so the natural language mode shows the resemblance world of India in the table.It scan the full rows and shows it below.

In the above,three rows are affected among five rows because the word India contains only in those rows alone.If we need to search more than two word means see below,
 

The default minimum length is 4 characters. You can change the min and max word length with the variables ft_min_word_len and ft_max_word_len.

II)Boolean Mode full text searches:

Sometimes we need to see particular word containing rows but it should not have some other words too during that time we can use this type of searching option.The + and - operators indicate that a word is required to be present or absent, respectively, for a match to occur. 

Eg:

Here,I shown the table containing the word as India but it should not have the word South. 

Here,I shown the table which contain the word city and it should not have the word India.


III) Query expansion full text searches



In this search it will show the resemblance word from the given word.In this it first scan and show the given word and again scan to see any resemblance word like this to show in the table.It will show the words that are commonly found with the words in the query.

Eg:

Here its shows the resemblance word and the given word from the table.



These are the types of Full text search Engines in MySQL and its functions.


Thank you...





1 comment:

  1. Hello..first off nice shot on the explanation!
    I'm new to MySql and have used other familiar DB products. My questions here, when I started roving your article:

    (i) Don't we require admin privileges to enable the FULL TEXT search index? or the USER role is pretty much okay to perform the search?

    (ii) Besides the performance, is there any advantage of using this search index when compared to LIKE operator in the SELECT statement. Also, its applicable only for the columns with the data type Text, char and varchar(one of the limitations).

    (iii) If I wish to avoid/exclude some words(stop words) in the full text search, is there any way I can exclude it?

    (iv) For query optimization, we have come across indexing and paging concepts.

    TABLE: EMPLOYEE

    ID NAME REGION
    1111 ARJUNA DELHI
    1112 PRAVEEN HYD
    1113 MALLIKARJUNAN DELHI
    1114 KUMAR CHENNAI

    SELECT * FROM EMPLOYEE
    WHERE NAME LIKE ‘%ARJUN%’ Consider NAME is a non-clustered index column

    TABLE: EMPLOYEE1

    ID NAME REGION
    1111 ARJUNA DELHI
    1112 PRAVEEN HYD
    1113 MALLIKARJUNAN DELHI
    1114 KUMAR CHENNAI

    SELECT * FROM EMPLOYEE1
    WHERE NAME MATCH(NAME.ARJUN) AGAINST (‘DELHI’) consider FULL text search index is enabled in the NAME column

    a) Which of these 2 queries is the best when dealing with large volume of data(have mentioned only few rows)?

    b) How the data in the FTS index is stored(whether physically sorted)?

    c) Where the FTS index is stored default (primary or secondary file group)?

    Looking forward to hear from you on my queries.

    Thank you!

    ReplyDelete