An improved model of distributed search engine
CSTR:
Author:
Affiliation:

(School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China)

Clc Number:

TP393

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    To solve the problem of search performance in traditional distributed search engine, a non-centralized high parallelization search model was proposed and the traditional model was improved in the index structure and search algorithm. In the model, the index was classified according to document theme, bitmap structure was employed for longer inverted record list, and parallel search algorithm (multi max score heap, MMSH) was achieved in index node by using multi-threading technology. Experimental results show that the improved search model with index classification and bitmap strategy of the inverted list structure can enhance the search pertinence in Merge layer, reduce CPU and memory cost. In the case that the inverted list can not be completely stored in memory, MMSH algorithm can implement highly parallel search and its query efficiency is higher than the classical term-at-a-time algorithm, which shortens the average search time and improves the system throughput. Index classification, bitmap structure and parallel query algorithm can avoid query blindness and improve the performance of distributed search engines.

    Reference
    Related
    Cited by
Get Citation
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:September 25,2013
  • Revised:
  • Adopted:
  • Online: July 30,2014
  • Published:
Article QR Code