README.txt 2.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125
  1. This directory contains an implementation of an indexing and search
  2. mechanism.
  3. Architecture:
  4. =============
  5. The indexer and search mechanism design is modular:
  6. - A storage mechanism
  7. - An indexer class
  8. - A search class
  9. - Text processing classes.
  10. The indexer uses a text processing class and a storage mechanism to create a
  11. search database. The search class uses the same storage mechanism to search
  12. the database.
  13. Currently, 3 databases are supported:
  14. - In memory database (plus flat file storage)
  15. - Firebird database
  16. - sqlite database.
  17. 3 input text processors are supported:
  18. - Plain text
  19. - HTML
  20. - Pas files.
  21. A text processor is selected based on the extension of a file, if a file is
  22. processed.
  23. It is possible to specify a list of words to ignore per language, and a mask for words to
  24. ignore.
  25. On top of the file/stream indexer, a database indexer is implemented.
  26. It can be used to implement full-text search on a database.
  27. Sample programs for all 3 classes (search, index and index DB) are provided
  28. in the examples dir.
  29. Overview of units:
  30. ==================
  31. fpindexer:
  32. The indexer, search and abstract database engine classes.
  33. An abstract SQL storage engine class.
  34. ireaderhtml
  35. an input engine for HTML files.
  36. ireaderpas
  37. an input engine for pascal files.
  38. ireadertxt
  39. an input engine for plain text files.
  40. masks
  41. Copied from the LCL, to implement masks on words.
  42. memindexdb
  43. A memory storage engine.
  44. sqldbindexdb
  45. An abstract SQLDB storage engine.
  46. fbindexdb
  47. A descendent of the SQLDB storage engine which uses a firebird database.
  48. sqliteindexdb
  49. SQLite database storage engine.
  50. dbindexdb
  51. Component to index a database.
  52. Overview of classes:
  53. ====================
  54. fpindexer:
  55. ----------
  56. TFPIndexer: The indexing engine.
  57. TCustomFileReader: abstract input engine.
  58. TFileHandlersManager: factory for file reader classes.
  59. TIgnoreListDef: Word ignore list definition.
  60. TIgnoreLists: Collection of ignore lists
  61. TFPSearch: the search engine.
  62. TCustomIndexDB: Abstract storage engine.
  63. TSQLIndexDB: Abstract SQL-Based storage engine.
  64. ireaderhtml:
  65. ------------
  66. TIReaderHTML: HTML input engine.
  67. ireaderpas:
  68. -----------
  69. TIReaderPAS: pascal input engine.
  70. ireadertxt:
  71. -----------
  72. TIReaderTXT: plain text input engine.
  73. memindexdb:
  74. -----------
  75. TMemIndexDB: In memory storage engine
  76. TFileIndexDB: Descendent of TMemIndexDB which stores everything in a flat
  77. file using a custom format.
  78. sqldbindexdb:
  79. -------------
  80. TSQLDBIndexDB: Abstract class for SQLDB-based storage (descendent of TSQLIndexDB)
  81. sqliteindexdb:
  82. --------------
  83. TSQLiteIndexDB: SQLIte based storage engine, descendent of TSQLIndexDB
  84. fbindexdb:
  85. ----------
  86. TFBIndexDB TSQLDBIndexDB descendent for Firebird.
  87. dbindexdb:
  88. ----------
  89. TDBIndexer: Implements a database indexer, using a second database as the index.
  90. TIBIndexer: Descendent of TDBIndexer for firebird.