123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125 |
- This directory contains an implementation of an indexing and search
- mechanism.
- Architecture:
- =============
- The indexer and search mechanism design is modular:
- - A storage mechanism
- - An indexer class
- - A search class
- - Text processing classes.
- The indexer uses a text processing class and a storage mechanism to create a
- search database. The search class uses the same storage mechanism to search
- the database.
- Currently, 3 databases are supported:
- - In memory database (plus flat file storage)
- - Firebird database
- - sqlite database.
- 3 input text processors are supported:
- - Plain text
- - HTML
- - Pas files.
- A text processor is selected based on the extension of a file, if a file is
- processed.
- It is possible to specify a list of words to ignore per language, and a mask for words to
- ignore.
- On top of the file/stream indexer, a database indexer is implemented.
- It can be used to implement full-text search on a database.
- Sample programs for all 3 classes (search, index and index DB) are provided
- in the examples dir.
- Overview of units:
- ==================
- fpindexer:
- The indexer, search and abstract database engine classes.
- An abstract SQL storage engine class.
- ireaderhtml
- an input engine for HTML files.
- ireaderpas
- an input engine for pascal files.
- ireadertxt
- an input engine for plain text files.
- masks
- Copied from the LCL, to implement masks on words.
- memindexdb
- A memory storage engine.
- sqldbindexdb
- An abstract SQLDB storage engine.
- fbindexdb
- A descendent of the SQLDB storage engine which uses a firebird database.
- sqliteindexdb
- SQLite database storage engine.
- dbindexdb
- Component to index a database.
- Overview of classes:
- ====================
- fpindexer:
- ----------
- TFPIndexer: The indexing engine.
- TCustomFileReader: abstract input engine.
- TFileHandlersManager: factory for file reader classes.
- TIgnoreListDef: Word ignore list definition.
- TIgnoreLists: Collection of ignore lists
- TFPSearch: the search engine.
- TCustomIndexDB: Abstract storage engine.
- TSQLIndexDB: Abstract SQL-Based storage engine.
- ireaderhtml:
- ------------
- TIReaderHTML: HTML input engine.
- ireaderpas:
- -----------
- TIReaderPAS: pascal input engine.
- ireadertxt:
- -----------
- TIReaderTXT: plain text input engine.
- memindexdb:
- -----------
- TMemIndexDB: In memory storage engine
- TFileIndexDB: Descendent of TMemIndexDB which stores everything in a flat
- file using a custom format.
- sqldbindexdb:
- -------------
- TSQLDBIndexDB: Abstract class for SQLDB-based storage (descendent of TSQLIndexDB)
- sqliteindexdb:
- --------------
- TSQLiteIndexDB: SQLIte based storage engine, descendent of TSQLIndexDB
- fbindexdb:
- ----------
- TFBIndexDB TSQLDBIndexDB descendent for Firebird.
- dbindexdb:
- ----------
- TDBIndexer: Implements a database indexer, using a second database as the index.
- TIBIndexer: Descendent of TDBIndexer for firebird.
|