소스 검색

updated docs

git-svn-id: svn://svn.sphinxsearch.com/sphinx/trunk@367 406a0c4d-033a-0410-8de8-e80135713968
shodan 19 년 전
부모
커밋
fcddb7d3d1
5개의 변경된 파일640개의 추가작업 그리고 246개의 파일을 삭제
  1. 1 158
      INSTALL
  2. 0 86
      README
  3. 0 0
      doc/sphinx.html
  4. 628 0
      doc/sphinx.txt
  5. 11 2
      doc/sphinx.xml

+ 1 - 158
INSTALL

@@ -1,158 +1 @@
-Sphinx 0.9.6 installation notes
-================================
-
-Supported operating systems
-----------------------------
-
-Most modern UNIX systems with a C++ compiler should be able
-to compile and run Sphinx without any modifications.
-
-Currently known systems Sphinx has been successfully compiled and
-tested on are:
-
-   - FreeBSD 4.x, 5.x, 6.x
-   - Linux 2.4.x, 2.6.x (various distributions)
-   - Windows 2000, XP
-   - NetBSD 1.6
-
-We hope Sphinx will work on other Unix platforms as well. 
-If the platform your run Sphinx on is not in this list,
-please do report it!
-
-Required tools
----------------
-
-On UNIX, you will need the following tools to build
-and install Sphinx:
-
-   - a working C++ compiler. GNU gcc is known to work.
-   - a good make program. GNU make is known to work.
-
-On Windows, you will need Microsoft Visual C/C++ Studio .NET 2003.
-Other compilers/environments will probably work as well, but for the
-time being, you will have to build makefile or project file yourself.
-
-Installing Sphinx
-------------------
-
-1. Extract everything from the distribution tarball (haven't you already?)
-   and go to the 'sphinx' subdirectory:
-
-      $ tar xzvf sphinx-0.9.6.tar.gz
-      $ cd sphinx
-
-2. Run the configuration program:
-
-      $ ./configure
-
-   There's a number of options to configure. The complete listing may
-   be obtained by using '--help' switch. The most important ones are:
-   
-      '--prefix', which specifies where to install Sphinx;
-
-      '--with-mysql', which specifies where to look for MySQL
-      include and library files, if auto-detection fails;
-
-      '--with-pgsql', which specifies where to look for PostgreSQL
-      include and library files.
-
-3. Build the binaries:
-
-      $ make
-
-4. Install the binaries in the directory of your choice:
-
-      $ make install
-
-Known installation problems
-----------------------------
-
-If 'configure' fails to locate MySQL headers and/or libraries,
-try checking for and installing 'mysql-devel' package. On some systems,
-it is not installed by default.
-
-If 'make' fails with a message which look like
-
-   /bin/sh: g++: command not found
-   make[1]: *** [libsphinx_a-sphinx.o] Error 127
-
-try checking for and installing 'gcc-c++' package.
-
-If you are getting compile-time errors which look like
-
-   sphinx.cpp:67: error: invalid application of `sizeof' to
-      incomplete type `Private::SizeError<false>'
-
-that means that some compile-time type size check failed.
-The most probable reason is that off_t type is less than 64-bit
-on your system. As a quick hack, you can edit sphinx.h and replace off_t
-with DWORD in a typedef for SphOffset_t, but note that this will prohibit
-you from using full-text indices larger than 2 GB. Even if the hack helps,
-please report such issues, providing the exact error message and
-compiler/OS details, so I could fix them in next releases.
-
-If you keep getting any other error, or the suggestions above
-do not seem to help you, please don't hesitate to contact me.
-
-Quick Sphinx usage guide
--------------------------
-
-All the example commands below assume that you installed Sphinx
-in '/usr/local/sphinx'.
-
-To use Sphinx, you need to:
-
-1. Create a configuration file.
-
-   Default configuration file name is 'sphinx.conf'. All Sphinx
-   programs look for this file in working directory by default.
-
-   Sample configuration file, 'sphinx.conf.dist', which has all the
-   options documented, is created by 'configure'. Copy and edit that
-   sample file to make your own configuration:
-
-   $ cd /usr/local/sphinx/etc
-   $ cp sphinx.conf.dist sphinx.conf
-   $ vi sphinx.conf
-
-   Sample file should index 'documents' table from MySQL database 'test';
-   so there's 'example.sql' sample data file to populate that table with
-   a few documents for testing purposes:
-
-   $ mysql -u test <example.sql
-
-2. Run the indexer to create full-text index from your data:
-
-   $ cd /usr/local/sphinx/etc
-   $ /usr/local/sphinx/bin/indexer
-
-3. Run the command-line client to query newly created index:
-
-   $ cd /usr/local/sphinx/etc
-   $ /usr/local/sphinx/bin/search test
-
-To use Sphinx from your PHP scripts, you need to:
-
-1. Run the search daemon which your script will talk to:
-
-   $ cd /usr/local/sphinx/etc
-   $ /usr/local/sphinx/bin/searchd
-
-2. Run the attached PHP API test script (to ensure that the daemon
-   was succesfully started and is ready to serve the queries):
-
-   $ cd sphinx/api
-   $ php test.php test
-
-3. Include the API (it's located in api/sphinxapi.php) and use from
-   your own scripts.
-
-Happy searching!
-
-Contacts
----------
-
-E-mail: shodan(at)shodan.ru
-Web: http://shodan.ru/contact/
-
---eof--
+Please refer to <<Installation>> section in doc/sphinx.txt or doc/sphinx.html.

+ 0 - 86
README

@@ -1,86 +0,0 @@
-Sphinx 0.9.6
-=============
-
-Copyright (c) 2001-2006, Andrew Aksyonoff <shodan(at)shodan.ru>
-Distributed under GPL; see the file COPYING for details.
-See below for commerical licensing questions.
-
-Overview
----------
-
-Sphinx is a search engine primarily intended to search through
-SQL databases, though generally applicable to searching anything else.
-
-Sphinx consists of the following parts:
-
-- full-text indexing/searching library written in C++
-- a set of generic utilities built on top of that library:
-  - indexer, an utility to creates full-text indices
-  - searchd, a daemon to run queries remotely (eg. from PHP/perl scripts)
-  - search, a console utility to run queries from command line
-- searchd APIs for different languages (currently, PHP)
-
-Features
----------
-
-- high indexing speed (upto 10 MB/sec on modern CPUs);
-- high search speed (avg query is under 0.1 sec on 2-4 GB text collections);
-- high scalability (upto 100 GB of text, upto 100 M documents on a single CPU);
-- distributed searching;
-- supports MySQL natively (MyISAM and InnoDB tables are both supported);
-- supports PostgreSQL natively;
-- supports phrase searching;
-- supports phrase proximity ranking, providing very good relevance;
-- supports English and Russian stemming;
-- supports any number of document fields (weights can be changed on the fly);
-- supports document groups;
-- supports stopwords;
-- supports "match all", "match phrase" and "match any" search modes;
-- supports boolean queries;
-- generic XML interface which grealy simplifies custom integration.
-
-Where to get it?
------------------
-
-Sphinx can be obtained through official project website at
-http://sphinxsearch.com/
-
-How do I use it?
------------------
-
-Please read the file INSTALL for installation instructions and
-a quick guide to using Sphinx.
-
-License
---------
-
-This program is free software; you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation; either version 2 of the License, or
-(at your option) any later version.
-
-This program is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with this program; if not, visit http://www.gnu.org/ or write
-to the Free Software Foundation, Inc., 59 Temple Place, Suite 330,
-Boston, MA 02111-1307 USA
-
-Commercial licensing
----------------------
-
-If you don't want to be bound by GNU GPL terms (for instance,
-if you would like to embed Sphinx in your software, but would not
-like to disclose its source code), please contact me to obtain
-a commercial license.
-
-Contacts
----------
-
-E-mail: shodan(at)shodan.ru
-Web: http://shodan.ru/contact/
-
---eof--

파일 크기가 너무 크기때문에 변경 상태를 표시하지 않습니다.
+ 0 - 0
doc/sphinx.html


+ 628 - 0
doc/sphinx.txt

@@ -0,0 +1,628 @@
+Sphinx 0.9.6 reference manual
+
+Free open-source SQL full-text search engine
+
+Copyright (c) 2001-2006 Andrew Aksyonoff, <shodan(at)shodan.ru>
+
+-----------------------------------------------------------------
+
+Table of Contents
+
+1. Introduction
+
+     1.1. About
+     1.2. Sphinx features
+     1.3. Where to get Sphinx
+     1.4. License
+     1.5. Author and contributors
+     1.6. History
+
+2. Installation
+
+     2.1. Supported systems
+     2.2. Required tools
+     2.3. Installing Sphinx
+     2.4. Known installation issues
+     2.5. Quick Sphinx usage tour
+
+3. Indexing
+
+     3.1. Data sources
+     3.2. Indexes
+     3.3. Restrictions on the source data
+     3.4. Charsets, case folding, and translation tables
+     3.5. SQL data sources (MySQL, PostgreSQL)
+     3.6. XMLpipe data source
+     3.7. Live index updates
+
+A. Sphinx revision history
+
+-----------------------------------------------------------------
+
+1. Introduction
+---------------
+
+1.1. About
+----------
+
+Sphinx is a full-text search engine, distributed under GPL version 2.
+Commercial licensing is also available upon request.
+
+Generally, it's a standalone search engine, meant to provide fast,
+size-efficient and relevant fulltext search functions to other
+applications. Sphinx was specially designed to integrate well with SQL
+databases and scripting languages. Currently built-in data source
+drivers support fetching data either via direct connection to MySQL,
+PostgreSQL, or from a pipe in a custom XML format.
+
+As for the name, Sphinx is an acronym which is officially decoded as
+SQL Phrase Index. Yes, I know about CMU's Sphinx project.
+
+1.2. Sphinx features
+--------------------
+
+  * high indexing speed (upto 10 MB/sec on modern CPUs);
+  * high search speed (avg query is under 0.1 sec on 2-4 GB text
+    collections);
+  * high scalability (upto 100 GB of text, upto 100 M documents on a
+    single CPU);
+  * provides good relevance through phrase proximity ranking;
+  * provides distributed searching capabilities;
+  * provides document exceprts generation;
+  * supports MySQL natively (MyISAM and InnoDB tables are both
+    supported);
+  * supports PostgreSQL natively;
+  * supports single-byte encodings and UTF-8;
+  * supports English stemming, Russian stemming, and Soundex for
+    morphology;
+  * supports any number of document fields (weights can be changed on
+    the fly);
+  * supports document groups;
+  * supports stopwords;
+  * supports "match all", "match phrase", "match any" and "boolean
+    query" search modes.
+
+1.3. Where to get Sphinx
+------------------------
+
+Sphinx is available through its official Web site at
+http://www.sphinxsearch.com/.
+
+Currently, Sphinx distribution tarball includes the following
+software:
+
+  * indexer: an utility to create fulltext indexes;
+  * search: a simple (test) utility to query fulltext indexes from
+    command line;
+  * searchd: a daemon to search through fulltext indexes from external
+    software (such as Web scripts);
+  * sphinxapi: a set of API libraries for popular Web scripting
+    languages (currently, PHP).
+
+1.4. License
+------------
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or (at
+your option) any later version. See COPYING file for details.
+
+This program is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
+USA
+
+If you don't want to be bound by GNU GPL terms (for instance, if you
+would like to embed Sphinx in your software, but would not like to
+disclose its source code), please contact [25]the author to obtain a
+commercial license.
+
+1.5. Author and contributors
+----------------------------
+
+Author
+------
+
+Sphinx initial author and current primary developer is:
+
+  * Andrew Aksyonoff, <shodan(at)shodan.ru>
+
+Contributors
+------------
+
+People who contributed to Sphinx and their contributions (in no
+particular order) are:
+
+  * Robert "coredev" Bengtsson (Sweden), initial version of PostgreSQL
+    data source;
+
+Many other people have contributed ideas, bug reports, fixes, etc.
+Thank you!
+
+1.6. History
+------------
+
+Sphinx development was started back in 2001, because I didn't manage
+to find an acceptable search solution (for a database driven Web site)
+which would meet my requirements. Actually, each and every important
+aspect was a problem:
+
+  * search quality (ie. good relevance)
+       + statistical ranking methods performed rather bad, especially
+         on large collections of small documents (forums, blogs, etc)
+  * search speed
+       + especially if searching for phrases which contain stopwords,
+         as in "to be or not to be"
+  * moderate disk and CPU requirements when indexing
+       + important in shared hosting enivronment, not to mention the
+         indexing speed.
+
+Despite the amount of time passed and numerous improvements made in
+the other solutions, there's still no solution which I personally
+would be eager to migrate to.
+
+Considering that and a lot of positive feedback received from Sphinx
+users during last years, the obvious decision is to continue
+developing Sphinx (and, eventually, to take over the world).
+
+2. Installation
+---------------
+
+2.1. Supported systems
+----------------------
+
+Most modern UNIX systems with a C++ compiler should be able to compile
+and run Sphinx without any modifications.
+
+Currently known systems Sphinx has been successfully running on are:
+
+  * Linux 2.4.x, 2.6.x (various distributions)
+  * Windows 2000, XP
+  * FreeBSD 4.x, 5.x, 6.x
+  * NetBSD 1.6
+
+I hope Sphinx will work on other Unix platforms as well. If the
+platform your run Sphinx on is not in this list, please do report it.
+
+At the moment, Windows version of Sphinx's searchd daemon is not
+intended to be used in production because it can only handle one
+client at a time.
+
+2.2. Required tools
+-------------------
+
+On UNIX, you will need the following tools to build and install
+Sphinx:
+
+  * a working C++ compiler. GNU gcc is known to work.
+  * a good make program. GNU make is known to work.
+
+On Windows, you will need Microsoft Visual C/C++ Studio .NET 2003.
+Other compilers/environments will probably work as well, but for the
+time being, you will have to build makefile (or other environment
+specific project files) manually.
+
+2.3. Installing Sphinx
+----------------------
+
+1. Extract everything from the distribution tarball (haven't you
+   already?) and go to the sphinx subdirectory:
+
+   $ tar xzvf sphinx-0.9.6.tar.gz
+   $ cd sphinx
+
+2. Run the configuration program:
+
+   $ ./configure
+
+   There's a number of options to configure. The complete listing may
+   be obtained by using --help switch. The most important ones are:
+
+      * --prefix, which specifies where to install Sphinx;
+      * --with-mysql, which specifies where to look for MySQL include
+        and library files, if auto-detection fails;
+      * --with-pgsql, which specifies where to look for PostgreSQL
+        include and library files.
+
+3. Build the binaries:
+
+   $ make
+
+4. Install the binaries in the directory of your choice:
+
+   $ make install
+
+2.4. Known installation issues
+------------------------------
+
+If configure fails to locate MySQL headers and/or libraries, try
+checking for and installing mysql-devel package. On some systems, it
+is not installed by default.
+
+If make fails with a message which look like
+
+   /bin/sh: g++: command not found
+   make[1]: *** [libsphinx_a-sphinx.o] Error 127
+
+try checking for and installing gcc-c++ package.
+
+If you are getting compile-time errors which look like
+
+   sphinx.cpp:67: error: invalid application of `sizeof' to
+       incomplete type `Private::SizeError<false>'
+
+this means that some compile-time type size check failed. The most
+probable reason is that off_t type is less than 64-bit on your system.
+As a quick hack, you can edit sphinx.h and replace off_t with DWORD in
+a typedef for SphOffset_t, but note that this will prohibit you from
+using full-text indexes larger than 2 GB. Even if the hack helps,
+please report such issues, providing the exact error message and
+compiler/OS details, so I could fix them in next releases.
+
+If you keep getting any other error, or the suggestions above do not
+seem to help you, please don't hesitate to contact me.
+
+2.5. Quick Sphinx usage tour
+----------------------------
+
+All the example commands below assume that you installed Sphinx in
+/usr/local/sphinx.
+
+To use Sphinx, you will need to:
+
+1. Create a configuration file.
+
+   Default configuration file name is sphinx.conf. All Sphinx
+   programs look for this file in current working directory by
+   default.
+
+   Sample configuration file, sphinx.conf.dist, which has all the
+   options documented, is created by configure. Copy and edit that
+   sample file to make your own configuration:
+
+   $ cd /usr/local/sphinx/etc
+   $ cp sphinx.conf.dist sphinx.conf
+   $ vi sphinx.conf
+
+   Sample configuration file is setup to index documents table from
+   MySQL database test; so there's example.sql sample data file to
+   populate that table with a few documents for testing purposes:
+
+   $ mysql -u test < /usr/local/sphinx/etc/example.sql
+
+2. Run the indexer to create full-text index from your data:
+
+   $ cd /usr/local/sphinx/etc
+   $ /usr/local/sphinx/bin/indexer
+
+3. Query your newly created index!
+
+To query the index from command line, use search utility:
+
+   $ cd /usr/local/sphinx/etc
+   $ /usr/local/sphinx/bin/search test
+
+To query the index from your PHP scripts, you need to:
+
+1. Run the search daemon which your script will talk to:
+
+   $ cd /usr/local/sphinx/etc
+   $ /usr/local/sphinx/bin/searchd
+
+2. Run the attached PHP API test script (to ensure that the daemon
+   was succesfully started and is ready to serve the queries):
+
+   $ cd sphinx/api
+   $ php test.php test
+
+3. Include the API (it's located in api/sphinxapi.php) into your own
+   scripts and use it.
+
+Happy searching!
+
+3. Indexing
+-----------
+
+3.1. Data sources
+-----------------
+
+The data to be indexed can generally come from very different sources:
+SQL databases, plain text files, HTML files, mailboxes, and so on.
+From Sphinx point of view, the data it indexes is a set of structured
+documents, each of which has the same set of fields. This is biased
+towards SQL, where each row correspond to a document, and each column
+to a field.
+
+Depending on what source Sphinx should get the data from, different
+code is required to fetch the data and prepare it for indexing. This
+code is called data source driver (or simply driver or data source for
+brevity).
+
+At the time of this writing, there are drivers for MySQL and
+PostgreSQL databases, which can connect to the database using its
+native C/C++ API, run queries and fetch the data. There's also a
+driver called XMLpipe, which runs a specified command and reads the
+data from its stdout. See Section 3.6, <<XMLpipe data source>>
+section for the format description.
+
+There can be as many sources per index as necessary. They will be
+sequentially processed in the very same order which was specifed in
+index definition. All the documents coming from those sources will be
+merged as if they were coming from a single source.
+
+3.2. Indexes
+------------
+
+To be able to answer full-text search queries fast, Sphinx needs to
+build a special data structure optimized for such queries from your
+text data. This structure is called index; and the process of building
+index from text is called indexing.
+
+Different index types are well suited for different tasks. For
+example, a disk-based tree-based index would be easy to update (ie.
+insert new documents to existing index), but rather slow to search.
+Therefore, Sphinx architecture allows for different index types to be
+implemented easily.
+
+The only index type which is implemented in Sphinx at the moment is
+designed for maximum indexing and searching speed. This comes at a
+cost of updates being really slow; theoretically, it might be slower
+to update this type of index than than to reindex it from scratch.
+However, this very frequently could be worked around with muiltiple
+indexes, see Section 3.7, <<Live index updates>> for details.
+
+It is planned to implement more index types, including the type which
+would be updateable in real time.
+
+There can be as many indexes per configuration file as necessary.
+indexer utility can reindex either all of them (if --all option is
+specified), or a certain explicitly specified subset. searchd utility
+will serve all the specified indexes, and the clients can specify what
+indexes to search in run time.
+
+3.3. Restrictions on the source data
+------------------------------------
+
+There are a few different restrictions imposed on the source data
+which is going to be indexed by Sphinx, of which the single most
+important one is:
+
+ALL DOCUMENT IDS MUST BE UNIQUE POSITIVE 32-BIT INTEGER NUMBERS.
+
+If this requirement is not met, different bad things can happen. For
+instance, Sphinx can crash with an internal assertion while indexing;
+or produce strange results when searching due to conflicting IDs.
+Also, a 1000-pound gorilla might eventually come out of your display
+and start throwing barrels at you. You've been warned.
+
+3.4. Charsets, case folding, and translation tables
+---------------------------------------------------
+
+When indexing some index, Sphinx fetches documents from the specified
+sources, splits the text into words, and does case folding so that
+"Abc", "ABC" and "abc" would be treated as the same word (or, to be
+pedantic, term).
+
+To do that properly, Sphinx needs to know
+
+  * what encoding is the source text in;
+  * what characters are letters and what are not;
+  * what letters should be folded to what letters.
+
+This should be configured on a per-index basis using charset_type
+and charset_table options. With charset_type, one would
+specify whether the document encoding is single-byte (SBCS) or UTF-8.
+charset_table would then be used to specify the table which maps
+letter characters to their case folded versions. The characters which
+are not in the table are considered to be non-letters and will be
+treated as word separators when indexing or searching through this
+index.
+
+Note that while default tables do not include space character (ASCII
+code 0x20, Unicode U+0020) as a letter, it's in fact perfectly legal
+to do so. This can be useful, for instance, for indexing tag clouds,
+so that space-separated word sets would index as a single search query
+term.
+
+Default tables currently include English and Russian characters.
+Please do submit your tables for other languages!
+
+3.5. SQL data sources (MySQL, PostgreSQL)
+-----------------------------------------
+
+With all the SQL drivers, indexing generally works as follows.
+
+  * connection to the database is established;
+  * pre-query (see ???) is executed to perform any necessary initial
+    setup, such as setting per-connection encoding with MySQL;
+  * main query (see ???) is executed and the rows it returns are
+    indexed;
+  * post-query (see ???) is executed to perform any necessary cleanup;
+  * connection to the database is closed;
+  * indexer does the sorting phase (to be pedantic, index-type
+    specific post-processing);
+  * connection to the database is established again;
+  * post-index query (see ???) is executed to perform any necessary
+    final cleanup;
+  * connection to the database is closed again.
+
+Most options, such as database user/host/password, are
+straightforward. However, there are a few subtle things, which are
+discussed in more detail here.
+
+Ranged queries
+--------------
+
+Main query, which needs to fetch all the documents, can impose a read
+lock on the whole table and stall the concurrent queries (eg. INSERTs
+to MyISAM table), waste a lot of memory for result set, etc. To avoid
+this, Sphinx supports so-called ranged queries. With ranged queries,
+Sphinx first fetches min and max document IDs from the table, and then
+substitutes different ID intervals into main query text and runs the
+modified query to fetch another chunk of documents. Here's an example.
+
+Example 1. Ranged query usage example
+
+   # in sphinx.conf
+
+   sql_query_range = SELECT MIN(id),MAX(id) FROM documents
+   sql_range_step = 1000
+   sql_query = SELECT * FROM documents WHERE id>=$start AND id<=$end
+
+If the table contains document IDs from 1 to, say, 2345, then
+sql_query would be run three times:
+
+1. with $start replaced with 1 and $end replaced with 1000;
+2. with $start replaced with 1001 and $end replaced with 2000;
+3. with $start replaced with 200 and $end replaced with 2345.
+
+Obviously, that's not much of a difference for 2000-row table, but
+when it comes to indexing 10-million-row MyISAM table, ranged queries
+might be of some help.
+
+sql_post vs. sql_post_index
+---------------------------
+
+The difference between post-query and post-index query is in that
+post-query is run immediately when Sphinx received all the documents,
+but further indexing may still fail for some other reason. On the
+contrary, by the time the post-index query gets executed, it is
+guaranteed that the indexing was succesful. Database connection is
+dropped and re-established because sorting phase can be very lengthy
+and would just timeout otherwise.
+
+3.6. XMLpipe data source
+------------------------
+
+XMLpipe data source is designed to enable users to plug data into
+Sphinx without having to implement new data sources drivers
+themselves.
+
+To use XMLpipe, configure the data source in your configuration file
+as follows:
+
+   source example_xmlpipe_source
+   {
+       type = xmlpipe
+       xmlpipe_command = perl /www/mysite.com/bin/sphinxpipe.pl
+   }
+
+The indexer will run the command specified in xmlpipe_command, and
+then read, parse and index the data it prints to stdout.
+
+XMLpipe driver expects the data to be in special XML format. Here's
+the example document stream, consisting of two documents:
+
+Example 2. XMLpipe document stream
+
+   <document>
+   <id>123</id>
+   <group>45</group>
+   <timestamp>1132223498</timestamp>
+   <title>test title</title>
+   <body>
+   this is my document body
+   </body>
+   </document>
+
+   <document>
+   <id>124</id>
+   <group>46</group>
+   <timestamp>1132223498</timestamp>
+   <title>another test</title>
+   <body>
+   this is another document
+   </body>
+   </document>
+
+At the moment, the driver is using a custom manually written parser
+which is pretty fast but really strict; so almost all the fields must
+be present, formatted exactly as in this example, and occur exactly in
+this order. The only optional field is timestamp; it's set to 1 if
+it's missing.
+
+3.7. Live index updates
+-----------------------
+
+There's a frequent situation when the total dataset is too big to be
+reindexed from scratch often, but the amount of new records is rather
+small. Example: a forum with a 1,000,000 archived posts, but only
+1,000 new posts per day.
+
+In this case, "live" (almost real time) index updates could be
+implemented using so called "main+delta" scheme.
+
+The idea is to set up two sources and two indexes, with one "main"
+index for the data which only changes rarely (if ever), and one
+"delta" for the new documents. In the example above, 1,000,000
+archived posts would go to the main index, and newly inserted 1,000
+posts/day would go to the delta index. Delta index could then be
+reindexed very frequently, and the documents can be made available to
+search in a matter of minutes.
+
+Specifying which documents should go to what index and reindexing main
+index could also be made fully automatical. One option would be to
+make a counter table which would track the ID which would split the
+documents, and update it whenever the main index is reindexed.
+
+Example 3. Fully automated live updates
+
+   # in MySQL
+   CREATE TABLE sph_counter
+   (
+       counter_id INTEGER PRIMARY KEY NOT NULL,
+       max_doc_id INTEGER NOT NULL
+   );
+
+   # in sphinx.conf
+   source main
+   {
+       # ...
+       sql_query_pre = REPLACE INTO sph_counter SELECT 1, MAX(id) FROM documents
+       sql_query = SELECT id, title, body FROM documents \
+           WHERE id<=( SELECT max_doc_id FROM sph_counter WHERE counter_id=1 )
+   }
+
+   source delta : main
+   {
+       sql_query_pre =
+       sql_query = SELECT id, title, body FROM documents \
+           WHERE id>( SELECT max_doc_id FROM sph_counter WHERE counter_id=1 )
+   }
+
+A. Sphinx revision history
+--------------------------
+
+A.1. Version 0.9.6, 26 jun 2006
+
+  * added boolean queries support (experimental, beta version)
+  * added simple file-based query cache (experimental, beta version)
+  * added storage engine for MySQL 5.0 and 5.1 (experimental, beta
+    version)
+  * added GNU style configure script
+  * added new searchd protocol (all binary, and should be backwards
+    compatible)
+  * added distributed searching support to searchd
+  * added PostgreSQL driver
+  * added excerpts generation
+  * added min_word_len option to index
+  * added max_matches option to searchd, removed hardcoded MAX_MATCHES
+    limit
+  * added initial documentation, and a working example.sql
+  * added support for multiple sources per index
+  * added soundex support
+  * added group ID ranges support
+  * added --stdin command-line option to search utility
+  * added --noprogress option to indexer
+  * added --index option to search
+  * fixed UTF-8 decoder (3-byte codepoints did not work)
+  * fixed PHP API to handle big result sets faster
+  * fixed config parser to handle empty values properly
+  * fixed redundant time(NULL) calls in time-segments mode
+
+--eof--

+ 11 - 2
doc/sphinx.xml

@@ -88,6 +88,13 @@ You should have received a copy of the GNU General Public License
 along with this program; if not, write to the Free Software Foundation, Inc.,
 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA 
 </para>
+<para>
+If you don't want to be bound by GNU GPL terms (for instance,
+if you would like to embed Sphinx in your software, but would not
+like to disclose its source code), please contact
+<link linkend="author">the author</link> to obtain
+a commercial license.
+</para>
 </sect2>
 
 
@@ -102,7 +109,7 @@ Sphinx initial author and current primary developer is:
 <bridgehead>Contributors</bridgehead>
 <para>People who contributed to Sphinx and their contributions (in no particular order) are:
 <itemizedlist>
-<listitem>Robert Bengtsson, initial version of PostgreSQL data source;</listitem>
+<listitem>Robert "coredev" Bengtsson (Sweden), initial version of PostgreSQL data source;</listitem>
 </itemizedlist>
 </para>
 <para>
@@ -650,6 +657,8 @@ source delta : main
 
 
 </sect1>
+
+<!--
 <sect1 id="searching"><title>Searching</title>
 
 
@@ -723,7 +732,7 @@ xmlpipe_command = perl /www/mysite.com/bin/sphinxpipe.pl
 
 
 </sect1>
-
+-->
 
 <appendix id="changelog"><title>Sphinx revision history</title>
 <sect2 id="ver_0_9_6"><title>Version 0.9.6, 26 jun 2006</title>

이 변경점에서 너무 많은 파일들이 변경되어 몇몇 파일들은 표시되지 않았습니다.