[ Mailing list | Forums | Download | CVS | Demo ]

FAQ

  1. What is OpenFTS?

    OpenFTS is a PostgreSQL-based full text search engine that provides online indexing and searching capabilities for multilingual documents and relevance ranking of the search results.

  2. Why OpenFTS is available only for PostgreSQL?

    PostgreSQL is a sophisticated true ACID-compliant Object-Relational DBMS, supporting almost all SQL constructs, including subselects, transactions, and user-defined types and functions. It is the most advanced open-source database available anywhere.

    OpenFTS uses PostgreSQL as a database backend where documents are stored as arrays of integers. The index access structure for the array of integers is constructed as an RD-Tree which is implemented using the GiST interface that is only available in PostgreSQL.

  3. Can I use OpenFTS on a commercial website?

    Sure! OpenFTS is available under the GNU General Public License. So, you are free to use OpenFTS for commercial and non-commercial use.

  4. Is feature X implemented?

    Check out the list of features that are currently implemented.

  5. What are the practical and theoretical limits of OpenFTS?

    The code itself doesn't put any real limit on the number of documents. Currently, OpenFTS is being used to index PostgreSQL's mailing list archives (fts.postgresql.org) which serves more than 160,000 messages and a Russian portal web site (www.rambler.ru) which serves more than 500,000 documents with average length 1000-2000 words each.

  6. Where is the line drawn between tsearch and OpenFTS?

    Currently tsearch is the base data type for searching we use in OpenFTS. It answers which documents contain a query but don't attempt to range documents by relevance. There are could be many different metrics to rank documents and this should be addressed in specific applications. The relcov ranking function is just an example (rather useful though) from the IR (information retrieval) world, but for many applications (news) sorting documents by datetime would be more useful.

    OpenFTS provides an interface to parsers, dictionaries and ranking. Currently, we use relcov function which uses coordinate information to judge which document is better conforms to query - in general, document with closer appearance of query terms gets higher rank. Also, there is a crude implementation to assign a higher rank to document with query terms in it's title. To rank collection of html documents we need more sophisticated relevance function to assign different weights to terms in headings, boldings, etc.

  7. Where can I get commercial support?

    You should contact XWare.

  8. My question is not answered here. Where should I go for help?

    The first place to check is the primer. If you still can't find the answer try posting to the forums or the mailing list or send us your question by email to: Oleg Bartunov, Teodor Sigaev, Dan Wickstrom, or Neophytos Demetriou.