Patent ReferencesSystem for retrieving information from a plurality of remote databases having at least two different languages Distribution mailing system having a control database for storing mail handling categories common to the databases of selected mailer stations Interactive database query system and method for prohibiting the selection of semantically incorrect query parameters System and method for query optimization using quantile values of a large unordered data set Database system with methods for optimizing query performance with a buffer manager System and method for query optimization using quantile values of a large unordered data set Method for facilitating world wide web searches utilizing a document distribution fusion strategy Patent #: 5864846 InventorApplicationNo. 979109 filed on 11/26/1997US Classes:707/2, Access augmentation or optimizing706/47, Ruled-based reasoning system706/59, Creation or modification707/10Distributed or remote accessExaminersPrimary: Black, Thomas G.Assistant: Coby, Frantz Attorney, Agent or FirmInternational ClassG06F 017/30AbstractIn an information retrieval system, an automated system optimizes selection of sources in a distributed information system for query searching. A training set of documents is created for each source by randomly selecting significant portions of the documents thereof. A test set documents is created for each source from the documents not included in the training set. Each document in the training and test set is defined in terms of features/attributes and a name as samples representing individual sources. Pattern recognizing means process the samples to recognize patterns in the documents to distinguish one source from another source. Rule generating means provide a set of DNF rules from the patterns as a model representing each source. The test set of documents is expressed in terms of DNF rules. Evaluating means create a final classification model after minimizing any error between the DNF rules for the training and test sets. Query means enable a user to express a query in terms of features/attributes and DNF rules which when applied to the final model automatically select the optimal sources for query searching. The sources may also be expressed in taxonomic groupings which reduces the number of data sources and speeds query searching on a distributive information network by a user.Other References
| |