Patent ReferencesText search system Method and apparatus to identify the relation of meaning between words in text expressions Method and system for generating lexicon of cooccurrence relations in natural language Document retrieval system Method and apparatus for abstracting concepts from natural language Document retrieval system using analog signal comparisons for retrieval conditions including relevant keywords Document identification by characteristics matching Sense discrimination system and method Information retrieval based on rank-ordered cumulative query scores calculated from weights of all keywords in an inverted index file for minimizing access to a main database Iterative technique for phrase query formation and an information retrieval system employing same InventorAssigneeApplicationNo. 148688 filed on 11/05/1993US Classes:707/3, Query processing (i.e., searching)704/9, Natural language715/531TextExaminersPrimary: Weinhardt, Robert A.Assistant: Dixon, Jennifer H. Attorney, Agent or FirmInternational ClassG06F 017/30AbstractThis is a procedure for determining text relevancy and can be used to enhance the retrieval of text documents by search queries. This system helps a user intelligently and rapidly locate information found in large textual databases. A first embodiment determines the common meanings between each word in the query and each word in the document. Then an adjustment is made for words in the query that are not in the documents. Further, weights are calculated for both the semantic components in the query and the semantic components in the documents. These weights are multiplied together, and their products are subsequently added to one another to determine a real value number (similarity coefficient) for each document. Finally, the documents are sorted in sequential order according to their real value number from largest to smallest value. Another, embodiment is for routing documents to topics/headings (sometimes referred to as filtering). Here, the importance of each word in both topics and documents are calculated. Then, the real value number (similarity coefficient) for each document is determined. Then each document is routed one at a time according to their respective real value numbers to one or more topics. Finally, once the documents are located with their topics, the documents can be sorted. This system can be used to search and route all kinds of document collections, such as collections of legal documents, medical documents, news stories, and patents.Other References
| |