Patent ReferencesMethod and apparatus to identify the relation of meaning between words in text expressions Method and system for generating lexicon of cooccurrence relations in natural language Document retrieval system Method and apparatus for abstracting concepts from natural language Patent #: 5056021 InventorAssigneeApplicationNo. 520027 filed on 08/28/1995US Classes:707/3, Query processing (i.e., searching)704/9, Natural language707/4, Query formulation, input preparation, or translation707/5, Query augmenting and refining (e.g., inexact access)715/531TextExaminersPrimary: Black, Thomas G.Assistant: Lewis, C. Attorney, Agent or FirmInternational ClassG06F 017/30AbstractThis is a procedure for determining text relevancy and can be used to enhance the retrieval of text documents by search queries. This system helps a user intelligently and rapidly locate information found in large textual databases. A first embodiment determines the common meanings between each word in the query and each word in the document. Then an adjustment is made for words in the query that are not in the documents. Further, weights are calculated for both the semantic components in the query and the semantic components in the documents. These weights are multiplied together, and their products are subsequently added to one another to determine a real value number(similarity coefficient) for each document. Finally, the documents are sorted in sequential order according to their real value number from largest to smallest value. Another, embodiment is for routing documents to topics/headings (sometimes referred to as faltering). Here, the importance of each word in both topics and documents are calculated. Then, the real value number (similarity coefficient) for each document is determined. Then each document is routed one at a time according to their respective real value numbers to one or more topics. Finally, once the documents are located with their topics, the documents can be sorted. This system can be used to search and route all kinds of document collections, such as collections of legal documents, medical documents, news stories, and patents.Other References
| |