U.S. patents available from 1976 to present.
U.S. patent applications available from 2005 to present.

Automatic language identification by stroke geometry analysis

Patent 6064767 Issued on May 16, 2000. Estimated Expiration Date: Icon_subject January 16, 2018. Estimated Expiration Date is calculated based on simple USPTO term provisions. It does not account for terminal disclaimers, term adjustments, failure to pay maintenance fees, or other factors which might affect the term of a patent.

Patent References

Methods and apparatus for evolving a starter set of handwriting prototypes into a user-specific set
Patent #: 5319721
Issued on: 06/07/1994
Inventor: Chefalas, et al.

Method and apparatus for cursive script recognition
Patent #: 5442715
Issued on: 08/15/1995
Inventor: Gaborski, et al.

Method and apparatus for automatic character script determination
Patent #: 5444797
Issued on: 08/22/1995
Inventor: Spitz, et al.

Detecting function words without converting a scanned document to character codes
Patent #: 5455871
Issued on: 10/03/1995
Inventor: Bloomberg, et al.

Method and apparatus for supplementing significant portions of a document selected without document image decoding with retrieved information
Patent #: 5748805
Issued on: 05/05/1998
Inventor: Withgott, et al.

Script identification from images using cluster-based templates Patent #: 5844991
Issued on: 12/01/1998
Inventor: Hochberg, et al.

Inventors

Application

No. 008225 filed on 01/16/1998

US Classes:

382/190, Feature extraction382/177, Segmenting individual characters or words382/201, Point features (e.g., spatial coordinate descriptors)382/224Classification

Examiners

Primary: Mehta, Bhavesh M.

Attorney, Agent or Firm

International Class

G06K 009/46

Abstract

A computer-implemented process identifies an unknown language used to create a document. A set of training documents is defined in a variety of known languages and formed from a variety of text styles. Black and white electronic pixel images are formed of text material forming the training documents and the document in the unknown language. A plurality of line strokes are defined from the black pixels and point features are extracted from the strokes that are effective to characterize each of the languages. Point features from the unknown language are compared with point features from the known languages to identify one of the known languages that best represents the unknown language.

Other References

  • Nakayama et al, "European Language Determination . . . " Jul. 1993 pp. 159-16
PatentsPlus Images
Enhanced PDF formats
loading...
PatentsPlus: add to cart
PatentsPlus: add to cartSearch-enhanced full patent PDF image
$9.95more info
PatentsPlus: add to cart
PatentsPlus: add to cartIntelligent turbocharged patent PDFs with marked up images
$16.95more info
 
Sign InRegister
Username  
Password   
forgot password?