Aurigin Claim (was Cited references)

From: Alan (aengel@intlscience.com)
Date: Tue Mar 28 2000 - 09:26:23 EST


I found US 5991780 quite interesting, particularly in light of two
document relating to the JPO's CD-ROM.

Doc 1) Nihon Joho Shori Kaihatsu Kyokai, "CD-ROM Kokai Koho yo
Kikirui no Kaihatsu Shishinsho (Specifications for Equipment for
CD-ROM Publications of Patent Applications)", October 31, 1991.
(in Japanese)

Doc 2) The Japanese Patent Office, "Specifications of CD-ROM
Publication of Patent Applications," Version 1.0, February 1992
(as corrected March 1992)

Alan wrote:

> US5991780:
>
> 1. A computer controlled display system including at least one central
> processing unit (CPU), said CPU coupled to a display for displaying a
> patent document and a patent image on said display, comprising:

Doc 1, pp.1-2 lists the minimum equipment configuration for a computer
system to search and display patent documents on the CD-ROM, namely,
personal computer (CPU80286), display (resolution: 640x400), mouse,
keyboard, hard disk, CD-ROM drive, page printer.

>
>
> means for preparing for display at least one patent document comprised
> of at least one patent text file, and at least one patent image document
> comprised of at least one patent image file, said at least one patent
> text file generated from at least one source text file, said patent text
> file including equivalency information detailing an at least partial
> equivalency relationship between said at least one patent image file and
> said at least one source text file, said equivalency information
> comprising linking information indicative of a correspondence between at
> least one portion of said at least one source text file and at least one
> portion of said at least one patent image file, said equivalency
> information also comprising one or more of (A)-(M):
>

Doc 2, p.6 describes three files for each patent document. To quote:

"(14) Text file of a document [yynnnnnn.TXT;1]"

"This file records character data of one document and is named as
document identification number with a file name extension "TXT".
(document indentification number is explained in (11))
The first two digits of a file name ("yy" in [ ]) are the last two digits
of document publishing year and last six digits of the file name
("nnnnnn" in [ ]) are the last six digits of a document number."

"(15) Image file of a document [yynnnnnn.IMG;1]
This file records image data of one document and is named as
document identification number with a file extension "IMG".
(document indentification number is explained in (11))"

"(16) Position file of a document [yynnnnnn.POS;1]
This file contains pointers which indicate where main SGML tags
are recorded in a text file. This file is named as document identification
number with a file name extension "POS".
(document indentification number is explained in (11))"

(end quote)

The image files are TIFF revision 5.0 (Doc 2, p.57).

>
> (A) special character information specifying at least one mapping of a
> group of characters in said at least one source text file to at least
> one special character in said at least one patent image file,

Postions of images in the text file are specified by the following
SGML tags as explained in Doc 2, p.63 under "Correspondence of
SGML tags ond TIFF tags":

<RTI ID=000001 ... > and
<EMI ID=000003 ... >

The ID numbers are the numbers of the images in the image file.

>
>
> (B) item location information identifying locations in said patent image
> file of items referred to or contained in said source text file, said
> items including any combination of figures, drawing sheets, figure
> elements, equations, non-text tables, structures, diagrams, and text
> objects,

>
>
> (C) formatting information representing at least an approximate
> arrangement of at least some bibliographic data from said at least one
> source text file as represented in said at least one patent image file,

The RTI tag above is used to indicated the image in the image file that
contains bibliographic information. In displaying the patent, this image
normally substitutes for a subdocument in the text file that contains
the same information.

>
>
> (D) column information representing at least an approximate arrangement
> of text in columns of said at least one patent image file,

Doc 2, p.45 specifies tags for layout structure, namely,
<DP N=nnnn> Page number
<TXF FR=nnnn HE=nnn WI=nnn LX=nnnn LY=nnnn> Text Frame
where FR is the frame number, HE is the height, WI is the width, LX
is the position in the X axis and LY is the position in the Y axis
<EMI ID=nnnnnn HE=nnn WI=nnn LX=nnnn LY=nnnn> Embedded Image
where ID is the number of the image in the image file. This is specified
for chemical formulas, mathematical formulas, tables and drawings.
<RTI ID=nnnnnn HE=nnn WI=nnn LX=nnnn LY=nnnn> Replacement
of Text by Image

>
>
> (E) line information representing at least an approximate arrangement of
> text in lines of said at least one patent image file,
>
> (F) column line number information representing approximate line numbers
> of lines of text,
>
> (G) section information representing at least approximate positions of
> patent sections,

Doc 2, p.45 specifies the following tags in the text file for logical
structure:
<SDO BIJ> Bibliographic subdocument
<SDO ABJ> Abstract subdocument
<SDO CLJ> Claims subdocument
<SDO DEJ> Description subdocument
<SDO EDJ> Explanation of drawings subdocument
<SDO DRJ> Drawing subdocument
<SDO OFJ> Overflow part of bibliographic data from front page
<SDO AMJ> Amendment subdocument

>
>
> (H) font information representing font styles of characters of said at
> least one source text file as represented in said at least one patent
> image file,

Doc 2, p.46 specifies the following tag in the text file for font style
<U></U> Underscore

>
>
> (I) font size information representing font sizes of characters of said
> at least one source text file as represented in said at least one patent
> image file,

Doc 2, p.46 specifies the following tags in the text file for font size
<BAI></BAI> Double width
<HAN></HAN> Half width

>
>
> (J) superscript information indicating characters in said at least one
> source text file that are represented using superscripts in said at
> least one patent image file,

Doc 2, p.46 specifies the following tag in the text file for superscript
<SP></SP> Superscript

>
>
> (K) subscript information indicating characters in said at least one
> source text file that are represented using subscripts in said at least
> one patent image file,

Doc 2, p.46 specifies the following tag in the text file for subscript
<SB></SB> Subrscript

>
>
> (L) bold attribute information indicating characters in said at least
> one source text file that are bolded in said at least one patent image
> file, and
>
> (M) italicized attribute information indicating characters in said at
> least one source text file that are italicized in said at least one
> patent image file,
>
> said at least one patent image file being at least one data file having
> stored therein one or more image pages associated with a patent, each of
> said image pages being an electronic image of at least a portion of a
> page of said patent or at least a portion of a page of a document
> related to said patent, said at least one source text file being at
> least one data file having stored therein text data representing at
> least a portion of textual data in said patent; and

In April 1992, the JPO produced a sample CD-ROM for testing by
developers. I received two copies in May 1992 and was then informed
that CAS had also received copies. Here is a directory of files for ten
patent documents:

91089911.img 1992/04/23
91089911.pos 1992/04/23
91089911.txt 1992/04/23
91089912.img 1992/04/23
91089912.pos 1992/04/23
91089912.txt 1992/04/23
91089913.img 1992/04/23
91089913.pos 1992/04/23
91089913.txt 1992/04/23
91089914.img 1992/04/23
91089914.pos 1992/04/23
91089914.txt 1992/04/23
91089915.img 1992/04/23
91089915.pos 1992/04/23
91089915.txt 1992/04/23
91089916.img 1992/04/23
91089916.pos 1992/04/23
91089916.txt 1992/04/23
91089917.img 1992/04/23
91089917.pos 1992/04/23
91089917.txt 1992/04/23
91089918.img 1992/04/23
91089918.pos 1992/04/23
91089918.txt 1992/04/23
91089919.img 1992/04/23
91089919.pos 1992/04/23
91089919.txt 1992/04/23
91089920.img 1992/04/23
91089920.pos 1992/04/23
91089920.txt 1992/04/23

>
> a user interface generated by said CPU for display on said display, said
> user interface selectively displaying said patent text file and said
> patent image file on said display, such that at least a portion of said
> at least one patent text file is displayed in a first window and at
> least a portion of said at least one patent image file is displayed in a
> second window and said windows may be selectively viewed simultaneously
> or individually on said display.
>

Toshiba sales literature dated March 1993 shows a Sparc station
based system for searching and displaying patent documents from
the JPO CDROM. Page 2 shows a screen with three separate windows
showing the bibligraphic data, abstract and front page image,
respectively. (Note that, on the CD, the abstract only exists
in the text file.)

A couple other notes:

1) Details on the JPO CDROM system and related hardware
are in the USPTO because of interoffice cooperation.

2) In May 1997 someone from SmartPatents called me to
demand access to our Japanese patent collection (or else...).
I am somewhat surprised that even though SmartPatents
apparently knew about the JPO CDROM and its contents,
nothing refering to the JPO system appears in '780.

>
> --
> ---------------------------------------------
> ISTA's office hours are 9:00 am to 5:00 pm
> Eastern Time, TUESDAY through SATURDAY. The
> Paterra(tm) Internet Service is available 24x7
> with occasional downtime on weekends.
> ---------------------------------------------
> Alan Engel, ISTA, Inc.
> http://www.intlscience.com
> ConvertedKokai(tm) machine translations of Japanese patents
> Paterra(tm) Instant MT(tm) for Japanese patents www.paterra.com

--
---------------------------------------------
ISTA's office hours are 9:00 am to 5:00 pm
Eastern Time, TUESDAY through SATURDAY.  The
Paterra(tm) Internet Service is available 24x7
with occasional downtime on weekends.
---------------------------------------------
Alan Engel, ISTA, Inc.
http://www.intlscience.com
ConvertedKokai(tm) machine translations of Japanese patents
Paterra(tm) Instant MT(tm) for Japanese patents www.paterra.com



This archive was generated by hypermail 2b29 : Fri Aug 10 2001 - 15:58:17 EDT