-- Alan Engel ISTA, Inc. ConvertedKokai(tm) machine translations of Japanese patents http://www.intlscience.com
Gregory Aharonian wrote:
> !19991119 Washington Post covers PTO search system scandal > > Today's Washington Post has an article on the PTO search system scandal. > Nice to see the mainstream press covering this issue, which apparently the > grammar-checking and fact-checking IP press showed little interest in > covering these last few months - too busy channeling the "truths" coming > out of the PTO PR office. > > Fortunately for all concerned, PTO management is gradually making > process fixing what it screwed up in the first place, so by the time anyone > in Washington gets around to investigating all the money being wasted, the > search systems will be working well. Anyways, the Post article is attached, > with some NOTEs from me. Kudos to the Post. > > In other matters, I hear that the Patent Reform Bill, attached to the > Satellite Bill, which is attached to some budget bill, will be voted on > Saturday at some point, and probably will pass. Personally, I can live > with it because it will lead to lower quality software patents, more for me > to bust. Altruistically, how can we trust a PTO management to implement > reform in light of such major screwups such as the search systems? But > first, a few notes from Japan: > > According to the Asahi Shimbun, the Japanese Patent Office is > planning to simplify the patent agents' exam in order to increase > the number of agents. Currently, only 3% (yes, that is three percent) > pass the exam yielding only 200 new patent agents each year. > > Also, on October 1, Japan promulgated its own version of the Bayh-Dole > Act to stimulate licensing of university inventions. > > Greg Aharonian > Internet Patent News Service > ==================== > > THE REGULATORS > System's Trademark: Have a Slow Day > > By Cindy Skrzycki > Washington Post Staff Writer > Friday, November 19, 1999; Page E01 > > This is the computer system from hell. > > The Patent and Trademark Office public search room in Crystal City, like > most libraries, is a subdued place where the diligence of several hundred > daily users is critical to many companies and inventors pursuing > innovation and laying claim to something new. > > The quiet there has been broken by the recent arrival of a new computer > system, which is used to search the Patent Office's 6 million patents to > see whether an idea is really new and worthy of a pedigree from the > government. Both patent examiners and public users say the new setup, > supplied by several companies, is a failure that doesn't retrieve > information, read commands correctly, or even function at times. > [NOTE: the current system only searches about 2.5 million patents > full text, not 6 million, though the PTO has plans to bring that > data online] > > "We think it's a stinko system," said Ronald Stern, president of the > Patent Office Professional Association, the union representing about 3,000 > examiners. "We have serious problems." > > "It's an anemic retrieval software," said Emory Damron, who was working in > the search room. Adds a former examiner sitting a few terminals away: "How > will you be able to testify to the validity [of a patent] when you know > there are errors in the system?" > > The new system they are grousing about is the Web-based Examiner Search Tool, > or WEST. Examiners have a fancier retrieval system called the Examiners > Automated Search Tool, or EAST. These replace the Automated Patent Searching > System, or APS, which was taken out of commission Oct. 9. > > PTO officials said much of the unhappiness has occurred because a key part > of the new system was delivered late, leaving only two months for the old > system to run side by side with it. If examiners had had a fallback, many > of the problems they now face would have been worked out in the meantime. > [NOTE: PTO officials conveniently forgot to mention to the Post > reporter that more than a few months before the new systems went > online, examiners performing Beta testing of the new systems > complained that the new systems were completely dysfunctional and > would crash with regular use. Their complaints were ignored by > PTO management, making this "key part" excuse less than the truth.] > > Nevertheless, the joke among users is that WEST stands for "Waste of > Examiners' Search Time" or "Worthless Examiners' Search Tool." An > advertisement for training on the WEST system posted by an elevator has > scrawled across it "Please sign up for worst training" and then "Why bother?" > [NOTE: current examiner acronym for EAST is Extremely Aggravating > Search Tool.] > > Outside users of the Patent Office library are an assortment of lawyers, > patent agents, former examiners and others. They are hired by companies, > inventors and patent applicants to wade through the 33 million paper > documents that the PTO stores in a 65,000-square-foot room. They spend hours, > days, months doing patent research, and much of their work is now done on > computers. > > They uniformly said the new system has so many bugs that the Patent Office, > which is part of the Commerce Department, isn't charging the customary $40 > an hour to use it. "We seem to be beta testers," said Elliot Greenwald, a > professional patent searcher who does work for companies interested in > patent information. "Any work done with the new WEST system is in doubt." > > Greenwald demonstrated by entering "Procter & Gamble" one way into the > system. The response was that the multibillion-dollar conglomerate owns a > mere 19 patents. Entered another way, the total was 4,587. Another user > said the system was so cumbersome that he had spent the whole day on the > same 20-word search. One query that took approximately 12 seconds on the > old system took more than seven minutes to be resolved by WEST. > [NOTE: While some examiners are reporting decent search times, others > are still experiencing 1 to 2 hour search times for queries that > returned within a minute or so on the old systems.] > > Patent Office officials said they are not completely happy with the results > so far but said they are making fixes that will increase ease of use and > speed of the $7 million system. > [NOTE: It is imperative that the PTO release ALL of the financial > records for this debacle. I have heard figures of $100,000,000 and > $40,000,000, but even if these are exaggerations, it has to be at > least a few tens of millions, so this $7 million figure is another > attempt to mislead the public. Don't force me to file a FOIA - > PTO management should own up to the financial size of the screwup, > and 'professional' groups like the ABA and AIPLA should demand an > accounting.] > > The pluses of the new system, they said, are that it costs much less to > operate than the old one and is Y2K-compliant. > [NOTE: Since the PTO refuses to release accurate statements on the > cost of the new system, this statement is not credible or believable. > And it would be have been trivial to make the old system Y2K > complaint - that is a meaningless accomplishment.] > > Officials added that the system has unlimited search capacity and has a > bigger database than its predecessor. > [NOTE: Again, a meaningless statement. The search capacity of these > systems is limited by hardware. Additionally, the PTO blew the > specification of the search capacity of the new system, since it had > to waste $2,000,000 to buy that new HP server. These officials are > just making up things to say to the reporter.] > > "We've had a few bumps in the road, and we are addressing them," said > Nicholas Godici, PTO deputy assistant commissioner for patent process > services. "Part of this is knowing the system and how it works. We've found > it [the problem] is in the query methodology." > [NOTE: So Nick thinks having to spend $2,000,000 to fix a flawed > initial hardware specification, is a "bump in the road"? Especially > when he just sent out a memo freezing hiring new examiners for a > lack of money. A cavalier attitude towards spending patent applicants' > money.] > > To help examiners over the bumps, he said supervisors are giving them more > time to review applications and flexible work hours so they can use the > system on off-peak hours when it seems to perform better. > [NOTE: Yes, but this translates in practice into overtime. Are the > examiners paid for this overtime, or are they having to "volunteer"?] > > Stern said problems with EAST and WEST would affect the productivity of > examiners "big time." He said the likely outcome is that examiners will > reduce the quality of their work rather than sacrifice efficiency. > [NOTE: quality reduction - should make the large companies happy. > I hate to think that software patents will be getting worse - oh > well, more for me to bust.] > > Examiners are so fed up that they sent a petition to Sen. Fred D. Thompson > (R-Tenn.), chairman of the Senate Governmental Affairs Committee, complaining > that the PTO's "handling of our automated search systems has been inadequate > and grossly negligent." "The new system is replete with operational bugs > and faults, and appears even if it was bug-free is actually a step backward > in search system capability. Searches using the new system take longer to > perform, produce fewer results than the systems that they replaced, and > thus prevent us from doing our jobs," the examiners' letter said. > > Public users such as Greenwald and others who have decades of patent > experience have asked Q. Todd Dickinson, commissioner of patents and > trademarks, to have a hearing so they can air their complaints. > > Their letter, which also went to members of Congress this week, said that > they "are being presented with an information retrieval system with > debilitating flaws: not only is the system frequently incapable of > performing even simple word searches, the results are often woefully > incomplete and even inaccurate." > > James F. Cottone, a patent agent, wrote in a missive to Dickinson and > several members of Congress: "How bad is WEST? In my over 40 years of > designing, evaluating and using high-tech electronic systems as a senior > systems engineer, the WEST system is easily the most stupid system I've > ever seen." > > So much for the "Quiet Please" entreaty in the public search room. > > Copyright 1999 The Washington Post Company > > ==================== > > Greg > > Some of examiners and they are VERY TICKED with Godici quote: > "Part of this is knowing the system and how it works. We've > found it [the problem] is in the query methodology." > calling it a "BLAME THE VICTIM" response. As one examiner noted and > then questioned: > "It [his quote] indicates that the examiners do not know how > to perform a Text search -- which puts into doubt the validity > of all patents in which a text search was performed. > > Examiners are wondering if the commissioner is going to order > a reexamine of all patents in which a text search was performed." > > It seems that the party line is -- That the problem isn't the search system > -- the problem is the examiners are too stupid to do the search queries > right ... and retraining them to do searches "right" will solve everything. > > The fault can only lie either in the system itself or in the user ... > therefore, since the PTO refuses to admit that the system is a screwup then > the only place left to place the blame is the users and they are getting > it bigtime. Publicly and privately - note all the "retraining" and BRS > search "tips" which take searching methodology two giant steps backward. > > On another note, right now, examiners have limited access to STN but soon > ALL access will be gone--(PTO claims it costs too much). That means no > more statistics comparisons that make the PTO look very very bad. > (STN-4 second searches to EAST 4 hour searches--Congress would see > STN as a jet airplane and EAST a horse with two broken legs in comparison..) > > ==================== > > Of course, some examiners are reporting good results with the new search > systems: > > Greg, > > Since I have so heartily criticized the CIO, I thought I should let > you know how the newest WEST server is doing. Assuming that everyone > else in the PTO isn't slacking off, it is giving 20-50 second response > times at 11:00 in the morning, which is about the speed of APS and is > certainly acceptable. While I would wish that they had anticipated > this problem, rather than fixing it after it occurred, it looks like > WEST is usable now, which is good. > > and > > My searches (which I always have tried to do between 6:30 and 9:30 > a.m. are much faster. As it gets later, the system begins to lag, > but is still fast enough to use, unlike before. For example, if I > send in a multiple term truncated search at 11:00, it may take several > minutes to complete, which is longer than APS. But this is competitive > enough with USPATFULL (which is also much faster) that I don't feel > the need to waste PTO money on the vendor search. Previously, it would > take 30 minutes to an hour, which is so slow that I felt justified in > spending the money. > > to which, when I asked another examiner, I got: > > Examiner ran tests today -- search took TWO SECONDS on STN/Messenger > and over SIX HOURS for equiv. search on EAST, BEFORE it came back with > truncation overflow error. > > Examiners have told me that ONE WORD searches with NO truncation or > fields or "common words" (like processor or disk) just TERM type > search take longer than APS/STN/ but generally less than a minute > (compared to a couple of seconds on STN). However, this is NOT THAT > relevant -- except for PTO to use for spin. > > Because in REAL LIFE - most searches are ---NOT--- one word searches > -- such as "disk drive", "bus bridge". And the more complicated the > search, the longer it takes in BRS. And even if examiner starts out > with "one word searches" -- eventually they MUST combine the results. > i.e. searching disk and then searching drive gives you THOUSANDS of > results -- not plausible. You MUST combine to "disk drive" (disk adj > drive) to be of ANY value. > > Forcing examiners to do each term one by one does NOTHING except > manipulate the statistics and take examiner LONGER. "Disk drive" in > ONE search may take 40 minutes (average of 40 minutes per "search") > but "disk drive" in THREE search may take 42 minutes (average of > 14 minutes per "search") but there is NO IMPROVEMENT in REALITY -- > it actually take LONGER!!!! > > The search I referred to yesterday that took TWO SECONDS on STN and > over SIX HOURS on BRS before erroring out was a TWO TERM SEARCH asking > to search term1 in the same paragraph as term2 where term2 has front > truncation ... In BRSspeak that would be term1 same $term2 > > The examiner made the note that apparently BRS doesn't have an reverse > index but just single index -- meaning that BRS would have to go through > every "dictionary" entry to see if entry ends in term2 as opposed to > having a second index of words and just looking for 2mret$, which is > MUCH easier and faster. > > ==================== > > There is a discussion group for people who use the BRS search systems > (what is inside EAST and WEST). Here are some comments, starting off > with one from a PTO systems guy. > > ==== > > From: Ian.Neil@USPTO.GOV > Full Date: Mon, 21 Sep 1998 06:26:44 -0400 > > Large database? What is considered a large database (1G....ha)? We > at the United States Patent & Trademark Office are just in our infancy > with BRS. Our smallest database is 11 Gig, our mid-sized is 56 Gig, > and our large database is 150+ Gig. The problem we see is our > mid-sized DB needs reorganized almost daily, we have yet to configure > our large database. Sleep, what is sleep? Any experience with DBs > of this size would be much appreciated. > > Doesn't sound like PTO management properly planned for the installation and > maintenance of the new search systems, especially in light of known > problems about the BRS system: > > At 1:49 AM +0000 09/16/1998, Charlie Farrell wrote: > > You may already know this - but just in case you've never done > rebuilds of what BRS considers to be a 'Large database' - then here's > a word of warning, when you come to reload the database. Make sure > that you don't try to reload the whole thing in one go. Better to > reload in several steps using files of decreasing size. > > The weakest link(s) in the loader when it comes to reloading large > databases is that two crucial files have a crappy filesize limits > which could trip you up: > > 1) the database dictionary file (512Mb) and > 2) Database Work file - DBNM.wrk in the database directory (1GB) - > (at least it was true up to Release 6.2) Haven't had the good > fortune to use 6.3 yet :-)) . > > Throwing too much data at the loader can cause the DBNM.wrk file to > exceed its limit, leaving you with a useless database and requiring > you to restart the load - from the beginning using a smaller file. > 'Adding' too large an additional file can cause so many changes to > the dictionary file (dict.db) that it exceeds its 512Mb limit - which > also requires the job to be re-started. These problems are a veritable > pain in the A** and have certainly caused me to lose many days/weeks > work in the past. (I have it all the information saved in word and > excel files in case anyone from Dataware should be interested). > > I know I've said this before (to Dataware senior executives) - 3 years > or more ago, but there's no harm in repeating it again. > > <RANT ON> I personally would like to see Dataware making the most of > the huge filesystems available today. Most of the internal pointers > used in the software are 32bits or less in length. This might have been > fine in the early 80's, (when all the juicy bits of the software were > written), but the capabilities of operating systems have exceeded these > limits for many years. 64bit processors and 64bit operating systems > aren't at all unusual and filesystem sizes have grown enormously during > the 90's, but BRS/Search still obeys file and pointer limits that would > seem quite advanced for the late 70's/early 80's, but which to me at > least, (and I don't believe I'm the only one), seem rather antique for > the 90's. > > Sure, increasing all the internal file limits would mean that databases > would become incompatible with earlier releases, but when you're trying > to work at the leading edge, with huge volumes of data, one-off database > reloads are a small price to pay for the ENDLESS hours wasted in > support, restoring databases and re-running updates. <RANT OFF> > > My advice would be that when performing a reload of an already huge > database, its best to start with a large file - no more than 1GB, and > then load in the rest in increasingly smaller chunks, keeping a close > eye on the database dictionary file and doing, at an absolute minimum, > database dictionary reorgs after each load. If your file is bigger than > 1GB, then I suggest that you use the '-skip' and 'use' parameters with > brsload.