This message is a little on the long side. Recently, In Suk Song and Fred
Herman have posted questions asking for assistance in creating patent maps.
Today Dr. Song (I hope that is the proper way to address someone from Korea)
posted the responses that had been received up to this point. I thought I
would take a few moments to briefly discuss the packages that were mentioned.
The analysis of patent information can mean a number of different things, as
can the concept of patent mapping. In general, patent analysis involves
extracting data from a patent document (could be any type of literature for
that matter) and analyzing the data by different criteria. The type of map
that is created depends upon the question that is trying to be answered.
>From my understanding, this analysis can be divided into two broad categories.
These are data mining (or mapping) and text mining. Data mining involves the
extraction of fielded data and the analysis thereof. An example would be if
someone wanted to examine the relationship between patent assignees and
International Patent Classification codes for a specific area of technology.
Mining or mapping this information can give someone an idea of who are the
major players in a technology area and what type of work they are generally
focusing on. When using Derwent data, a similar analysis can be done
replacing IPC codes with Derwent manual codes.
Text mining or mapping typically involves clustering or categorizing documents
based on the major concepts that are contained within. The data source is
unstructured text data, it is not fielded and the only structure is that which
the author has applied when they wrote the document and built relationships
between different concepts within. An example of this would be if you
collected patents from a specific patent assignee and you analyzed the text of
these documents. In a cluster map the software would extract the major
concepts found within and create clusters of documents that appear to cover
the same concept. The software would then visualize these clusters in some
fashion creating a map. By looking at the clusters that were created (and
subsequently the documents themselves, but now with an organized method) you
can quickly get a general idea of the concepts that this organization is
working on and how they interrelate.
Now that some definitions are out of the way, let's look at some of the
packages that were mentioned this morning.
Manning & Napier's MapIT: When someone purchases access to this system they
are given a login id and password for accessing M&N's internet site. Care
should be taken that you have logged in using a secure link to the site. All
of the work is done remotely on M&N's servers. There are advantages and
disadvantages to this. M&N have collected patent data from US, EP and PCT
applications and granted patents (the general rules on years covered apply to
this system) and the first step in using MapIT is to construct a search query
using their natural language search system. M&N will advice that this query
should be as specific as possible and contain as many synonyms as you can
think of (they suggested using the first claim of a patent for instance). The
system will retrieve the first 1,000 patents that meet your search criteria.
There is some flexibility on weighing whether your search terms appear in
different areas of the patent full-text but I will not go into that here.
Once you have generated a list of documents you can choose to start reading
the documents or you can apply a couple of different analysis tools to the
set. The cite sort option allows you to do some rudimentary data mining on
the set. This feature will create graphs of the first 100 patents based on
the inventors, patent assignees, USPC class and sub-class. This data is given
as is and the user is not allowed to customize this data or look at other data fields.
The other major tool is called IBM clustering and as the name implies this
allows you to cluster the documents based on the system developed by IBM (This
is available in a stand alone package from them called Technology Watch.
Technology Watch has options for doing both data and text mining). When the
system is finished analyzing the patents it will create a list of clusters
categorizing the documents.
Overall, MapIT is an easy system to use and is a good general tool for patent
mining or mapping. For more advanced users, the lack of customizable features
may be frustrating.
Semio: This is pretty much a text mining tool that creates cluster maps based
on a set of documents. Once the system is installed it is fairly easy to
create a map from it and post the map to an intranet site so that a number of
people can share the information. A standard web browser is used to look at
the maps and after a short introduction to how the maps work a user can
quickly and easily start using the system. One large drawback is that for
Semio to work most effectively individual documents must be created for each
reference. For example if you were downloading data from Derwent for
analysis, you would have to create a separate document for each Derwent
record. Otherwise when you saw a concept you were interested in and wanted to
look at the documents in that cluster, the system would return the entire
online record. In other words, the system does not contain a feature where
online data can be imported in and parsed into separate records for analysis.
Overall, Semio is one of the more attractive visualization packages out there
for doing concept mapping (text mining).
Aurigin's IPAM system: IPAM stands for Intellectual Property Asset Management
and as the name implies this system allows you to organize and manage
intellectual property (not just patents, but corporate documents as well).
The system contains tools for patent analysis as well since this is an
integral part of smart IP management. While a very interesting system,
Aurigin is a big ticket item. There are substantial costs involved in
purchasing a server to run the system and setting it up to work within an
organization. It offers a great deal of power, flexibility and security
(since it is located behind your company's firewall) but it is not trivial to
get established.
IPAM is an integrator system meaning that they have built a platform for the
system and have allowed it to be flexible enough to allow a number of third
party applications to work within the framework. Aurigin invited some of the
best third party analysis tools companies to partner with them and integrate
their systems in with Aurigin. They have incorporated both text and data
mining tools into the system and set them up so that they all work together seamlessly.
The patent data is taken from US, EP and PCT documents (same basic rules apply
for coverage) and they also have a method for searching these references and
creating sets that can be further analyzed. Another nice feature is that
since Aurigin began life as SmartPatents, you can have all of the annotation
and viewing capabilities of SmartPatents accessible through the system (for an
additional charge of course to purchase the SmartPatents of interest). One of
the key strengths of the IPAM system is the ability for individuals within an
organization to create sets of patents, analyzed them, annotate them and
generally create intelligence from them and save all of this knowledge in a
single place where it can be preserved for the company.
Overall, this is a nice system but a big investment.
There are a number of other tools available, but due to the length of this
message I think I will save that discussion for another time.
Questions or comments are welcome. You can respond to the list or contact me
by e-mail at trippe@go-concepts.com.
Thanks,
Tony Trippe
trippe@go-concepts.com
in suk song wrote:
>
> Does anybody know the software which analyzes patent informations?
> (as likely rank or sort function in dialog)
> It seems to be useful making a patent map.
>
> I would like to know which one is available if ever.
> Any help will be greatly appreciated.
> --
> in suk song
>
> Associate Scientist
>
> SamYang Medical Research R&D Center
> 63-2,Hwaam,Yusung,Taejeon,South Korea
> zip:305-348
> tel.82.42.865.8258
> fax.82.42.865.8299
> email: nata@samyang.co.kr