Report:Delphion/Viewing Results/Analyzing Results

From Intellogist

Jump to: navigation, search
  Report          
This search system report was created by the Intellogist Team and is available for viewing only. If you'd like to share your knowledge on Intellogist, please visit the Best Practices, Glossary, or Community Reports pages. Registered users may be notified of any substantial changes to this report by placing a "watch" on the Revisions page, which is the last page listed in the table of contents. To learn more about using the Intellogist "watchlist," see the Watchlist Help page.
DWPI on Delphion is no longer available, as of March 31, 2012. DWPI data is available on the Thomson Innovation platform.

Analyzing Results

Working with Results Sets

Delphion offers several data analysis functions for the temporary manipulation of results sets: Snapshot analysis, which offers a statistical analysis of various bibliographic data fields within a document set, and Clustering, which is a more unusual tool designed to perform linguistic (extracted keyword) analysis to show relationships between patent content. These two features are described in more detail in the sections that follow.

Snapshot

The Snapshot feature is ultimately a tool to “summarize” statistical data about the results set (which can be either an unsaved hit list, or saved work file). The tool produces information about the dataset via number/percentage occurrence statistics, as well as bar charts showing the information graphically. It is accessed from the “Snapshot” tab at the top of the screen when viewing a hit list or work file. The analysis can be performed on either the first 500 hits, or up to 20,000 results from the hit list.


The Snapshot tab.


Upon opening the Snapshot tab, default selections appear in the tab menu. The default settings will generate a four-way split summary window of Assignee, Inventor, Publication Year, and IPC-7 class. After the report is run, the top data values (by occurrence) are listed, along with actual number of occurrences, percentage of total dataset, and a graphical bar value showing the relative quantity compared to other rows in the chart.

After user-selections are made, the large “Summarize” button will run the program. Below is an example of the system’s output when the default settings remain selected. Only the assignee analysis is visible below. A graphical bar chart appears to the right of each line, showing the relative frequency of occurrence for each data point.


An example of Snapshot statistical analysis results. The Assignee and part of the Inventor analysis results are shown. A bar chart graphically shows the relative frequency of each data point.


After running the Snapshot, “minimum number of occurrences” and “maximum rows shown” can be set by the user. The default values are minimum number of occurrences = 2 and maximum rows = 10. This means that data points will only be included in the charts if they occur 2 or more times in the dataset, and that only the top 10 assignees/inventors/classifications/etc. will be shown in the charts.

Individual data fields that can be summarized using the Snapshot feature include:

Default 4 (Assignee, Inventor, Publication Year, IPC-7)
Assignee
Assignee City
Assignee State
Assignee Country
Designated Country
Application Year
Application Year/Month
Attorney
Inventor
Inventor City
Inventor State
Inventor Country
IPC-R Code – 4 Digit
IPC-R Code – full
IPC 1-7 Code – 4 Digit
IPC 1-7 Code - full
Publication Year
Publication Year/Month
Priority Year
Priority Year/Month
US Assignee Code
US Class – 3 Digit
US Class – full
US Examiner
US Maint. Status
US References – all
US Forward Refs – all
Unified Company
Parent Company
Ultimate Company
Derwent Assignee Code
Derwent Inventor
Derwent Class – main
Derwent Class – all
Derwent Manual Code
Derwent Update

Once the snapshot analysis is run, the program gives the user an opportunity to “drill-down” into the data by choosing only the data points of most interest (for example, choosing the top three company names by ticking their checkboxes); the program will then select only that particular subset of data, and display only those records for review under the Current Results tab. This new data subset may be saved to a work file for later review, or a second round of manipulation can begin on it (e.g. more snapshots, clustering, data extract, PDF/file history order, etc.).


editors note iconEditor's Note:

In Snapshot, users can choose from some unusual data fields for statistical analysis, such as “assignee city,” for example. This tool is notable because of the wide range of data analysis features it provides to the user. In contrast, some competitors restrict statistical analysis tools to major data fields like assignee, inventor, and classifications.


Clustering

Clustering is a keyword analysis feature intended to organize documents into related “clusters” based on extracted keywords from document titles and abstracts. It can be performed on either the first 500 hits, or up to 20,000 references in a dataset. Clustering analysis is performed from the “Clustering” tab at the hit list (or work file) view.


The clustering tab.


Once clusters have been calculated, a hyperlinked list will appear showing number of occurrences for a particular group of keywords.


A list of groups, based on common keyword extraction.


Clustering works by assigning each document from the results set into “one and only one cluster,” defined by shared keywords that “characterize the cluster,” and the number of patent documents in the cluster itself. As seen in the figure above, a cluster is based on keywords that do not necessarily have any relation to the keyword terms in the search string. Clicking any hyperlinked group number from the cluster list (shown in the figures above) will display a list of newly grouped patent numbers. Supposedly, “drilling-down” into this group of documents by exploring them individually should expose their relationship with each other to the reader.

Visual analysis of the clusters is accessed through the Visual Map link. Once generated, the clusters can be organized (and re-organized) on this map based on the number of documents shared between them. The size of the clusters can be increased to show more of the underlying keywords that make up the clusters, and the font size can be adjusted accordingly. When viewing the map, the clusters are arranged so that more closely related groups (based on keyword content) should appear within a closer distance to one another. Selecting the “start” button on the menu bar will allow the clusters to spread out in relation to one another: unrelated clusters should increase their distance from each other. The “link values” shown in the figure below represent the “similarity percentage” between two clusters.


A Visual Map of clusters, based on shared keywords.


editors note iconEditor's Note:

The clustering feature was once unique to Delphion, although such keyword extraction analysis features are becoming more common in search tools. Word association is also the basis for other advanced search techniques such as Latent Semantic Analysis (LSA), which can be found in an emerging set of patent search systems.

In everyday legal patent searching, Delphion's clustering feature may not provide much use. However, it might be useful when applied to certain datasets, such as a patent portfolio owned by one particular inventor or assignee; in that case, the feature might allow users to investigate technological diversity within an IP portfolio. Including an analysis feature which is not relevant to legal-type patent searching, but has this kind of competitive intelligence application, is yet another instance of Delphion’s catering to the business management side of patents, as opposed to the prior art search side.

Delphion’s help guide also notes that because Derwent titles and abstracts are re-written using industry-specific, standard terminology, using Clustering on Derwent records is especially meaningful. This is due to the wide range of different language and terminology that can sometimes be found describing similar inventions in separate documents. However, given the price of a single Derwent record, using this data in clustering can also be costly.

Patent search questions. Expert answers.  Brought to you by Landon IP
HOT Items

Intellogist is brought to you by the patent search experts at Landon IP.

Welcome to Intellogist!

To network with our international community of patent info pros, please create an account.

For a list of our current members, see our Community Page.