The System Features tables are used to provide a brief summary of the presence of some common features in most patent search and/or analysis systems. There are two separate templates - one to be used for patent and non-patent literature search systems, and one for analysis tools. Full definitions for the table categories are provided below.
System Features - Patent Search Systems
Browser-based or Desktop interface:
Many search systems can be accessed via a web browser, and do not require you to download any software to use them. However, certain systems do require a software download. Here, we refer to these systems as "desktop interface" systems.
Patent data collections created by topic:
Some search products contain only documents directed to certain subject areas (e.g. a data file of pharmaceutical patent documents, for example). If the system under discussion contains any of these directed collections, their subject areas will be listed here.
For Patent Search Systems: Contains data indexed by non-patent-office developed classification schemes (e.g. Derwent Manual Codes):
The presence of any classification systems that were not developed by a national or regional patenting authority is noted in this field. As an example, Derwent Manual codes have been created and applied by a commercial entity, thus the presence of manual codes would be noted here.
Non-Patent and Mixed Content Search Systems: Literature collections created by topic:
Does the system contain any collections of non-patent or mixed collections that directed towards a specific subject area?
Non-Patent and Mixed Content Search Systems: Contains value-added indexing data (e.g. Derwent Manual Codes):
Have the producers of the datafiles hosted on this system added any special indexing to the records? This could include classification codes, subject headings or controlled vocabulary, for example.
Non-Patent and Mixed Content Search Systems: Contains Patent Content in addition to Non-Patent Literature:
This question asks whether any patent literature is searchable through the system, even if it is presented alongside non-patent content.
IPC code searching:
Does the system support searching by International Patent Classification (IPC)codes? If only certain versions are supported, that information can be noted here as well.
US Classification searching:
Does the system support searching by the US National Patent Classification system? If only original classification data at time of publication is included, that information can be noted here as well.
ECLA code searching:
Does the system support searching by the ECLA Classification system?
Japanese F-terms or File Indexing:
Does the system support searching by Japanese File Index and/or Japanese F-term classification systems? These two separate classification systems often, but not always, accompany each other.
Does the system support limiting search queries by date? Any date limiter fulfills this criterion, but unusual date limiters may also be mentioned here.
Section-specific Text Searching (e.g. a search in “claims only”?):
Does the system support limiting keyword queries to certain portions of patent text (for example, searching in the title, abstract, or claims sections only)?
Does the system support limiting a search to only certain collections, for example, a search in JP patent documents only?
Chemical Structure Draw/Search Tool:
Does the system offer a chemical structure drawing tool to create search queries, and (does the underlying data) support chemical structure search queries?
Command line interface:
Does the system offer a command line or command-line-style search interface? A "command-line-style" interface is defined (by Intellogist) to mean a search bar or text box where users can enter search commands via special search syntax, such as "engine/TI AND 2000/PY" for "engine" in the title, and 2000 as the publication year, of a record.
Web-Browser Search Form:
Does the system offer a web-based search form (as opposed to a command line interface)? Web-based search forms allow users to input search parameters into various text boxes, and eliminate the need for a special command language.
Combining (stacking) Search Queries:
This question refers to the ability to combine two previously executed search queries. This is sometimes enabled via systems' command lines by using syntax such as "S1 AND S2" where the two queries are linked by a Boolean operator. In other cases, this can be performed from the saved search interface by selecting two searches and choosing a "combine" option.
Uses a relevancy ranking when listing search results:
Does the system use a ranking system to list the most relevant search results first, rather than simply sorting by publication date, publication country or other bibliographic parameters? The relevancy ranking can be determined by any method in order to get a "yes" answer here. If the ranking method is known, a brief comment may be included to describe it.
Does the system apply any automatic highlighting to search terms that appear in the search results?
Advanced Keyword Highlighting:
In our table, "Advanced Keyword Highlighting" refers to the ability of the user to define which terms should be highlighted in search results, rather than highlighting only those terms which were used in the initial search query. If this feature is present, a short comment may describe it in more detail.
Keyword In Context Display:
Keyword-In-Context (often abbreviated KWIC) describes the idea that searchers must see their search keywords within a few sentences of context to understand whether the reference is likely to be useful. Many search products will offer a keyword-in-context display that helps searchers scan their results.
Linear Reference Viewing (clicking “Next” takes you to the next patent in the hitlist):
Intellogist uses the phrase "linear reference viewing" to describe the concept of moving sequentially from one result view to the next without being forced to return to the list of search results.
Saving Results Sets:
Most professional patent search products allow users to save search results for later review, and/or download or export. This is often done via the use of a virtual "folder " or page of saved results.
Flagging Individual Results:
Most search products which allow users to save search results will support the ability to flag only certain relevant results to be saved/exported/download, rather than forcing the user to save all of the search results at once. This is usually done via a checkbox next to each search result.
Saving Search Histories: (Automated/User command required/Not possible)
This field addresses whether the search product has any mechanism to store search activity. The three standard answers include "Automated," which indicates the search activity is automatically saved, "User command required," which means users must take some action if they wish to save, and "Not possible." An additional comment may be added to describe the feature further.
This field asks whether the search system offer any way to see a breakdown of search results by generally useful statistics, such as making simple charts or graphs or listing the "top ten" assignees with their occurrence frequencies, for example.
Other (In-program) Analysis tools:
Other analysis tools might include semantic analysis (like keyword clustering, for example), or citation analysis tools. This field deals with any other analysis tools between a simple breakdown of the search results, and the generation of charts and graphs.
PDF document delivery: (Individual only/Batch/None)
Does the system offer any downloading service for original patent copies in PDF format? A standardized answers include: "Individual only," used when copies must be downloaded one-at-a-time, "Batch," used when multiple documents can be downloaded at once, and "None." A further comment may be added to describe the feature.
PDF document delivery for non-patent literature: (Direct/Indirect - external link to publisher)
If PDF copies of non-patent literature documents are available via the system, are they available for download directly from the system (direct), or through external links to third party publishers (indirect), or both?
Does the system offer the ability to export information about search results, such as bibliographic data, drawing images, and/or full text? Any file format is acceptable for a "yes" answer to this questions. A brief comment may be added describing the export file formats that are supported.
Does the system offer the ability to import information, especially patent document numbers? If yes, a brief comment may explain any limitations on this function, such as the maximum number of documents that may be imported.
Does the system allow the user to receive periodic updates, whether via e-mail or another method (such as RSS), of new publications that fit their search string criteria?
Online Help Guide: (Detailed/Cursory/None)
This field describes any online help materials provided to users regarding how to use the system properly. Three standard answers include "Detailed," "cursory," and "None." Detailed help will generally include some screenshots and examples, and troubleshooting information.
This section addresses whether there are any live or recorded web seminars to show users how to use the product more effectively.
Does the system offer any live assistance with the product? This could involve a customer support line or a live chat feature. A comment may be added to explain the hours or other aspects of this feature.
Will the system automatically log the user out after a period of inactivity? If yes, a brief comment will explain the timeframe for this.
Secure Connection Option:
Does the system allow users to connect via a secure connection option?
Pricing (Free/Pay-per-use/Flat fee/Combination)
This field relates to the pricing structure of the tool and has four standard answers: Free, Pay-per-use, Flat fee, and Combination. Pay-per-use denotes a pricing structure where the user pays per transaction, such as per search query executed or per record viewed. Flat fee denotes a pricing structure where the user pays a regular lump sum to continue to use most basic features product. (Some features under a flat fee subscription may cost extra, but most basic features should be free.)
System Features Tables - Analysis Systems
Is the product a browser or desktop based system? Many search systems can be accessed via a web browser, and do not require you to download any software to use them. However, certain systems do require a software download. Here, we label these systems as "desktop interface" systems.
Does the system contain any patent data collections?
Some analysis tools are just software with no searchable databases attached. This question asks whether the product contains any searchable patent data collections.
Are patent PDF image downloads available?
Some analysis tools may provide PDF document images if they offer a patent data collection – others may have the ability to obtain PDF images via an automated process that retrieves the image from a third party provider.
Does the system contain any academic literature collections?
Some analysis tools are just software with no searchable databases attached. This question asks whether the product contains any searchable academic literature data collections.
Does the system contain any news or business collections?
Some analysis tools are just software with no searchable databases attached. This question asks whether the product contains any news or business data collections.
Does the system contain any legal or public records collections?
Some analysis tools are just software with no searchable databases attached. This question asks whether the product contains any legal or public records collections. The system may also make this data available via a 3rd party provider.
Can data from multiple collections be associated (e.g. via corporate name recognition)?
Can identifying characteristics (such as assignee name) be used to associate patents and related business or legal information regarding the patent owner? This could also be referred to as "collating" data from different sources.
Can data sets be imported? If so, what kind? (patent/literature/business/legal)
Many analysis tools are designed to support the analysis of text documents, no matter their type (e.g. to analyze both patent data and news or journal articles). This question asks which types of data can be imported into the tool.
What file formats are supported for import?
If the answer to the question above ("Can data sets be imported") was yes, this field will list the supported file formats that may be imported.
Pre-Analysis: Cleaning and Organizing Data Sets
Automated data extraction: Can system recognize basic data elements from imported text (such as author, title, abstract, etc.)?
Some analysis systems have methods for identifying data elements within imported data. This is sometimes achieved via the use of a "filter" that tells the product how to interpret data that has been imported for a specific source. In other cases, if the imported data has been tagged with an extensible markup language such as XHTML, can the system recognize the various content elements (e.g. title, author, etc.) in preparation for analysis?
Is manual data cleaning (e.g. user decides when two entities are the same) supported?
"Manual" data cleaning takes place when the user "teaches" the analysis system how to interpret data. For example, a data set may be imported with the same company name spelled two different ways, e.g. "Acme" vs. "Acme Inc." If the user can tell the system to recognize both names as belonging to the same entity, the answer to this question will be "Yes."
Are at least some basic data cleaning tasks automated?
Automated data cleaning occurs when the system itself can recognize a problem in the underlying data without user intervention. For example, if an analysis system can use a pre-built algorithm or thesaurus to recognize that the company "Acme" is equivalent to "Acme Inc.," the answer to this question will be "Yes."
Does the system recognize and associate various subsidiary names via corporate tree data?
Corporate tree data is hierarchical information which shows company relationships. It is provided via a number of commercial vendors, including [1790 Analytics] and Lexis's [Directory of Corporate Affiliations (DCA)]. If the analysis system can use this data to recognize that two companies are related, the answer to this question will be yes. For example, if the system uses corporate relationship data to treat "Prime Industries" as a subsidiary company of "Acme Inc.," the answer to this question would be "Yes."
Can users assign ratings to individual documents?
User annotation is often needed in large scale analysis projects to rank documents based on their importance. The ability to tag references with any kind of numeric rating will give this field a "Yes" answer.
Can users define categories and assign them to individual documents?
User annotation is often needed in large scale analysis projects to rank documents based on their importance. The ability to tag references with a text phrase such as "tier 1, tier 2 etc." will give this field a "Yes" answer. The system must also be able to treat these tags as standardized categories, not just free text annotations.
Can users handpick documents from a dataset to become a new group or sub-group, upon which analysis will take place?
This question refers to the ability to create a sub-group of records within a dataset, upon which specific analysis tasks will be conducted.
Advanced Automated Features
Does the system perform full text analysis to extract keywords for tagging/indexing?
This question refers to whether the analysis system will automatically assign "tags" or "categories" to data records by analyzing them with a semantic algorithm to determine their important content.
Does the system tag references via a pre-loaded taxonomy based on the presence of given data elements (such as specific keywords)?
Some analysis systems can intelligently tag documents via semantic analysis methods based on a pre-loaded taxonomy (essentially a subject-specific dictionary). This method ensures that all documents pertaining to the same subject bear the same keyword tags, standardizing the language used to tag the documents and eliminating synonymous tags.
Does the system apply a proprietary relevance ranking?
A relevance ranking will sort records in the order of perceived importance. This ranking can be determined in any number of ways, and most systems will have their own proprietary method.
Can the system perform clustering on unstructured data via keyword extraction?
Unstructured data is textual information which is presented without metadata – that is, the title, author, and other bibliographic data is all mashed together with the full text of the record. This question asks whether the analysis system can intelligently analyze the contents of these unstructured records to resolve them into related groups (or clusters) based on their contents.
Does the system apply any kind of semantic or natural language processing algorithm?
A semantic or natural language processing algorithm is a computer program that can "understand" human language up to a certain point. This question pertains to whether the analysis system contains any technology that attempts to mechanize the interpretation of electronic text, e.g. to "cluster" related documents, and/or to tag them with categories based on their contents.
Can clusters be mapped in relation to one another?
Some semantic or natural language processing algorithms can determine the level of relatedness of text documents by analyzing content and keyword density. These algorithms can then calculate the degree of relatedness shared by two documents or groups of documents. This question pertains to whether the analysis tool performs this kind of relationship analysis on groups of documents.
Can cluster maps evolve over time?
This question pertains specifically to whether the element of time can be added into the text mining analysis so users can appreciate how the technology landscape has changed over a specific timeline.
Can the system produce simple tables/charts?
This question relates to whether the analysis tool can generate tables of data.
Can the system produce 3-dimensional bar charts?
If the system supports graphing, can it produce 3-dimensional bar charts? A comment may be added here to explain exactly what the capabilities are.
Can the system produce topographical maps?
Topographical maps are produced when natural language processing or semantic algorithms are used to determine the level of similarity between documents based on their content. The resulting visualization can be outputted as something similar to a topographical map of the earth, where high peaks represent the greatest concentration of documents and valleys represent outlying documents.
Can the system produce co-occurrency matrices?
Is the system capable of calculating and showing co-occurrency matrices? Co-occurrency matrices show how many times items of one field occur with items in another field. For example, in a matrix where companies and inventors are compared, the matrix would show the frequency with which company X appeared on the same document as inventor Y. Results are typically displayed using the number of occurrences or by using various shades of one color.
Can the system produce relationship/collaboration networks?
This question refers to whether the system can analyze metadata like authorship, ownership, and/or inventorship to show collaborations between people and companies.
Does the system have any citation visualization tools?
Citation relationships get more complex as the pool of related documents gets larger. Various citation visualization methods are used to help elucidate these complex relationships. These techniques can be used to identify seminal works in a technology area, among other applications.
Can automated reports be generated?
Can the system run a series of algorithms to generate a standard set of visualizations automatically, without much user input (if any)?
Drill Down Viewing
Can the system contain and display document full text?
Analysis systems often handle thousands of documents, but it can sometimes be important for users to be able to see data at its most basic level, and read underlying patent documents. This question relates to whether the system supports the viewing of individual data records at a very detailed level, rather than simply viewing the overall trends in the dataset.
Can the system contain and display document images?
This question is related to the question "Can the system contain and display document full text?" above. If the system supports viewing individual patent documents, does it also support viewing the document facsimile image (the original patent copy)? This can become necessary when users need to view the patent drawings or in-line data such as chemical structures or formulae.
Does the system offer any advanced full text viewing features (such as keyword highlighting)?
Advanced full text viewing features would include any feature designed to help users quickly scan through full text patent documents. Keyword highlighting, graphical maps of keyword hits, and keyword-in-context displays would all be examples of advanced full text viewing features.
Can search results be saved?
If the analysis system contains inherent patent data collections, can the results of a search in this data collection be saved for future manipulation?
Can visualization results, such as graphs, be saved?
If the system supports data visualization, can the results of this visualization be saved for later manipulation?
Can raw data from search results be exported? What file formats are supported?
If the system contains inherent patent data collections, can information about the search results (sucha as bibliographic data, images and/or full text) be exported for use in another program?
Can patent PDFs be downloaded? Individually or in bulk?
If the system contains inherent document image collections, can these patent copies be downloaded and saved to a local directory? "Bulk" downloading refers to the ability to download a large number of document copies at once.
Can graphs and charts be exported or saved for future viewing?
If the system supports the ability to generate charts and graphs, can this information be exported, rather than only viewed via access to the system itself? This is often necessary if users wish to share the results of their analysis projects with others, or include them in publications.
Can users take and export snapshots of interactive graphs?
If the system supports the ability to generate interactive visualizations, can the user take "snapshots" of these graphs at certain points in time? This is often necessary to share analysis results with others, or to include in publications.
Can more than one user have access to a single seat of the product?
If the product is set up so that only one person can have access to one subscription seat at any given time, can this access extend to more than one user, as long as the users are not logged on simultaneously?
Can users share the results of their analysis without encountering licensing restrictions?
Are there any licensing restrictions on the product or its inherent data that would prohibit users from sharing analysis results? For example, can you export data and send it to another user, exactly as it appeared when created? Or, can another licensed user of the system not within your company have access to your data and analysis results?
What subscription types are available? Often, subscriptions are available on a monthly or yearly basis.