Report:Delphion/Viewing Results/Weighting and Ranking Features
|Report||Patent Coverage Map||Ratings||Comments|
|This search system report was created by the Intellogist Team and is available for viewing only. If you'd like to share your knowledge on Intellogist, please visit the Best Practices, Glossary, or Community Reports pages. Registered users may be notified of any substantial changes to this report by placing a "watch" on the Revisions page, which is the last page listed in the table of contents. To learn more about using the Intellogist "watchlist," see the Watchlist Help page.|| |
|DWPI on Delphion is no longer available, as of March 31, 2012. DWPI data is available on the Thomson Innovation platform.|
Weighting and Ranking Features
Although it is proprietary, Delphion has made a few things known about the algorithm, most importantly that it is based on keyword density – in other words, a 1,000 word document with one keyword hit should have a higher relevancy score than a 5,000 word document with one hit. Depending on the operators used, keyword proximity can also be a factor in the relevancy score.
Supporting the relevancy ranking algorithm has allowed Delphion to create several operators that are unique to the system.
- <near> - the <near> operator will function like the Boolean AND operator, in that it will return the same results set. The documents will be ranked by keyword proximity rather than density of keyword occurrence.
- <accrue> - the <accrue> operator will function like the Boolean OR operator, in that it should retrieve any document containing even one of the keyword terms. The documents will be ranked by number of keyword occurrences, and number of different keywords contained in the document, as opposed to keyword density.
- <yesno> - the <yesno> operator is designed to turn off the ranking mechanism; when this command is used, results sets will be generated and ranked by publication date, rather than Delphion’s ranking algorithm.
Delphion’s ranking system is one of the key ways in which it differentiates itself from patent search competitors as being directed to more casual users of patent information; in other words, relevancy ranking systems are useful to end users more interested in reviewing a few top documents than exhaustively searching results sets.
Supporting this argument is the fact that Delphion keeps its ranking system logic proprietary, assuming that users do not need to know why the information is provided to them as “most relevant” as long as they find interesting results right away.
Although Delphion offers to rank references based on its own logic, there are also resources for users who like this feature but who want to design their own ranking scheme.
The previous section discusses the <yesno> operator, which turns the proprietary algorithm off. But the system even allows the user to go one step further, and create a custom ranking system of sorts.
In order to use this feature, users just enter a number between 01 and 100 next to the term or set of terms they wish to search for. The Delphion helpfile gives this example:
- ([](toothbrush and holder) or []holder)
This search string will return documents with the keywords toothbrush and holder, and will also return documents containing only the term holder. Those documents containing both keywords will receive a higher relevancy ranking than just those containing holder.
If results sets from a weighted search are large, users will only see high-ranking references in their search results.
The following excerpt is provided by Delphion’s help guide, and provides a concise summary of the difference between the three concepts.
- Relevancy Scoring, Weighting, and <accrue> Compared
- Weighting, relevancy scoring, and <accrue> are three search options that are somewhat related and often confused. If you ask how documents are scored, the brief answer is that they are scored by frequency of key words. There are however, different aspects of that scoring that you should be aware of as well as different ways in which you can influence the scoring.
- Weighting is something you request when you form your query. You tell the search engine how you want the qualifying documents weighted against each other.
- The <accrue> operator selects documents that include at least one of the search elements you specify. The more search elements that are present, the higher the score will be.