Report:VantagePoint/Text Mining and Clustering/Clustering Unstructured Data via Keyword Extraction

From Intellogist

Jump to: navigation, search
  Report          
This report was created by the Intellogist Team and is available for viewing only. If you'd like to share your knowledge on Intellogist, please visit the Best Practices, Glossary, or Community Reports pages. If you are a registered user and would like to be notified of any substantial changes to this report, you may place a "watch" on the Revisions page, which is the last page listed on the table of contents. To learn more about using the Intellogist "watchlist," see the Watchlist Help page.

Clustering Unstructured Data via Keyword Extraction

Patent analytics expert Anthony Trippe explains clustering unstructured data via keyword extraction. "Unstructured text is defined as text that has not been indexed or segmented into individual data fields. The only structure contained within the document is the structure that was implied by the author when they put words into sentences, sentences into paragraphs, and so on…As with the clustering of structured data, text concepts instead of codes can then be used to group documents that share a high degree of overlap…Where tools for clustering tagged or structured data start by parsing the fielded data into a database, the systems for clustering unstructured text begin by identifying relevant terms within a document." [1]

Clustering of unstructured data via keyword extraction is available using the Extract Nearby Phrase command. See the Natural Language Processing section for more details.

Sources

  1. Trippe, Anthony J. "Patinformatics: Tasks to tools." World Patent Information 25 (2003): 211-221.
Patent search questions. Expert answers.  Brought to you by Landon IP
HOT Items

Intellogist is brought to you by the patent search experts at Landon IP.

Welcome to Intellogist!

To network with our international community of patent info pros, please create an account.

For a list of our current members, see our Community Page.