From Intellogist

Jump to: navigation, search
If you found this page through a web search, we invite you to visit our Main Page to see what Intellogist is all about. This Community Report exists for our users to share information and experiences about patent search resources. Registered users can add, edit, or delete material on this page. The information contained on this page is the result of community collaboration and is not vetted by individual experts or fact-checkers. For in-depth reviews of other patent search resources, please read our Intellogist Reports. For information about editing Community reports, please see our Help pages. If you'd like more information about why Intellogist did not write about this search tool, please see our Criteria for Intellogist Articles.


CiteSeerX is a free digital library and search engine for scientific literature that mostly focuses on computer and information science literature.[1] The system is hosted and developed by the The College of Information Sciences and Technology (CIST) at Pennsylvania State University. CiteSeerX began as CiteSeer, which was developed in 1997 at NEC Research Institute, Princeton, New Jersey, by Steve Lawrence, Lee Giles and Kurt Bollacker.[1] According to the CiteSeerX website, the maintenance of the system transitioned to CIST at Pennsylvania State University in 2003, and the project has since been led by Lee Giles and Isaac Councill.[1]

Citeseer was the first digital library and search engine to utilize the method of autonomous citation indexing (ACI),[1] which is defined as follows:[2]

ACI autonomously creates citation indices similar to the Science Citation Index R . An ACI system autonomously locates articles, extracts citations, identifies identical citations that occur in different formats, and identifies the context of citations in the body of articles. ACI can organize the literature and provide most of the advantages of traditional citation indices, such as literature search using citation links, and the evaluation of articles based on citation statistics.

CitSeerX was developed to provide a "new architecture and data model" for the CiteSeer system, which indexes over 750,000 documents.[1] Features of CiteSeerX include:[1]

  • Autonomous Citation Indexing (ACI) - See above definition
  • Citation statistics and context
  • Reference linking
  • Awareness and tracking - Provides automatic notification of new citations to given papers and new papers matching a user profile
  • Related documents
  • Full-text indexing
  • Query-sensitive summaries
  • Harvesting of articles using CiteSeerX Crawler (citeseerxbot) - The citeseerxbot is a "focused crawler" which only crawls sites from a crawl list or submitted URL crawl request.[3]
  • Metadata of articles - CiteSeer automatically extracts and provides metadata from all indexed articles.
  • Personal Content Portal (My CiteSeerX) - Provides personal collections, RSS-like notifications, social bookmarking, social network facilities, personalized search settings, institutional data tracking, and a document submission system[4]


If users would like to create a personal account for the My CiteSeerX portal, they can create a free account and log in through this page. Under their personal account, users can:

  • Edit their profile information
  • View or add collections
  • Monitor "tags"
  • View any changes in "Papers I'm Monitoring" (user will also be notified in changes to metadata of paper via email)

Users can search CiteSeerX (beta version) via either a simple or advanced search form. The simple search form includes three tab options (Documents, Authors, or Table). Users can choose to include citations under the Documents and Authors tab. The Authors tab also includes the option for a "Disambiguated Search."

The advanced search form allows users to search within various text fields (text, title, author name, author affilication, etc.), with certain range criteria (publication year, minimum number of citations), and users can specify how they would like the result list to be sorted (citations, relevancy, date, or recency).

In the hit list, users can choose to change the ranking of results by the previously mentioned criteria. Users can also try searching the query in other search engines (Google Scholar, Yahoo!, Ask, Bing, and Collection of Computer Science Bibliographies). The hit list provides ten results at a time, with bibliographic information and short excerpts showing bolded keywords in context. An orange arrow icon beside each result also automatically displays an excerpt of the abstract when moused over.

After a user selects a result to view the full record, they are taken to a full record view with three available tabs:

  • Summary - Displays linked citations, bibliographic information, the abstract, and a link to download the full-text PDF
  • Related Documents - "Active Bibliography" or "Co-Citation"
  • Version History - View changes to metadata over time

Through the full record view, users can add the metadata for this paper to the "Metacart" for later download, add tags to the paper, choose to "Add to Collection," choose to "Correct Errors," or "Monitor Changes" to the paper. Registered users can also bookmark the page via social media widgets.


  1. 1.0 1.1 1.2 1.3 1.4 1.5 "About CiteSeerX." CiteSeerX website, Accessed August 18, 2011.
  2. Bollacker, Kurt, C. Lee Giles, and Steve Lawrence. "Digital libraries and autonomous citation indexing." IEEE COMPUTER. 1998. CiteSeerX website, Accessed August 18, 2011.
  3. "About CiteSeerX Crawler." CiteSeerX website, (link no longer available). Accessed August 18, 2011.
  4. "About My CiteSeerX." CiteSeerX website, (link no longer available). Accessed August 18, 2011.

Patent search questions. Expert answers.  Brought to you by Landon IP
HOT Items

Intellogist is brought to you by the patent search experts at Landon IP.

Welcome to Intellogist!

To network with our international community of patent info pros, please create an account.

For a list of our current members, see our Community Page.