GenomeQuest

From Intellogist

Jump to: navigation, search
If you found this page through a web search, we invite you to visit our Main Page to see what Intellogist is all about. This Community Report exists for our users to share information and experiences about patent search resources. Registered users can add, edit, or delete material on this page. The information contained on this page is the result of community collaboration and is not vetted by individual experts or fact-checkers. For in-depth reviews of other patent search resources, please read our Intellogist Reports. For information about editing Community reports, please see our Help pages. If you'd like more information about why Intellogist did not write about this search tool, please see our Criteria for Intellogist Articles.

GenomeQuest is a pay-per-use or subscription proprietary peptide and nucleic acid sequence searching tool.


Major Recent Updates

February 2014

As of February 2014, the GQ-Pat database now contains over 475,000 documents and over 235 million sequences.[1] GenomeQuest's Platinum and Gold+ offerings include manually curated documents from major and developing authorities. These databases are now growing by 1,000 to 2,000 new manually curated documents a week until the entire backfile is complete by the end of 2014.[2]

June 2013

GenomeQuest has created a new documentation site, which is available from http://www.genomequest.com/docs/section/table-of-contents/

This documentation includes detailed information on coverage, including efforts to add sequences from BRIC countries and emerging country collections.

January 2013

As of January 2013, the GQ-PAT database now contains over 200 million sequences.[3]

According to a GenomeQuest representative, the following updates have been added to the system: [4]

  • Normalized Patent Assignee: Allows you to group and filter so you can quickly find or remove the correct patent holders. This feature is only available to Gold Plus & Platinum subscribers. See this page on the GenomeQuest Documentation Wiki for more information on the Normalized Patent Assignee.
  • Subject database filtering: Now when you are searching, you can filter based on either publication date and/or remove the jumbo sequences to narrow your results which will return more relevant hits. See this page on the GenomeQuest Documentation Wiki for more information on Subject Databases Prefiltering.


October 2012

According to a GQ-PAT Premium Update Sheet from GenomeQuest, GQ-PAT Premium now offers an Emerging Countries Domestic Patents Archive that covers "domestic-only filed applications/patents with sequences from China, Brazil, India and an expanding list of counties":[5]

  • India - GQ has identified and acquired hundreds of India patents with sequences. These are being curated and formatted. We currently have 260 of these domestic-only filed Indian patents with sequences and will only continue to add going forward. By November, we expect to have an additional 800 Indian patents available through GQ-PAT Premium.
  • Brazil - There are over 270 Brazil patents in GQ-PAT Premium. This brings us up to date with all domestic only Brazil patents with sequences. We will continue to add new Brazil patents with sequences moving forward.
  • China - We have made strides to bring new China patents with sequences into GQ-PAT Premium faster. We believe we have the industry’s fastest delivery of sequence-based patents from the China patent office (SIPO).
The next countries under consideration to add to the archive include "Argentina, South Africa, Canada, Australia, Vietnam, Russia, European Countries (patents with no EP), Mexico, Taiwan, Indonesia, Philippines, Vietnam, Chile, and Malaysia."[5]

According to a GQ-PAT Premium Update Sheet from GenomeQuest, extended legal statuses are now available on GQ-PAT Premium, and "these legal statuses are not limited to Premium content, but are only accessible to Premium customers."[5] The update sheet gives the following description of the extended legal status:
When doing searches for patents using GenomeQuest’s premier IP product, you will be able to view legal statuses beyond Granted and Application. Legal Statuses will include: Pending, Granted, Lapsed, Expired, and Revoked and are updated monthly, presenting the most current and up to date information available.

For more information on viewing the Extended Legal Status and Legal Status National Phase (which will only be available to users that subscribe to GQ-PAT Gold+ or GQ-PAT Platinum after January 2013), see this page on the GenomeQuest Documentation Wiki.[6]


December 2010

GenomeQuest announced the expansion of their data coverage to sequences filed at the Chinese patent office (SIPO). As of early December 2010, over 40,000 sequences had been indexed into the GenomeQuest collection from over 5,000 Chinese patents.

For more information, see the PRNewswire press release.

Data Coverage

GenomeQuest maintains its own patent sequence database, GQ-PAT. Many other non-patent datafiles are also simultaneously searchable on the platform, such as GenBank, RefSeq, Swiss-Prot, and other NCBI and EBI files.


GQ-PAT

The GQ-PAT database is an aggregated collection of sequence data and annotations from major national and regional patent offices, including US Applications, US Patents, US PSIPS data, EP, JP, KR, CN, IN, and WO/PCT documents, plus many smaller authorities. JP records are added from the DDBJ - the DNA Databank of Japan - where the Japanese Patent Office (JPO) deposits patents that contain sequence data. Documents from the WO/PCT office are not available in electronic format; therefore they are treated with OCR technology in-house at GenomeQuest using a proprietary process to render the sequence data searchable. This data is subsequently hand-corrected by humans using guidance provided by documents from other machine-readable sources (such as a US family member with electronic sequence data). Other documents are identified using technology to detect sequence information and then manually parsed and curated.

The following information was available on the GenomeQuest wiki as of September 28, 2010:[7]

There are many mistakes existing in the original data. The typical mistakes include sequence listing malformat, typographical errors, incorrect numbering, miscounting, corrupted files, etc. A set of intelligent rules is used in constructing GQ-Pat Gold to detect these mistakes and fix them automatically whenever possible or manually when necessary without compromising GQ-Pat Gold database content accuracy, and timeliness.
Very often, non-machine readable WIPO applications and their machine-readable US (or other national) applications are filed concurrently. In this case, our automated system uses Optical Character Recognition (OCR) software to acquire and retrieve sequence information, followed by human editing to correct remaining errors with guidance from other national machine-readable sources. This strict QC process delivers sequences of high quality.
For GQ-Pat Gold+ and Platinum data, GenomeQuest employs automation to detect documents which may have sequences, and then manually curates those documents for sequences, sequences in tables, and sequences in figures.

GeneSeq

The Thomson Reuters database of genetic sequences from patent documents, GeneSeq, can be loaded onto the GenomeQuest platform through a subscription agreement with Thomson Reuters.


GQ-PAT vs. Thomson Reuters GeneSeq

GeneSeq is a database of manually edited records which are treated by Thomson Reuters indexing staff during the production of the Derwent World Patents Index file, and contains sequences from patent documents from over 40 patenting authorities. Because of the labor involved in manually indexing these documents, the GeneSeq database contains only 1 record per patent family, and there is a delay before documents are indexed into the database.

In contrast, GQ-PAT can offer more timely data, and processes all patents with sequence data from the covered authorities. As of January 2013, GQ-Pat manually indexes all non-ST.25 sequence data it finds in patent documents, including sequences found in tables and figures. GQ-Pat's sources have a broad range including US, EP, WO/PCT, CN, JP, KR, RU, IN, BR, TW, European domestics, and more exotic authorities.


USGENE vs GQ-PAT

As of January 28th, 2011, statistics published by SequenceBase and GenomeQuest, indicate that USGENE[8] had 4.9% more U.S. sequence publications than GQ-PAT[9]. This may be partly due to comparative timeliness, since each week USGENE provides a 3 days from USPTO publication timeliness, as compared to GQ-PAT which provides an 11 days from USPTO publication timeliness, updated once every two weeks.[10] As of January 2013, GQ-Pat has been adding non-ST.25 US documents into their Gold+ offering through a process of manual curation.


Search Interface

Sequences are searchable on GenomeQuest using four options. GenePAST identity search is usually the preferred method for intellectual property searching. BLAST similarity search is also available, as are 2 other algorithms: fragment search and motif search.


References

  1. GenomeQuest Website, http://genomequest.com/content. Accessed February 6, 2014.
  2. GenomeQuest Website, http://www.genomequest.com/docs/section/available-genomequest-databases. Accessed February 6, 2014.
  3. "GQ Pat Numbers." GenomeQuest Wiki, http://wiki.genomequest.com/index.php/GQ_Pat_Numbers. Accessed January 21, 2013.
  4. Email Correspondence with GenomeQuest representative. Received January 16, 2013.
  5. 5.0 5.1 5.2 GenomeQuest. "GQ-PAT Premium: Emerging Countries Domestic Patents Archive & Extended Legal Status." GQ-PAT Premium Update Sheet (PDF) received from GenomeQuest representative. Accessed January 16, 2013.
  6. "ELS." GenomeQuest Documentation Wiki, http://wiki.genomequest.com/index.php/ELS#Extended_Legal_Status_and_Legal_Status_National_Phase. Accessed January 16, 2013.
  7. "GQ PAT: Data QC. " GenomeQuest Wiki, http://wiki.genomequest.com/index.php/GQ_Pat#Timeliness_of_GQ-PAT_Data. Accessed September 28, 2010.
  8. STN website, http://www.stn-international.com/usgene.html. Accessed February 22, 2011.
  9. GenomeQuest website, http://genomequest.s3.amazonaws.com/GQ-IP-Content.pdf#page=2. Accessed February 22, 2011.
  10. "GQ Pat." GenomeQuest wiki, http://wiki.genomequest.com/index.php/GQ_Pat. Accessed February 9, 2012.

Patent search questions. Expert answers.  Brought to you by Landon IP
HOT Items

Intellogist is brought to you by the patent search experts at Landon IP.

  • There is a new Community Report on Relecura.
  • A new Analytics module is available on PatBase!
  • There's a new System Report on PatSeer!
  • Patbase has announced new legal status and similarity search tools.
Welcome to Intellogist!

To network with our international community of patent info pros, please create an account.

For a list of our current members, see our Community Page.