Report:MMS/Chemical Structure Tools/Chemical Structure Drawing
|Report||Patent Coverage Map||Ratings||Comments|
|This search system report was created by the Intellogist Team and is available for viewing only. If you'd like to share your knowledge on Intellogist, please visit the Best Practices, Glossary, or Community Reports pages. Registered users may be notified of any substantial changes to this report by placing a "watch" on the Revisions page, which is the last page listed in the table of contents. To learn more about using the Intellogist "watchlist," see the Watchlist Help page.|| |
Chemical Structure Drawing
Markush structure search systems have to support query complexity as well as complex indexing schemes. It may be that a structure query itself can admit many variations on a potential theme. The computational demands of finding potential matches between a generically defined potential compound, and the millions of generically disclosed structures contained within patent documents, are sometimes very great; fortunately, the Markush DARC software upon which MMS is based can handle this necessity. The program offers query structure-building terms that allow great variety through the use of generic groups, free sites, and unspecific atoms.
Chemical structures in MMS are divided into three general categories by the MMS instruction manual: fully specified structure queries, substructure queries, and finally, Markush structure queries. A fully specified structure query is one that has no ambiguity- it represents exactly one desired chemical structure. The substructure query is a query with a given backbone, or partial backbone, that is designed to return queries that may have additional branches through the use of free sites or G-groups; in addition, its bonds may be variable or undefined, and the exact elemental identity of some of its atoms may be undefined. Finally, a Markush structure is defined by the use of “superatoms,” or system terms that can be added to a structure to represent a generic class of chemical add-ons, such as alkyl groups.
These distinctions between the type of structure that can be drawn and searched with in MMS are merely hypothetical distinctions to teach the user the system’s potential – the system does not require the user to enter any special commands before designing one of these query types. Chemical structure queries for MMS can be defined via command line input, or via the TOPFRAG program, which can take a graphical chemical structure and generate the appropriate MMS command line code from it. Users can also draw the chemical structure through a graphical interface in ChemAxon's MarvinSketch. The three sections below describe these approaches.
Chemical Structure Drawing – In MMS Command Language
Chemical structure drawing in MMS can be done through a series of drawing commands in which numbers are used to represent the nodes of a structure. The first step of any structure drawing effort is to define the structure’s backbone by using the “Graph” (GR) command. A wide variety of chemical attributes may then be defined by using the node number(s) to convert atoms and bonds to their necessary type. Most (if not all) users will find it necessary to draw and label the nodes of their desired structure before beginning to draw it, to ensure that the proper connections are made and the appropriate backbone structure is drawn without error.
Just as generically claimed structures in a patent may encompass many specific compounds, chemical structure searchers may often need their queries to encompass multiple variations on a basic structure. Constructing generic queries can be accomplished in a number of ways in MMS: by using free sites, by using non-specific atoms or generic groups, or by defining a selected set of optional structure variations through a variable group. These options are explained more fully in the Generic Structure Building section. In this section, we will focus on using the interface to draw and “verify” a query for searching.
Structure drawing takes place within the “QT or “query text” menu. After entering MMS, the user may type the command QT to enter the query drawing phase. They have a choice to retain the last structure query that was created in the system, within the week that it was drawn (structure queries are automatically retained for up to 1 week, unless saved by the user; the cache of unsaved query drawings and answer sets is erased once every week, usually on Saturdays).
Whether the user chooses to draw a new structure, or edit a previously entered one, the following command prompt will be displayed in QT mode:
- - QT - (CN,CA,GM,GI,GR,BO,AT,FS,AP,VP,ATTR,VE) ?
The following text is provided upon login to MMS, which gives a basic overview of the available functionality for defining queries, and the order in which the various steps should occur:
- You should first specify the graph (the basic skeleton) of your query:
- - use the GR command.
- When the graph is defined, the following commands should be used to modify it:
- - AT : to move to the ATom specification level
- - BO : to move to the BOnd specification level
- - FS : to move to the Free Site (optional site of substitution) specification level
- - TRA : to move to the TRAnslation attribute specification level
- - CR : to move to the Chain-Ring attribute specification level
- and also
- - CH : to move to the CHarge specification level
- - AM : to move to the Abnormal Mass (Isotope) specification level
- - DT : to move to the Deuterium-Tritium specification level
- - PA : to move to the Polymer attribute specification level
- - AV : to move to the Abnormal Valency specification level
- A G0 group can have some [may have some variable groups, or] G groups (e.g. G1, G2, ...). In order to specify the alternatives for each of these Gi group you should first use the
- - GM command : to move to the definition of a given group. [GM stands for “Group Markush”]
- When a given Gi group has some multi-node moieties, you should use the
- - AP command : to specify the attachment points to the parent group.
- The following commands are also available
- - VP : when a group has some Variable Positions of Attachment, use the VP command to specify them.
- - GI : when 2 groups Gx and Gy are identical,use the GI command to specify the content of Gy simply by copying the previously defined content of Gx [GI stands for “Group Identical]
- - CA : use the CA command to cancel the complete query (CA then ALL), the current group (CA then GM) or a given attribute in the current group (CA then [attribute symbol] e.g. CA TRA to cancel the translation attributes).
- - VE : use the VE command to VErify (display) the current group; use VE to get a graphic display, or VE TX to get an alphanumeric display.
- When the query is completed use the
- - FI command : to move back to the ST level and start the search.”
After each drawing step is complete, users are able to use the “VE” command to open a graphic display window, called the MrkGraphX window, which will show the structure query after every step. This window may take a few moments to load.
Within the MrkGraphX display window, there is a left-side pane showing all the G-groups that have been defined on the structure. If variable G-groups off the father structure have been defined by the user, they can be viewed by double-clicking the G-group of interest from within this pane. While viewing a secondary G-group, the parent structure can be also be recalled, and will appear in a small pop-up window for comparison.
A setting in the display window, when activated, will also place an image of the structure into the MMS session transcript, so that the user can refer to it when typing his/her next drawing command. Choosing “Automatic Capture Query – ON” will enable this feature.
Once the structure drawing is complete, the user types “FI” to leave the structure query editor, and is placed back at the ST prompt, from which a search may be conducted.
Because structure drawing in MMS requires the user to be logged-on to the service, a connect-hour fee is charged for the time needed to define the structure query. Therefore, the best method for performing a search this way is for users to draw out their desired structure, label the nodes, and prepare the list of drawing commands to define the needed attributes on a scrap sheet of paper before logging in.
Chemical Structure Drawing – With the TOPFRAG Program
TOPFRAG is a chemical structure drawing tool that will allow users to draw the desired structure, with attributes allowed by the MMS search engine. Once the user has finished, the program will convert the input graphic structure to a section of code which can be pasted into MMS via the command line, and will automatically generate the appropriate structure query. When this program is used in conjunction with MMS, it allows the user to save time and connect hour fees, and also allows the user to check over his or her work at length before submitting the query to MMS, preventing mistakes.
In addition to the advantages described above, TOPFRAG has an additional timesaving feature: it allows users to select common structure elements, such as benzene rings, for example, from a menu of templates.
Markush TOPFRAG also creates query language that can be used to search the DWPI file using chemical fragmentation code indexing; however, this is not a search that can be conducted through MMS.
Chemical Structure Drawing – In JChem
Chemical structures in JChem can be drawn using MarvinSketch, the "integrated structure drawing tool of JChem." The MarvinSketch user guide gives detailed instructions on how to utilize all features of MarvinSketch, including how to draw Markush structures and R-group queries with the software. MarvinSketch is similar in use and functionality to the TOPFRAG program, since both programs allow users to draw chemical structures through a graphical interface and define attributes for certain elements of the structure.
However, unlike TOPFRAG, MarvinSketch doesn't generate code representing the structure query which needs to be pasted into a command line. The MarvinSketch structure drawings can be uploaded directly to the query form, as illustrated below. Users choose to "Add detail field" and then select a "Markush structure" drawn in MarvinSketch to be added as a term in the query.
Once the user has uploaded a saved structure drawing from MarvinSketch into the query form, they can define the type of search they wish to conduct. According to the guide "Special search types: Markush structures" on the ChemAxon website, "all structural search types are allowed for Markush targets/tables. (DUPLICATE, SUBSTRUCTURE, SUPERSTRUCTURE, FULL and FULL_FRAGMENT search types.)"
Finally, users can define other attributes of the structure query, such as:
- Stereochemistry (On, Exact, Diastereomer, Enantiomer, or Off)
- Double bond stereo check (All, Marked, or Off)
- Atom matching (On, Exact, or Ignore for Charges, Isotopes, Radicals, and Valence)
- Tautomer (On or Off)
- Vague bond (Off, Ambiguous aromaticity 5-membered rings, Ring bonds "or aromatic", All bonds "or aromatic", or Ignore bond types)
Once a user has selected the options to apply to the structure query and added any additional text field terms, they can run the query in the MMS database.
- ↑ "JChem Query Guide." ChemAxon website, http://www.chemaxon.com/jchem/doc/user/queryindex.html. Accessed July 28, 2011.
- ↑ "Special search types: Markush structures." ChemAxon website, http://www.chemaxon.com/jchem/doc/user/query_markush.html. Accessed June 28, 2011.