Report:MMS/Chemical Structure Tools/Generic Structure Indexing
From Intellogist
| Report | Patent Coverage Map | Ratings | Comments |
| This search system report was created by the Intellogist Team and is available for viewing only. If you'd like to share your knowledge on Intellogist, please visit the Best Practices, Glossary, or Community Reports pages. Registered users may be notified of any substantial changes to this report by placing a "watch" on the Revisions page, which is the last page listed in the table of contents. To learn more about using the Intellogist "watchlist," see the Watchlist Help page. | |
![]() ![]() |
|
Generic Structure Indexing in MMS
Because generic structures in MMS have to cover the full gamut of compounds disclosed or claimed in patent documents, the limitations on the variability of an indexed structure are greater than what is allowed in a structure query itself. Indexed structures created by MMS indexers may contain up to 50 G-groups, and 4 levels of nesting (e.g. G3 cites G10, G10 cites G 16, G16 cites G22, and G22 cites G30).
Indexers always attempt to create records that may be found by a specific query, even if portions of the structure are broadly defined as disclosed in the patent document. When superatom attributes are not specified by a patent disclosure, (e.g. if a patent document specifies an alkyl chain without describing its length or whether it is straight or branched, etc.) all possible attributes are included in the indexing for that structure (e.g., the document is indexed as having an alkyl chain, CHK, with any of the chain length attributes LO, MID, HI, and both the straight/branched attributes STR or BRA). The result is that if a user searches for the disclosed structure having a branched long-chain alkyl, CHKBRA, HI, the indexed patent with the unspecified alkyl chain could be retrieved by the structure query.
Patent documents sometimes include qualifying information about disclosed patent structures which cannot be represented faithfully in the chemical indexing scheme of MMS. When this occurs, text notes are used to convey information which cannot be represented graphically. These cannot be searched, but they will be displayed alongside the relevant structures in the results set. Text notes may apply to the entire Markush structure, may contain information relating to a particular superatom, or may be information on the number of repetitions in a repeating unit. The name “text notes” is a little misleading, as these notes are actually represented via a list of pre-defined codes. For example, if a patent discloses a structure having an alkyl chain with 2-4 carbons, the indexer would use the superatom CHK, for alkyl chain, and designate it with the attribute LO, meaning it has 1-6 carbons in the chain. Actually, the indexer knows that the chain may only have from 2-4 carbons, but he cannot index to that level of specificity. So, to communicate that information to the user, the indexer will add a text note, “C2-4”, related to that superatom. Below are a few examples of common types of text notes.[1]
| Type of Information | Text Note Format | Related Superatoms | Examples |
| Number of Carbon Atoms | C# | CHK CHE CHY CYC |
C2 (just 2) C2-4 (from 2 to 4) C2- (at least 2) |
| Number of Double Bonds | E# | CHE CHY |
E1 E1-2 E2- |
| Etc... | |||
| NOTE: This is not a complete list of possible text notes used in the system | |||
Sources
- ↑ Borne, Philip, Laurence Favier, and Catherine Roesch. MMS User Manual. Chapter 4, "Searching and Answer Display." Questel Website, http://www.questel.com/Prodsandservices/mms_chemistry.htm. Accessed on January 29, 2007.


