Accelrys Product Previous Next Contents Index Top
QSAR



8       Working with descriptors

A descriptor is any one of a number of molecular properties that QSAR+ can calculate and use in determining new QSAR relationships. QSAR+ provides over 100 different descriptors in a variety of categories:

For information on the descriptors available in each category or family see the Theory section of this guide.

To meet your requirements for building QSAR equations, QSAR+ enables you to work with descriptors in a variety of ways.

Managing descriptors

You manage descriptors by using the several control panels (described later in this chapter). Descriptor management includes activities such as identifying the descriptors with which you want to work, displaying and selecting only descriptors in a specific class, specifying preferences for the various descriptors, and adding descriptors to the study table.

Editing the descriptor database

When QSAR+ is installed, you can access a descriptor database that contains the equations used to calculate molecular descriptors. You can edit this database to modify the supplied descriptors, create new descriptors, specify which descriptors should be considered default descriptors, create new descriptor categories, and control the format in which the results of descriptor calculations are displayed in the study table.

The following activities related to working with descriptors are included in this chapter:

Default descriptors sets in the following section.
Managing descriptors on page 148.
Using receptor surface analysis descriptor on page 155.
Rule of five on page 162.


Descriptors in the combi-chem documentation

Additional information about descriptors (and other information) can be found in the combi-chem documentation.

In Start-up and Configuration:

Daylight setup
Oracle setup
MDL ISIS setup

In 4d. Descriptors for library analysis:

Using molecular diversity descriptors
Statistical analyses and data-mining techniques

In Theory:

Graph-theoretic descriptors
Information-theoretic descriptors
Descriptors based on projections of the molecular surface (shadow indices)
Descriptors based on partial charges mapped on surface area
2D and 3D fingerprints metrics

In 4e. Data analysis and library visualization:

Visualization of compounds in descriptor space
Principal component analysis
Factor analysis
Cluster analysis
Multidimensional scaling


Default descriptors sets

QSAR+ has predefined sets of default descriptors relevant to QSAR, Combichem, and QSPR. These sets are accessible from the study table by going to the Preferences/Defaults Set menu item and selecting the QSAR, COMBICHEM, QSPR, or, if an external set of descriptors is required, Other submenu.

You can see the descriptors in each set by selecting Descriptors/Databases from the study table menu bar. This opens the Descriptor Database control panel, which contains a list of descriptors.

The message at the top of the Descriptor Database control panel identifies the current default set.

QSAR defaults descriptor set

Table 2. QSAR default descriptors

Conformational
EPenalty   Conformational energy penalty.  
LowEne   Lowest energy conformer.  
Energy   Energy.  
Electronic
Charge   Sum of partial charges.  
Fcharge   Sum of formal charges.  
Apol   Sum of atomic polarizabilities.  
Dipole   Dipole moment.  
HOMO   Highest occupied molecular orbital.  
LUMO   Lowest unoccupied molecular orbital.  
Sr   Superdelocalizability.  
Information
InfoContent   Graph-theoretical Information-content indices.  
Molecular shape analysis (MSA)
DIFFV   Difference volume.  
Fo   Common overlap volume (ratio).  
NCOSV   Non-common overlap steric volume.  
ShapeRMS   Rms to shape reference.  
COSV   Common overlap steric volume.  
SRVol   Shape reference volume.  
Quantum mechanical
LUMO_MOPAC   Lowest unoccupied molecular orbital from MOPAC.  
DIPOLE_MOPAC   Dipole moment from MOPAC.  
HF_MOPAC   Heat of formation from MOPAC.  
HOMO_MOPAC   Highest occupied molecular orbital from MOPAC.  
Receptor
Receptor_energies   Molecule-receptor interaction energies.  
Receptor_RSA   Molecule-receptor points interaction energies.  
Spatial
RadOfGyration   Radius of gyration.  
Jurs descriptors   Jurs charged partial surface areas descriptors.  
Shadow indices   Surface area projections descriptors.  
Area   Molecular surface area.  
Density   Density.  
PMI   Principal moment of inertia.  
Vm   Molecular volume.  
Structural
MW   Molecular weight.  
Rotlbonds   Number of rotatable bonds.  
Hbond acceptor   Number of hydrogen-bond acceptor groups.  
Hbond donor   Number of hydrogen-bond donor groups.  
Chiral centers   Count of the number of chiral centers (R or S) present in a molecule.  
Thermodynamic
AlogP   Ghose and Crippen logP.  
AlogP98   Log of the partition coefficient, atom-type value.  
Fh2o   Desolvation free energy for water.  
Foct   Desolvation free energy for octanol.  
Hf   Heat of formation.  
MolRef   Ghose and Crippen molar refractivity.  
Topological
Balaban   Balaban indices.  
Kappa indices   Molecular shape kappa indices.  
PHI   Molecular flexibility index.  
SubgraphCount   Subgraph counts.  
Chi indices   Kier & Hall chi connectivity indices.  
Wiener   Wiener Index.  
log Z   Logarithm of Hosoya index.  
Zagreb   Zagreb index.  


Managing descriptors

This section provides information about the following activities related to managing descriptors:

Using the default descriptors

To add the default descriptors set to the study table, select the Descriptors/Add Default menu item in the study table. This adds the current descriptors database to the study table. A button in the study table is also available to do this.

Selecting descriptors

Descriptors are selected using the Descriptors control panel. To access the Descriptors control panel, select the Descriptors/Select menu item in the study table.

The Descriptors control panel contains a list of the descriptors in the current descriptors database. These may be selected by clicking the descriptor name in the first column, for example, clicking EPenalty causes that row of the descriptor table to become highlighted, which means it will be added to the study table (see the next section for details). To unselect a descriptor, click any part of the table other than the first column, so that the highlight is turned off.

The Descriptors control panel contains controls that allow you to select groups of descriptors. The left popup controls whether the action that occurs when you click the associated action button is to Select, Deselect, or Display the selected descriptors. For example, if you want to select all the conformational descriptors, you can do so by choosing Select in the left popup and then setting the Descriptors in Family popup (far right) to Conformational. Now when you click the (unlabeled) action button (below ADD), the conformational descriptors are selected. To deselect them, change the Display popup to Deselect, then click the action button again.

If you find the display of all the descriptors at the same time distracting, you can display just the selected descriptors by setting the popup to Display.

Another way to select a subset of descriptors is to use the All/Default popup. To see the effect of this control, set the Descriptors in Family popup to Electronic, select Default from the All/Default popup, then click the action button.

Setting descriptors preferences

You may have noticed that selecting certain families of descriptors causes the Preferences button to become available and to change its name.

When the Descriptors in Family popup is set to Electronic, for example, the Preferences button is labelled Electronic. When you click this newly active pushbutton, a control panel appears, which allows you to customize certain aspects of the way the electronic descriptors are calculated. For example, if you decide that only the total dipole moment is needed, uncheck the XYZ Components checkbox. Now only the total dipole moment (calculated from atomic partial charges) is added to the study table.

Preferences for the calculation of other types of descriptors can be set in the same way.

Daylight descriptors preferences

The maximum error levels allowed in the Daylight calculation of ClogP and CMR are customizable through the Daylight Descriptors control panel. Options are also provided to add the error level values to the study table as separate columns. Open this control panel by setting the family popup in the Descriptors control panel to Daylight and then selecting the Daylight pushbutton.

Information-content descriptor preferences

The Atomic Composition/Total checkbox in the Information-content Descriptors control panel sets the information of atomic composition index, created by partitioning the atoms of the molecule into equivalence classes based on their atomic numbers.

If Edge-based is checked, the four buttons below apply to information indices based on the edge adjacency and edge distance matrices, specifically,

Edge adjacency/magnitude
Edge adjacency/equality
Edge distance/magnitude
Edge distance/equality

If Vertex-based is checked, the four buttons apply to information indices based on the adjacency and distance matrices.

Vertex adjacency/magnitude
Vertex adjacency/equality
Vertex distance/magnitude
Vertex distance/equality

The four checkboxes at the bottom of the panel are switches for the Multigraph, Structural, Bonding, and Complementary information-content indices.

For a detailed explanation of this descriptor, see Chapter 5, Theory: QSAR+ descriptors.

Receptor descriptor preferences

Setting the family popup in the Descriptors control panel to Receptor and clicking the Receptor pushbutton opens two control panels: Receptor-Model Interactions and RSA Preferences (receptor surface analysis).

You cannot add receptor descriptors to the study table until you have specified a receptor surface model. For information on this, see Using receptor surface analysis descriptor on page 155.

Spatial preferences

Open the Spatial Descriptors control panel by setting the family popup in the Descriptors control panel to Spatial and then selecting the Spatial button.

This control panel controls the calculation of spatial descriptors such as the moment of inertia about the principal axes of a molecule. For example, if you want the magnitude of the moment of inertia, but not its Cartesian components, uncheck the XYZ Components checkbox. See the Principal moment of inertia (PMI) section, page 101, for a theoretical explanation of the principal moment of inertia descriptor.

Jurs charged partial surface area parameters

The definition of polar atoms and the probe radius for the solvent-accessible surface area calculation can also be customized with the Spatial Descriptors control panel.

Polar atoms can be defined:

The correlation between the Jurs Charged Partial Surface Area Parameters checkboxes in the Spacial Descriptors control panel and the list of Jurs descriptors under Jurs descriptors based on partial charges mapped on surface area is:

Checkbox Toggles calculation of descriptors
Solvent Accessible Surface Area   SAS area descriptor Jurs-SASA  
Partial Charged Surface Areas   Jurs-PPSA-1, Jurs-PNSA-1, Jurs-DPSA-1  
Total Charge Weighted Surface Areas   Jurs-PPSA-2, Jurs-PNSA-2 and Jurs-DPSA-2  
Atomic Charge Weighted Surface Areas   Jurs-PPSA-3, Jurs-PNSA-3, and Jurs-DPSA-4  
Fractional Charged Partial Surface Areas   Jurs-FPSA-1, Jurs-FPSA-2, Jurs-FPSA-3, Jurs-FNSA-1, Jurs-FNSA-2, Jurs-FNSA-3  
Surface Weighted Charged Partial Surface Areas   Jurs-WPSA-1, Jurs-WPSA-2, Jurs-WPSA-3, Jurs-WNSA-1, Jurs-WNSA-2, Jurs-WNSA-3  
Relative Positive and Negative Charges   Jurs-RPCG, Jurs-RNCG, Jurs-RPCS, Jurs-RNCS  
Relative Polar and Apolar Surface Areas   Jurs-TPSA, Jurs-TASA, Jurs-RPSA, and Jurs-RASA  

Shadow indices

For an explanation of the shadow indices see the Shadow indices section on page 97 under Theory. The correlation between the Shadow Parameters checkboxes and the descriptor names is:

Checkbox Toggles calculation of descriptors
Areas of Molecular Shadows   Shadow-XY, Shadow-XZ, and Shadow-YZ  
Fractional Areas of Molecular Shadows   Shadow-XYfr, Shadow-XZfr, and
Shadow-YZfr  
Extents of Molecular Shadows   Shadow-nu, Shadow-Xleng, Shadow-Yleng, and Shadow-Zleng  

Defining hydrogen-bond acceptors and donors and rotatable bonds

The definitions of hydrogen-bond acceptors, hydrogen-bond donors, and rotatable bonds can be customized with the Structural Descriptors control panel.

Open this control panel by setting the family popup in the Descriptors control panel to Structural and then selecting the Structural pushbutton.

Thermodynamic descriptors preferences

AlogP98 descriptors

The 115 atom types defined in the calculation of AlogP98 are now available as descriptors. To calculate them, select the entry AlogP_atypes in the Thermodynamic family in the descriptor table. Each AlogP98 atom-type value represents the number of atoms of that type in the molecule. An additional atom type called Unkown_Type can also be added to the table, together with the other AlogP98 atom types. A value greater than zero for this descriptor indicates the presence of atoms that couldn't be classified as any of the defined AlogP98 atom types. The AlogP Atom Types control panel allows you to select the elements to be taken into account.

Open this control panel by setting the family popup in the Descriptors control panel to Thermodynamic and then selecting the Thermodynamic pushbutton.

Topological descriptors preferences

For an explanation of the topological descriptors see the discussion of graph-theoretical (page 73) and information-content descriptors (page 91).

To change preferences for topological descriptors, set the family popup in the Descriptors control panel to Topological and select the Topological pushbutton. The correlation between the checkboxes in the Topological Descriptors control panel and the descriptors is:

Checkbox Toggles calculation of descriptors
Unmodified   Molecular connectivity Indices CHI-0, CHI-1, and CHI-2  
Valence-modified   Valence-modified connectivity index, a refinement which takes into account the atomic number and order of connected bonds.  
Subgraph Order From and To   Range of allowable orders in subgraphs: 0 through M, where M is the number of edges in the graph.  
Subgraph Type   Checkboxes Path, Cluster, Path/Cluster, and Ring specify the subgraph types used with the molecular and valence-modified connectivity indices.  
Kier & Hall Kappa Shape Indices   Shapes of molecules in terms of the count of atoms (One), count of branchings (Two), and count of paths of length 3 (Three).  
Subgraph Counts   Path, Cluster, Path/Cluster, and Ring subgraphs found in the model.  
Balaban Indices   Characterize the shape of a molecule, which can take account of the covalent radii (JX) and electronegativity (JY) of the atoms of the model.  

Adding descriptors to the study table

When you have selected the set of descriptors that you want to use, you add them to the study table by clicking the ADD button in the Descriptors control panel.


Using ISIS keys and Daylight fingerprints

ISIS keys

To work with ISIS keys, select Descriptors/Fingerprints/Isis Keys from the study table to open the 2D Fingerprints Isis Keys control panel. With this control panel, you can:

Daylight fingerprints

To work with Daylight fingerprints, select Descriptors/Fingerprints/Daylight Fingerprints from the study table to open the 2D Fingerprints Daylight control panel. With this control panel, you can:


Using receptor surface analysis descriptor

To use the RSA descriptor, choose the Descriptors/Select menu item. Set the family popup in the Descriptors control panel to Receptor and click the Receptor pushbutton to open two control panels.

The first control panel (Receptor-Model Interactions) is concerned with addition of the receptor energy descriptors to the study table. To learn more about the receptor energy descriptors, see Receptor descriptors under Theory.

The second control panel (RSA Preferences) controls the addition of interaction energies at each vertex of the surface. You may add only the van der Waals (steric) component of the interaction energy or only the electrostatic component or both, by checking the VDW, ELE, and TOT (total) checkboxes.

A column is created in the study table for each point on the receptor surface model, containing the energy of interaction at that point between the surface and the molecule. For a large receptor surface model, this can be several thousands of columns if all points are added to the study table: too many for some of the statistical methods available. You can reduce the number of points added to the study table by using the Filter Surface Points popup.

Three main methods are available, based on selecting every nth surface point or on adding points based on their variance or correlation. When you set the Filter Surface Points popup to the desired method, additional controls appear.

a.   Add all surface points

Add all the points on the surface of the receptor model to the study table.

b.   Add every Nth surface point

Add every nth point on the surface of the receptor model to the study table. Typically this is a good place to start. Fill in the Every entry box with the frequency with which the surface is sampled.

The difference in energy at each surface point between each molecular model is used to filter input to the study table.

a.   Add points with variance higher than threshold

Those points with variance higher than the Variance Threshold are added to the study table.

b.   Add percentage of points with highest variance

The Percent is the percentage of highest-variance points to add.

The square of the correlation between energy at each surface point and biological activity data in the study table (marked Independent Y) is used to filter the RSA input to the study table.

a.   Add points with correlation higher than threshold

Correlation^2 is used to filter out any columns that show lower correlation with the activity than the specified threshold.

b.   Add percentage of points with highest correlation^2

The Percent specifies the percentage of highest-correlation squared points to add.

It is probably best to start with Add Every Nth surface point. You also can select columns from the study table by selecting the Variables/Manage Independent menu item on the study table.

Next, click the action button on the extreme left side of the Descriptors control panel (underneath the ADD button). This displays the receptor descriptors Receptor_energies and Receptor_RSA. To select the Receptor_RSA descriptor, click the cell containing the label Receptor_RSA. To add the receptor surface data to the study table, then click the ADD pushbutton. The receptor surface points are added to the study table.

These points may be displayed with the Manage Independent Columns control panel, which is accessed by selecting the Variables/Manage Independent menu item in the study table. Set the 3D-QSAR Labels popup to RSA and click the Label Independent Variables action button.

Surface points in the study table are displayed on the receptor surface model as a label, for example, TOT/123. The first part of the label refers to the type of energy term specified in the RSA Preferences control panel under Include Molecule-Surface Point Interaction Energies. The second part is the number of the surface point and is the same index as the Surface point index in the first column of the output of the Receptor List function.

Typically, the next stage is to calculate a QSAR that relates the receptor surface energy at each surface point to experimental activity data. For a guide to calculating QSARs, see Chapter 15, Using the equation viewer, and Chapter 3, QSAR+ QuickStart.


Using pKa descriptors

Installing pKa

For the pKa program to be found by Cerius2, it must be listed in the applcomm.db file in $C2DIR/libraries/applcomm.db. The form of the entry is:


A unix pKa pathname 

where pathname is replaced by the pathname of your pKa application.

Adding pKa descriptors to the study table

The pKa descriptors are included in the QSAR, COMBICHEM, and QSPR descriptor databases. The three steps to adding pKa descriptors to the study table are:

1.   Open the appropriate descriptor database

From the study table, open the Descriptor Database control panel by selecting the Descriptors/Databases menu item. In the Descriptor Database control panel, set the popup to the appropriate database and click the OPEN DATABASE pushbutton.

From the study table, open the Descriptors control panel by selecting the Descriptors/Select menu item.

2.   Set the pKa descriptor preferences:

Set the family popup to ACD. Click the ACD pushbutton to open the ACD Descriptors control panel, which is used to set preferences for treating the pKa data. Two types of pKa descriptors are available: a count of pKas for each model, and a list of pKas.

To add a count of pKas to the study table, check the List a count of pKas checkbox and specify the range within which pKas are to be counted.

To add the pKa values to the study table, check the lower List checkbox, specify whether the values should be listed from High to Low or from Low to High, specify the maximum number of pKas to be listed, and specify a range or an upper or lower limit for pKas to be listed.

3.   Add the pKa descriptors to the study table

In the Descriptors control panel select the pKa row and click the ADD button. The pKa descriptors are added to the study table, and the results are calculated for any entries already in the study table. For subsequent additions to the study table, the pKa descriptors are calculated automatically.

What do the pKa column names mean?

The column names for pKa descriptors reflect the preferences defined at the time the column was created.

A count of pKa columns begins with the string n_pKa_. This is followed by the range of values being counted. For example, n_pKa_0.00_14.00 is a count of pKas with values between 0.00 and 14.00.

A list of pKa columns begins with the string pKa_. The first number tells which pKa value among the selected pKas is held in this column. The second number gives the maximum number of pKas to be listed. The third number specifies whether the pKas are listed from low to high (number = 0) or from high to low (number = 1), The fourth number specifies whether a range (number = 0) or a lower (number = 1) or upper (number = 2) bound is used to select the pKas to list. If a range is used, it is followed by two numbers specifying the range. If a lower or upper bound is used, it is followed by the number specifying the bound. For example, pKa_1_2_0_2_14.00 is the lowest pKa of a maximum of two pKas under the bound of 14.00.


Using ADME descriptors

To use ADME descriptors, on the Study Table, select Descriptors/Select to bring up the Descriptors control panel. Then change Descriptors in Family to ADME.and click the ADME button to bring up the ADME Models Preferences control panel.

The panel is divided into three sections, one each for the ADME models (Egan et. al 2000). Each is described in the following sections.

Intestinal Absorption Model

Reports the molecule's predicted absorption level (Good, Moderate, Poor, Very Poor).

First select a model type:

2D Model: model is based on Polar Surface Area (PSA) calculated from connectivity data only (2D structure)

3D Model: model is based on Polar Surface Area (PSA) calculated from 3D coordinates

2D and 3D Models: both 2D and 3D models are calculated

Then customize the output:

T-squared Values Check this to include a column with the T2 values (distance from the center of the ellipse) used in the ADME absorption model calculation.

AlogP98 Values Check this to include a column with the AlogP98 values used in the ADME calculation.

PSA Values Check this to include a column with the polar surface area values used in the ADME calculation.

BBB Penetration Model

Reports the logarithm of the brain-blood concentrations ratio and the corresponding BBB penetration level.

First select a model type:

2D Model: model is based on Polar Surface Area (PSA) calculated from connectivity data only (2D structure).

3D Model: model is based on Polar Surface Area (PSA) calculated from 3D coordinates.

2D and 3D Models: both 2D and 3D models are calculated.

Then customize the output:

BBB LeveValues Check this to include a column with the BBB Penetration level corresponding to the calculated Brain-Blood ratio.

Table 3. Key to BBB penetration levels

level values description
0   Very High   Brain-Blood ratio greater than 5:1  
1   High   Brain-Blood ratio between 1:1 and 5:1  
2   Medium   Brain-Blood ratio between 0.3:1 and 1:1  
3   Low   Brain-Blood ratio less than 0.3:1  
4   Undefined   Outside 99% confidence ellipse  
5   Alogp98   Warning: molecules with one or more unknown Alogp98 types  

AlogP98 Values Check this to include a column with the AlogP98 values used in the ADME calculation.

PSA Values Check this to include a column with the polar surface area values used in the ADME calculation.

Water Solubility Model

Predicts the logarithm of the water solubility at 25 degrees and the solubility level assigned to the molecule.

Report Solubility Level Values: Check this to include a column of solubility levels corresponding to the logarithm of the water solubility.

Table 4. Key to water solubility levels

LogSw Level Description
< -8.0   0   extremely low solubility, lower than 95% of drugs  
-8.0 to -6.0   1   very low solubility, at border line of 95% of drugs  
-6.0 to -4.0   2   low solubility, at lower end of 95% of drugs  
-4.0 to -2.0   3   good,slight soluble to soluble  
-2.0 to 0.0   4   optimal solubility  
> 0.0   5   very soluble, perhaps too soluble  
1000   6   Warning: molecules with one or more unknown Alogp98 types  

Rule of five

Reports the number of violations to Lipinski's Rule of 5 (Lipinski et al. 1997):

MW <= 500
Hbond acceptors <= 5
Hbond donors <= 10
logP <= 5

Set Hydrogen Bonds Preferences for Rule of 5 Click this to bring up the HBond Descriptors concrol panel allowing you to set preferences for Hbond donors and acceptors.

Unknown and undefined AlogP98 atom types

Report Number of Unknown AlogP98 Atom Types Check this to include a column with the number of unknown atom types for each molecule in the AlogP98 calculation.

Don't Calculate ADME Descriptors for molecules with Unknown AlogP98 Atom Types Check this to set ADME descriptors to undefined whenever the corresponding molecule has undefined AlogP98 atom types.

Calculation Times

The following table shows ADME descriptors calculation time estimates for several datasets. Times are expressed in thousands of molecules per hour. The calculations were carried out on an SGI R10K, 180 MHz machine. The Absorption and BBB Penetration results correspond to the 2D models. Note that only the 2D models can be calculated in Fast Descriptors mode.

Table 5. ADME descriptors calculation time estimates

Dataset Mode Absorption BBB Penetration Solubility
400 dipeptides   Fast Descriptors   1127   1080   1004  
625 benzodiazepines   Fast Descriptors   1306   1130   1083  
1000 ACD molecules   Fast Descriptors   1987   1674   1720  
400 dipeptides   Study Table   16   16   17  
625 benzodiazepines   Study Table   17   17   18  
1000 ACD molecules   Study Table   21   20   24  


Analyzing ADME descriptors

Once the ADME descriptors have been calculated and saved in either the Study Table or BDF files, the results can be analyzed using tools accessible from the menu bar in the Study Table (under Descriptors/ADME...) or from the new menu bar in the Select BDF panel (Analysis/ADME Models...).

In addition to displaying the molecular structure in the Cerius2 models window, double-clicking the row name in the Study Table also displays the calculated ADME properties for the corresponding molecule, including Absorption, BBB Penetration and Solubility levels.

Intestinal Absorption Model

You can analyze the results of either or both of the following models:

2D absorption model obtained using polar surface areas calculated from molecular connectivity.

3D absorption model obtained using the X, Y, Z coordinates to calculate polar surface areas.

The data to be analyzed can come from all the molecules in the Study Table or from the currently selected rows.

The PLOT button generates a plot of PSA vs. AlogP98, such as the one shown below.

Two check boxes below specify the display of the 95% and 99% confidence limit ellipses obtained in the development of the model (Lipinski et al. 1997).

There are also options to display BBB Penetration model ellipses, which occupy a slightly different position in the plot. The Absorption level is calculated based on the position of each molecule in the PSA vs. AlogP98 plot:

Table 6. Key to water absorption levels

value level description
0   Good   Inside 95% Ellipse  
1   Moderate   Inside 99% Ellipse  
2   Poor   Inside box defined by PSA between 0 and 150 and AlogP98 between -2 and 7  
3   Very Poor   Outside box  
4   Undefined   Molecules with unknown AlogP98 atom types  

By default the plot is centered on the good absorption areas (around the 95% and 99% ellipses) and the points are color-coded according to Absorption level.

Points can also be color-coded according to the number of violations to the Rule of 5. Additional controls allow the user to specify which points to include in the plot: points with Absorption level Greater or equal to, Equal to, or Smaller or equal to a specified level.

Molecules corresponding to points in the plot can be identified by using the Plot Pick Tool, which selects all the molecules within the specified PSA and AlogP98 distance of the click position and highlights them in the Study Table.

The PRINT button prints out the percentage of molecules in each Absorption level.

The SELECT button selects all the molecules in the Study Table which satisfy the specified Absorption levels.

The Absorption panel launched from the Binary Data File control panel is the same as the one described above, except the SELECT button is replaced by CREATE BDF FILE which creates a new BDF file with the molecules that satisfy the specified criteria.

BBB Penetration Model

The ADME BBB Penetration model control panel works in a similar way to the absorption model control panel.

The log10 of the Brain-Blood ratio (logBB) is calculated only for molecules whose PSA and AlogP98 values lie within the 99% confidence limit ellipse corresponding to the BBB Penetration model. Points outside the ellipse are classified as undefined (probably poor penetration). For those points within the ellipse, a QSAR equation is used to predict logBB. The BBB Penetration levels are defined as follows (see Table 3).

Try pusing the PRINT button to generate sample output.

Water Solubility Model

The ADME Solubility control panel is relatively simple and self-explanatory.


Editing a descriptor database

A descriptor database is a Cerius2 table containing equations and equation coefficients used to calculate molecular descriptors. When QSAR+ is installed, you can access a database that contains over 100 spatial, electronic, thermodynamic, conformational, and other descriptors.

You can modify an existing descriptor database or create a new one by editing the installed descriptor database provided in QSAR+, then saving the modified descriptor database under a new name.

This section describes the following activities related to editing a descriptor database:

Before you begin

Because the descriptor database is accessed as a Cerius2 table, you should be familiar with Cerius2 tables before performing any activities described in this section. For information about tables and basic table operations, see Cerius2 Modeling Environment.

Opening a descriptor database

You select and open a descriptor database in a descriptor database table before you can edit it. The default database name is listed in the text window when you open QSAR+.

To open a descriptor d-atabase

If you have only a single database or if you want to use the currently selected database, select Descriptors/Databases in the study table or on the QSAR card. The Descriptor Database control panel appears.

If you have more than one descriptor database and want to change the selected database:

or:

The descriptor database you specified is displayed in the Descriptor Database control panel, which is essentially a Cerius2 table.

The descriptor database table contains one row for each descriptor. Each row contains columns, some of which are described below (to see all columns, use the horizontal scroll bar).

You can create your own family names (as described in Creating new descriptor categories on page 171), but usually most descriptors fit into one of the existing groups.

Identifying default descriptors

QSAR+ has default descriptors that are automatically used to calculate a QSAR equation unless you override them. You can determine which are the default descriptors by looking at the Default column in the Descriptor Database control panel. Any descriptor with Yes in this column is a default descriptor.

You can change the set of default descriptors by editing the Default column.

Adding a descriptor to the default set

To add a descriptor to the default set:

1.   Select the cell in the Default column for that descriptor.

2.   Clear the edit window and enter 1.

3.   Press <Return> or click any other cell in the table.

The Default cell now displays Yes.

Removing a descriptor from the default set

To remove a descriptor from the default set:

1.   Select a cell in the Default column.

2.   Press <Return> or clear the edit window and enter 0.

3.   Click any other cell in the table.

The Default cell now displays No.

Creating new descriptors

You can create new descriptors using one or more of the operators that are supplied with QSAR+. Three categories of operators are available for creating a new descriptor:

To create a new descriptor:

1.   Insert a new row in the Descriptor Database table using the Insert tool.

2.   In the Family column of the new row, enter a family name.

Most descriptors can be categorized into one of the existing families, usually spatial, electronic, or thermodynamic. For example, if you want to create a descriptor that counts halogen atoms in a structure, enter spatial.

3.   Enter a descriptor equation in the Value column using valid math and molecular operators.

For example, to create the equation for a descriptor that counts halogens in a structure, enter:


ecount(col "Structure", "Cl") + ecount(col "Structure", "Br")

This descriptor counts and reports the total number of chlorine and bromine atoms in a structure.

4.   In the Description column, enter a short description of the descriptor. For example, enter:


	Number of halogen atoms

for the description of the descriptor created in Step 3.

5.   In the 3D column, enter 0 if your descriptor is not a 3D descriptor. Enter 1 if the descriptor is 3D.

6.   In the Default column, enter 1 if you want the descriptor to be part of the default set. Enter 0 if the descriptor is not to be a default descriptor (Identifying default descriptors on page 168).

7.   In the Format column, enter the format for descriptor values to be displayed in the study table. The choices are float, integer, or scientific.

8.   In the Decimal column, enter the number of decimal places to be displayed in a descriptor value. If you entered integer in the Format column, enter 0.

9.   In the Units column, enter the units (for example, kcal/mol) to be applied to the descriptor value. If no units are to be applied, leave the cell blank.

10.   If the descriptor can be modified from a Cerius2 control panel, enter the name of the control panel in the Panel column. Otherwise, leave this column blank.

11.   To name the descriptor, click the first column in the row, then click the Prop (properties) tool. The Table Properties control panel appears. Select Row from the Properties popup.

12.   Enter a name (for example, Halogens) in the Row Name entry box.

13.   Click APPLY TO. The row name is entered in the first column of the selected row. QSAR+ sorts the descriptor list as it performs calculations, so the position of a descriptor in the list may change.

14.   Save the database containing the new descriptor. You can save the descriptor to the current database, to another existing database, or to a new database. For more information, Saving a descriptor database on page 171.

Note

To activate a new descriptor, you must first save the descriptor database with the descriptor in it.  

When you finish creating a descriptor, you can check to see that it is correctly entered by adding it to the study table and inspecting the generated data (see Adding descriptors to the study table on page 154).

Modifying descriptors

You can modify an existing descriptor in a database by editing the entry for the descriptor in the Value column of the descriptor database table. For example, to modify the Halogens descriptor defined above so that it counts fluorine as well as chlorine and bromine atoms, enter:


ecount(col "Structure", "Cl") + ecount(col "Structure", "Br") + ecount(col "Structure", "F")

in the Value column for the descriptor.

Save the database to activate the edited descriptor (see Saving a descriptor database on page 171).

When you finish modifying a descriptor, you can check to see that the modifications are correct by adding it to the study table and inspecting the generated data (see Adding descriptors to the study table on page 154).

Controlling the descriptor display format

You can control the numerical format of a descriptor value using one of the following options: floating decimal (float), integer (integer), or scientific notation (scientific).

To change the descriptor display format, edit the entry displayed in the Format column of the descriptor database table.

Creating new descriptor categories

The entry in the Family column of the descriptor database table categorizes descriptors and determines the list of choices in the family popup in the Descriptors control panel.

Creating a new descriptor family

You can create new categories of descriptors by placing new entries in the Family column. For example, if investigator Jones wants to place all saved equations in a category named Jones-QSARs, Jones simply enters this designation in the Family column for the rows containing QSARs and saves the modified table. The value Jones-QSARs now appears as a choice in the family popup on the Descriptors control panel.

Saving a descriptor database

If you make a change in the descriptor database table, that change is not activated until the table is saved and then read back into Cerius2 again with OPEN DATABASE.

If you want to save the database that is displayed to the current database file, go to the study table or the QSAR card, select Descriptors/Databases to open the Descriptor Database control panel. Click the SAVE DATABASE pushbutton. The Save Database control panel appears, which lets you choose a name for your new or modified database.



Accelrys Product Previous Next Contents Index Top

Last updated June 13, 2001 at 03:27PM Pacific Daylight Time.
Copyright © 2001, Accelrys. All rights reserved.