| QSAR |

To meet your requirements for building QSAR equations, QSAR+ enables you to work with descriptors in a variety of ways.
Managing descriptors
You manage descriptors by using the several control panels (described later in this chapter). Descriptor management includes activities such as identifying the descriptors with which you want to work, displaying and selecting only descriptors in a specific class, specifying preferences for the various descriptors, and adding descriptors to the study table.
When QSAR+ is installed, you can access a descriptor database that contains the equations used to calculate molecular descriptors. You can edit this database to modify the supplied descriptors, create new descriptors, specify which descriptors should be considered default descriptors, create new descriptor categories, and control the format in which the results of descriptor calculations are displayed in the study table.

In Start-up and Configuration:

You can see the descriptors in each set by selecting Descriptors/Databases from the study table menu bar. This opens the Descriptor Database control panel, which contains a list of descriptors.
The message at the top of the Descriptor Database control panel identifies the current default set.

The Descriptors control panel contains a list of the descriptors in the current descriptors database. These may be selected by clicking the descriptor name in the first column, for example, clicking EPenalty causes that row of the descriptor table to become highlighted, which means it will be added to the study table (see the next section for details). To unselect a descriptor, click any part of the table other than the first column, so that the highlight is turned off.
The Descriptors control panel contains controls that allow you to select groups of descriptors. The left popup controls whether the action that occurs when you click the associated action button is to Select, Deselect, or Display the selected descriptors. For example, if you want to select all the conformational descriptors, you can do so by choosing Select in the left popup and then setting the Descriptors in Family popup (far right) to Conformational. Now when you click the (unlabeled) action button (below ADD), the conformational descriptors are selected. To deselect them, change the Display popup to Deselect, then click the action button again.
If you find the display of all the descriptors at the same time distracting, you can display just the selected descriptors by setting the popup to Display.
Another way to select a subset of descriptors is to use the All/Default popup. To see the effect of this control, set the Descriptors in Family popup to Electronic, select Default from the All/Default popup, then click the action button.
When the Descriptors in Family popup is set to Electronic, for example, the Preferences button is labelled Electronic. When you click this newly active pushbutton, a control panel appears, which allows you to customize certain aspects of the way the electronic descriptors are calculated. For example, if you decide that only the total dipole moment is needed, uncheck the XYZ Components checkbox. Now only the total dipole moment (calculated from atomic partial charges) is added to the study table.
Preferences for the calculation of other types of descriptors can be set in the same way.
If Edge-based is checked, the four buttons below apply to information indices based on the edge adjacency and edge distance matrices, specifically,
For a detailed explanation of this descriptor, see Chapter 5, Theory: QSAR+ descriptors.
Receptor descriptor preferences
Setting the family popup in the Descriptors control panel to Receptor and clicking the Receptor pushbutton opens two control panels: Receptor-Model Interactions and RSA Preferences (receptor surface analysis).
Spatial preferences
Open the Spatial Descriptors control panel by setting the family popup in the Descriptors control panel to Spatial and then selecting the Spatial button.
Jurs charged partial surface area parameters
The definition of polar atoms and the probe radius for the solvent-accessible surface area calculation can also be customized with the Spatial Descriptors control panel.
For an explanation of the shadow indices see the Shadow indices section on page 97 under Theory. The correlation between the Shadow Parameters checkboxes and the descriptor names is:
Defining hydrogen-bond acceptors and donors and rotatable bonds
The definitions of hydrogen-bond acceptors, hydrogen-bond donors, and rotatable bonds can be customized with the Structural Descriptors control panel.
Thermodynamic descriptors preferences
AlogP98 descriptors
The 115 atom types defined in the calculation of AlogP98 are now available as descriptors. To calculate them, select the entry AlogP_atypes in the Thermodynamic family in the descriptor table. Each AlogP98 atom-type value represents the number of atoms of that type in the molecule. An additional atom type called Unkown_Type can also be added to the table, together with the other AlogP98 atom types. A value greater than zero for this descriptor indicates the presence of atoms that couldn't be classified as any of the defined AlogP98 atom types. The AlogP Atom Types control panel allows you to select the elements to be taken into account.
Topological descriptors preferences
For an explanation of the topological descriptors see the discussion of graph-theoretical (page 73) and information-content descriptors (page 91).
To change preferences for topological descriptors, set the family popup in the Descriptors control panel to Topological and select the Topological pushbutton. The correlation between the checkboxes in the Topological Descriptors control panel and the descriptors is:
Adding descriptors to the study table
When you have selected the set of descriptors that you want to use, you add them to the study table by clicking the ADD button in the Descriptors control panel.
Using ISIS keys and Daylight fingerprints
ISIS keys
To work with ISIS keys, select Descriptors/Fingerprints/Isis Keys from the study table to open the 2D Fingerprints Isis Keys control panel. With this control panel, you can:

The first control panel (Receptor-Model Interactions) is concerned with addition of the receptor energy descriptors to the study table. To learn more about the receptor energy descriptors, see Receptor descriptors under Theory.
The second control panel (RSA Preferences) controls the addition of interaction energies at each vertex of the surface. You may add only the van der Waals (steric) component of the interaction energy or only the electrostatic component or both, by checking the VDW, ELE, and TOT (total) checkboxes.
a. Add all surface points
b. Add every Nth surface point
a. Add points with variance higher than threshold
b. Add percentage of points with highest variance
a. Add points with correlation higher than threshold
b. Add percentage of points with highest correlation^2
Next, click the action button on the extreme left side of the Descriptors control panel (underneath the ADD button). This displays the receptor descriptors Receptor_energies and Receptor_RSA. To select the Receptor_RSA descriptor, click the cell containing the label Receptor_RSA. To add the receptor surface data to the study table, then click the ADD pushbutton. The receptor surface points are added to the study table.
These points may be displayed with the Manage Independent Columns control panel, which is accessed by selecting the Variables/Manage Independent menu item in the study table. Set the 3D-QSAR Labels popup to RSA and click the Label Independent Variables action button.
Surface points in the study table are displayed on the receptor surface model as a label, for example, TOT/123. The first part of the label refers to the type of energy term specified in the RSA Preferences control panel under Include Molecule-Surface Point Interaction Energies. The second part is the number of the surface point and is the same index as the Surface point index in the first column of the output of the Receptor List function.
Typically, the next stage is to calculate a QSAR that relates the receptor surface energy at each surface point to experimental activity data. For a guide to calculating QSARs, see Chapter 15, Using the equation viewer, and Chapter 3, QSAR+ QuickStart.

Using pKa descriptors
Installing pKa
For the pKa program to be found by Cerius2, it must be listed in the applcomm.db file in $C2DIR/libraries/applcomm.db. The form of the entry is:
A unix pKa pathnamewhere pathname is replaced by the pathname of your pKa application.
1. Open the appropriate descriptor database
2. Set the pKa descriptor preferences:
3. Add the pKa descriptors to the study table
A count of pKa columns begins with the string n_pKa_. This is followed by the range of values being counted. For example, n_pKa_0.00_14.00 is a count of pKas with values between 0.00 and 14.00.
A list of pKa columns begins with the string pKa_. The first number tells which pKa value among the selected pKas is held in this column. The second number gives the maximum number of pKas to be listed. The third number specifies whether the pKas are listed from low to high (number = 0) or from high to low (number = 1), The fourth number specifies whether a range (number = 0) or a lower (number = 1) or upper (number = 2) bound is used to select the pKas to list. If a range is used, it is followed by two numbers specifying the range. If a lower or upper bound is used, it is followed by the number specifying the bound. For example, pKa_1_2_0_2_14.00 is the lowest pKa of a maximum of two pKas under the bound of 14.00.

The panel is divided into three sections, one each for the ADME models (Egan et. al 2000). Each is described in the following sections.
First select a model type:
First select a model type:
Report Solubility Level Values: Check this to include a column of solubility levels corresponding to the logarithm of the water solubility.
Rule of five
Reports the number of violations to Lipinski's Rule of 5 (Lipinski et al. 1997):
| Dataset | Mode | Absorption | BBB Penetration | Solubility |
| 400 dipeptides | Fast Descriptors | 1127 | 1080 | 1004 |
| 625 benzodiazepines | Fast Descriptors | 1306 | 1130 | 1083 |
| 1000 ACD molecules | Fast Descriptors | 1987 | 1674 | 1720 |
| 400 dipeptides | Study Table | 16 | 16 | 17 |
| 625 benzodiazepines | Study Table | 17 | 17 | 18 |
| 1000 ACD molecules | Study Table | 21 | 20 | 24 |
Once the ADME descriptors have been calculated and saved in either the Study Table or BDF files, the results can be analyzed using tools accessible from the menu bar in the Study Table (under Descriptors/ADME...) or from the new menu bar in the Select BDF panel (Analysis/ADME Models...). 
Analyzing ADME descriptors
Intestinal Absorption Model
You can analyze the results of either or both of the following models:
The PLOT button generates a plot of PSA vs. AlogP98, such as the one shown below.
Two check boxes below specify the display of the 95% and 99% confidence limit ellipses obtained in the development of the model (Lipinski et al. 1997).
There are also options to display BBB Penetration model ellipses, which occupy a slightly different position in the plot. The Absorption level is calculated based on the position of each molecule in the PSA vs. AlogP98 plot:
By default the plot is centered on the good absorption areas (around the 95% and 99% ellipses) and the points are color-coded according to Absorption level.
BBB Penetration Model
The ADME BBB Penetration model control panel works in a similar way to the absorption model control panel.
Try pusing the PRINT button to generate sample output.
Water Solubility Model
The ADME Solubility control panel is relatively simple and self-explanatory.
A descriptor database is a Cerius2 table containing equations and equation coefficients used to calculate molecular descriptors. When QSAR+ is installed, you can access a database that contains over 100 spatial, electronic, thermodynamic, conformational, and other descriptors.
Editing a descriptor database
Because the descriptor database is accessed as a Cerius2 table, you should be familiar with Cerius2 tables before performing any activities described in this section. For information about tables and basic table operations, see Cerius2 Modeling Environment.
Opening a descriptor database
You select and open a descriptor database in a descriptor database table before you can edit it. The default database name is listed in the text window when you open QSAR+.
If you have only a single database or if you want to use the currently selected database, select Descriptors/Databases in the study table or on the QSAR card. The Descriptor Database control panel appears.
The descriptor database table contains one row for each descriptor. Each row contains columns, some of which are described below (to see all columns, use the horizontal scroll bar).
You can change the set of default descriptors by editing the Default column.
1. Select the cell in the Default column for that descriptor.
2. Clear the edit window and enter 1.
3. Press <Return> or click any other cell in the table.
1. Select a cell in the Default column.
2. Press <Return> or clear the edit window and enter 0.
3. Click any other cell in the table.
1. Insert a new row in the Descriptor Database table using the Insert tool.
2. In the Family column of the new row, enter a family name.
3. Enter a descriptor equation in the Value column using valid math and molecular operators.
ecount(col "Structure", "Cl") + ecount(col "Structure", "Br")
4. In the Description column, enter a short description of the descriptor. For example, enter:
Number of halogen atoms
5. In the 3D column, enter 0 if your descriptor is not a 3D descriptor. Enter 1 if the descriptor is 3D.
6. In the Default column, enter 1 if you want the descriptor to be
part of the default set. Enter 0 if the descriptor is not to be a
default descriptor (Identifying default descriptors on page 168).
12. Enter a name (for example, Halogens) in the Row Name entry
box.
14. Save the database containing the new descriptor. You can save
the descriptor to the current database, to another existing database,
or to a new database. For more information, Saving a
descriptor database on page 171.
|
| To activate a new descriptor, you must first save the descriptor database with the descriptor in it. |
When you finish creating a descriptor, you can check to see that it is correctly entered by adding it to the study table and inspecting the generated data (see Adding descriptors to the study table on page 154).
Modifying descriptors
You can modify an existing descriptor in a database by editing the entry for the descriptor in the Value column of the descriptor database table. For example, to modify the Halogens descriptor defined above so that it counts fluorine as well as chlorine and bromine atoms, enter:
ecount(col "Structure", "Cl") + ecount(col "Structure", "Br") + ecount(col "Structure", "F")in the Value column for the descriptor.
Save the database to activate the edited descriptor (see Saving a descriptor database on page 171).
When you finish modifying a descriptor, you can check to see that the modifications are correct by adding it to the study table and inspecting the generated data (see Adding descriptors to the study table on page 154).
Controlling the descriptor display format
You can control the numerical format of a descriptor value using one of the following options: floating decimal (float), integer (integer), or scientific notation (scientific).
Creating new descriptor categories
The entry in the Family column of the descriptor database table categorizes descriptors and determines the list of choices in the family popup in the Descriptors control panel.
You can create new categories of descriptors by placing new entries in the Family column. For example, if investigator Jones wants to place all saved equations in a category named Jones-QSARs, Jones simply enters this designation in the Family column for the rows containing QSARs and saves the modified table. The value Jones-QSARs now appears as a choice in the family popup on the Descriptors control panel.
Saving a descriptor database
If you make a change in the descriptor database table, that change is not activated until the table is saved and then read back into Cerius2 again with OPEN DATABASE.