A characteristic of most synthetic polymers, as compared to small rigid molecules or to biopolymers in the native state, is the tendency to adopt an enormous number of very different conformations with finite probability. Due to entropic considerations, the minimum energy conformation of a polymer chain is generally of negligible importance in determining the chain's properties. That is, there is only one minimum energy structure, but there is typically a great number of energetically accessible structures, such that the chain is overwhelmingly likely not to be in the lowest energy structure at any given instant.
Consequently, it is generally not useful to focus on the lowest energy structure alone if one is interested in the properties of a polymer in the melt or in solution. (It might be appropriate for the crystalline state, however.) Rather, one must deal with the statistical behavior of a chain, taking into account all the accessible possibilities for the chain conformation. In this discussion, the focus is on the structural properties of chains--their size and stiffness. These are the simplest properties both to visualize and to calculate, but one can also calculate dipole moments, optical properties, and other quantities by means of Rotational Isomeric State (RIS) theory.
Theoretically, the simplest measure of a chain's size
is its mean squared end-to-end distance, denoted r2.
For a long chain, this quantity scales as a power of the number of bonds (or molecular
weight) of the chain. A quantity closely related to the end-to-end distance is
the radius of gyration, defined as the root mean square distance of each of the
atoms (or groups) in the chain from the chain's center of mass. The radius of
gyration is denoted s, and for a simple chain its average is given by:
A measure of polymer stiffness, indirectly related to polymer shape, is the persistence length. This can be defined in various ways. The persistence length may be defined as the average sum of the projections of all bond vectors onto the first bond of a chain. Alternatively, it may be defined as the projection of all succeeding bonds (including the bond itself) onto an internal bond of the chain. The persistence length is a measure of the distance over which a chain retains "memory" of its initial direction. A flexible chain tends to lose all memory of its direction after a short distance and thus tends to have a small value of the persistence length. By contrast, a stiff chain has a comparatively large value. Note that the "stiffness" defined here is a static, equilibrium stiffness, as opposed to a dynamic stiffness. (The latter depends on the barriers to rotations of the chain backbone, not just the relative energies of the various rotational states of the chain.)
Simple models of polymer chains
Because computing accurate conformational statistics for real polymer chains is difficult, various models with differing degrees of simplification have been proposed. The simplest of these is the random walk or random flight model. In the most basic version of this model, the polymer chain is envisioned as a random walk on a lattice. To construct a representative chain, one first lays down a segment (bond) with arbitrary orientation, and continues to randomly add segments until the desired number has been reached. Because measured properties are averages over all possible conformations, rather than just properties of a single conformations, it is necessary that any calculation perform this averaging. For a random walk, the results of doing so are as follows:
Random walk models are not limited to those on lattices. An example of an off-lattice random walk model is the freely jointed chain. Such a chain consists of n segments, joined at vertices with unrestricted bond angles. (It is as if a pivot is at each vertex, allowing the connected segments to swing about freely; hence the term freely jointed.) The chain statistics for a freely jointed chain are the same as for a random walk on a lattice; i.e., r2 = nl2 and s2 = r2/6. Note that because all succeeding bonds are uncorrelated with the first, the persistence length is equal to the bond length for both the random walk on a lattice and the freely jointed chain. That is, these chains have no stiffness beyond the trivial stiffness of a single bond.
It is obvious that the above models omit all chemical details of chains. In addition, they make another major approximation. They neglect excluded volume interactions--that is, they neglect the fact that a real polymer chain cannot intersect with itself. For a chain in dilute solution under theta conditions, this approximation is a good one. It is also valid for a chain in the melt and for relatively short or stiff chains. (Of course, it is not literally true that the chain can intersect itself under these conditions. But for the purpose of computing conformational averages, the chain can be treated as if this were possible.) In other cases, such as in a good solvent, excluded volume can have a dramatic effect on the chain statistics. In particular, r2 scales roughly as n1.2, rather than n1, for a chain whose excluded volume interactions are not negligible. Thus, an excluded volume chain is significantly larger than a theta chain. Excluded volume interactions make theoretical treatment of a chain difficult, and in order to predict properties of realistic chains with excluded volume, explicit simulation (Monte Carlo or molecular dynamics) must be performed. RIS theory treats theta chains only. Although doing so is computationally expensive, the RIS Monte Carlo method can be modified to treat excluded volume in an approximate fashion. (This has been done in the RIS module.) When excluded volume interactions are negligible, a chain's dimensions are referred to as unperturbed dimensions and are usually designated by the subscript 0 (e.g., r20).
In this section, a brief presentation of rotational isomeric state (RIS) theory, the basis for the RIS module, is given. Although the basic theory is close to thirty years old, even today it is one of the simplest and fastest ways of calculating unperturbed conformational properties of real polymer chains.
RIS theory forms the basis for Flory's book, Statistical Mechanics of Chain Molecules (originally published in 1969, and republished in 1989), and for a later review article (Flory 1974) that reformulated the theory in a simpler and more general form. Refer to these sources for the details of the method, which is only outlined here.
Like most computationally tractable theories of polymer chains, RIS theory neglects long range excluded volume interactions. (Here, long range refers to the separation along the contour of the chain, not to the separation in 3D space.) The theory does account for short range excluded volume effects and for chain stiffness, however. Most importantly, unlike simplified random walk models, it accounts for the detailed chemical structure of a polymer chain.
The RIS approach takes advantage of the fact that in determining the overall conformation of a linear (i.e., unbranched) chain, the most important degrees of freedom are torsional rotations about the bonds of the backbone. The force constants for other degrees of freedom, such as bond bending and stretching, are so large that bond angles and lengths do not vary greatly from their equilibrium values. Consequently, such motions generally make little contribution to the gross conformation of a chain. In the RIS scheme, all degrees of freedom except torsions in the backbone are explicitly neglected. (The effects of these other degrees of freedom may be implicitly taken into account in determining the statistical weights, however.)
A torsion angle may in principle adopt a continuum of values from -180º to 180º. For the purpose of determining conformational statistics, the rotational isomeric state approximation treats a torsion angle in a chain's backbone as if it can adopt only a discrete subset of these values. The justification for this is made clear by inspecting a plot of the potential energy as a function of the dihedral angle. Such a plot typically contains a small number of distinct minima, separated by relatively high energy barriers. In polyethylene, for example, a typical RIS model considers only three values for any given torsion angle, corresponding to one trans and two gauche minima in the energy profile. Although only a single value of the dihedral angle () for each state enters the calculation, deviations from this value are not disregarded but are implicitly included in the weights.
In the RIS scheme, any conformational property, A, is assumed to be a function only of the dihedral angles in the chain backbone:
For certain properties A, the value of the property for a given conformation may be expressed:
Writing A as a product of generator matrices gives its value for a single conformation, but to obtain conformational statistics it is necessary to average over all possible conformations. That is, one must perform the sums over all the i with each term weighted by the exponential Boltzmann factor. To see how this is done, it is convenient to first address the calculation of the configurational partition function. This will introduce the statistical weights matrices, which will be used in performing the sums over the i.
In the RIS scheme, the configurational partition function may be written:
For independent bond rotations, the weight associated with a given conformation is:
For an internal bond pair in polyethylene at 140ºC, the statistical weights matrix is:
The energies that are used to determine the statistical weights may be obtained from contour maps of the energy as a function of two adjacent torsion angles in a small fragment of a chain. In principle, the "energy" in such maps should be a potential of mean force, obtained by averaging over all other degrees of freedom in the problem. In addition, some or all of the statistical weights may be optimized by constraining certain RIS results (e.g., characteristic ratios) to agree with experiment. These same statistical weights may then be used to calculate other properties for which experimental data are not available.
Given the above formulation, the average property A may be expressed:
is the diagonal array of generator matrices Fi for the rotational states 1 ... i, and Es is the identity matrix of the same order as Fi. The subscript 0 has been appended to A because neglect of long range forces is implied by the assumption that bond energies depend on nearest neighbor interactions alone.
Although a fair amount of algebra (most of which has not been shown in this introductory presentation) is required to obtain the above result, the simplicity of this result should be noted. In the RIS model, the extremely complicated problem of averaging a property over all possible chain conformations is reduced to serial multiplication of matrices. To summarize, the RIS model gives the conformational statistics of unperturbed, realistic polymer chains by reducing complicated integrals to simple matrix multiplication.
Some of the more common properties of unperturbed chains that may be calculated using RIS theory are the following: mean square end-to-end distance (and consequently, the characteristic ratio), radius of gyration, persistence length, mean square dipole moment, polarization anisotropy, stress-optical coefficient, molar Kerr constant (related to electric birefringence), higher moments of r and s (e.g., r4), and macrocyclization and epimerization equilibria.
RIS Monte Carlo simulation
The advantage of the RIS approach extends beyond straightforward
RIS calculations using matrix multiplication. The statistical weights of RIS can
be used to perform Monte Carlo generation of chains, and thereby calculate some
of the properties that are not directly calculable by simple matrix multiplication.
The distribution function p(r) and intramolecular structure
factor S(k) are two examples of these. To get p(r),
one needs all of the moments of r (i.e., r4,
etc.), not just r2.
Any particular moment can be calculated by means of RIS, but for higher moments,
the generator matrices become progressively larger and more unwieldy. The structure
factor S(k), observable by means of radiation scattering, is
proportional to the average of:
Although a RIS Monte Carlo (RIS MC) calculation is computationally more intensive than a standard RIS calculation, the theory behind it is simpler in some ways. RIS MC makes use of the statistical weights described previously, but avoids the need to compute generator matrices for each property of interest.
In performing a RIS MC calculation, chain configurations (or conformations--these terms are used interchangeably here) are generated with the same probability distribution as that implicit in the statistical weights. A given property or set of properties is calculated for each conformation, and the results are averaged over a large number of these.
The normalized probability of occurrence for a given chain conformation in the RIS scheme (see Eq. 492 and Eq. 495) is:
The RIS MC procedure generates a chain conformation by beginning at the tail of the chain and setting backbone rotational states one-by-one until the head of the chain is reached. To be able to select such states requires that single-bond a priori probabilities and bond pair conditional probabilities be calculated. These are defined as follows. The probability of bond i being in state , irrespective of the states of all other bonds, is referred to as the single-bond a priori probability. The probability of bond i being in state , given that bond (i - 1) is in state ', is the bond pair conditional probability. These probabilities are determined by the statistical weights in such a way that the probability of generating a conformation using RIS MC is the same as that given in Eq. 500.
Specifically, the single bond a priori probability is given by:
To obtain conditional bond-pair probabilities, we first need a priori pair probabilities, which are given by expressions of the type:
For every conformation generated by the Monte Carlo process, appropriate data (e.g., end-to-end distance, radius of gyration) are calculated and running averages or distribution functions are maintained. The process of generating a conformation is repeated until a pre-selected total number has been reached.
Properties calculated from RIS Monte Carlo simulations
A richer variety of quantities may be calculated from RIS Monte Carlo computations than from conventional RIS computations. Although a significant increase in computer time is often required, RIS Monte Carlo is still faster than the usual (Metropolis) Monte Carlo or molecular dynamics methods.
Among the quantities calculable from RIS Monte Carlo are the following:
Probability distribution of end-to-end vector
The probability distribution of the end-to-end vector
(denoted p(r)) is a function giving the relative probability
that a randomly selected conformation of a polymer chain has a certain end-to-end
vector. For a chain in an amorphous system (liquid or glass), this distribution
is spherically symmetric, and thus a function of the magnitude of r alone.
For very long chains in the melt or in a theta solvent, p(r)
takes on a Gaussian form. For shorter chains, it deviates significantly from Gaussian
behavior. (See Figure 55)
Probability distribution of radius of gyration
The probability distribution of the radius of gyration, p(s), is analogous to p(r), except that the radius of gyration, s, is a scalar rather than a vector. This means that p(r) and p(s) have slightly different normalizations:
The eigenvalues of this tensor give the principal components of s2 in three mutually perpendicular directions. The probability distribution of the square roots of these eigenvalues (denoted p(S1, S2, S3)) indicates the average shape of the chain; for example, the extent to which it tends to be prolate as opposed to oblate. When represented as a 3D contour plot, p(S1, S2, S3) gives a vivid illustration of the chain's overall shape.
Pair correlation function
The pair correlation function, g(r), gives the relative probability of finding two atoms within the same chain separated by a distance r. Note that this is an intramolecular version of the usual intermolecular pair correlation function, also denoted g(r). The pair correlation function is directly related to the Fourier transform of the scattering function.
The scattering function, S(k), also known as the single chain structure factor (), is proportional to a polymer's scattering intensity for radiation (e.g., neutrons or x-rays) of wavevector k. RIS MC allows one to calculate an intramolecular scattering function (as opposed to the intermolecular or the total S(k)). The expression for S(k) for a molecule in an isotropic environment is:
The total scattering function for a molecule is a sum of contributions from all different types of scattering centers. In a RIS Monte Carlo calculation, each of these separate contributions may be computed as well. Suppose A and B are two types of scattering centers in a molecule. Then the total S(k) is represented in terms of partial S(k)'s as follows (assuming the scattering amplitude is the same for both centers):
|You can access online RMMC documentation by pointing your browser at the file $BIOSYM/doc/polymer/ris/rmmc/index.html. This file and those to which it links more fully describe some aspects and capabilities of the RMMC program.|
Both the conventional RIS and RIS MC methods require that statistical weights exist for the polymer of interest. There is a third method--"RIS" Metropolis Monte Carlo (RMMC) simulation--that does not require statistical weights. Instead, RMMC uses a forcefield (such as CVFF or PCFF) to determine the conformational properties of a chain. Because it does not use statistical weights (or even discrete torsion angles), RMMC is not, strictly speaking, a rotational isomeric state method. However, it shares several assumptions with RIS theory, and is used to compute the same types of properties as RIS. In this sense, it is closely related to RIS theory.
A step-by-step presentation of the RMMC procedure appears later in the The RMMC algorithm section. First, if you wish to decide which method is best for your polymer, consider these major differences between the RIS MC and RMMC methods:
|RIS MC requires that statistical weights exist for your polymer.||RMMC does not; instead, it relies on a forcefield in order to calculate the chain properties.|
|RIS MC assumes discrete torsion states (e.g., t, g+, g-).||RMMC allows torsions to vary continuously.|
|RIS MC cannot treat branched polymers.||RMMC can treat branched polymers.|
|RIS MC can account in an approximate way for the effect of an external field or long range excluded volume on a chain's conformations.||RMMC lacks this ability.|
|If statistical weights for your polymer exist, RIS MC is a faster method for getting results than RMMC.||However, RMMC is more general, and is easier to use if statistical weights do not already exist.|
Properties calculated with RMMC
The RMMC program can directly calculate the following properties:
As in traditional RIS methods, only torsional degrees of freedom are considered in determining a chain's conformation; bond lengths and angles are fixed. Unlike these methods, RMMC allows torsion angles to vary continuously; it does not impose the assumption of discrete rotational states.
As its name indicates, RMMC is a Metropolis (1953) Monte Carlo method. (This can be contrasted to the Markovian approach used in a RIS MC calculation to build independent chains.) In a Metropolis simulation of a polymer, one begins with a chain in an arbitrary conformation. A Monte Carlo step consists of making a small change to that conformation--e.g., by rotating a bond--and then deciding whether or not to retain that change, based on the temperature and the energy of the new conformation relative to the old one. This process is repeated many times in order to yield a set of conformations characteristic of that chain at the specified temperature.
In outline, a RMMC simulation proceeds as follows:
1. Perform an energy minimization on the molecule (so that bond lengths and angles adopt reasonable values).
2. Randomly select a rotatable backbone bond.
3. Select a random torsion value for this bond between -180 and +180 degrees.
4. Rotate the bond to its new torsion value and compute the new energy of the chain.
5. Generate a random number, R, between 0 and 1. If exp[-(Enew -Eold)/kT] > R, keep the new torsion value. Otherwise, restore the old value.
6. After enough steps have been done that the chain is equilibrated: compute the properties of the chain conformation (e.g., r2, s2), and update the running averages of these properties.
7. Repeat from step (2) until the desired number of iterations has been performed.
Treatment of constraints
In a RMMC simulation, bond lengths and bond angles are constrained. (For this reason, "pre-minimization" is recommended in order that the bonds lengths and angles adopt reasonable values.) The next three paragraphs deal with some of the technical subtleties of simulations with constraints. Knowledge of these subtleties is not required for a basic understanding of the RMMC simulation approach, but is included here for the sake of completeness.
In a Monte Carlo simulation that incorporates constraints (such as fixed bond lengths and angles) and uses internal coordinates (rather than Cartesian coordinates) in making its moves, care must be taken that the conformations are sampled correctly. Simulations may be performed with constraints that are regarded as "rigid" or as "flexible." In the rigid case, the momenta conjugate to the constrained coordinates are neglected. In the flexible case, it is imagined that the force constants for the constrained coordinates are so large that the coordinates do not move significantly from their equilibrium values, but that the conjugate momenta are activated. It is hard to say a priori which is a more reasonable approximation for a polymer. In a real polymer, bond lengths and angles are, of course, not rigidly constrained. On the other hand, according to quantum mechanics, very stiff degrees of freedom are not activated at ordinary temperatures. Go and Scheraga (1976) argue that, in practice, flexible constraints should be more quantitatively accurate.
It should be emphasized that the distinction between "rigid" and "flexible" constraints is in the assumptions behind the physical model. Thus, even in the "flexible" case, the constrained coordinates do not move during the simulation.
A flexible constraint requires no special treatment in a Monte Carlo simulation. This is because the configuration integral, as expressed in terms of torsional degrees of freedom, contains a Jacobian-like term that is independent of these degrees of freedom. Thus, this term can be moved outside the integral, and is divided out of any conformational averages. In the rigid case, this is not true, and in principle the Boltzmann factor used in the Metropolis acceptance criterion needs to be multiplied by an extra term (Fixman 1974). In practice, this extra term has been found to change the conformational statistics only slightly (Almarza et al. 1990). For simplicity, the flexible constraint assumption, with no extra terms in the Boltzmann factor, is made for the RMMC algorithm. Any errors introduced by this approach are likely to be much smaller than those due to other approximations in the RMMC algorithm--such as approximating the potential of mean force by a forcefield with a truncated interaction range. For more on the issue of constraints in simulations, see Fixman (1974), Allen and Tildesley (1987), and Almarza et al. (1990).
Polymer allows you to use RIS theory to calculate various properties of polymers. Databases of bond-level properties and statistical weights are provided. You may update these if they do not contain data for polymers you are interested in. Before you can perform a calculation on a given polymer, the RIS databases must contain information on the groups, properties, and statistical weights appropriate to that polymer.
In the database you receive with the polymer software, all data have been obtained from the literature. (The references are given in the statistical weights data file; see also Appendix C, File Formats.) Parameters for the following polymers and only these polymers are included in the database:
Several fundamental concepts are employed in the RIS software. Some of these, such as statistical weights, are directly extracted from the RIS theory. Others, such as substitution, are computationally based. The following concepts are important for understanding the scope, usage, and limitations of the RIS module.
Before a RIS calculation can be performed on a polymer chain, the RIS module must "sequence" the chain to determine the groups making up its backbone. This is necessary for proper statistical weights and bond properties to be assigned to the chain. In the context of the RIS module, a group is normally defined to be an atom in the backbone of a linear chain, along with its side groups. For example, in the molecule CH3-CH2-CHCl-CH2-CH3, there are three distinct groups: (CH3-), (-CH2-), and (-CHCl-). For the last of these to be uniquely specified, note must be made of its stereochemical orientation (whether l or d--see the discussion of stereochemistry below).
The following are special cases of groups: A ring or fused ring in the backbone is defined to be a single group, irrespective of the number of atoms it contains. A cis double bond is regarded as a single group, connected by virtual bonds to neighboring groups. On the other hand, a trans double bond is regarded as a sequence of two groups, which in a RIS calculation, are treated as comprising a bond with only one rotational state. A triple bond or any sequence of more than one multiple bond is defined to be a single group.
The RIS module maintains a database of known groups. For the module to identify a chain properly, the groups making up the chain must be in this database. When the chain sequencer is presented with a polymer, it attempts to find a match between the groups in the chain's backbone and known groups. In attempting to find a match, it accounts only for the connectivity between atoms, not for their positions in 3D space, with one exception. The exception is consideration for the stereochemical orientation of pseudo-asymmetric centers in the backbone of a tactic chain. In this way, account is made for tacticity. Account is not made for distinctions based on stereochemistry in side groups.
It is quite easy to add new groups to the database of known groups, but the bond properties and statistical weights databases must be updated as well for RIS calculations to be performed on a new polymer (see the Adding a new group and Adding new properties sections).
Rings in the backbone
In the RIS module, a ring or set of fused rings (e.g., p-phenylene) in the backbone is identified as a single group. Conceptually, such a group is connected by virtual bonds to neighboring groups.1 Care must be taken in assigning statistical weights and properties to virtual bonds. Many published results treat the ring itself as part of a virtual bond connecting its neighbors; that is, the ring itself is not treated as an explicit group. The approach taken here should be contrasted with that one. In particular, note that characteristic ratios calculated based on the two different approaches will disagree, even when the calculated chain dimensions are identical in both cases.
For the RIS module to compute the properties of an entire polymer chain, it must first have bond property data for each of the distinct types of bonds making up the chain. These properties include bond length (for structural properties, the properties most commonly calculated using RIS theory), bond dipole moment (for molecular dipole moments and Kerr constants), and polarizability tensors (for all optical properties). Bond properties are kept in a properties database that you may update.
For the purpose of specifying bond properties, a bond is defined by two connected groups. For example, an interior bond of polyethylene is defined by the group sequence methylene, methylene.
In performing RIS calculations, one end of a chain is chosen to be the tail and the other to be the head. However, this designation is completely arbitrary, and RIS theory yields the same conformational properties irrespective of which chain end is chosen as the tail. Because of the head/tail arbitrariness in a RIS calculation, if A and B are two different groups, and you add properties for an A-B bond to the database, you should also enter properties for a B-A bond.
A description of the concept of statistical weights
has been given in the Theory section.
Some details regarding treatment of these weights in the RIS
module are given here.
Because of the pentane effect, a bond pair is defined by a sequence of five groups, for the purpose of assigning statistical weights. For example, an interior bond pair in polyethylene is defined by the sequence methylene, methylene, methylene, methylene, methylene. For syndiotactic poly(vinyl chloride), one type of internal bond pair has the sequence methylene, d-chloromethylene, methylene, l-chloromethylene, methylene. For a RIS calculation to be performed on a particular polymer, all distinct sequences of five groups in the chain's backbone must have weights stored in the database. (However, see the information in Appendix C, File Formats, on using wildcards in specifying bond sequences. Also, see below for dealing with terminal and near-terminal bonds.)
A statistical weight for a bond pair is stored in the database as a prefactor and an energy, such that the value of the weight at a given temperature T is given by:
The calculation of the statistical weights matrix for a given bond pair in a polymer chain entails the following steps. First, a short chain segment (which commonly has a length of four skeletal bonds including the bond pair in question) is identified and extracted from the polymer chain. Consider, for example, the following chain segment which has been extracted from polytetrafluoroethylene (PTFE):
The second step is the determination of the total energy of this segment as a function of the two torsion angles 1 and 2. This is accomplished by running Discover to find the minimum energy corresponding to each fixed pair of values of the rotational angles.
Having obtained energy Et(1,2)2 as a function of the two torsion angles, the third step is the construction of a contour map showing the lines of constant energy on a two dimensional graph. The energy map for the PTFE segment shown above is given in Figure 56.
Next, the map is analyzed to locate the energy minima. The purpose of this analysis is to determine the discrete states (i.e., angle values) to be used in the RIS approximation. To this end, the map is divided into a set of regions. In most common situations, each local minimum would be surrounded by a distinct region. This is exemplified in Figure 57 for the PTFE segment.
When two or more minima are within close proximity to each other, however, it makes sense to group them together within a single region which then allows the representation of the minima in question together as a single state on the map.
The application of these ideas to another map (which is included below as an example) is illustrated in Figure 58. Note that there are 11 minima and 9 regions on this map. Two pairs of minima have been formed by lumping individual minima which are close to each other into pairs, each pair being surrounded by a single region.
Before discussing how these regions are employed in the calculation of statistical weights, another crucial step of the calculations is first explained here.
The energy values obtained for the PTFE segment above include all first-order and second-order interactions in the segment. Those interactions that depend on the value of exactly one torsion angle are termed first-order interactions. The interactions that depend on the values of exactly two (usually adjacent in the backbone) torsion angles are termed second-order interactions (Flory 1989).
For the PTFE segment shown earlier, the interaction between C1 and C4, for example, is a first-order interaction. The energy changes due to the attractive and repulsive forces between C1 and C4 depend only on the value of the first torsion angle, 1, and not on the second angle, 2. On the other hand, the interaction between C1 and C5 depends on both of the torsion angles, since the distance between C1 and C5 depends on both 1 and 2. Thus, the C1-C5 interaction is a second-order interaction.
By convention, the statistical weight matrix for a given bond pair should be based on all the second-order interactions in the segment and only those first-order interactions that are dependent on the second torsion angle 2. In the RIS approximation, the total conformational energy of a polymer chain is expressed as the sum of the energies of its segments. Since the energy changes due to the rotation of the first bond (i.e., the first-order interactions dependent on 1) are accounted for in the calculations for the neighboring segment (which includes the first bond of the present segment as its second bond) these changes should be excluded from the calculations for the present segment to avoid double counting them.
As was pointed out earlier, the energy map constructed includes all first- and second-order interactions within the segment in question. To calculate the statistical weights correctly from this map, the energy due to the first-order interactions dependent on 1 must be determined and subtracted. To this end, a second set of Discover calculations is carried out on the following chopped segment extracted from the original polymer segment.
Note that 2 is not defined on this molecule since there are no side groups attached to C4. Therefore, all first- and second-order interactions involving 2 are absent from this smaller segment. In this model, only the first-order interactions (such as the C1-C4 interaction) that depend on 1 are included. Discover is used to calculate the energy of the chopped segment as a function of the first torsion angle, i.e., Eseg(1). The net energy corresponding to a value of 1 is obtained by subtracting the chopped segment energy from the total energy at the same value of 1:
The next step is the calculation of average angles 1 and 2, the average energy E, and the partition function z for each region. The following integrals are evaluated to obtain these quantities:
The next step is the determination of the number and locations of the states, i.e., 1 and 2 values that will be employed as the discrete states in the RIS approximation. If all 1 and 2 values were distinct (that is, if i j then 1i 1j, and the same condition holds for 2 ), then there would be as many discrete 1and 2 values as there are regions. If the number of regions is Nr, then the dimension of the resulting weight matrix would be Nr XNr. Frequently, however, two or more regions may have the same, or similar, 1 or 2 values. In such cases, it is desirable to represent both regions using average 1, 2 values. If the 1 values of certain regions are to be combined so that they are all represented by a single 1combined value, this combined average angle is determined as follows:
Let the number of 1 values (obtained in the manner described above) be N1, and the number of 2 values be N2 (note that N1 Nr and N2 Nr). Further, let these angles (some of which may have been obtained by combining the average angles of two or more regions) be denoted by 1ci and 2cj, 1 i N1; 1 j N2. These angles are indexed such that they are in ascending order, for example:
If there is no region with the average angles 1ci and 2cj, then:
To perform calculations on vinyl chains, or any chains with pseudo-asymmetric centers, you should know the convention used by the RIS module for designating such centers. It is the same as that used by Flory (1989). For a -CHR- group, a l center is shown in the following figure. A d center has the positions of H and R reversed.
In performing RIS calculations on vinyl and other tactic chains, many authors have taken advantage of symmetries inherent in such chains to simplify their calculations. For example, in some treatments of vinyl chains, the sign convention for dihedral angles differs for different types of bonds in the chains. (See Chapter VI of Flory 1989, and Flory et al. 1974.) Because the core of the RIS module is general, it does not take advantage of such simplifications and uses the same sign convention and same form of transformation matrix for all bonds of a chain. The sign convention is that used in Flory's book, Chapter I (1989). (Positive rotations are measured in a right-handed sense.) If you obtain weights for vinyl chains from published results, you must be careful to confirm that angles are of the proper sign before you enter them.
Even if you consider only isotactic chains, you must enter parameters for both d-centers and l-centers in the chain. This is because the selection of one end of a chain as tail and the other as head is arbitrary, and therefore unpredictable in advance. This means, for example, that an isotactic chain may be regarded equally well as a sequence of d-centers or l-centers by the RIS module.
"RIS" Metropolis Monte Carlo (RMMC) concepts
In an RMMC simulation, every rotatable bond can potentially have its torsion angle changed. The RMMC program uses backbone atom flags to determine which bonds are rotatable for simulation purposes. A bond is considered rotatable in this context if it satisfies all of the following criteria:
Unlike RIS and RIS MC calculations, RMMC does not use statistical weights. Instead, it uses the energy as computed from a forcefield in order to calculate chain conformational properties. The only energy terms considered in a RMMC calculation are:
Parameters controlling the simulation
A number of parameters can affect the outcome of a RMMC simulation. These include the following:
Whether articulated side groups are treated as flexible in a RMMC simulation is determined by whether the side group atoms have their backbone flags set. (The Sequencer/Set_Backbone command can be used to set these flags. It is best to do so on the repeat units before building the polymer.) In principle, it is more realistic to treat side groups as flexible rather than rigid. However, doing so increases the computation time and might not be necessary if the groups are small (as, for example, in polypropylene).
The "energy scaling factor" is a number that multiplies all computed energies. If experimental data are available, this factor can be used to refine the properties calculated by RMMC so that they match experimental properties as closely as possible. The same factor can then be used for calculations on related chains for which no experimental data exist.
The reason for the relatively large number of parameters controlling the energy calculation is as follows. A RMMC simulation is performed on an isolated polymer chain. Yet, the properties desired are those for a chain in solution or in the melt. If solvent molecules or other chain molecules were present in the simulation, then a high quality forcefield such as PCFF should be capable of predicting the correct properties. Because these other molecules are not explicitly present, their effects must be mimicked by altering the way the energy is computed. (In the language of statistical mechanics, it is not "bare" interaction energies but potentials of mean force (PMF) that determine the polymer conformations in a solvent or in the presence of other chains. See McQuarrie (1976) and Mattice and Suter (1994) for more on the PMF concept. The goal in a RMMC simulation is to have the computed energy come as close to the potential of mean force as possible.)
Using Max Bonds to limit the range of nonbond interactions is a way of mimicking theta conditions. However, there is no known a priori way to determine the ideal value of Max Bonds for a particular polymer. Thus, calculations with differing values of this parameter should be performed and the results evaluated according to their reasonableness or agreement with established data.
The dielectric constant provides another way of mimicking the presence of solvent. But it must be kept in mind that, at a molecular level, a solvent is not a dielectric continuum, so that this too is an approximation. Consequently, the best value for the dielectric constant in a RMMC calculation might not be the same as the system's bulk dielectric constant. In the event that variation of Max Bonds and the dielectric constant within reasonable limits does not give adequate results, the energy scaling parameter is available as a last resort.
The RIS module allows you to perform calculations on linear polymer chains using Rotational Isomeric State (RIS) theory. Among the properties you can calculate are the mean square end-to-end distance, radius of gyration, persistence vector, mean square dipole moment, and various optical properties.
In addition to the core pulldowns in the top menu bar, the RIS module adds the Sequencer, RIS_Compute, RIS_Monte_Carlo, NMR_System, Database, Stat_Weights, Graph, and Background_Job pulldowns to the lower menu bar.
The Sequencer pulldown allows you to sequence a polymer into known and unknown RIS or QSPR groups, to locate and identify these groups, to perform group substitutions, and to perform group database maintenance. This pulldown contains the commands: S_Run, Substitute, Group_Database, Set_Backbone, Color_Alt_Grps, and Color_Indiv_Grps.
Please see Chapter 21, Sequencer Pulldown, for more information on the commands within the Sequencer pulldown.
The RIS_Compute pulldown contains commands for the final set up and running of a RIS calculation. These commands are Files, Define_Segment, and C_Run.
The RIS_Compute/Files command allows you to specify the bond properties and statistical weights files to be used in the calculation. By default, the system files are used.
The RIS_Compute/Define_Segment command allows you to specify Tail, Repeating, and Head segments in preparation for a variable length chain calculation using the C_Run command. This allows you to calculate quantities as a function of the degree of polymerization.
The RIS_Compute/C_Run command causes statistical weights to be assigned and a background RIS process to be run. All properties that you specify are calculated. These are mean square end-to-end distance, mean square radius of gyration, persistence vector, mean square dipole moment, polarization anisotropy, molar Kerr constant, optical configuration parameter, and the log of the configurational partition function. You may also specify a range of temperatures over which to calculate the properties.
Calculations may be performed on a single chain (of fixed or variable length), or on each chain in an assembly of chains, with averaged results compiled automatically.
The RIS_Monte_Carlo pulldown contains commands that allow you to set up, run, and see the results of RIS Monte Carlo simulations. The commands accessible from this pulldown are Range, Scatterers, MC_Run, Playback, MC_To_arc, and RIS_Metrop_MC_Run.
The RIS_Monte_Carlo/Range command allows you to specify the range of argument over which to calculate various functions using RIS Monte Carlo.
The RIS_Monte_Carlo/Scatterers command allows you to specify scattering centers within your polymer chain for the purpose of calculating partial scattering functions or pair distribution functions.
The RIS_Monte_Carlo/MC_Run command causes statistical weights to be assigned and a background RIS Monte Carlo job to be run. The functions you may calculate using RIS Monte Carlo are p(r/r0), p(s/s0), p(S1, S2, S3), S(k), g(r), r(f).
Calculations may be performed on a single chain of fixed length, or on each chain in an assembly of chains, with averaged results compiled automatically.
The RIS_Monte_Carlo/Playback command is used to display configurations (conformations) previously generated using the RIS_Monte_Carlo pulldown.
The RIS_Monte_Carlo/MC_To_arc command is used to convert RIS conformation files (.riscnf) to Biosym archive files (.arc). RIS conformation files can be created by the MC_Run command. Archive files can be used as input to Discover, Flexiblend and several other commands within Insight II.
The RIS_Monte_Carlo/RIS_Metrop_MC_Run command is used to compute the conformational properties of a polymer chain. It performs a RIS Metropolis Monte Carlo (RMMC) simulation of your polymer, calculates the properties that you specify, and (optionally) saves the conformations for later viewing or analysis. If you choose to, you can modify various parameters controlling the simulation in order to enhance the quality of the results.
The Database pulldown contains commands that allow you to update information in the RIS databases of properties and statistical weights. These commands are Properties and Weights.
The Database/Properties command allows you to add to, delete from, or get the database of bond properties used in RIS calculations.
The Database/Weights command allows you to add records to or to delete or get records from the database of statistical weights.
The Stat_Weights pulldown allows you to perform all the steps necessary to calculate estimated statistical weight matrices for unknown bond pairs. This pulldown contains the commands: Bond_Pairs, Define_Torsions, Dihdrl_Dihdrl_Run, DihdrlDihdrlCntour, Analyze_Energy, Refine_Region, Compute, Combine_Angles and Define_Edge.
The Stat_Weights/Bond_Pairs command allows you to locate all unknown bond pairs in a polymer chain, label and display each unknown bond pair, one at a time, and extract an unknown bond pair into a new fragment.
The Stat_Weights/Define_Torsions command allows you to define torsion angles. You can Add new torsions, Clear all defined torsions, Delete a particular torsion angle, or List the currently defined torsion angles.
The Stat_Weights/Dihdrl_Dihdrl_Run command allows you to set up and execute a Discover background job to perform the minimizations necessary for calculating estimated statistical weight matrices for an unknown bond pair.
The Stat_Weights/DihdrlDihdrlCntour command allows you to construct a energy-dihedral-dihedral graph, and to obtain a contour map from the graph.
The Stat_Weights/Analyze_Energy command carries out the initial analysis which locates all the local minima in the given map or curve. For a map, a set of default regions are defined and displayed. For a curve (i.e., graph of energy vs. dihedral angle), a set of default segments are defined and displayed.
The Stat_Weights/Region_Set_Up command allows you to modify or totally replace the default set of regions on a map.
The Stat_Weights/Refine_Region command allows you to merge and unmerge regions on a contour map and to color regions. Merging two regions causes them to be considered one conceptually, without physically changing them (other than color).
The Stat_Weights/Compute command allows you to calculate the partition function, average energy, and average rotational angles when Compute Step is Compute_Averages. You can reduce the number of rotational states by combining average rotational angles when Compute Step is Combine_Angles. In addition, you can calculate the statistical weight matrix for a given set of angles when Compute Step is Compute_Weights.
The Stat_Weights/Combine_Angles command is activated when the Compute Step parameter in the Compute command is set to Combine_Angles. This command allows you to combine, uncombine, or list average rotational angles. Average rotational angles may be combined by specifying a tolerance, or by explicitly picking the angles to be combined.
The Stat_Weights/Define_Edge command allows you to add a new region by defining its edges. This command is activated when the Region_Mode parameter under the command Region_Set_Up/Stat_Weights is set to Add_Region.
The Graph pulldown includes the following commands: Boolean, CharSize, Color, Contour, Correlate, Differentiation, Equation, FFT_Real, Get, Info, Interpolation, Integration, Label, Line_Fit, Modify_Display, Move_Axis, Put, Scale_Axis, Smoothing, Threshold, and Tick_Mark.
Please see Chapter 19, Graph Pulldown for more information.
The Background_Job pulldown allows you to set up background jobs to run concurrently or interactively with Insight II. You are given the choice of whether to send background jobs to a local or remote host. This pulldown is generic and is found in many Insight II modules that run background jobs. The Background_Job pulldown contains the following commands: Setup_Bkgd_Job, Control_Bkgd_Job, Completion_Status, and Kill_Bkgd_Job.
Please see Chapter 15, Background_Job Pulldown for more information.
Procedures when parameters are known
To perform a RIS calculation for a polymer whose parameters (statistical weights and bond properties) are known, the following sequence of steps must be performed.
Performing group substitution
When preparing to run a RIS calculation, the first possibility that may be encountered is a message that one or more groups in the chain are unknown to the RIS module. (This means that properties and statistical weights for chains containing these groups are unknown as well). If you perform step 3 under Calculations on Assemblies of Chains, (above), and get a message that groups are unknown, execute the Sequencer/Substitute command to label them. In addition to the unknown groups, the tail of the chain is also labeled to allow you to consider stereochemistry when performing substitutions.
The Substitute command is used to perform a group substitution. This is accomplished by identifying the atom to change and the name of the group to be associated with that atom. As groups are substituted, the labels marking the unknown atoms are removed. When all unknown groups are substituted, no labels remain. You may then carry out the RIS calculation by performing step 4 above.
Even if you use the Substitute command successfully, you may find that statistical weights for your chain are unknown. This means that even though all the groups in your chain are known, there are sequences of groups for which weights are not known. If this happens, you must update the properties and weights databases to perform calculations on your molecule. Before doing so, you must first add any new groups to the database of known groups. Statistical weights may be obtained from the literature or estimated from energy contour plots, using the commands in the Stat_Weights pulldown.
Calculating new statistical weights
When all of the groups in your chain are known, but there are sequences of groups for which weights are unknown, statistical weights for the unknown bond pairs (usually, five groups define a bond pair,) can be calculated by performing the following steps:
The bond pair and the corresponding rotational angles in the cis arrangement can be defined as follows:
The calculations carried out on this segment will give the desired statistical weights.
As pointed out above, the RIS method treats a trans double bond as a one-state bond. The statistical weight matrix for the angle pair i-1 and i requires no calculation as it is a column matrix consisting of 1's. Its dimension is determined by the number of columns of the matrix for the previous bond pair i-2 and i-1. The matrix corresponding to the bond pair i and i+1 is a row matrix. The calculation of this matrix requires conformational energy calculations based on the rotation of a single angle (i.e., i+1) only.
Consider the following structure within the backbone of a chain:
The two backbone bonds connecting the ring to the rest of the chain are "replaced" by two virtual bonds as shown in the illustration above. That is, this ring is treated in exactly the same manner as a cis double bond in the backbone. The statistical weight matrix for the virtual bond pair in question can be obtained by defining the rotational angles as follows:
The following ring structure is analogous to a trans double bond in the backbone:
Here, a one-state virtual bond (i.e., nonrotatable) is placed across the ring cluster as shown in the illustration. The matrix corresponding to the angle pair i and i+1 is a row matrix which requires single-angle conformational calculations. Although the Stat_Weights tools can be used to calculate this matrix automatically, note that the Sequencer currently does not treat such structures properly. You may, however, treat fused rings of this type by editing the Sequencer output file.
Adding a new group
See Chapter 21, Sequencer
Pulldown for the procedure for adding new groups to the database.
Note: You must have write privileges for the system groups database to be able to add new groups. If not, an error message is returned when you try to add a new group. If this happens, see your system manager regarding write privileges for the database, or make your own personal copy of the system database as described in the Sequencer Tutorial.
Adding new properties
After adding a new group, you should add properties for bonds between that group and any other groups to which it is bonded in the polymer of interest. To add new properties, you must first have a properties file in your current directory. If you do not, then enter the UNIX shell and type:
> cp $BIOSYM/data/polymer/ris/system.prop user.propBecause you need to add new weights as well, type:
> cp $BIOSYM/data/polymer/ris/system.statwt user.statwtto get a local copy of the system weights file. (Later, when performing calculations using data in these files, you will use the RIS_Compute/Files command to select these files.)
Once a local properties file is in place, use the Database/Properties command to add new entries to this file. Specify the particular file you want to modify (in case there is more than one) and the following properties: bond length (in Å3), bond dipole moment (in Debye), elements of the anisotropic optical polarizability tensor (in Å), and elements of the anisotropic static polarizability tensor (in Å3). Of these properties, only the bond length is essential. If you lack data on the other properties, enter 0 for each of them, but keep in mind that quantities dependent on them (mean square dipole moment and optical properties) will be calculated incorrectly.
Adding new statistical weights
The final step in adding parameters for a new polymer is using the Database/Weights command. Before using this command you need to obtain a set of statistical weights for your polymer. Such weights can be obtained from the literature or from conformational energy calculations that you perform yourself (e.g., by generating energy contour maps using the tools in the Stat_Weights pulldown).
|2||A1k, EPS1k||1.0, 1.0||1.0, 0.5||1.0, 0.5|
|3||A2k, EPS2k||1.0, 0.0||1.0, 2.5||1.0, 0.5|
|4||A3k, EPS3k||1.0, 0.0||1.0, 0.5||1.0, 2.5|
Conceptually, a statistical weights matrix is associated with the second bond of a bond pair; that is, the bond between Group 3 and Group 4 when you define the bond pair by a group sequence. The first bond of a chain has an imaginary zeroth bond that precedes it. The first bond is specified as follows (e.g., for polyethylene):
Group 1 = NONE
Group 2 = NONE
Group 3 = methyl
Group 4 = methylene
Group 5 = methylene
Note, first, that you use NONE to specify nonexistent groups off the end of the chain, and second, that it is a methyl (-CH3) rather than a methylene (-CH2-) group that terminates the chain. In Flory's convention, the statistical weights matrix for the first bond of a chain is a 1 X 1 vector of the form:
The weights matrices of the second and third bonds of a chain do not take on any special form. However, for the RIS module to be able to break down a chain into contiguous sequences of five groups and thereby assign weights for the complete chain, the sequences defining these bonds must be specified. For example:
Group 1 = NONE
Group 2 = methyl
Group 3 = methylene
Group 4 = methylene
Group 5 = methylene
Group 1 = methyl
Group 2 = methylene
Group 3 = methylene
Group 4 = methylene
Group 5 = methylene
If you want to use the same weights matrix for bond 2 and bond 3 of your chain as for interior bonds, you may perform step 4 above multiple times, specifying a slightly different sequence of groups in each case, as indicated above. In the case of simple chains such as polyethylene, it is also appropriate to use the same matrix for the second-to-last bond of the chain, specified as:
Bond n - 1:
Group 1 = methylene
Group 2 = methylene
Group 3 = methylene
Group 4 = methylene
Group 5 = methyl
To enter properties for the first or last bond of a chain, you specify pairs of groups such as methyl, methylene or methylene, methyl as appropriate.
The convention shown above is not the only possible one for the first weights matrix in a chain. In particular, the form of the bond 1 matrix may vary depending on the structure of the matrix associated with the second bond of the chain.
Strictly speaking, the weights matrix for bond 2 should include contributions only from first-order interactions (see the Statistical weights section). Because of this, and because the state of the first bond is undefined, the weights for bond 2 can be placed in a diagonal matrix, instead of a matrix whose first row represents the first order interactions for this bond. In the case of a diagonal bond 2 matrix, the bond 1 matrix must have the form:
Another possibility is that the reference state (e.g., the trans state) for bond 2 is not the state represented by the first row of its weights matrix. When statistical weights are calculated automatically using the commands in the Stat_Weights pulldown, this is a distinct possibility. The order of rows (and also columns) in this case, is from the most negative to most positive value of the torsion angle. (Thus, for polyethylene, the order of the states would be gauche-minus, trans, gauche-plus--corresponding to the angles -120° 0°, +120°). If such a convention is used, and the bond 2 matrix is not of diagonal form, then you must either:
The final bond of the chain has the following sequence:
Group 1 = methylene
Group 2 = methylene
Group 3 = methylene
Group 4 = methyl
Group 5 = NONE
and also has a special weights matrix that is a n-1 X 1 column vector of the form:
How terminal groups are determined
The RIS module uses the RIS theory as formulated in Flory's 1974 review paper. One of the premises of this theory is that the terminal groups of a chain have cylindrical symmetry. This premise determines what one must select as the terminal group of a chain. For example, if a chain ends in a methyl group (-CH3), then this group has cylindrical symmetry and is acceptable according to the above rule. However, if a methyl chloride group (-CH2Cl) ends the chain, such is not the case. The rule applied by the RIS module is that the heaviest singly--coordinated atom becomes the new terminal group in this case. Thus, for the purpose of assigning weights and bond properties, the sequence -CH2-CH2Cl would be identified as methylene, methylene, chloride; not as methylene, methylchloride.
Specifying the argument range of a function to be calculated
The RIS_Monte_Carlo/Range command allows you to specify the argument range and fineness of discretization (Number Of Bins) for any function to be calculated by RIS Monte Carlo. If you use this command, you must execute it before executing the MC_Run command.
To use the RIS_Monte_Carlo/Range command, select the function for which you wish to set the argument range. Then enter values for Min Argument, Max Argument, and Number Of Bins. The parameters of Min Argument and Max Argument are self explanatory. The Number Of Bins parameter determines the number of points between the minimum and maximum argument at which the function is to be calculated.
You may use the RIS_Monte_Carlo/Range command for any number of functions before invoking the RIS_Monte_Carlo/MC_Run command. If you choose not to use the Range command for a function, reasonable default values for Min Argument, Max Argument, and Number Of Bins are used when you calculate the function.
Identifying scattering centers
If you calculate the structure factor, S(k), or pair correlation function, g(r), then by default every key backbone atom in the polymer chain is regarded as a scattering center. If you wish to identify particular scattering centers--e.g., for the purpose of calculating partial scattering functions--you may do so using the RIS_Monte_Carlo/Scatterers command.
Each scattering center that you identify has a Scatterer Number associated with it. For the purpose of calculating partial scattering functions, however, it is the Scatterer Name that is relevant. You may have several different scattering centers that you select identified by the same Scatterer Name. However, no more than four distinct Scatterer Names are allowed.
To identify a scattering center, you must first identify the key backbone atom associated with it. The key backbone atom of a group is the atom in the chain's backbone with which the group is identified. In the case of a simple group such as methylene, this is an unambiguous selection. In the case of a more complicated group, such as p-phenylene, there is a certain arbitrariness in identifying the key atom. In such a case, the key atom is defined as the tailmost backbone atom in the group. Because of the head/tail arbitrariness in a polymer chain, you may have to select first the atom at one end of a group and then that at the other end in order to identify the key atom. (You are prompted with an error message if a Backbone Atom you select is not a key atom.)
Once a key atom is selected, you may simply let the key atom be the scattering center, or you may select up to three additional atoms to establish the position in space of the scattering center. The average position of these auxiliary atoms (Scatter Atom 1, 2, and 3) gives the scattering center position. In performing the Monte Carlo calculations, this position is assumed to be invariant in the coordinate frame associated with the bond connecting the key atom with its immediately headward key backbone atom. (See Flory 1989, page 20, for the definition of this coordinate frame). When you have identified a scatterer, all groups with the same key as that of the key backbone atom are identified with scatterers of the same type.
The specific steps in identifying scatterers are as follows:
Select the Read_Configs option and enter the name of the Configuration File you wish to read. Select Execute. The file is read, and the Display_Config option is automatically selected. You may then repeatedly select Execute to step through the configurations, or enter any specific Config Number that you wish to see.
Hints for saving computation time
In RIS Monte Carlo simulations, different functions require vastly different numbers of configurations to obtain results with good precision. For example, a calculation of p(r) or p(s) may require 10,000 or more configurations for adequate statistics to be obtained. On the other hand, a calculation of S(k) or g(r) may require only a few hundred configurations. The latter is fortunate, because the time per configuration required to calculate S(k) or g(r) scales as n2, where n is the number of backbone bonds in the chain. The time per configuration required for calculating p(r), p(s), p(s tensor), or r(f) scales only as n.
When you wish to calculate many different functions for a large chain, it is best to calculate the order n and order n2 functions separately. The order n functions typically require many more configurations than do those of order n2. (Note: for S(k), an approximate order n calculation method is available, and may be selected using the RIS_Monte_Carlo/Range command. For chains containing more than about 100 bonds, this method produces results of comparable precision to the order n2 method in a much shorter time).
RIS theory considers only torsional degrees of freedom in backbone bonds in performing its calculations. Chains with large, flexible side groups are therefore inappropriate for RIS and result in a branched chain error message. Very large rings in the chain backbone may also result in a branched chain error message.
A chain of only one bond results in an error message. This is a trivial case, of course. A chain of only two bonds is likely not to have statistical weights assigned, and thus may generate an error message. (A special weights record must be entered for the two-bond case, with the first and last groups both being NONE. Since the two-bond case is normally of little interest, this has not been done for the system statistical weights library).
Note that the above restrictions apply only to RIS and RIS MC calculations. They do not apply to RMMC calculations, which are discussed next.
"RIS" Metropolis Monte Carlo (RMMC) simulations
If your chain is branched, or if its statistical weights are unknown, RMMC simulation is an alternative to the RIS and RIS MC approaches. To calculate the properties of a polymer chain using RMMC, perform the following steps.
To calculate a bond pair distribution function, use Subset/Template to define a subset of n-tuples containing 5 to 8 atoms each. The first 4 atoms of each n-tuple define the first dihedral angle of a bond pair; the last 4 atoms define the second. Execute the RIS_Metrop_MC_Run command with the Dihedral_Pair_Dist parameter toggled on. The RMMC program will calculate both a priori and conditional probability distributions for the bond pair.
A RMMC calculation may produce a number of output files. Listed below are the extensions and contents of these files.
Then, from the Open Tutorial window, select Polymer Modeling and Property Prediction tutorials, then select RIS Module Tutorials, and choose from the list of available lessons:
Lesson 1: Calculating Properties of Isotactic and Syndiotactic Polystyrenes
Lesson 2: Using RIS Metropolis Monte Carlo Simulation
Lesson 3: Calculating RIS properties for a new polymer is not included as an online tutorial. For this tutorial, follow the step-by-step instructions given below.
You can access the Open Tutorial window at any time by clicking the Open File button in the lower left corner of the Pilot window.
For a more complete description of Pilot and its use, click the on-screen help button in the Pilot interface or refer to the Introduction in the Insight II manual.
The topic covered in this tutorial is:
Overview of tutorial Lesson 3
> mkdir ris_tut > cd ris_tut
In order to carry out RIS calculations, all the groups in the polymer must be present in a RIS groups database and statistical weights for all the bond pairs must be present in a RIS statistical weights database. It is suggested that you use local copies of the RIS groups and weights databases rather than modifying the centrally located files.
It should be emphasized that the statistical weights you calculate here are based on minimizations in vacuum. Thus, any effects due to surrounding chains or solvent in a bulk system are neglected. Although weights computed in this manner may correctly predict trends within families of polymers, they should be validated and fine tuned by comparison with experimental data whenever possible. This optimization has been done for virtually all statistical weights reported in the literature.
The topics covered in this tutorial are:
1. Initial Setup
Perform the following procedure in the UNIX environment before started Insight II.
|Make a new directory named statwt_tut for this lesson, and move to this directory. At the UNIX operating system prompt, enter:|
This creates a statwt_tut subdirectory and copies to it the files you need for this lesson.
2. Set the Environmental Variable RIS_FILES
Set the environment variable for the RIS files to point to your statwt_tut directory. This tells the RIS program where to look for the RIS database files. You can also change the environment variable within Insight using the Session/Env_var command. If you always want to use local versions of these RIS database files you can place the setenv RIS_FILES command in your .cshrc file.
|To set the required environment variable, and verify that it points to your statwt_tut directory, enter:|
3. Build A "New" Polymer
|From the statwt_tut directory, start an Insight II session by entering insightII at the UNIX operating system prompt.|
|Click the Accelrys icon and select the Polymerizer module. Select the Homopolymer/Polymerize command. Select the Vinyls option in the Type Of Repeat_Unit parameter box. When the repeat unit icons appear on the screen, pick one of the atoms on the dClEth (dichloroethylene) repeat unit icon. Enter 5 in the Polymerization Deg parameter box. Leave the other parameters at their default values, and select Execute to construct the polymer.|
4. Sequence the Polymer and Display Unknown Groups
|Click the Accelrys icon and select the RIS module. You are prompted to Fix the potentials in the Forcefield/Potentials command. Select Fix for Potential Action and Charge Action, then Execute.|
|Select the Sequencer/S_Run command. The polymer name is automatically entered in the Assembly/Molecule parameter box, and Group_Type is set to RIS. Leaving SEQ Object Type set to Molecule, select Execute.|
The following message appears in the information area:
Starting S_Run background job on <hostname> as job <job number>After a few moments the Sequencer background job completes and the following message appears in the information area:
S_Run job terminated with diagnostics: There are unknown groups in the chain.Next you display the unknown groups in the polymer.
|Select the Sequencer/Substitute command. The polymer name is automatically entered into the Molecule Name parameter box, Group_Type is set to RIS, and Substitute_Action is set to Show_Unknown_Grps. Select Execute.|
Make sure that the end with the Tail label is on the left.
In the poly(vinylidene chloride) molecule you have built there are six labeled atoms, i.e., one unknown group which occurs five times, plus the tail atom, which is also labeled.
5. Add Unknown Groups to the Groups Database
Here, you add dichloromethylene to the groups data base.
|Select the Sequencer/Group_Database command. The Group_Type is set to RIS, and Group_Action is Add. To add the dichloromethylene (-CCl2-) group as a new group, choose the carbon atom in the first (-CCl2-) group to be the Headward Atom (labeled C1 in the diagram below). Choose the carbon atom attached to the left side of the (-CCl2-) group to be the Tailward Atom (labeled C2 in the diagram). Enter dichloromethylene as the Group Name and select Execute.|
The information area contains the following messages after the Add is performed:
Starting S_Run background job on <hostname> as job <job number>Notice that because dichloromethylene is not pseudochiral, choosing the Tailward Atom to be on the right side of the (-CCl2-) group would not make any difference in the group definition.
S_Run job completed successfully.
You next re-sequence the polymer into groups.
|Select the Sequencer/S_Run command. Select Execute.|
After a few moments the Sequencer background job is completed and the following message appears in the information area:
S_Run job completed successfully.6. Verify the Absence of Known Groups
|Select the Sequencer/Substitute command. The polymer name is automatically entered into the Molecule Name parameter box, Group_Type is set to RIS, and Substitute_Action is set to Show_Unknown_Grps. Select Execute.|
The information area contains the following message:
There are no unknown groups.Notice that there are no labeled atoms on the polymer.
7. Employ Local User Files Instead of Default System Files
|Select the RIS_Compute/Files command. Stat Weights File: User (default is System)|
This command allows you to employ user.statwt as the statistical weights database instead of the default file $BIOSYM/data/polymer/ris/system.statwt, and user.prop as the properties file instead of $BIOSYM/data/polymer/ris/system.prop. Note that because you defined the RIS_FILES environment variable, there are no system files shown. If, after finishing this tutorial, you wish to access the system files, open a new UNIX window.
The local file user.statwt is empty at this point. You will enter the statistical weights for poly(vinylidene chloride) into this file after they are calculated.
The local file user.prop is also empty. You will add the bond lengths associated with poly(vinylidene chloride) into this file.
8. Add Properties for the New Groups to the Properties Database
|Select the Database/Properties command. Action: Add|
|Next, specify the same bond length by choosing methyl for Group 1 and dichloromethylene for Group 2.|
In general, if you enter properties for A-B, where either A or B is a new group, you should also enter properties for B-A.
|Repeat the same process for the groups in the following table:|
|Group 1||Group 2||Bond Length|
9. Identify and Extract Bond Pairs for Which Statistical Weights
Are Not Known
|Select the Stat_Weights/Bond_Pairs command. Select Locate_Unknown for Bond Pair Action. The molecule name (POLYDCLE) is automatically entered for the Molecule Name parameter. Select Execute.|
The following message appears in the information area:
There are 6 unique unknown bond pairs.
|Select Execute to display the first "pair".|
Groups identified are: methyl dichloromethylene methylene
|Select Execute again to display the second bond pair.|
Groups identified are: methyl dichloromethylene methylene dichloromethyleneNote that the first bond "pair" is identified by three consecutive backbone atoms, and the second "pair" is identified by four such atoms.
It should be noted that a bond pair is defined by a sequence of five groups. In such a sequence, the bond connecting the backbone atom of group 2 to that of group 3 is the first bond, whereas the bond connecting the backbone atom of group 3 to that of group 4 is termed the second bond.
Therefore, strictly speaking, the first two sequences of groups located and displayed by the Stat_Weights/Bond_Pairs command do not define bond pairs. The Bond_Pairs command identifies these two sequences as "unknown", because the statistical weights database has to contain a matrix corresponding to each of these sequences before you can calculate conformation dependent properties for the polymer in question using the RIS module.
The matrix corresponding to the first sequence (CH3-CCl2-CH2) is a row vector (given by Eqn (43) in Flory 1974) and does not require any calculations. The specification of this matrix will be discussed later in this lesson.
The second sequence (CH3-CCl2-CH2-CCl2) identifies the second bond (the first internal bond) on the backbone of the polymer chain. The statistical weight matrix (denoted by U2 on pp.67-68 of Flory 1989) for this bond can be calculated automatically by the Stat_Weights tools. This matrix can be approximated, however, by using the weight matrix of the internal bond pair defined by the sequence CCl2-CH--CCl2-CH2-CCl2 (see pp.67-68 of Flory 1989).
|Select Execute again to display the third bond pair.|
The information area contains the following message:
Groups identified are: methyl dichloromethylene methylene dichloromethylene methyleneThe backbone atoms of these groups are labeled.
These five consecutive groups define a bond pair for which the statistical weights can be estimated using the procedure described below. The statistical weight matrix for this bond pair, however, is the same as that for the fifth bond pair defined by the groups:
methylene dichloromethylene methylene dichloromethylene methyleneA separate calculation for the third bond pair, therefore, is not necessary. The results that will be obtained for the fifth bond pair will be employed for this pair also. (To see the basis of this usage, you may note that both sequences contain the same bond pair: CCl2-CH2 and CH2-CCl2. Furthermore, the neighboring backbone atoms are carbon atoms, either in the form of CH3 or CH2C, which are essentially the same for our purposes).
|Select Execute again to display the fourth bond pair.|
The information area shows the following:
Groups identified are:
dichloromethylene methylene dichloromethylene methylene dichloromethylene.
|Now, set Bond Pair Action to Extract_Unknown, Type DCL1 for the Fragment Name, and select Execute.|
|Move this segment (DCL1) to the upper left side of your screen using the mouse. Bond Pair Action: Display_Unknown|
The information area shows the following:
Groups identified are: methylene dichloromethylene methylene dichloromethylene methylene
|Set Bond Pair Action to Extract_Unknown, type DCL2 for the Fragment Name, and select Execute.|
|Move this segment (DCL2) to the upper right side of your screen. Set Bond Pair Action to Display_Unknown again. Type 6 for the Bond Pair Number, and select Execute.|
The information area shows the following:
Groups identified are: methylene dichloromethylene methylene dichloromethylene hydrogenThis sequence of groups, while not equivalent to the fifth sequence, contains the same bond pair: CCl2-CH2 and CH2-CCl2. Furthermore, it is on the "right" end of the chain and occurs only once along the chain. Therefore, its statistical weight matrix is approximated by that of the fifth sequence, the error introduced being negligible for long chains for which end effects are not significant.
Thus, there are six unknown bond pairs (i.e., six sequences of groups) for which you have to enter the statistical weight matrices into the statistical weights database before you can calculate conformation dependent properties for this "new" polymer using the RIS module.
These six sequences of groups are:
The calculations carried out using DCL1 will give the matrix for sequence 4.
The matrix for sequence 1 is a row matrix, which can be determined without any calculations (these will be described later).
The matrix for sequence 2 will be approximated using that of sequence 4.
|At this point, delete the original polymer chain using the Object/Delete command.|
10. Calculate Statistical Weights for the Internal Bond Pairs
|Select the Stat_Weights/Define_Torsions command. Select the Add mode for Activation. For Torsion Name, type phi1. Starting from the left end of the segment DCL1, pick the 4 backbone carbon atoms for Atom 1 Spec, Atom 2 Spec, Atom 3 Spec, and Atom 4 Spec.|
These 4 consecutive backbone atoms define the first rotational angle and are shown in the illustration below:
|Now type phi2 for Torsion Name and pick the 4 backbone atoms starting from the second backbone atom from the left end of the segment:|
This sequence of atoms defines the second torsion angle in your molecule.
|Next, select the Stat_Weights/Dihdrl_Dihdrl_Run command. Make sure that you identify the molecule (in this case DCL1) for which you have just defined torsions. You can do this by picking that molecule using the mouse, or by typing the name DCL1 for Molecule Name. Leave all the parameters at their default values. In particular, make sure that the Run Name parameter is set to the same name as the Molecule Name (i.e., DCL1). Select Execute.|
The following message is displayed at the information area:
Starting Discover minimization background job on <hostname> as job <job number>
|Repeat the above commands (starting with the Stat_Weights/Define_Torsions command, and ending with the Stat_Weights/Dihdrl_Dihdrl_Run command) for the segment DCL2.|
Starting Discover minimization background job on <hostname> as job <job number>Even with the coarse grid used by default (i.e., 36°), the Discover calculations take about ten minutes on a 20 MHz 4D/25 SGI workstation to complete. To obtain maps with enough detail that "good" estimates for the statistical weights can be obtained using them generally requires a finer grid and several hours of CPU time.
The main purpose of this lesson is to familiarize you with the procedure of constructing contour maps of energy, analyzing these maps and calculating statistical weights, which is the same whether the grid you use is fine or coarse.
Because of the consequent large increase in CPU time and disk space requirements, the use of an angle increment smaller than 10° is usually not warranted. Also note that the angle increment you specify should divide 360.0 evenly.
When the first Discover job is complete, the following message is displayed at the information area:
Job <job number> (Dihdrl_Dihdrl_Run DCL1) completed; status 0You now proceed to calculate the statistical weights for DCL1. (You later perform the same step for DCL2.)
Discover job has successfully completed.
|After the first Discover job is complete, select the Stat_Weights/DihdrlDihdrlCntour command. Contour Step: Construct_Graph|
A 3D graph showing the total energy of the molecule as a function of two dihedral angles is constructed and displayed.
|Move and rotate the graph so that you can see it in 3D form and you can read the minimum and maximum energy values on the z-axis. Select the Contour_Graph option for Contour Step, and Execute.|
|Using the mouse, pick the graph you just constructed for Plot Spec. Select Range_Create for Contour_Action_2D. Make sure that New_Graph is toggled on, and leave the Diplay_Dimension set to 2D. Select a color from the Color Palette for the Color parameter, and enter the minimum energy value you read from the graph (see note below) for the Start Value. Enter 1.0 for Interval Size, and 10 for Num of Levels. Select Execute.|
Note: You should enter an energy value that is slightly greater than the minimum energy that you can read from the graph. For example, if the axis minimum is labeled 4.4 then you should enter 4.45 as the minimum value. You need to do this because Graph labels are rounded off, and if you enter a minimum energy less than the true minimum on the graph, you will get an error message.
|Select the Stat_Weights/Analyze_Energy command. Select Execute.|
This command analyzes the given map to find the minima on it. A set of default regions is generated and displayed. Each such region represents a state in the RIS approximation.
|Select the Stat_Weights/Compute command. Select Compute_Averages for Compute Step. Enter 300.0 for Temp, and select Execute.|
This command computes the average angles, partition function, and average energy for each region on the map. The average angles are displayed (using yellow triangular marks) upon the completion of this command.
|Select the Combine_Angles option for Compute Step, and Execute.|
|In the Combine_Angles menu, make sure that Angles Action is set to Combine, and Combine Type is set to Tolerance, and Tolerance is set to 20 (degrees), and select Execute.|
|Select Cancel to return to the main selection menu of the Compute command. Finally, select the Compute_Weights for Compute Step, and Execute.|
The statistical weights for the bond pair are calculated and the results are displayed in a table named RIS_DCL1. The weights are also written to a file named DCL1.weights.
|Now, repeat the same commands for DCL2 (start by selecting the Stat_Weights/DihdrlDihdrlCntour command and setting Contour Step to Construct_Graph).|
11. Add Statistical Weights to the Database
The statistical weights for the two internal bond pairs of poly(vinylidene chloride) are in the tables named RIS_DCL1 and RIS_DCL2. The next step is to add these weights to your local database file user.statwt.
Select the Database/Weights command.
Choose the following as Group 1 through Group 5: Group 1: dichloromethylene
As an approximation, you will use the same matrix (appropriate for interior bonds) for the second bond from the tail end of a chain. The matrix RIS_DCL1 contains contributions from both first and second order interactions (see the Statistical weights section), and in principle the matrix for the second bond should contain only first order interactions. However, the error introduced by this approximation becomes vanishingly small for long chains.
|While still in the Database/Weights command, change the group names to the following: Group 1: NONE|
Thus, the terminal group sequence CH3-CCl2-CH2-CCl2 is assigned the same statistical weights as the internal group sequence CCl2-CH2-CCl2-CH2-CCl2.
Next, store the matrix RIS_DCL2.
|Move the focus to Table Name, and choose RIS_DCL2 from the value-aid. Leave the Theta_Sup, Temp, and User Weights File parameters as they are. Choose the following for Group 1 through Group 5: Group 1: methylene|
|Select the Database/Weights command again (with the RIS_DCL2 matrix) for the following sequences of the groups in the following table:|
|Group 1: methyl and||Group 1: methylene|
|Group 2: dichloromethylene||Group 2: dichloromethylene|
|Group 3: methylene||Group 3: methylene|
|Group 4: dichloromethylene||Group 4: dichloromethylene|
|Group 5: methylene||Group 5: hydrogen|
Lastly, you must add the weights for the first bond of this polymer, because the results of the Stat_Weights/Compute command are not in the standard Flory form (see the Terminal and near-terminal bonds section). 4
Remain in the Database/Weights command.
Nu0: 1 |
(these parameters give the dimensions of the matrix) Type FIRST for Table Name, and select Execute.
|The Weights_Action parameter is automatically set to Edit_Wts_Table. Type 0 for Table Cell Value and press <Enter> to fill in the first cell. Repeat this procedure two more times to fill in the second and third cells.|
This indicates that the dihedral angles for the states of the first bond are all zero. (The state of the first bond is actually undefined, but by convention it is assumed to be always trans. See Flory 1974, p.386.)
|Type 0, 0 for Table Cell Value and press <Enter>. Type 1, 0 and press <Enter> again. To complete the matrix, type 0, 0 and <Enter>.|
These are the prefactor (A1k) and energy (EPS1k) values for each state. When complete, the matrix should appear as given below:
|2||A1k, EPS1k||0, 0||1, 0||0, 0|
|If your matrix looks different from this, use the mouse to set the focus to the appropriate cell, and retype its value. When the matrix is satisfactory, set Weights_Action to Add.|
Enter the following for the group names, and then select Execute:
Group 1: NONE|
Group 2: NONE
Group 3: methyl
Group 4: dichloromethylene
Group 5: methylene
12. Verify Presence of Statistical Weights for Each Bond Pair in the Polymer
|If you have deleted your original polymer, rebuild a chain of poly(vinylidene chloride) using five repeat units.|
|Next, select the Stat_Weights/Bond_Pairs command. Select Locate_Unknown for Bond Pair Action, enter the name (POLYDCLE) of your polymer, and select Execute.|
The following message is displayed at the information window:
There are no unknown bond pairs13. Calculate Conformation Dependent Properties
Now that you have entered all the necessary properties of poly(vinylidene chloride) in your local database, you will carry out a simple RIS calculation to obtain the characteristic ratio as a function of number of bonds using the statistical weights you have calculated.
To perform RIS calculations on chains of variable length, you must first use the RIS_Compute/Define_Segment command to define head, tail, and repeating segments for your chain. In this lesson, you will do this for the poly(vinylidene chloride) chain you have constructed.
|To simplify the process of defining segments, first select the Sequencer/Substitute command. Set Substitute_Action to Show_All_Grps.|
This causes the key atom of every group in the chain backbone to be labeled with its atom name.
|Select the RIS_Compute/Define_Segment command. Set Segment Type to Tail. Make sure Color_Segment is toggled on, and choose a Color from the Palette. Pick the carbon atom (labeled C1) in the terminal methyl group as both the Headward Atom and Tailward Atom.|
When this is done, the command executes, and the Segment Type is automatically set to Repeating.
To define the repeating segment, choose another color, and pick an unsubstituted methylene carbon atom (labeled C1) in the chain interior as the Headward Atom. Pick an adjacent substituted methylene (labeled C2) as the Tailward Atom.|
Make sure Segment Type is set to Head, and choose a Color. For Headward Atom, pick the labeled hydrogen atom (labeled H2) attached to the terminal substituted methyl group. For the Tailward Atom, pick the attached carbon atom (labeled C2).
|Select the RIS_Compute/C_Run command. Run Name: PVDC|
When the RIS calculation is complete, the following message is displayed in the information window:
RIS job completed successfully
|Select the Graph/Get command. File Name: PVDC.ristbl|
|To exit Insight II, enter quit at the command prompt and press <Enter>. To confirm, press <Enter> again.|
1 For example, in the case of p-phenylene, it is as if an imaginary atom lies at the center of the ring. The virtual bonds connect this imaginary atom to neighboring groups.
2 The superscript t on Et stands for total, indicating that this is the total energy of the segment.
3 Note that for the common 3-state case, the order of the rows and columns used here is g-, t, g+; as opposed to the common arrangement t, g+, g-.
Each row of the weights matrix corresponds to a distinct phi1 value, and each column corresponds to a distinct phi2 value. The rows and columns of a matrix generated by the Stat_Weights tools are put in ascending order of phi1 and phi2 values, respectively. Thus, the phi1 value corresponding to the i-th row is smaller than that corresponding to the (i + 1)-th row.
Hence, for symmetric vinyl chains, the order is g-, t, g+ instead of t, g+, g-. This approach permits consistent handling of all linear polymers: not all chains can be handled using the familiar 3-state (t,g+,g-) model.
For example, the trans state (i.e., phi = 0) may not be one of the low energy states, and therefore may not be used as a rotational state in the RIS model.
Note also that you should not mix the weight matrices generated by the Stat_Weights tools by those in the system database. That is, if the weight matrices of some of the bond pairs are available in the system database, and you calculate the unknown weight matrices using the Stat_Weights tools, you should not mix these matrices. The correct approach for a such a polymer is to calculate all the weight matrices using the Stat_Weights tools and put these matrices into a new (user) database as described in this lesson.
Last updated April 20, 1998 at 10:02AM PDT. Copyright © 1997, Accelrys Inc. All rights reserved.