 KoopmansTheorem - Maple Help

Koopman's Theorem with Applications in Medicine

Copyright (c) RDMCHEM LLC 2019 Overview Very often, chemical reactivity can be thought of in terms of ionization energies.  For example, the lower an atom or molecule's ionization energy, the higher its reactivity.  For an atom or molecule, X, its ionization energy, IE, can be defined as   X(g) -->  X+(g) + 1e-         ΔE = IE   Of course, ionization energies will depend on from which molecular orbital an electron is removed.   The lowest ionization energy would correspond to removing an electron from the highest occupied molecular orbital (HOMO).   So, given its importance, it is often useful to calculate an atom or molecule's ionization energy to predict or explain its chemical reactivity. In this activity, we explore two approaches. One approach is to calculate explicitly the energy of X, and X+ and take energy differences:   IE = E(X+) − E(X)          (1)   For smaller molecules, this is a straightforward approach. However, as the size of the molecule increases, this approach becomes increasingly difficult.  Another more approximate method, known as Koopman's Theorem, requires the calculation of the electronic structure of X only:   Koopman's Theorem: The lowest (vertical) ionization energy of a molecule can be approximated as the negative of the highest occupied molecular orbital (HOMO) energy:   IE_Koopman = − e(HOMO)          (2)                                            More generally, ionization from the ith molecular orbital (MO) is approximately the negative of the MO energy.  In this exercise, we will explore to some degree the quality, or lack thereof, of Koopman's theorem for approximating ionization energies.  We will then consider an application in which the ionization energies of a series of non-steroidal anti-inflammatory drugs (NSAIDs) can be related to their medicinal activity! Koopman's Theorem and Periodic Trends

In this section, we explore Koopman's theorem for a series of atoms (and molecules) and compare calculated ionization energies to their experimentally observed values. In particular, we consider periodic trends for atoms as one proceeds down a group or across a period.  We consider  Group 1A (H - Cs), Group 7A (F - I), and  Period 2 (Li - Ne).  For those that are interested, we also consider a series of isoelectronic binary compounds: methane, ammonia, water, and hydrogen fluoride. Initialize

Here we initialize the Maple packages we wish to use as well as define the experimental values of the quantities we wish to calculate for comparison.

 > Atomic Ionization Energies and Periodic Trends

In this section, we calculate the energies of the neutral and cation species for Group 1A, Group 7A, and Period 2 elements and use equations (1) and (2) to determine IEs.  We then plot the results along with experimental values to assess Koopman's theorem.    We will use Hartree Fock theory with a DZP basis, which allows us to calculate energies of Group 1A elements up to cesium and Group 7A elements up to iodine.

The following Maple input is divided into steps.  The first 3 steps are used to calculate the IE of the H.  Step 4 requires you to repeat Steps 1 - 3 to calculate IEs for the remaining elements in Group 1A.  Once you have completed the series, you can plot the results and then repeat the process for Group7A and Period 2.  For logistical reasons, we consider the following variables to be specified before each calculation:

Group1A series has n = 6 elements.  The index m labels which atom:

H (m = 1),

Li (m = 2),

Na (m = 3),

K (m = 4),

Rb (m = 5), and

Cs (m = 6).

Num_neutral = 1, Num_cation = 0.

Group7A series has n = 4 elements.  The index m labels which atom:

F (m = 1),

Cl (m = 2),

Br (m = 3), and

I (m = 4).

Num_neutral = 1,

Num_cation = 2.

Period 2 series has n = 8 elements.  The index m labels which atom:

Li (m = 1),

Be (m = 2),

B (m = 3),

C (m = 4),

N (m = 5),

O (m = 6),

F (m = 7), and

Ne (m = 8).

Num_neutral = varies according to ground electron configuration and Hund's Rule,

Num_cation = varies according to ground electron configuration and Hund's Rule.

Step 0:  Initialize vectors (Run once for each series: Group 1A, Group 7A, and Period 2)

 >

Step 1: Specify parameters

 >
 >
 >
 >
 >

Step 2: Calculate energies of neutral atom and cation. Determine HOMO.

 >
 ${\mathrm{HOMO}}{≔}{1}$ (2.2.1)

Step 3: Calculate IE (in kJ/mol) using Equations (1) and (2)

 >
 ${{\mathrm{_rtable}}}_{{18446744829320603638}}$
 ${{\mathrm{_rtable}}}_{{18446744829320603758}}$ (2.2.2)

Step 4: Repeat steps 1 - 3 for each element in group or period

 > 

Step 5: Plot Ionization Energies and Compare

 > How do ionization energies compare with experimental values using Equations (1) and (2)? How does the *trend* in ionization energies compare? Answer We see that Equation (1) using Hartree-Fock and the DZP basis compares well with experiment.  Approximate IEs using Koopman's theorem (Equation (2)) are qualitatively correct with the same trend as experimental values.

Repeat Steps 0 - 5 for Group 7A and Period 2. Ionization Energies of Small, Isoelectronic Binary Hydrides

In this section, we consider Koopman's theorem for an isoelectronic series of binary hydrides: methane (CH_4), ammonia (NH_3), water (H_2O), and hydrogen fluoride (HF).  Similar to what we did above, we let

m = 1 (methane),

m = 2 (ammonia),

m = 3 (water), or

m = 4 (hydrogen fluoride).

Step 0: Initialize vectors

 >

Step 1: Specify Parameters

 > $m≔3:\phantom{\rule[-0.0ex]{0.0em}{0.0ex}}\mathrm{molecule}≔"water":$

Step 2: Calculate energies of neutral molecule and cation. Determine HOMO.

 >
 >
 ${\mathrm{HOMO}}{≔}{5}$ (2.3.1)

Step 3: Calculate IE (in kJ/mol) using Equations (1) and (2)

 >
 $\left[\begin{array}{c}0\\ 0\\ 1330.65232026\\ 0\end{array}\right]$
 $\left[\begin{array}{c}0\\ 0\\ 1066.41477630\\ 0\end{array}\right]$ (2.3.2)



Step 4: Repeat steps 1 - 3 for ammonia (m=2), water (m=3), and hydrogenfluoride (m=4)

 > 

Step 5: Plot Ionization Energies and Compare

Note: Equation (1) = Blue,   Equation (2) (Koopman's Thm) = Red, Experiment = Green

 > We find that Koopman's theorem is slightly larger than the experimental ionization for each molecule, but the trend is correct. Optional It is important to note that ionization energies calculated using Koopman's theorem using Equation (2) or explicitly using Equation (1)  depend on the level of theory and basis.  Using higher levels of theory than Hartree Fock can improve calculated ionization energies. Repeat the calculations above for methane - hydrogen fluoride with the tzp basis.  Did the calculations improve? Application of Koopman's Theorem: Ionization Energies and NSAIDs

Now that we are familiar with Koopman's theorem, let's use it to predict activities of non-steroidal anti-inflammatory drugs (NSAIDs). The compounds we consider are based on substituted salicylic acids (SAA), benzoic acids (BZA), and phenols [1,2].  Figure 1 depicts the structures considered.  R-groups include 3-OH, 4-OH, 5-OH, 3-F, 5-F, 3-Cl, 4-Cl, 5-Cl, 3-IPR (isopropyl), 4-IPR, and 5-IPR.  When X=H, we have SAA, and when S=OH, we have BZA. Figure 1: Skeletal structure represents substituted salicylic acids (SAA) and benzoic acids (BZA) as NSAIDs. Substitution of the R group occurs at the numbered positions.   The skeletal structure was generated with the SkeletalStructure command.

 Salicylic acid derivative   (SA_index) Coordinate Filename Activity H (salicylic acid)   (1) H.xyz 3.33 3-Cl    (2) 3Cl.xyz 3.89 3-F    (3) 3F.xyz 3.82 3-OH    (4) 3OH.xyz 4.43 3-IPR    (5) 3IPR.xyz 3.92 4-Cl    (6) 4Cl.xyz 3.31 4-OH    (7) 4OH.xyz 3.02 4-IPR    (8) 4IPR.xyz 3.29 5-Cl    (9) 5Cl.xyz 4.06 5-F    (10) 5F.xyz 3.82 5-OH    (11) 5OH.xyz 4.61 5-IPR    (12) 5IPR.xyz 4.12

Use the Maple input below to estimate the ionization energy of each salicylic acid derivative.

Step 1: Initialize and read in activities of salicylic acid derivatives.   You will not have to reinitialize again.

 >

Step 2: Specify salicylic acid derivative to be calculated.  The "SA_index" and coordinate filename can be found in Table 1:

 >
 ${\mathrm{SA_index}}{≔}{2}$ (3.1)
 >
 ${\mathrm{molec}}{≔}\left[\left[{"O"}{,}{-0.25234200}{,}{-1.82785300}{,}{-0.00028600}\right]{,}\left[{"O"}{,}{3.34848900}{,}{0.38099100}{,}{0.00014600}\right]{,}\left[{"O"}{,}{2.35622900}{,}{-1.66916500}{,}{-0.00007400}\right]{,}\left[{"C"}{,}{-0.23597400}{,}{-0.46877200}{,}{-0.00006500}\right]{,}\left[{"C"}{,}{0.96542400}{,}{0.29463800}{,}{0.00005900}\right]{,}\left[{"C"}{,}{-1.45886000}{,}{0.22292000}{,}{-0.00002900}\right]{,}\left[{"C"}{,}{0.90804900}{,}{1.70344800}{,}{0.00022600}\right]{,}\left[{"C"}{,}{-1.50685900}{,}{1.60886400}{,}{0.00014300}\right]{,}\left[{"C"}{,}{-0.31660900}{,}{2.35789300}{,}{0.00027100}\right]{,}\left[{"C"}{,}{2.23723300}{,}{-0.41825800}{,}{0.00001200}\right]{,}\left[{"Cl"}{,}{-3.00103200}{,}{-0.72522200}{,}{-0.00015800}\right]{,}\left[{"H"}{,}{1.83746100}{,}{2.25851900}{,}{0.00032100}\right]{,}\left[{"H"}{,}{-2.47031400}{,}{2.10464600}{,}{0.00018100}\right]{,}\left[{"H"}{,}{-0.36247200}{,}{3.44041100}{,}{0.00040400}\right]{,}\left[{"H"}{,}{4.15935500}{,}{-0.17151400}{,}{0.00013100}\right]{,}\left[{"H"}{,}{0.68008900}{,}{-2.17947200}{,}{-0.00034200}\right]\right]$ (3.2)
 > $\mathrm{PlotMolecule}\left(\mathrm{molec}\right);$ Step 3: Calculate energy using DFT with 6-31g basis.  For improved results, you may try a larger basis set at a computational (time) cost.

 > $\mathrm{data}≔\mathrm{DensityFunctional}\left(\mathrm{molec},\mathrm{basis}="6-31g"\right);$
 ${\mathrm{table}}{}\left({\mathrm{%id}}{=}{18446744829309859870}\right)$ (3.3)
 >
 ${\mathrm{HOMO}}{≔}{44}$ (3.4)
 >
 ${{\mathrm{IonizationEnergies}}}_{{2}}{≔}{6.55694334}$ (3.5)



Now repeat Steps 2 - 3 for SA_index = 2 to 12!

Once you have calculated the ionization energy for each salicylic acid derivative, you can plot the ionization energies verses activity and use the interactive CurveFitting function to calculate a linear fit.

 > $\mathrm{plots}\left[\mathrm{pointplot}\right]\left(\mathrm{IonizationEnergies},\mathrm{Activities}\right);$ >
 > $f\left(\mathrm{IE}\right)≔9.35295734976844-0.879064241187929\cdot \mathrm{IE}$
 ${f}{≔}{\mathrm{IE}}{↦}{9.35295735}{+}\left({-1}\right){\cdot }{0.87906424}{\cdot }{\mathrm{IE}}$ (3.6)

Now you can use the function f(x) to estimate the activity of new derivatives!  Consider a derivative with a fluorine at the 3-position.   The coordinate file is "4F.xyz".

 >
 ${\mathrm{molec}}{≔}\left[\left[{"O"}{,}{-0.57423900}{,}{2.20316900}{,}{-0.00033600}\right]{,}\left[{"O"}{,}{-2.61515200}{,}{-1.50531600}{,}{0.00009700}\right]{,}\left[{"O"}{,}{-2.76883700}{,}{0.76845100}{,}{-0.00021900}\right]{,}\left[{"C"}{,}{0.08316400}{,}{1.00840100}{,}{-0.00013300}\right]{,}\left[{"C"}{,}{-0.59083300}{,}{-0.24584600}{,}{0.00001100}\right]{,}\left[{"C"}{,}{1.48479400}{,}{1.04612000}{,}{-0.00008400}\right]{,}\left[{"C"}{,}{0.16589100}{,}{-1.43790100}{,}{0.00021100}\right]{,}\left[{"C"}{,}{2.17656000}{,}{-0.15135800}{,}{0.00011300}\right]{,}\left[{"C"}{,}{1.55297100}{,}{-1.40487300}{,}{0.00026600}\right]{,}\left[{"C"}{,}{-2.04454000}{,}{-0.25944200}{,}{-0.00005300}\right]{,}\left[{"H"}{,}{1.99737200}{,}{1.99781200}{,}{-0.00019900}\right]{,}\left[{"H"}{,}{-0.36039200}{,}{-2.38396500}{,}{0.00032000}\right]{,}\left[{"F"}{,}{3.56252000}{,}{-0.11046100}{,}{0.00016300}\right]{,}\left[{"H"}{,}{2.14977600}{,}{-2.30713700}{,}{0.00041800}\right]{,}\left[{"H"}{,}{-3.59295500}{,}{-1.42528600}{,}{0.00004700}\right]{,}\left[{"H"}{,}{-1.55869500}{,}{2.05168100}{,}{-0.00036800}\right]\right]$ (3.7)
 > $\mathrm{PlotMolecule}\left(\mathrm{molec}\right);$ >
 > $\mathrm{data}≔\mathrm{DensityFunctional}\left(\mathrm{molec},\mathrm{basis}="6-31g"\right);$
 ${\mathrm{table}}{}\left({\mathrm{%id}}{=}{18446744829479293342}\right)$ (3.8)
 >
 ${\mathrm{HOMO}}{≔}{40}$ (3.9)
 >
 ${\mathrm{IE}}{≔}{6.69549138}$ (3.10)
 > $f\left(\mathrm{IE}\right)$
 ${3.46719030}$ (3.11)

Do you expect the 4-F derivative to be more or less active then the average derivative? Answer Less... The average activity is 3.80. The predicted activity of the 4-F derivative is only 3.5.

How could the prediction of activity be improved, if possible? Answer Our estimation of activity is based on two approximations:  1) Koopman's theory for calculating IE, and 2) estimating activity as a linear function of IE.   From the point plot, it is clear that there is only a trend relating IE and activity and that the relationship is not exactly linear.  We can only improve how we estimate IE, either using a higher level of electronic theory, a larger basis, or calculating IE using Eq. (1) from the Overview instead of Koopman's theorem.   We should point out, however, that improving our estimation of IE does not necessarily mean the linear fit will be any better. See Ref. 2 for more details on this matter!     One possible way to improve the prediction of activity would be to include more derivatives in the linear fit. Here we chose only a subset of the activity data presented in Ref. 2. Appendix References 1. Mehler, et al.,  Int. J. of Quant. Chem., 35, 205 (1989). 2. Shakman and Mazziotti, J. Phys. Chem. A, 111, 7223 (2007). 3. A. Szabo and N. S. Ostlund, Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory (Dover Books, New York, 1996).