__Data Regression__

`> `
**restart;**

The problem at hand is the fitting of experimental data to equations used to represent the vapor pressure. We begin with a list of the temperatures (in Celcius) at which the vapor pressure has been measured.

`> `
**T(C) := [-36.7,-19.6,-11.5,-2.6,7.6,15.4,26.1,42.2,60.6,80.1];**

Note the perhaps unconventional symbol used to denote the list of temperatures. As far as Maple is concerned
here is a name just like
would be, and can be assigned to any valid Maple object; in this case a list of temperature values. The corresponding values of the vapor pressure (in mm Hg) is next

`> `
**P(mmHg) := [1,5,10,20,40,60,100,200,400,760];**

A quick check to see that the number of values of temperature and pressure are equal.

`> `
**nops(P(mmHg)), nops(T(C));**

We plot the data to see its general form

`> `
**datapoints := [seq([T(C)[k],P(mmHg)[k]],k=1..nops(T(C)))];**

`> `
**dataplot:=plot(datapoints,T=-60..100,style=point,symbol=box): dataplot;**

We continue by attempting to fit simple polynomials to the data. We define a function to give us a polynomial in
of any order.

`> `
**p := n->sum(a[k]*T^k,k=0..n);**

A quadratic, for example, is given by

`> `
**p(2);**

and a cubic is

`> `
**p(3);**

Maple includes a least squares fitting procedure in its statistical package. The package is loaded as follows

`> `
**with(stats):**

We fit a quadratic to the data as follows

`> `
**n:=2; **

expr := P=p(n);

`> `
**eqn[n]:=fit[leastsquare[[T,P], expr, {seq(a[k],k=0..n)}]]([T(C),P(mmHg)]);**

We can fit higher order polynomials just as easily

`> `
**n:=3; expr := P=p(n);**

`> `
**eqn[n]:=fit[leastsquare[[T,P], expr, {seq(a[k],k=0..n)}]]([T(C),P(mmHg)]);**

`> `
**n:=4; expr := P=p(n);**

`> `
**eqn[n]:=fit[leastsquare[[T,P], expr, {seq(a[k],k=0..n)}]]([T(C),P(mmHg)]);**

`> `
**n:=5; expr := P=p(n);**

`> `
**eqn[n]:=fit[leastsquare[[T,P], expr, {seq(a[k],k=0..n)}]]([T(C),P(mmHg)]);**

We can compare these expressions graphically

`> `
**n:='n':**

polyplot:=plot([seq(rhs(eqn[n]),n=2..5)],T=-40..100,color=[red,black,blue,green]):

`> `
**plots[display]({polyplot,dataplot});**

From which we observe that the red line (the quadratic) is clearly poorer than the others; all of which seem more or less equally good. However, if we plot the equations on a semilog plot we see that the polynomials suffer from a rather serious problem.

`> `
**lplot1:=plots[logplot]([seq(rhs(eqn[n]),n=2..5)],T=-40..100,color=[red,black,blue,green]):**

`> `
**lplot2:=plots[logplot](datapoints,style=point,symbol=box):**

`> `
**plots[display]({lplot1,lplot2});**

From this diagram (if viewed in color) we would notice that the fourth and fifth order polyomials are best (as might be expected).

We now attempt to fit the Clausius Clapeyron (CC) equation to the data.

`> `
**CCeqn:= log10(P) = A-B/(T): CCeqn;**

The bottom line of the second term on the right is the temperature in Kelvin. We define two new variables

`> `
**v1:= Tau =1/T: v1;**

`> `
**v2:=rho=log10(P): v2;**

and substitute into the CC equation.

`> `
**CC2:=subs(rhs(v2)=lhs(v2),rhs(v1)=lhs(v1),CCeqn): CC2;**

Thus, this equation is linear in
and
.

The next step is to transform the data, beginning with the vapor pressures

`> `
**lnpdata := evalf(map(log10,P(mmHg)));**

and now for the temperatures.

`> `
**ToKelvin:=x->x+273.15;**

T(K) := map(ToKelvin,T(C));

We must take the reciprocal of these values

`> `
**r := x-> 1/x;**

Tover := map(r,T(K));

We are now ready to fit the transformed data

`> `
**CCfit:=fit[leastsquare[[Tau,rho], CC2, {A,B}]]([Tover,lnpdata]): CCfit;**

This result can be expressed in terms of our original variables

`> `
**CC3:=subs(v1,v2,T=ToKelvin(T),CCfit): CC3;**

We make a function of the right hand side for plotting purposes

`> `
**CCF:=unapply(10^rhs(CC3),T);**

and display this new result along with the data.

`> `
**n:='n':**

p1:=plots[logplot]([CCF(T)],T=-40..100,color=[red]):

`> `
**plots[display]({p1,lplot2});**

From this we see immediately that the Clausius Clapeyron form is far superior to the simple polynomial fits.

Another popular equation used for fitting vapor pressure data is the Antoine equation.

`> `
**Antoine := log10(P)=A-B/(T+C): Antoine;**

This equation cannot be made linear in
and
. However, some rearrangement of the above expression gives:

`> `
**A1:=collect(expand(normal(Antoine*(T+C))/T),T):**

`> `
**t:=op([1,2],A1):**

`> `
**A2:=A1-(t=t): A2;**

This equation is linear in
and
with parameters
,
, and
.

We define new temperature and pressure variables as was done for the Clausius-Clapeyron equation.

`> `
**v1:=Tau=1/T: v1;**

`> `
**v2:=rho=log10(P): v2;**

We must also define the combined variable:

`> `
**v3:=Gamma = rho*Tau: v3;**

With the above definitions we may rewrite the Antoine equation as

`> `
**read `e:/maple/utils/utils.mpl`:**

`> `
**A3:=subs(Solve([v1,v2,v3],[T,ln(P),Tau]),A2):**

A4:=algsubs(Tau*rho=Gamma,A3):

A4:=sort(A4,[a[0],a[1],a[2]]): A4;

We define new parameters:

`> `
**params := [a[0]=A,a[1]=coeff(rhs(A4),Tau,1),a[2]=-coeff(rhs(A4),Gamma,1)]: op(params);**

and rewrite the equation in terms of these parameters:

`> `
**A4:=subs(map(flip,params,equation),A4): A4;**

The next step is to obtain the necessary data. We already have the column of
; the column of values of
is obtained as follows.

`> `
**r := x-> 1/x;**

Tover := map(r,T(K));

`> `
**Gammavalues := [seq(Tover[k]*lnpdata[k],k=1..nops(lnpdata))];**

We are now ready to fit the transformed data

`> `
**Antoinefit:=evalf(fit[leastsquare[[Tau,Gamma,rho], A4, {a[0],a[1],a[2]}]]([Tover,Gammavalues,lnpdata])):**

Antoinefit;

In the next few lines we reconstruct the Antoine equation in terms of oroginal variables:

`> `
**peqn1:=a[0]=coeff(coeff(rhs(Antoinefit),Tau,0),Gamma,0): peqn1;**

`> `
**peqn2:=a[1]=coeff(rhs(Antoinefit),Tau): peqn2;**

`> `
**peqn3:=a[2]=-coeff(rhs(Antoinefit),Gamma): peqn3;**

`> `
**params;**

`> `
**params2:=solve(subs(peqn1,peqn2,peqn3,{op(params)}),{A,B,C}): params2;**

This result can be expressed in terms of our original variables

`> `
**A5:=subs(params2,T=ToKelvin(T),Antoine): A5;**

We make a function of the right hand side for plotting purposes

`> `
**AF:=unapply(10^rhs(A5),T);**

and display this new result along with the Clausius-Clapeyron equation and the data.

`> `
**p1:=plots[logplot]([CCF(T),AF(T)],T=-40..100,color=[red,blue]):**

`> `
**plots[display]({p1,lplot2});**

In this case the Antoine fit (in blue) appears to be even better than the was obtained with the Clausius-Clapeyron equation (in red). Contrast the behavior of both methods with that of the simple polynomial fits given above.

`> `