UWA logoMaterial to support teaching in Environmental Science at The University of Western Australia

Units ENVT3361, ENVT4461, and ENVT5503

Andrew Rate, School of Agriculture and Environment, The University of Western Australia, 2025-11-27

 

Introduction

The Back Story

The problem is to write R code that replicates the output of some Fortran-77 code I wrote and used in my PhD thesis (Rate, 1990). The original Fortran code was to fit models to data on complex-formation of copper and cadmium ions to humic acids (stable organic macromolecular substances present in soils; see Figure 1). Optimisation was originally done with the amoeba Nelder-Mead simplex algorithm from Press et al. (1986). The same multidimensional simplex algorithm is implemented in R in the optim() function. I originally used a hybrid Newton-Raphson/bisection method to solve cubic equations (subroutine newton from Press et al., 1986). To simplify the code and remove the need to specify the differential function, I used the R uniroot() function which worked fine with my code and uses Brent's method for finding univariate roots using an adaptive hybrid of bisection, the secant method, and inverse quadratic interpolation. These functions allow all the hard number-crunching to be done in base-R (R Core Team, 2025), and I used the stringr package (Wickham, 2025) to manipulate object and file name strings.

Figure 1: Map of areas in Waitaha/Canterbury, Aotearoa New Zealand where soil was collected for extraction of humic acids.

Figure 1: Map of areas in Waitaha/Canterbury, Aotearoa New Zealand where soil was collected for extraction of humic acids.

 

The Setting

It's very common in disciplines like environmental and soil sciences to present adsorption data (such as metal adsorption on humic acids) as the amount of substance adsorbed (q) vs. the concentration of that substance in solution (c). This is the classical “adsorption isotherm” (Figure 2), and the data are commonly fitted to empirical equations such as the Langmuir or Freundlich equations.

Equation which reads qM =  (C1*K1*cM/(1+K1*cM))

Figure 2: Classical isotherm representation of adsorption, for the reaction of copper ions with solid-phase calcium humate derived from humic acid extracted from Waimari Peat, Waitaha/Canterbury, Aotearoa New Zealand. Circle symbols show experimental observations, and the smooth curve represents a single site-binding model (the most simple, a.k.a. the ‘Langmuir’ equation).

Figure 2: Classical isotherm representation of adsorption, for the reaction of copper ions with solid-phase calcium humate derived from humic acid extracted from Waimari Peat, Waitaha/Canterbury, Aotearoa New Zealand. Circle symbols show experimental observations, and the smooth curve represents a single site-binding model (the most simple, a.k.a. the ‘Langmuir’ equation).

 

There are many issues which make this “traditional” isotherm approach problematic. There is a whole universe in the rabbit hole of whether a model can be realistic if it describes such reaction only in terms of the concentrations of adsorbed and non-adsorbed species (e.g. ignoring electrostatic effects). Similar diversions can be indulged in the discussion of whether a large complex molecule like a humic acid would ever have only one type of binding site defined by a concentration \(C_{1}\) and a conditional stability “equilibrium” constant \(log(K_{1})\) as assumed by the Langmuir equation (the “heterogeneity” rabbit-hole). These arguments are scientifically valid, but incessant. For practical purposes, therefore, many researchers pragmatically implement a model with 2 or more Langmuir-type sites having additive behaviour. This approach simplifies (or even ignores) the real physicochemical issues, but provides enough adjustable fitting parameters to describe the data.

Those are the “chemistry” issues, but fitting equations/models is also a mathematical/statistical undertaking, and there are some problems with the traditional isotherm approach there, too.

Typically isotherms are generated by adding (e.g. titrating) metal ion solutions to a suspension or solution of adsorbent (forms of humic acid, in this case). The experimental conditions (mass of adsorbent, amounts of metal ions added, volumes, temperatures, etc.) are very well-controlled, and statistically are the most appropriate independent variables. During the metal ion addition/titration experiment, the critical measurement is the concentration of metal ion not adsorbed to the adsorbent. For my PhD experiments, I measured unadsorbed (dissolved) metal ion concentration (\(cM\)) using ion-selective electrodes, which have a linear calibration between log-concentration and electrode potential, with well-established theory from the Nernst Equation. My point here is that the dependent variable for model fitting should reflect the actual experimental measurement (Figure 3). We could use the original electrode potentials, but since potentials are collinear with log10-metal concentrations, it makes sense to use concentrations.

Figure 3: Adsorption isotherm for Cu²⁺-calcium humate shown as cM (note log-scale) as a function of qM. Circle symbols show experimental observations, and the smooth curve again epresents the best-fit Langmuir equation.

Figure 3: Adsorption isotherm for Cu²⁺-calcium humate shown as cM (note log-scale) as a function of qM. Circle symbols show experimental observations, and the smooth curve again epresents the best-fit Langmuir equation.

 

So far, so good... the \(log_{10}(cM)\) vs. \(qM\) plot still looks like an adsorption curve... but we also end up with a maths problem. The usual adsorption isotherm equations are written in terms of adsorbed metal (\(qM\)) being a function of dissolved metal, \(cM\) (so dissolved metal concentration is the independent variable, which we don't want).

Equation which reads qM = the summation from i=1 to 3 of (Ci*Ki*cM/(1+Ki*cM))

Conveniently, algebraic rearrangement of these equations to solve for \(cM\) gives polynomials, where the polynomial order is equal to the number of sites. This makes life easy with a two-site model; with three sites, however, the equation is cubic. To predict \(cM\), we need to find the root(s) of the cubic equation, which can't be done algebraically, and requires a numerical solution – an algorithm, i.e. some code.

The cubic equation . . . \(cM = A + B(qM) + C(qM)^{2} + D(qM)^{3}\)

where the parameters \(A, B, C\), and \(D\) for a 3-site model are:

\(A = qM;\)

\(B = qM(K_{1}+K_{2}+K_{3}) - (C_{1}K_{1}+C_{2}K_{2}+C_{3}K_{3});\)

\(C = K_{1}K_{2}(qM-(C_{1}+C_{2}) + K_{1}K_{3}(qM-(C_{1}+C_{3}) + K_{2}K_{3}(qM-(C_{2}+C_{3});\)

\(D = K_{1}K_{2}K_{3}(qM-(C_{1}+C_{2}+C_{3})\)

Figure 4: Curve plots for cubic functions for which roots need to be found. Each root is the predicted Cu²⁺ concentration (cM) for each amount of Cu²⁺ adsorbed to calcium humate (qM).

Figure 4: Curve plots for cubic functions for which roots need to be found. Each root is the predicted Cu²⁺ concentration (cM) for each amount of Cu²⁺ adsorbed to calcium humate (qM).

 

Coding it

First we need to define some user functions:

  1. one to set up our cubic equation for feeding into the uniroot() function, and
  2. a second which calculates a residual sum-of-squares to be minimised by the optim() function.

Some things to note:

# 1. Basic form of cubic equation to pass to uniroot() function

fcubic <- function(x, a, b, c, d) {
  a + b*x + c*x^2 + d*x^3
}

# 2. Calculate residual (error) sum-of-squares at each iteration of optim()

calcSSE <- function(par, data, justSSE=TRUE){
  # convert log10-parameters
  C1 <- unname(10^par[1])
  K1 <- unname(10^par[2])
  C2 <- unname(10^par[3])
  K2 <- unname(10^par[4])
  C3 <- unname(10^par[5])
  K3 <- unname(10^par[6])
  
  # calculate cubic coefficients
  A = data[,2]
  B =(data[,2]*(K1+K2+K3)) - (C1*K1 + C2*K2 + C3*K3)
  C = K1*K2*(data[,2]-(C1+C2)) + K1*K3*(data[,2]-(C1+C3)) +
      K2*K3*(data[,2]-(C2+C3))
  D = K1*K2*K3*(data[,2]-(C1+C2+C3))

  # calculate roots:
  # - first pre-allocate memory in data frame...
  cMpred <- data.frame(cMpred=rep(NA, nrow(obs)), f.root=rep(NA, nrow(obs)))
  #   ... then loop to find roots for cubic equations for each data point
  for(i in 1:nrow(obs)){
    result0 <- uniroot(fcubic, interval=c(1e-15, 10*max(data$cM,na.rm=T)),
                       a=A[i], b=B[i], c=C[i], d=D[i],
                       tol=1e-12, extendInt = "yes")
    # write root and f(root) to data frame row i
    cMpred[i,] <- c(unname(log10(result0$root)),unname(result0$f.root))
  }
  pred <-
    data.frame(qMobs = data$qM, cMobs.log = log10(data$cM),
               cMfit.log = cMpred[,1], y.at.root = cMpred[,2])
  # calculate residuals
  pred$cMres <- pred$cMobs.log - pred$cMfit.log
  # calculate squared difference for total sum-of-squares
  pred$diffsq <- (pred$cMobs.log - mean(pred$cMobs.log, na.rm=T))^2
  # calculate residuals squared for error sum-of-squares
  pred$cMres2 <- pred$cMres^2
  ssq <- colSums(pred[ ,c(ncol(pred)-1,ncol(pred))]) # just last 2 columns
  names(ssq) <- c("SST","SSE")

  if(justSSE==TRUE){
    SSE <- ssq["SSE"]
    return(SSE)
  } else {
    r.squared <- 1-(ssq["SSE"]/ssq["SST"])
    # define and format output object
    #       if(is.matrix(par)) par <- par[1,]
    result <- list(parameters=par,
                   data.table=pred,
                   sums.of.squares=ssq,
                   stats=r.squared)
    # names(result[[3]]) <- c("SST","SSE")
    names(result[[4]]) <- "r.squared"
    return(result)
  }
}

Check our functions!

cat("The object 'fcubic' is a", class(fcubic), "\n");cat("The object 'calcSSE' is a", class(calcSSE))
## The object 'fcubic' is a function 
## The object 'calcSSE' is a function

Now we need some inputs:

# our dataset
infile <- "wpcu05b.csv"
obs <- read.csv(infile, stringsAsFactors = TRUE)
str(obs) # to check it

# named vector of log-transformed parameters
P <- log10(c(2., 7.e4, 30., 4.e3, 400., 2.e2))
names(P) <- c("logC1","logK1","logC2","logK2","logC3","logK3")
P
## 'data.frame':    18 obs. of  5 variables:
##  $ cM    : num  4.89e-09 5.44e-09 1.07e-08 1.56e-08 1.96e-08 ...
##  $ qM    : num  0.00315 0.00315 0.00629 0.00629 0.00943 ...
##  $ M.add : num  3.15e-07 3.15e-07 6.29e-07 6.29e-07 9.43e-07 ...
##  $ VT    : num  0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 ...
##  $ massHA: num  1e-04 1e-04 1e-04 1e-04 1e-04 1e-04 1e-04 1e-04 1e-04 1e-04 ...
##    logC1    logK1    logC2    logK2    logC3    logK3 
## 0.301030 4.845098 1.477121 3.602060 2.602060 2.301030

 

Do the optimisation for model fitting

We now have what we need to try optimisation, which will adjust the parameters of our model to best fit the data, by minimising the residual sum-of-squares in log10(cM). Note that the arguments for the function to be minimised (calcSSE, that we made above) are included as arguments for the optim() function. We also increase the maximum iterations from the default value of 500.

Apparently the inverse of the Hessian matrix can allow us to calculate standard errors for the optimised parameters. We haven't done that here, but it may be a future improvement. (For simplex optimisation, the hessian matrix is calculated numerically, since we don't supply derivative (gradient) function(s).)

predOut <- optim(par=P,
                 fn = calcSSE,
                 data = obs,
                 justSSE=TRUE,
                 method = "Nelder-Mead",     # this is the Simplex algorithm
                 hessian = FALSE,            # we can do hessian=TRUE if we want
                 control = list(maxit=2000)) # max. 2000 iterations
{cat(paste("\n-------- OPTIMISATION by Nelder-Mead SIMPLEX method for",
            substr(infile,1,str_locate(infile,"\\.")-1),"--------\n"))
print(predOut)}
## 
## -------- OPTIMISATION by Nelder-Mead SIMPLEX method for wpcu05b --------
## $par
##      logC1      logK1      logC2      logK2      logC3      logK3 
## -1.6107206  7.3839742 -0.2935944  5.1015828  5.3156356 -2.8998928 
## 
## $value
## [1] 0.06035422
## 
## $counts
## function gradient 
##      555       NA 
## 
## $convergence
## [1] 0
## 
## $message
## NULL

The output of optim() above shows us the best-fit values of the model parameters logC1...logK3, with the minimised residual sum-of-squares ($value) ≃ 0.06. Under $counts, the output shows the number of calls to the function calcSSE (it was evaluated 555 times!), as the simplex made its multidimensional contortions (we didn't need a gradient function). A $convergence value of zero means the optimisation converged successfully.

More detail about the model output

It can be useful to inspect the output in full, not just the optimised parameters (especially if the results look weird or wrong).

The long output of our calcSSE function, obtained with the option justSSE = FALSE, includes

The code below also outputs the optimised parameters as non-log10-transformed values for convenience.

# first retrieve optimised parameters from optim() results object
P <- predOut$par
# then use the calcSSE() function without restricting output to just SSE
calcOut <- calcSSE(par=P, data=obs, justSSE = FALSE)
cat(paste("\n---- Output for dataset",
          substr(infile,1,str_locate(infile,"\\.")-1),
          "based in OPTIMISED parameter estimates ----\n"))
print(calcOut)
# output non-log-transformed parameters as well
{rawpars <- 10^calcOut$parameters
  names(rawpars) <- c("C1","K1","C2","K2","C3","K3"); print(rawpars, digits=4)}
## 
## ---- Output for dataset wpcu05b based in OPTIMISED parameter estimates ----
## $parameters
##      logC1      logK1      logC2      logK2      logC3      logK3 
## -1.6107206  7.3839742 -0.2935944  5.1015828  5.3156356 -2.8998928 
## 
## $data.table
##      qMobs cMobs.log cMfit.log     y.at.root        cMres       diffsq       cMres2
## 1  0.00315 -8.310514 -8.272440  1.496226e-07 -0.038073398 1.1545100381 1.449584e-03
## 2  0.00315 -8.264321 -8.272440  1.496226e-07  0.008118880 1.0573782888 6.591621e-05
## 3  0.00629 -7.971836 -7.920802  2.008326e-09 -0.051033666 0.5414065211 2.604435e-03
## 4  0.00629 -7.806041 -7.920802  2.008326e-09  0.114760893 0.3249099845 1.317006e-02
## 5  0.00943 -7.706859 -7.689381  1.829176e-08 -0.017477930 0.2216773603 3.054780e-04
## 6  0.00943 -7.564633 -7.689381  1.829176e-08  0.124747094 0.1079787681 1.556184e-02
## 7  0.01398 -7.554863 -7.430552  4.537565e-11 -0.124310989 0.1016530583 1.545322e-02
## 8  0.01397 -7.497983 -7.431064  4.498171e-11 -0.066918580 0.0686180757 4.478096e-03
## 9  0.01886 -7.222863 -7.199548  1.824740e-09 -0.023315083 0.0001734219 5.435931e-04
## 10 0.01886 -7.186619 -7.199548  1.824740e-09  0.012929011 0.0024416515 1.671593e-04
## 11 0.02513 -6.929593 -6.947124  2.482648e-08  0.017531526 0.0939052195 3.073544e-04
## 12 0.02513 -6.927750 -6.947124  2.482648e-08  0.019374102 0.0950378910 3.753558e-04
## 13 0.03140 -6.728391 -6.741511  9.540941e-09  0.013120089 0.2576999586 1.721367e-04
## 14 0.03139 -6.682773 -6.741803  9.589225e-09  0.059029896 0.3060961993 3.484529e-03
## 15 0.03766 -6.615288 -6.578819 -7.398722e-08 -0.036469498 0.3853231218 1.330024e-03
## 16 0.03765 -6.563519 -6.579050 -7.466198e-08  0.015530352 0.4522736797 2.411918e-04
## 17 0.05018 -6.346787 -6.342870 -1.548380e-12 -0.003917566 0.7907562743 1.534733e-05
## 18 0.05018 -6.367948 -6.342870 -1.548380e-12 -0.025077913 0.7535705784 6.289017e-04
## 
## $sums.of.squares
##        SST        SSE 
## 6.71541009 0.06035422 
## 
## $stats
## r.squared 
## 0.9910126 
## 
##        C1        K1        C2        K2        C3        K3 
## 2.451e-02 2.421e+07 5.086e-01 1.264e+05 2.068e+05 1.259e-03

The optimised parameters are interesting here; C1 = 0.0245 and C2 = 0.509 mol/kg make sense given the range of qM values. Values for logK1 = 7.38 and logK2 = 5.1 also make sense as they suggest strong adsorption of Cu2+ to humic acid, something that is widely observed in the literature. In contrast, C3 = 206800 mol/kg which is impossibly high, but this is partially explained by logK3 = -2.9, representing very weak adsorption which actually wouldn't contribute to adsorbed Cu2+ in the experimental range of metal ion additions. We might conclude that only two sites are needed to describe the data, which seems OK.

Plot it!

The final thing we would probably like to do is visualise the fit of the model to our data with a plot (Figure 5).

# clunky base-R code used as I don't know how to do the same in ggplot 🙄
par(mar=c(3,3,1,1), mgp=c(1.6,0.2,0), tcl=0.25, font.lab=2)
with(calcOut$data.table, plot(qMobs, cMobs.log, 
             xlab=expression(bold(paste("Cu"^"2+"," adsorbed (mol/kg)"))),
             ylab=expression(bold(paste("[Cu"^"2+","] (mol/L)")))))
grid();with(calcOut$data.table, points(qMobs, cMobs.log, pch=21, cex=1.4, bg="gold"))
with(calcOut$data.table, lines(qMobs, cMfit.log, col="#300080b0", lwd=3))
rect(0.0375,-8.35,0.0505,-7.3, col="#ffffffa0", border="transparent")
mtext(paste0("R² = ", signif(calcOut$stats[[1]],4)*100,"%"), adj=0.95,
      side=1, line=-1.5, font=2)
parz <- c("C1","logK1","C2","logK2","C3","logK3")
for(i in seq(2,length(predOut$par),2)){
  mtext(rev(parz)[i], side=1, line=(0.5-i)-2.2, adj=0.8)
  mtext(signif(10^(predOut$par[(length(predOut$par)+1)-i]),4),
        side=1, line=(0.5-i)-2.2, adj=0.95)}
for(i in seq(1,length(predOut$par),2)){
  mtext(rev(parz)[i], side=1, line=(0.5-i)-2.2, adj=0.8)
  mtext(signif(predOut$par[(length(predOut$par)+1)-i],4),
        side=1, line=(0.5-i)-2.2, adj=0.95)}
legend("topleft", bg="#ffffffa0", box.col="transparent", inset=0.02, 
       legend=c("Experimental observations","Model predictions"),
       pch=c(21,NA), pt.bg=c("gold",NA), col=c(1,"#300080b0"), pt.cex=c(1.4,NA),
       lwd=c(NA,3))
Figure 5: Plot of log(cM) vs. qM for adsorption of Cu²⁺ onto Waimari Peat calcium humate from Waitaha/Canterbury, Aotearoa New Zealand. Points show observations, with the line showing predictions from a 3-site adsorption model fitted by minimising a residual sum-of-squares in log(cM).

Figure 5: Plot of log(cM) vs. qM for adsorption of Cu²⁺ onto Waimari Peat calcium humate from Waitaha/Canterbury, Aotearoa New Zealand. Points show observations, with the line showing predictions from a 3-site adsorption model fitted by minimising a residual sum-of-squares in log(cM).

 

Another Example: Adsorption by a Distribution of Sites

The idea behind this example is a little different. The modelling objective is broadly the same – to describe adsorption of metal ions on a chemically heterogeneous material. In the previous example, we assumed that the heterogeneity in metal ion adsorption could be descried by an additive mixture of 3 sites.

Figure 6: Two different ways of expressing heterogeneity for metal ion adsorption: (a) discrete binding sites defined by Ci and log Ki values; (b) a continuum of binding sites which are normally distributed in log K values.

Figure 6: Two different ways of expressing heterogeneity for metal ion adsorption: (a) discrete binding sites defined by Ci and log Ki values; (b) a continuum of binding sites which are normally distributed in log K values.

 

Instead of adding the contribution of discrete sites (as in the first example above), we need to solve the integral of the distribution function. Figure 6(b) shows binding from a continuum of sites, normally distributed with respect to log K. If each infinitesimal “site” behaves like a Langmuir site, the relevant equation is:

Equation which reads qM = the summation from i=1 to 3 of (Ci*Ki*cM/(1+Ki*cM))

where \(κ\) (≡ log K) is the variable of integration, and µ is the mean of the distribution in \(κ\) with standard deviation σ.

Figure 7: The distribution of metal ion adsorption sites with respect to log K and cM (µ = 4, σ = 1.5). The value of qM is equal to the integral with repsect to log K at each value of cM. Note that at low cM, sites with high log K are occupied preferentially; the complete distribution becomes filled at greater cM values. You can click and drag on the image to rotate or zoom.

 

The integral cannot be solved analytically, so we need a numerical method (as we did above for finding cubic roots). The numerical method for integration we'll use is Romberg integration, an efficient improvement on the trapezoid method which fits a polynomial curve to early trapezoidal steps and extrapolates to the solution. The Romberg method is available for R in the appropriately-named romberg() function in the pracma package (Borchers, 2023). The best-fit values of µ and σ are found using simplex optimisation with the optim() function, and qmax is a fixed value based on the numbers of acidic functional groups on the humic acid, which are probable adsorption sites.

As in the previous example, we set all this up first with user-defined functions:

# equation defining log K distribution for adsorption,
# using log10-parameters kmu (µ) and ksig (σ) to avoid negatives

knorm <- function(x, kmu, ksig, maxq, cM){
  (1/(ksig*sqrt(2*pi))) *
    (maxq*(10^x)*cM/(1+((10^x)*cM))) * (exp(-0.5*((kmu-(x))/ksig)^2))
}

# function to calculate integral of distribution function, and subsequent
# residual sum of squares (and optionally other output if long=TRUE).

calcSSE2 <- function(params, maxq, data, long=FALSE){
  # load package for Romberg integration
  require(pracma)
  # pre-allocate data frame to hold obs + pred data
  pred <-
    data.frame(cMobs = data$cM, qMobs = data$qM,
               qMfit = rep(NA,length(data$qM)),
               qMerr = rep(NA,length(data$qM)))
  
  # loop through observations to solve integral for each with Romberg method
  for(i in 1:nrow(pred)){
    romint <- romberg(f=knorm, a=params[1]-5*params[2], b=params[1]+5*params[2],
                      tol=1e-15, kmu=params[1], ksig=params[2], maxq=maxq,
                      cM=obs$cM[i])
    pred$qMfit[i] <- romint$value
    pred$qMerr[i] <- romint$rel.error
  }

  # calculate residuals
  pred$qMres <- pred$qMobs - pred$qMfit
  # calculate squared difference for total sum-of-squares
  pred$diffsq <- (pred$qMobs - mean(pred$qMobs, na.rm=T))^2
  # calculate residuals squared for error sum-of-squares
  pred$qMres2 <- pred$qMres^2
  # calculate sums of squares for everything
  ssq <- colSums(pred[ ,c(ncol(pred)-1,ncol(pred))])
  r.squared <- 1-(ssq["qMres2"]/ssq["diffsq"])
  # define and format output object
  #       if(is.matrix(params)) params <- params[1,]
  result <- list(parameters=params,
                 data.table=pred,
                 sums.of.squares=ssq,
                 stats=r.squared)
  names(result[[3]]) <- c("SST","SSE")
  names(result[[4]]) <- "r.squared"
  SSE <- unname(result$sums.of.squares["SSE"])
  if(long==FALSE){
    return(SSE)
  } else {
    return(result)
  }
}

We have already input some metal ion adsorption data, but of course we need initial estimates of the model parameters.

# Vector P to hold initial estimates of LOG10-TRANSFORMED parameters
P2 <- c(kmu=2.5, ksig=0.55)
# non-adjustable value of qmax = maxq
maxq <- 1.95e2

Then using simplex optimisation as before:

predOut2 <- optim(par=P2,
                 fn = calcSSE2,
                 data = obs,
                 maxq = maxq,
                 method = "Nelder-Mead",
                 # hessian = TRUE,
                 control = list(maxit=2000, reltol=1.e-15))
## Loading required package: pracma
{cat(paste("\n-------- OPTIMISATION by Nelder-Mead SIMPLEX method for",
          substr(infile,1,str_locate(infile,"\\.")-1),"--------\n"))
print(predOut2)}
## 
## -------- OPTIMISATION by Nelder-Mead SIMPLEX method for wpcu05b --------
## $par
##       kmu      ksig 
## -4.614805  3.051154 
## 
## $value
## [1] 3.206444e-05
## 
## $counts
## function gradient 
##      287       NA 
## 
## $convergence
## [1] 0
## 
## $message
## NULL

More detailed lognormal model output

P2 <- predOut2$par
  calcOut2 <- calcSSE2(params=P2,data=obs, maxq=maxq, long = TRUE)
  cat(paste("\n-------- RESULTS for",
            substr(infile,1,str_locate(infile,"\\.")-1),"from OPTIMISED parameter estimates --------\n"))
## 
## -------- RESULTS for wpcu05b from OPTIMISED parameter estimates --------
  print(calcOut2)
## $parameters
##       kmu      ksig 
## -4.614805  3.051154 
## 
## $data.table
##        cMobs   qMobs       qMfit        qMerr         qMres       diffsq       qMres2
## 1  4.892e-09 0.00315 0.004156830 5.204170e-18 -0.0010068304 3.472632e-04 1.013707e-06
## 2  5.441e-09 0.00315 0.004428016 4.336809e-18 -0.0012780155 3.472632e-04 1.633324e-06
## 3  1.067e-08 0.00629 0.006570285 5.204170e-18 -0.0002802847 2.400950e-04 7.855953e-08
## 4  1.563e-08 0.00629 0.008183691 1.040834e-17 -0.0018936906 2.400950e-04 3.586064e-06
## 5  1.964e-08 0.00943 0.009319798 8.673617e-18  0.0001102021 1.526460e-04 1.214451e-08
## 6  2.725e-08 0.00943 0.011210040 0.000000e+00 -0.0017800396 1.526460e-04 3.168541e-06
## 7  2.787e-08 0.01398 0.011352312 1.734723e-18  0.0026276881 6.091802e-05 6.904745e-06
## 8  3.177e-08 0.01397 0.012214860 5.204170e-18  0.0017551402 6.107422e-05 3.080517e-06
## 9  5.986e-08 0.01886 0.017328751 1.734723e-17  0.0015312493 8.555625e-06 2.344725e-06
## 10 6.507e-08 0.01886 0.018135818 1.387779e-17  0.0007241817 8.555625e-06 5.244392e-07
## 11 1.176e-07 0.02513 0.024955427 2.081668e-17  0.0001745731 1.118903e-05 3.047576e-08
## 12 1.181e-07 0.02513 0.025012029 2.081668e-17  0.0001179714 1.118903e-05 1.391725e-08
## 13 1.869e-07 0.03140 0.031899183 2.775558e-17 -0.0004991833 9.244822e-05 2.491840e-07
## 14 2.076e-07 0.03139 0.033707092 1.387779e-17 -0.0023170919 9.225603e-05 5.368915e-06
## 15 2.425e-07 0.03766 0.036558075 0.000000e+00  0.0011019251 2.520156e-04 1.214239e-06
## 16 2.732e-07 0.03765 0.038896266 6.938894e-18 -0.0012462658 2.516982e-04 1.553178e-06
## 17 4.500e-07 0.05018 0.050285930 5.551115e-17 -0.0001059301 8.062760e-04 1.122118e-08
## 18 4.286e-07 0.05018 0.049050158 5.551115e-17  0.0011298416 8.062760e-04 1.276542e-06
## 
## $sums.of.squares
##          SST          SSE 
## 3.942460e-03 3.206444e-05 
## 
## $stats
## r.squared 
## 0.9918669
 par(mar=c(3,3,1,1), mgp=c(1.7,0.3,0), tcl=0.25, font.lab=2)
  with(calcOut2$data.table, plot(cMobs, qMobs, pch=21, cex=1.4, bg="gold"))
  with(calcOut2$data.table, lines(cMobs, qMfit, col="#400080b0", lwd=2))
  mtext(paste("Optimised estimates of parameters:", substr(infile,1,str_locate(infile,"\\.")-1)), col="#6000a0")
  mtext(paste0("R² = ", signif(calcOut2$stats[[1]],3)*100,"%"), adj=0.95,
        side=1, line=-1.4, font=2, col="#6000a0")
  for(i in 1:length(predOut2$par)){
    mtext(c("µ","σ")[(length(predOut2$par)+1)-i], side=1, line=(-0.5-i)-1.2, 
          adj=0.85)
    mtext(signif(predOut2$par[(length(predOut2$par)+1)-i],4), side=1, 
          line=(-0.5-i)-1.2, adj=0.95)
  }
Figure 8: Plot of log(cM) vs. qM for adsorption of Cu²⁺ onto Waimari Peat calcium humate from Waitaha/Canterbury, Aotearoa New Zealand. Points show observations, with the line showing predictions from an adsorption model havng a continuum of sites which are normally distributed with respect to log K.

Figure 8: Plot of log(cM) vs. qM for adsorption of Cu²⁺ onto Waimari Peat calcium humate from Waitaha/Canterbury, Aotearoa New Zealand. Points show observations, with the line showing predictions from an adsorption model havng a continuum of sites which are normally distributed with respect to log K.

 

The adsorption model based on a continuum of sites having a lognormal distribution of K values fits the data at least as well (R² 0.992) as the first model (R² 0.991). This is significant since the lognormal model includes only 2 adjustable parameters (µ and σ), whereas the first example's discrete site model had six adjustable parameters (C1, K1, C2, K2, C3, K3).

 

The code and ideas here should be adaptable to other optimisation problems, especially those involving equations without analytical solutions.

 

References

Borchers H (2023). pracma: Practical Numerical Math Functions. doi:10.32614/CRAN.package.pracma, R package version 2.4.4, https://CRAN.R-project.org/package=pracma.

Press, W. H., Flannery, B. P., Teukolsky, S. A., & Vetterling, W. T. (1986). Numerical Recipes – the art of Scientific Computing. Cambridge University Press.

R Core Team. (2025). R: A language and environment for statistical computing. In (Version 4.5.1) R Foundation for Statistical Computing. https://www.R-project.org

Rate, A. W. (1990). Humic Acid Protonation, Metal Ion Binding, and Metal Complex Dissociation Kinetics [Ph.D. Thesis, Lincoln University]. Lincoln, New Zealand. |thesis link|

Wickham H (2025). stringr: Simple, Consistent Wrappers for Common String Operations. doi:10.32614/CRAN.package.stringr, R package version 1.5.2, https://CRAN.R-project.org/package=stringr.

 


CC-BY-SA • All content by Ratey-AtUWA. My employer does not necessarily know about or endorse the content of this website.
Created with rmarkdown in RStudio. Currently using the free yeti theme from Bootswatch.