Many methods to estimate the area under a function evaluates the integral of simpler sub-curves that resembles the original function. Essentially, the curve is broken up into different sub-intervals and a simpler curve that estimates the original curve is created for each sub-interval. The curves made are generally a polynomial since polynomials are easy to integrate. Then, the integral of sub-curve in each sub-interval is evaluated and summed together. The result is an approximation of the integral of the entire curve.

Some notable and widely used methods include the Riemann sum which uses horizontal lines to estimate the integral, trapezoidal rule which use lines, and Simpson's 1/3 and 3/8 rule which uses degree two and three polynomials, respectively. It is important to note that estimating the integral of the overall curve then boils down to the integration of the sub-curves. If the area bounded by the sub-curve in each sub-interval and the x-axis is a simple shape (i.e. rectangles for the Riemann sum or polygons for the trapezoidal rule) then the integral of each sub-curve can be geometrically represented by such area.

## The Trapezoidal Rule

The trapezoidal rule is one of the easiest methods to estimate the value of an integral. It uses lines to approximate the shape of a curve. Then, the integration of all such lines results in an estimation. The trapezoidal rule is also simple to use since the equations are simple and the number of sub-intervals chosen do not have additional restrictions other than being positive (e.g. Simpson's 1/3 rule requires the number of sub-intervals to be even).### Derivation

We briefly discuss a geometric derivation of the trapezoidal rule. Suppose we have an integral that spans from $x_0=a$ to $x_n=b$. We assume the function is continuous over the interval. We can partition such interval $[a,b]$ into $n$ smaller sub-intervals such that $a = x_0 \lt x_1 \lt \dots \lt x_i \lt \dots \lt x_{n-1} \lt x_n=b$. Define $\Delta x_i$ to be $x_i - x_{i-1}$. Consider a sub-interval $[x_{i-1},x_i]$. The curve in the sub-interval is estimated to be a linear line between the bounds of the sub-interval evaluated at the function. If $f(x_i)$ and $f(x_{i-1}) \geq 0$ then the integral of the linear line is given by the area bounded by the line and x-axis in the interval. Thus, the integral and therefore the area in the subinterval is determined to be $$I_i \approx \frac{f(x_i)+f(x_{i-1})}{2}\Delta x_i \label{eq:1} \tag{1}$$For $f(x_i)$ and $f(x_{i-1}) \leq 0$ the integral is estimated using the same method as above, and is the negative of the area. Thus it is $-\frac{|f(x_i)+f(x_{i-1})|}{2}\Delta x_i$. This equals to equation \ref{eq:1} since $f(x_i)$ and $f(x_{i-1}) \leq 0$. If $f(x_i)$ and $f(x_{i-1})$ have opposing signs, we can further break up the sub-interval. There only exists one point $x_k$ where $f(x_k)=0$ due to the intermediate value theorem and the fact that a line can only pass the x-axis once, never or an infinite amount of times. The estimated integral of the function in $[x_{i-1},x_i]$ is determined to be:

\begin{align*}I_i &\approx \frac{f(x_k)+f(x_{i-1})}{2}(x_k-x_{i-1})+\frac{f(x_i)+f(x_k)}{2}(x_i-x_k)\\

&=\frac{f(x_{i-1})}{2}(x_k-x_{i-1})+\frac{f(x_i)}{2}(x_i-x_k)\\\end{align*}By similar triangles we see that $x_k-x_{i-1}=-\frac{f(x_{i-1})\Delta x_i}{f(x_i)-f(x_{i-1})}$ and $x_i-x_k=\frac{f(x_i)\Delta x_i}{f(x_i)-f(x_{i-1})}$ so

\begin{align*}I_i &\approx -\frac{f^2(x_{i-1})}{2(f(x_i)-f(x_{i-1}))}\Delta x_i+\frac{f^2(x_i)}{2(f(x_i)-f(x_{i-1}))}\Delta x_i\\

&=\frac{f^2(x_i)-f^2(x_{i-1})}{2(f(x_i)-f(x_{i-1}))}\Delta x_i\\

&=\frac{f(x_i)+f(x_{i-1})}{2}\Delta x_i\\\end{align*}

Therefore regardless of the nature of the function in any sub-interval, the integral in $[x_{i-1},x_i]$ is estimated to be equation \ref{eq:1}.

The integral of the overall curve in the interval $[a,b]$ is the sum of all the estimated integrals in each sub-interval, explicitly:

$$I = \sum_{i=1}^{n} I_i$$

$$I \approx \sum_{i=1}^{n} \frac{f(x_{i-1})+f(x_{i})}{2}\Delta x_i \label{eq:2} \tag{2}$$

If the sub-intervals are evenly spaced, then

$$I \approx \frac{b-a}{2n}\sum_{i=1}^{n} \big(f(x_{i-1})+f(x_i)\big)$$

$$I \approx \frac{b-a}{2n} \bigg(f(x_0)+f(x_n)+2\sum_{i=1}^{n-1} f(x_{i})\bigg) \label{eq:3} \tag{3}$$

Equations \ref{eq:2} and \ref{eq:3} are the final forms used in this numerical method.

### Implementation

Implementation of the trapezoidal rule is relatively straightforward. One thing to note is that the higher the number of sub-intervals, the better the accuracy

*however*it is important to realize that the higher the number, the more prominent round-off errors get. Generally the round-off errors associated to this method are negligible assuming the number of intervals don't get ridiculously large. It is recommended to calculate and use the minimum number of sub-intervals to achieve a desired maximum error using this formula. Then as the integration bounds change, it is generally recommended to triple the number of sub-intervals for the doubling of the integration interval; this will keep the maximum error relatively the same assuming the maximum concavity of the function in the new interval change stays constant compared to the old one.function f(x) return math.sin(x)/x end a=1 --lower bound b=3 --upper bound n=5000 --number of sub-intervals sum=0 for i=1,n do xi=((b-a)*i/n)+a xi1=((b-a)*(i-1)/n)+a sum=sum+(f(xi)+f(xi1))*((b-a)/(2*n)) end print(sum)

Figure 3: Lua code for the implementation of the trapezoidal rule. Note the estimation of a non-elementary integral sin(x)/x.

The method above is adequate but relies on the user to implement a suitable subinterval count. A better method of implementation involves estimating the integral with progressively more subintervals until the answer converges to a single value (i.e. is numerically stable). This is done by calculating the estimated integral while increasing the amount of subintervals used, calculating the relative difference between the current estimate with the previous best estimate as we go. Only when this error is below a preset tolerance will the answer be outputted. In the example below, this method is implemented in MATLAB but is improved with Romberg's method - a way to decrease the number of iterations used with the trapezoidal method.

rombInt(@(x) sin(x)./x,1,3) function I=rombInt(f,a,b) maxIter=1000; % Max iterations etol = 1e-6; % Tolerance for relative error n=1; if a>=b error("Must provide unequal and smaller a value than b") end prev=Inf; for j=1:maxIter % Apply trapezoidal rule twice for Romberg's method trap1=0; xDif=(b-a)/n; x=a:xDif:b; y=f(x); for i=1:length(x)-1 trap1=trap1+0.5*(y(i)+y(i+1))*xDif; end n=n*2; % Double the trapezoidal rule subinterval count trap2=0; xDif=(b-a)/n; x=a:xDif:b; y=f(x); for i=1:length(x)-1 trap2=trap2+0.5*(y(i)+y(i+1))*xDif; end n=n*2; % Calculate final integral I=(4/3)*trap2-(1/3)*trap1; if abs((prev-I)/I) <= etol fprintf("Convergence in %.d iterations",j) break elseif j==maxIter warning("Integral did not converge") break end prev=I; end end

Figure 4: MATLAB code for the implementation of an adaptive trapezoidal method.

## Exercises

- Using the error formula for the trapezoidal rule what is the equation that gives the
*maximum*possible error for trapezoidal estimation of $\int^{x}_{0} t^3+e^{t}~dt$ using $n$ equal sub-intervals? What is the formula for the*exact*error? - Prove that the trapezoidal rule, for any number of sub-intervals greater or equal to 1,
- overestimates a parabola's integral if it faces upward.
- perfectly estimates the integral for functions of the form $y=ax+b$.
- Use the trapezoidal rule with $n=4$ to estimate $\int_{0}^{5}5x-5~dx$.
- Estimate the integral from $\pi$ to $2\pi$ of $\frac{\sin(x)}{x}$ with the minimum number of sub-intervals such that the maximum error is no more than 0.05.
- For batch/simple distillation where a binary liquid mixture is boiled and condensed into a receiving flask, the equation $\ln\Big(\frac{W}{F}\Big)=-\int_{x_w}^{x_f}\frac{dx}{y-x}$
_{ }is important to quantify the ending amounts of liquid in both containers and its respective concentrations. $x$ is related to $y$ as provided in the table below: - Derive equation \ref{eq:1} using
- a geometric approach.
- a mathematical approach.
- Using one sub-interval with the trapezoidal rule to estimate certain functions is sometimes more accurate than using two or more intervals. Provide a function that shows this.
- Refer to Figure 1 and identify the numerical method that estimates the integral
- the most poorly?
- the most accurately?
- the quickest, if calculated manually?
- the slowest, if calculated manually?
- Refer to equation \ref{eq:3} and the equation directly above it. Why is it more efficient to use equation \ref{eq:3} rather than the other one, especially when manually evaluating the formula?
- We've seen different methods that can be used to estimate the integral of a curve. The ones mentioned use polynomials of varying degrees to resemble the shape of the original curve. Referring to Figure 5 and considering all the methods presented in this article, which method would be the best (efficiency and accuracy wise) at estimating the integral of the functions from $x=1$ to $x=5$? For each function,
- is equation \ref{eq:2} or \ref{eq:3} more useful?
- how many sub-intervals would you choose?
- where would you partition the sub-intervals?

Figure 5: Table of values for question 5. |

If $x_w=0.02, x_f=0.20$, and $F=1$, estimate $W$, the remaining liquid in the feed pot in mols. [1]

Another useful integration rule is the Trapezoidal Rule. Under this rule, the area under a curve is evaluated by dividing the total area into little trapezoids rather than rectangles. As n→∞, the right-hand side of the expression approaches the definite integral b∫af(x)dx. Mathematics is a subject that you cannot avoid. Some love it but, if we are being honest, most people hate studying maths. The importance of maths for students has never been greater. STEM subjects are the basis for the technologies of tomorrow. It is impossible to study maths properly by just reading and listening. To study maths you have to roll up your sleeves and actually solve some problems. The more you practice answering maths problems, the better.

ReplyDeleteInformative Post. The information you have posted is very useful and sites you have referred was good. Thanks for sharing.

ReplyDeleteData Science Course with Placement

Excellent post to make this blog more wonderful, attractive and cool stuff you have. Thank You.

ReplyDeleteData Science Course in India with Placements

Well done for this excellent article. and really enjoyed reading this article today it might be one of the best articles I have read so far and please keep this work of the same quality.

ReplyDeleteData Analytics Course in Noida

Very interesting blog and lot of the blogs we see these days don't provide anything that interests me but i am really interested in this one just thought I would post and let you know.

ReplyDeleteData Science Training Institute in Noida

I like viewing this web page which comprehend the price of delivering the excellent useful resource free of charge and truly adored reading your posting. Thank you!

ReplyDeleteData Science Certification Course

Very educating blog, got lot of information thank you.

ReplyDeleteData Scientist Course in Amritsar

Very happy to find a good place for many here in the post, the writing is just great, thanks for the post.

ReplyDeleteData Science Course in Ernakulam

Good blog and absolutely exceptional. You can do a lot better, but I still say it's perfect. Keep doing your best.

ReplyDeleteData Science Training in Durgapur