## Quantophile

### Financial programming for Quants

A quadratic polynomial $$p(x)=ax^2+bx+c$$ has only three degrees of freedom $$(a,b,c)$$. If you want your quadratic function to run through two points, you already have only one degree of freedom left. If the the slope of the curve at one of the end-points is fixed, this uses up the last degree of freedom, leaving no choice of slope at the other end. A cubic polynomial $$ax^3+bx^2+cx+d$$ has four degrees of freedom, thus allowing to prescribe four conditions – passing through two points and having specific slopes at the two end points.

Quadratic functions are more ringed. It’s because, a quadratic has a fixed degree of bentness. If we glue together a few quadratics, to interpolate a set of points, a passenger roller-coaster travelling on this curve would experience long ascents and descents. Cubics are more relaxed. Unlike cubics, quadratics can’t have a point of inflexion. Intuitively, this is why, cubic polynomials are used for interpolating a set of points.

However, quadratic splines also have many applications, for example, they are used while designing true type fonts.

Cubic splines are also use to construct cubic Bezier curves used in car designing.

I have put together the intuition, the math behind cubic splines and a python code snippet implementing the algorithm in this notebook.

Let $$u$$ and $$v$$ be differentiable functions of $$x$$. Then the differential of the product $$uv$$ is found by the product rule

$$d(uv)=udv+vdu$$

Whence, by integration, we have

$$uv=\int{udv}+\int{vdu}$$

$$\bbox[5pt, border:2px solid blue] { \int{udv}=uv-\int{vdu}\qquad{ (1) } }$$

This formula is called integration by parts. It is often used to integrate expressions, that may be represented as a product of two factors $$u$$ and $$dv$$, in such a way that the finding of the function $$v$$ from its differential $$dv$$ and the evaluation of the integral $$\int{vdu}$$ taken together, be a simpler problem than the direct evaluation of the integral $$\int{udv}$$. To become skilled at breaking up a given element of integration into factors $$u$$ and $$dv$$, one has to solve problems.

Let us consider some functions of a quadratic trinomial. We express the trinomial as a sum or difference of squares and then proceed to integrate the given function.

Example. \begin{aligned} \int{\frac{dx}{x^{2}+2x+5}} \end{aligned}

Solution. We can simplify the quadratic trinomial $$x^{2}+2x+5$$ as,

\begin{aligned} x^{2}+2x+5&=x^{2}+2(x)(1)+1^{2}+4\\ &=(x+1)^{2}+2^{2} \end{aligned}

We make the substitution $$x+1=t$$.

\begin{aligned} I&=\int{\frac{dx}{(x+1)^{2}+2^{2}}}\\ &=\int{\frac{dt}{t^{2}+2^{2}}}\\ &=\frac{1}{2}\arctan\left(\frac{t}{2}\right)+C\\ &=\frac{1}{2}\arctan\left(\frac{x+1}{2}\right)+C \end{aligned}

Some more examples follow.

Let us attempt to graph functions – draw a rough sketch of the function $$y=f(x)$$ or the implicit functions $$F(x,y)=0$$ by hand. These are my notes.

The method of substitution is one of the basic methods for calculating indefinite integrals. Even when we integrate by some other technique, we often resort to substitution in the intermediate stages. The success of integration largely depends on how appropriate the substitution is, in simplifying the integrand. Essentially, the study of the methods of integration, reduces to finding out what kind of substitution should be done. I solve a few interesting problems here.

##### Change of variable

Theorem. The form of the integration formula is independent of the nature of the variable of integration.

Proof. Let it be required to find the integral

$$\int{f(x)dx}$$

Let us change the variable in the expression under the integral sign, putting

$$x=\phi(t)$$

$$\int{f(x)dx}=\int{f[\phi(t)]d(\phi(t))}=\int{f[\phi(t)]\phi'(t)dt}$$

In many problems, a substitution such as the one above, leads to a simpler integral. Let us establish, that the indefinite integral is independent of the nature of the variable of integration – whether $$x$$ or $$\phi(t)$$. It is necessary to prove that their derivatives with respect to $$x$$ are equal.

Differentiating the left side with respect to $$x$$:

$$\left(\int{f(x)dx}\right)_{x}=f(x)$$

Differentiating the right side with respect to $$x$$:

$$\left(\int{f[\phi(t)]\phi'(t)dt}\right)_{x}=\left(\int{f[\phi(t)]\phi'(t)dt}\right)_{t}\cdot\frac{dt}{dx}=f[\phi(t)]\phi'(t)\frac{dt}{dx}$$

But,

$$\frac{dt}{dx}=\frac{1}{dx/dt}=\frac{1}{\phi'(t)}$$

Therefore,

$$\left(\int{f[\phi(t)]\phi'(t)dt}\right)_{x}=f[\phi(t)]\phi'(t)\frac{1}{\phi'(t)}=f[\phi(t)]=f(x)$$

Therefore, the derivatives, with repsect to $$x$$ of the right and left sides are equal as required. Hence, the expressions to the right and left side are the same.

This is part 1 of 4 posts that I would like to share on FX Volatility surfaces. The posts assume that the reader is familiar with the concept of volatility skew. There are many excellent references out there and I don’t want to repeat what you can already find in texts.

This part will focus on how FX traders build a volatility surface. Beginning with three data points $$25\Delta$$ put($$75\Delta$$ call) volatility, at-the-money volatility and $$25\Delta$$ call($$75\Delta$$ put) volatilities, we are to interpolate the curve. The curve should be arb free. Traders use a thumb-rule called the Vanna-Volga method.

VV is extremely intuitive. The idea is to construct a portfolio of $$1$$ long call, short $$\Delta$$ units of the stock and short the three calls ($$25\Delta$$, $$50\Delta$$ and $$75\Delta$$) in the proportion $$(x_{1},x_{2},x_{3})$$. For the portfolio to be arb-free, we set vega, volga and vanna to zero. This yields $$(x_{1},x_{2},x_{3})$$. The weights can be used to compute the vanna-volga adjustment. The market price of the option is simply the sum of the Black Scholes Price and the vanna-volga adjustment.

I could write a first order approximation of the adjustment as :

$$\text{Adjustment}= \text{Vega}\times(\text{IV} – \text{Flat volatility }\sigma_{atm})$$

And a second order one as :

\begin{aligned} \text{Adjustment} &= \text{Vega}\times(\text{IV} – \text{Flat volatility }\sigma_{atm}) \\ &+ \text{Volga}\times(IV – \text{Flat volatility }\sigma_{atm})^{2} \end{aligned}

Solving the above for IV, it’s easy to extract a first-order and second-order approximation.

This document outlines the Vanna-Volga method and a sample code snippet in Python. In my next post, I plan to write on other stochastic volatility models and their C++ implementation.

A system of linear equations can be solved by using Gaussian elimination. For example, I have three equations:

$$\begin{matrix} 2u&+v&+w&=5\\ 4u&-6v& &=-2 \\ -2u&+7v&+2w&=9 \end{matrix}$$

The problem is to find the unknown values of $$u$$, $$v$$ and $$w$$ that satisfies the three equations.

##### Gaussian Elimination

I start by subtracting multiples of the first equation from the other equations.

The coefficient $$2$$ is the first pivot. Elimination is constantly finding the right multiplier $$l$$ by dividing the pivot into the members below it, so that one of the variables is eliminated from the equation. To eliminate $$u$$ from $$4u-6v =-2$$, I must subtract $$2(2u+v+w=5)$$ from it. The multiplier $$l=4/2$$. Similarly, to eliminate $$u$$ from $$-2u+7v+2w=9$$, the multiplier is $$l=-2/2=-1$$.

So, the operations

1. Subtract $$2$$ times equation 1 from equation 2.
2. Subtract $$-1$$ times equation 1 from equation 3.

result in :

$$\begin{matrix} 2u&+v&+w&=5\\ &-8v&-2w &=-12 \\ &+8v&+3w&=14 \end{matrix}$$

These are my hand-written notes on conic sections.

Conic sections, Parametric and Polar curves

A random variable $$X$$ is a function on a sample space. Typical random variables are the result of tossing a coin, the value on rolling a die, the number of aces in a Poker hand, of multiple birthdays in a company of $$n$$ people, number of successes in $$n$$ Bernoulli trials. The classical theory of probability was devoted mainly to the study of a gambler’s gain, which is again a random variable.

The position of a particle under diffusion, the energy, temperature etc. of physical systems are random variables, but they are defined in non-discrete sample spaces. The source of randomness in the random variable is the experiment itself, in which events $${X=x_{j}}$$ are observed according to the probability function $$P$$.

In a large number of repetitions of an experiment, what is the average value of the random variable? What is the variance of the random variable? The links to my notes and solved problems is given below.

Theorems and Notes

Conditional probability has numerous applications in scientific, medical and legal reasoning, statistical genetics, cryptography etc. Whenever we observe new evidence or information, the odds of an event must be updated.

##### A motivating example.

Example. The Monty Hall problem.

On the game show Let’s Make a Deal, hosted by Monty Hall, there are three doors, randomly one of the doors has a car behind them, the other two doors have goats behind them. You as the contestant have no idea, which one has the car, they are all equally likely. But Monty Hall knows which one has the car.

You as a contestant, pick a door, say door $$1$$. Then, what happens is, Monty opens up either of the remaining doors $$2$$ or $$3$$, revealing a goat. The door he opens always has a goat behind it (he never reveals the car!), for example, Monty opens goat door $$3$$. So, then you know, that the car is behind the door you initially picked, door $$1$$ or the other unopened door $$2$$.

Monty then offers the contestant the option of switching to the other unopened door, or keeping your original choice. The question is, should you stay with with your initial choice or should you switch?

Assumption. Monty always opens a goat door. If he has a choice, of which door to open, he picks with equal probabilities.

For example, if you initially guessed right, so the car is behind the door $$1$$, doors $$2$$ and $$3$$ both have goats. Monty could open either door $$2$$ or door $$3$$. Assume these are equally likely.

###### Should you stay with your initial choice or switch?

Many people, upon seeing the problem for the first time, argue that there is no advantage to switching : “There are two doors remaining, and one of them has the car, so the chances are 50-50”. Controversy has raged about this problem for years and years.

Solution.
Under the standard assumptions above, the answer is, you should switch. Let’s label the doors $$1$$ through $$3$$. We know that, any of the $$3$$ doors are equally likely to be the car-door.

$$P(C_{1})=P(C_{2})=P(C_{3})=1/3$$

Without the loss of generality, we can assume that the contestant picked the door $$1$$. Suppose that Monty opens door $$3$$. Suppose we believe the car is behind door $$2$$, denoted by event $$C_{2}$$. The prior probability that the car is behind door $$2$$, $$P(C_{2})=1/3$$ can be updated in light of the new information about Monty opening door $$3$$.

Bayes’s rule is used to update prior probabilities by incorporating new information.

\begin{align} P(C_{2}|M_{3})&=P(C_{2})\cdot\frac{P(M_{3}|C_{2})}{P(M_{3})}\\ &=(1/3)\cdot\frac{1}{(1/2)}\\ &=2/3 \end{align}

And, the probability that the car is behind door $$1$$ is,

$$P(C_{1}|M_{3})=1-P(C_{2}|M_{3})=1/3$$

It would be an abuse of the naive definition of probability, if you just immediately say, that the two doors are equally likely to have a car behind them. There is a $$1/3$$rd chance that the car is behind the first door, and a $$2/3$$rds chance it is behind the second door.

Thus, the observation that Monty opened door $$3$$ makes us more sure of our belief that the car is behind door $$2$$.

###### Building correct intuition

Let’s consider an extreme case. Suppose that there are a thousand doors, $$999$$ of which contain goats and $$1$$ of which has a car. After the contestant’s initial pick, Monty opens opens $$998$$ doors with goats behind them. Let’s update our prior probabilities as Monty opens doors with goats behind them.

Assume, Monty opens door $$1000$$.

$$P(C_{2}|M_{1000})=P(C_{2})\frac{P(M_{1000}|C_{2})}{P(M_{1000})}=\frac{999}{998}\cdot\frac{1}{1000}$$

Next, Monty opens door $$999$$.

$$P(C_{2}|M_{999}M_{1000})=P(C_{2}|M_{1000})\frac{P(M_{999}|C_{2}M_{1000})}{P(M_{999}|M_{1000})}=\frac{999}{997}\cdot\frac{1}{1000}$$

Continuing in this fashion,

$$P(C_{2}|M_{3}\ldots{M_{1000}})=\frac{999}{1000}$$

In this extreme case, it becomes clear that the probabilities are not $$50-50$$. As Monty eliminates $$998$$ of the $$999$$ doors, we are extremely confident of the belief that the car is behind the remaining unopened door.

###### R Simulation

A simple R simulation demonstrates that the strategy to always switch has a success probability of $$2/3$$.

The experiment is repeated $$1$$ millon times.

The number of games won and lost are :

The links to my notes and the solved problems are given below.