or
the place
- kappa_post is the posterior focus parameter:
- mu_post is the posterior indicate path
Yay, we made it! The posterior could be a von Mises distribution with updated parameters (mu_post, kappa_post). Now we’re in a position to change the prior with every new assertion and see how the indicate path changes.
Welcome once more to those of you who skipped the maths and congratulations to those who made it by! Let’s code the Bayesian change and vizualize the outcomes.
First, let’s define the helper capabilities for visualizing the posterior distribution. We’ll should it to create a pleasing animation shortly.
import imageio
from io import BytesIOdef get_posterior_distribution_image_array(
mu_grid: np.ndarray,
posterior_pdf: np.ndarray,
current_samples: Guidelines[float],
idx: int,
fig_size: Tuple[int, int],
dpi: int,
r_max_posterior: float
) -> np.ndarray:
"""
Creates the posterior distribution and seen samples histogram on a polar plot,
converts it to an image array, and returns it for GIF processing.
Parameters:
-----------
mu_grid (np.ndarray):
Grid of indicate path values for plotting the posterior PDF.
posterior_pdf (np.ndarray):
Posterior chance density carry out values for the given `mu_grid`.
current_samples (Guidelines[float]):
Guidelines of seen angle samples in radians.
idx (int):
The current step index, used for labeling the plot.
fig_size (Tuple[int, int]):
Measurement of the plot decide (width, peak).
dpi (int):
Dots per inch (resolution) for the plot.
r_max_posterior (float):
Most radius for the posterior PDF plot, used to set plot limits.
Returns:
np.ndarray: Image array of the plot in RGB format, acceptable for GIF processing.
"""
fig = plt.decide(figsize=fig_size, dpi=dpi)
ax = plt.subplot(1, 1, 1, projection='polar')
ax.set_theta_zero_location('N')
ax.set_theta_direction(-1)
ax.plot(mu_grid, posterior_pdf, coloration='pink', linewidth=2, label="Posterior PDF")
# seen samples histogram
n_bins = 48
hist_bins = np.linspace(-np.pi, np.pi, n_bins + 1)
hist_counts, _ = np.histogram(current_samples, bins=hist_bins)
# normalize the histogram counts
if np.max(hist_counts) > 0:
hist_counts_normalized = hist_counts / np.max(hist_counts)
else:
hist_counts_normalized = hist_counts
bin_centers = (hist_bins[:-1] + hist_bins[1:]) / 2
bin_width = hist_bins[1] - hist_bins[0]
# set the utmost radius to accommodate every the posterior pdf and histogram bars
r_histogram_height = r_max_posterior * 0.9
r_max = r_max_posterior + r_histogram_height
ax.set_ylim(0, r_max)
# plot the histogram bars outside the circle
for i in fluctuate(len(hist_counts_normalized)):
theta = bin_centers[i]
width = bin_width
hist_height = hist_counts_normalized[i] * r_histogram_height
if hist_counts_normalized[i] > 0:
ax.bar(
theta, hist_height, width=width, bottom=r_max_posterior,
coloration='teal', edgecolor="black", alpha=0.5
)
ax.textual content material(
0.5, 1.1, f'Posterior Distribution (Step {idx + 1})',
rework=ax.transAxes, ha="coronary heart", va="bottom", fontsize=18
)
ax.set_yticklabels([])
ax.grid(linestyle="--")
ax.yaxis.set_visible(False)
ax.spines['polar'].set_visible(False)
plt.subplots_adjust(excessive=0.85, bottom=0.05, left=0.05, correct=0.95)
# saving to buffer for gif processing
buf = BytesIO()
plt.savefig(buf, format="png", bbox_inches=None, pad_inches=0)
buf.search(0)
img_array = plt.imread(buf)
img_array = (img_array * 255).astype(np.uint8)
plt.shut(fig)
return img_array
Now we’re ready to place in writing the change loop. Understand that we’ve to set our prior distribution. I’ll start with a spherical uniform distribution which is the same as a von Mises distribution with a spotlight parameter of 0. For the kappa_likelihood I set a tough and quick common focus parameter of two. That’ll make the posterior change further seen.
# preliminary prior parameters
mu_prior = 0.0 # preliminary indicate path (any value, since kappa_prior = 0)
kappa_prior = 0.0 # uniform prior over the circle# fixed focus parameter for the prospect
kappa_likelihood = 2.0
posterior_mus = []
posterior_kappas = []
mu_grid = np.linspace(-np.pi, np.pi, 200)
# vizualisation parameters
fig_size = (10, 10)
dpi = 100
current_samples = []
frames = []
for idx, theta_n in enumerate(information['radians']):
# compute posterior parameters
C = kappa_prior * np.cos(mu_prior) + kappa_likelihood * np.cos(theta_n)
S = kappa_prior * np.sin(mu_prior) + kappa_likelihood * np.sin(theta_n)
kappa_post = np.sqrt(C**2 + S**2)
mu_post = np.arctan2(S, C)
# posterior distribution
posterior_pdf = np.exp(kappa_post * np.cos(mu_grid - mu_post)) / (2 * np.pi * i0(kappa_post))
# retailer posterior parameters and seen samples
posterior_mus.append(mu_post)
posterior_kappas.append(kappa_post)
current_samples.append(theta_n)
# plot posterior distribution
r_max_posterior = max(posterior_pdf) * 1.1
img_array = get_posterior_distribution_image_array(
mu_grid,
posterior_pdf,
current_samples,
idx,
fig_size,
dpi,
r_max_posterior
)
frames.append(img_array)
# updating priors for subsequent iteration
mu_prior = mu_post
kappa_prior = kappa_post
# Create GIF
fps = 10
frames.lengthen([img_array]*fps*3) # repeat ultimate physique a lot of events to make a "pause" on the end of the GIF
imageio.mimsave('../footage/posterior_updates.gif', frames, fps=fps)
And that’s it! The code will generate a GIF exhibiting the posterior distribution change with every new assertion. Proper right here’s the improbable finish end result:
With every new assertion the posterior distribution will get more and more concentrated throughout the true indicate path. If solely I’d change the pink line with Auri’s silhouette, it could possibly be good!
We’re in a position to extra visualize the historic previous of the posterior indicate path and focus parameter. Let’s plot them:
# Convert posterior_mus to ranges
posterior_mus_deg = np.rad2deg(posterior_mus) % 360
n_samples = information.type[0]
true_mu = information['degrees'].indicate()
# Plot evolution of posterior indicate path
fig, ax1 = plt.subplots(figsize=(12, 6))coloration = 'tab:blue'
ax1.set_xlabel('Number of Observations')
ax1.set_ylabel('Posterior Indicate Path (Ranges)', coloration=coloration)
ax1.plot(fluctuate(1, n_samples + 1), posterior_mus_deg, marker="o", coloration=coloration)
ax1.tick_params(axis="y", labelcolor=coloration)
ax1.axhline(true_mu, coloration='pink', linestyle="--", label="Sample Distribution Indicate Path")
ax1.legend(loc="greater left")
ax1.grid(True)
ax2 = ax1.twinx() # instantiate a second axes that shares the an identical x-axis
coloration = 'tab:orange'
ax2.set_ylabel('Posterior Focus Parameter (kappa)', coloration=coloration) # we already handled the x-label with ax1
ax2.plot(fluctuate(1, n_samples + 1), posterior_kappas, marker="o", coloration=coloration)
ax2.tick_params(axis="y", labelcolor=coloration)
fig.tight_layout() # in some other case the very best y-label is barely clipped
sns.despine()
plt.title('Evolution of Posterior Indicate Path and Focus Over Time')
plt.current()
The plot reveals how the posterior indicate path and focus parameter evolve with every new assertion. The indicate path lastly converges to the sample value, whereas the main focus parameter will improve, as a result of the estimate turns into further positive.
The very very last thing I wanted to aim was to utilize the Bayes Factor technique for hypothesis testing. The idea behind the Bayes Difficulty could possibly be very straightforward: it’s the ratio of two marginal likelihoods for two competing hypotheses/fashions.
Mainly, the Bayes Difficulty is printed as:
the place:
- p(D∣Mi) and p(D∣Mj) are the marginal likelihoods of the data beneath the i and j hypothesis
- p(Mi∣D) and p(Mj∣D) are the posterior potentialities of the fashions given the data
- p(Mi) and p(Mj) are the prior potentialities of the fashions
The end result’s a amount that tells us how extra probably one hypothesis is compared with the other. There are completely totally different approaches to interpret the Bayes Difficulty, nonetheless a typical one is to utilize the Jeffreys’ scale by Harold Jeffreys:
What are the fashions it’s possible you’ll ask? Simple! They’re distributions with completely totally different parameters. I’ll be using PyMC to stipulate the fashions and sample posterior distributions from them.
To start with, let’s re-itroduce the null hypothesis. I nonetheless assume it’s a spherical uniform Von Mises distribution with kappa=0 nonetheless this time we’ve to calculate the prospect of the data beneath this hypothesis. To simplify extra calculations, we’ll be calculating log-likelihoods.
# Calculate log probability for H0
log_likelihood_h0 = vonmises.logpdf(information['radians'], kappa=0, loc=0).sum()
Subsequent, it’s time to assemble the selection model. Starting with a straightforward scenario of a Unimodal South path, the place I assume that the distribution is concentrated spherical 180° or π in radians.
Let’s define the model in PyMC. We’ll use Von Mises distribution with a tough and quick location parameter μ=π and a Half-Common prior for the non-negative focus parameter κ. This allows the model to be taught the main focus parameter from the data and check if the South path is hottest.
import pymc as pm
import arviz as az
import arviz.information.inference_data as InferenceData
from scipy.stats import halfnorm, gaussian_kdewith pm.Model() as model_uni:
# Prior for kappa
kappa = pm.HalfNormal('kappa', sigma=10)
# Likelihood
likelihood_h1 = pm.VonMises('angles', mu=np.pi, kappa=kappa, seen=information['radians'])
# Sample from posterior
trace_uni = pm.sample(
10000, tune=3000, chains=4,
return_inferencedata=True,
idata_kwargs={'log_likelihood': True})
This offers us a pleasing straightforward model which we’re in a position to moreover visualize:
# Model graph
pm.model_to_graphviz(model_uni)
And proper right here’s the posterior distribution for the main focus parameter κ:
az.plot_posterior(trace_uni, var_names=['kappa'])
plt.current()
All that’s left is to calculate the log-likelihood for the selection model and the Bayes Difficulty.
# Posterior samples for kappa
kappa_samples = trace_uni.posterior.kappa.values.flatten()
# Log probability for each sample
log_likes = []
for okay in kappa_samples:
# Von Mises log probability
log_like = vonmises.logpdf(information['radians'], okay, loc=np.pi).sum()
log_likes.append(log_like)
# Log-mean-exp trick for numerical stability
log_likelihood_h1 = np.max(log_likes) +
np.log(np.indicate(np.exp(log_likes - np.max(log_likes))))
BF = np.exp(log_likelihood_h1 - log_likelihood_h0)
print(f"Bayes Difficulty: {BF:.4f}")
print(f"Likelihood kappa > 0.5: {np.indicate(kappa_samples > 0.5):.4f}")
>> Bayes Difficulty: 32.4645
>> Likelihood kappa > 0.5: 0.0649
Since we’re dividing the prospect of the selection model by the prospect of the null model, the Bayes Difficulty signifies how extra probably the data is beneath the selection hypothesis. On this case, we get 32.46, a very strong proof, suggesting that the data is not uniformly distributed throughout the circle and there’s a need for the South path.
Nonetheless, we furthermore calculate the chance that the main focus parameter kappa is bigger than 0.5. It’s a straightforward resolution to check if the distribution is significantly completely totally different from the uniform one. With the Unimodal South model, this probabiliy is simply 0.0649, which signifies that the distribution continues to be pretty unfold out.
Let’s try one different model: Bimodal North-South Mixture.
This time I’ll assume that the distribution is bimodal with peaks spherical 0° and 180°, merely as we’ve seen throughout the compass rose.
To achieve this, I’ll need a mixture of two Von Mises distributions with completely totally different fixed indicate directions and a shared focus parameter.
Let’s define a lot of helper capabilities first:
# Type aliases
ArrayLike = Union[np.ndarray, pd.Series]
ResultDict = Dict[str, Union[float, InferenceData.InferenceData]]def compute_mixture_vonmises_logpdf(
sequence: ArrayLike,
kappa: float,
weights: npt.NDArray[np.float64],
mus: Guidelines[float]
) -> float:
"""
Compute log PDF for a mixture of von Mises distributions
Parameters:
-----------
sequence: ArrayLike
Array of seen angles in radians
kappa: float
Focus parameter
weights: npt.NDArray[np.float64],
Array of mixture weights
mus: Guidelines[float]
Array of means for each component
Returns:
--------
float: Sum of log potentialities for all information elements
"""
mixture_pdf = np.zeros_like(sequence)
for w, mu in zip(weights, mus):
mixture_pdf += w * vonmises.pdf(sequence, kappa, loc=mu)
return np.log(np.most(mixture_pdf, 1e-300)).sum()
def compute_log_likelihoods(
trace: az.InferenceData,
sequence: ArrayLike,
mus: Guidelines[float]
) -> np.ndarray:
"""
Compute log likelihoods for each sample throughout the trace
Parameters:
-----------
trace: az.InferenceData
The trace from the PyMC3 model sampling.
sequence: ArrayLike
Array of seen angles in radians
"""
kappa_samples = trace.posterior.kappa.values.flatten()
weights_samples = trace.posterior.weights.values.reshape(-1, 2)
# Calculate log probability for each posterior sample
log_likes = []
for okay, w in zip(kappa_samples, weights_samples):
log_like = compute_mixture_vonmises_logpdf(
sequence,
kappa=okay,
weights=w,
mus=mus
)
log_likes.append(log_like)
# Calculate marginal probability using log-sum-exp trick
log_likelihood_h1 = np.max(log_likes) + np.log(np.indicate(np.exp(log_likes - np.max(log_likes))))
return log_likelihood_h1
def posterior_report(
log_likelihood_h0: float,
log_likelihood_h1: float,
kappa_samples: ArrayLike,
kappa_threshold: float = 0.5
) -> str:
"""
Generate a report with Bayes Difficulty and chance kappa > threshold
Parameters:
-----------
log_likelihood_h0: float
Log probability for the null hypothesis
log_likelihood_h1: float
Log probability for the selection hypothesis
kappa_samples: ArrayLike
Flattened posterior samples of the main focus parameter
kappa_threshold: float
Threshold for computing the chance that kappa > threshold
Returns:
--------
summary: str
A formatted string containing the summary statistics.
"""
BF = np.exp(log_likelihood_h1 - log_likelihood_h0)
summary = (
f"Bayes Difficulty: {BF:.4f}n"
f"Likelihood kappa > {kappa_threshold}: {np.indicate(kappa_samples > kappa_threshold):.4f}"
)
return summary
And now once more to the model:
mu1 = 0 # 0 ranges
mu2 = np.pi # 180 rangeswith pm.Model() as model_mixture_bimodal_NS:
# Priors for focus parameters
kappa = pm.HalfNormal('kappa', sigma=10)
# Priors for component weights
weights = pm.Dirichlet('weights', a=np.ones(2))
# Define the von Mises parts
vm1 = pm.VonMises.dist(mu=mu1, kappa=kappa)
vm2 = pm.VonMises.dist(mu=mu2, kappa=kappa)
# Mixture distribution
probability = pm.Mixture(
'angles',
w=weights,
comp_dists=[vm1, vm2],
seen=information['radians']
)
# Sample from the posterior
trace_mixture_bimodal_NS = pm.sample(
10000, tune=3000, chains=4, return_inferencedata=True, idata_kwargs={'log_likelihood': True})
# Get kappa samples
kappa_samples = trace_mixture_bimodal_NS.posterior.kappa.values.flatten()
As quickly as as soon as extra, let’s visualize the model graph and the posterior distribution for the main focus parameter κ:
# Model graph
pm.model_to_graphviz(model_mixture_bimodal_NS)
# Posterior Analysis
az.plot_posterior(trace_mixture_bimodal_NS, var_names=['kappa'])
plt.current()
And finally, let’s calculate the Bayes Difficulty and the chance that the main focus parameter κ is bigger than 0.5:
log_likelihood_h1 = compute_log_likelihoods(trace_mixture_bimodal_NS, information['radians'], [mu1, mu2])
print(posterior_report(log_likelihood_h0, log_likelihood_h1, kappa_samples))
>> Bayes Difficulty: 214.2333
>> Likelihood kappa > 0.5: 0.9110
Unbelievable! Every our metrics level out that this model is a considerably higher match for the data. The Bayes Difficulty suggests a decisive proof and quite a lot of the posterior κ samples are greater than 0.5 with the indicate value of 0.99 as we’ve seen on the distribution plot.
Let’s try a pair further fashions sooner than wrapping up.
This model as quickly as as soon as extra assumes a bimodal distribution nonetheless this time with peaks spherical 270° and 180°, which have been frequent directions throughout the compass rose.
mu1 = np.pi # 180 ranges
mu2 = 3 * np.pi / 2 # 270 rangeswith pm.Model() as model_mixture_bimodal_WS:
# Priors for focus parameters
kappa = pm.HalfNormal('kappa', sigma=10)
# Priors for component weights
weights = pm.Dirichlet('weights', a=np.ones(2))
# Define the 4 von Mises parts
vm1 = pm.VonMises.dist(mu=mu1, kappa=kappa)
vm2 = pm.VonMises.dist(mu=mu2, kappa=kappa)
# Mixture distribution
probability = pm.Mixture(
'angles',
w=weights,
comp_dists=[vm1, vm2],
seen=information['radians']
)
# Sample from the posterior
trace_mixture_bimodal_WS = pm.sample(
10000, tune=3000, chains=4, return_inferencedata=True, idata_kwargs={'log_likelihood': True})
# Get kappa samples
kappa_samples = trace_mixture_bimodal_WS.posterior.kappa.values.flatten()
# Posterior Analysis
az.plot_posterior(trace_mixture_bimodal_WS, var_names=['kappa'])
plt.current()
log_likelihood_h1 = compute_log_likelihoods(trace_mixture_bimodal_WS, information['radians'], [mu1, mu2])
print(posterior_report(log_likelihood_h0, log_likelihood_h1, kappa_samples))
>> Bayes Difficulty: 20.2361
>> Likelihood kappa > 0.5: 0.1329
Nope, positively not so good as the sooner model. Subsequent!
Final spherical. Probably my canine really likes to align himself with the cardinal directions? Let’s try a quadrimodal distribution with peaks spherical 0°, 90°, 180°, and 270°.
mu1 = 0 # 0 ranges
mu2 = np.pi / 2 # 90 ranges
mu3 = np.pi # 180 ranges
mu4 = 3 * np.pi / 2 # 270 rangeswith pm.Model() as model_mixture_quad:
# Priors for focus parameters
kappa = pm.HalfNormal('kappa', sigma=10)
# Priors for component weights
weights = pm.Dirichlet('weights', a=np.ones(4))
# Define the 4 von Mises parts
vm1 = pm.VonMises.dist(mu=mu1, kappa=kappa)
vm2 = pm.VonMises.dist(mu=mu2, kappa=kappa)
vm3 = pm.VonMises.dist(mu=mu3, kappa=kappa)
vm4 = pm.VonMises.dist(mu=mu4, kappa=kappa)
# Mixture distribution
probability = pm.Mixture(
'angles',
w=weights,
comp_dists=[vm1, vm2, vm3, vm4],
seen=information['radians']
)
# Sample from the posterior
trace_mixture_quad = pm.sample(
10000, tune=3000, chains=4, return_inferencedata=True, idata_kwargs={'log_likelihood': True}
)
# Get kappa samples
kappa_samples = trace_mixture_quad.posterior.kappa.values.flatten()
# Posterior Analysis
az.plot_posterior(trace_mixture_quad, var_names=['kappa'])
plt.current()
log_likelihood_h1 = compute_log_likelihoods(trace_mixture_quad, information['radians'], [mu1, mu2, mu3, mu4])
print(posterior_report(log_likelihood_h0, log_likelihood_h1, kappa_samples))
>> Bayes Difficulty: 0.0000
>> Likelihood kappa > 0.5: 0.9644
Successfully… In all probability not. Although the chance that the main focus parameter κκ is bigger than 0.5 is kind of extreme, the Bayes Difficulty is 0.0.
The fantastic thing about Bayes Difficulty is that it penalizes overly superior fashions efficiently stopping overfitting.
Let’s summarize the outcomes of all fashions using information requirements. We’ll use the Widely Applicable Information Criterion (WAIC) and the Depart-One-Out Cross-Validation (LOO) to test the fashions.
# Compute WAIC for each model
wail_uni = az.waic(trace_uni)
waic_quad = az.waic(trace_mixture_quad)
waic_bimodal_NS = az.waic(trace_mixture_bimodal_NS)
waic_bimodal_WS = az.waic(trace_mixture_bimodal_WS)model_dict = {
'Quadrimodal Model': trace_mixture_quad,
'Bimodal Model (NS)': trace_mixture_bimodal_NS,
'Bimodal Model (WS)': trace_mixture_bimodal_WS,
'Unimodal Model': trace_uni
}
# Look at fashions using WAIC
waic_comparison = az.look at(model_dict, ic="waic")
waic_comparison