Supervised Learning Using Baysian Decision Rule
Python Functions for Bayesian Learning (COSC 522 Project)
By Harshvardhan in R Python statistics ML
September 7, 2021
In this project, I applied Bayesian decision theory for classification problem. The datasets used were from Ripley’s Pattern Recognition and Neural Networks. The first dataset has two features and has a balanced class portfolio. The second dataset is for diabetes in Pima Indians with seven features where the number of diabetic patients is much less than the number of normal patients.
Codes to perform these calculations are in this Github repository.
The heart of this project was in three functions.
- Euclidean Classifier
- Mahalanobis Classifier
- Bayesian Quadratic Classifier
Euclidean Classifier
-
The features are assumed to be statistically independent of each other (strictly speaking, no correlation) and have the same variance.
-
Geometrically, the samples would fall in a equal-radii hyperspherical cluster.
-
The decision boundary for a two class problem would be a hyperplane
\(d\)
-dimensions.
$$ \Sigma = \begin{bmatrix} \sigma^2 \dots 0 \ \vdots \ddots \vdots \ 0 \dots \sigma^2 \end{bmatrix}. $$
$$ g_i(\vec{x}) = - \frac{||\vec{x} - \vec{\mu_i}||}{2\sigma^2} + \ln{P(\omega_i)}. $$
Python Function
def euclid_classifier(xtrain, ytrain, xtest, ytest, pw):
t1 = t.time()
pw0 = pw
pw1 = 1-pw
nn, nf = xtest.shape
# for class 0
arr = xtrain[ytrain == 0]
covs0 = np.cov(np.transpose(arr))
means0 = np.mean(arr, axis = 0)
# for class 1
arr = xtrain[ytrain == 1]
covs1 = np.cov(np.transpose(arr))
means1 = np.mean(arr, axis = 0)
# for euclidean distance
covavg = (covs0+covs1)/2
avg_var = np.mean(np.diagonal(covavg))
# initialising yhat array
yhat = np.ones(len(ytest))
for i in range(len(ytest)):
#for class 0
d = np.dot(xtest[i]-means0, xtest[i]-means0)
g0 = -d/(2*avg_var) + np.log(pw0)
# for class 1
d = np.dot(xtest[i]-means1, xtest[i]-means1)
g1 = -d/(2*avg_var) + np.log(pw1)
# if g0>g1, then i belongs to 0, else to 1
if(g0>g1):
yhat[i] = 0
overall_acc = np.sum(yhat == ytest)/len(ytest)
class0_acc = np.sum(yhat[ytest == 0] == 0)/np.sum(ytest == 0)
class1_acc = np.sum(yhat[ytest == 1] == 1)/np.sum(ytest == 1)
t2 = t.time()
tt = t2-t1
return yhat, overall_acc, class0_acc, class1_acc, tt
Mahalanobis Classifier
-
The covariance matrices for all classes are identical but not identity (times
\(\sigma^2\)
). There is a constant variance. -
Geometrically, the samples fall in a hyperellipsoidal shape.
-
Decision boundary is a hyperplane of
\(d\)
-dimensions.
$$ g_i(\vec{x}) = - \frac{1}{2}(\vec{x} - \vec{\mu_i})'\Sigma_i (\vec{x} - \vec{\mu_i}) + \ln{P(\omega_i)}, $$
where \(\Sigma_i = \Sigma\)
.
Python Function
def maha_classifier(xtrain, ytrain, xtest, ytest, pw):
t1 = t.time()
pw0 = pw
pw1 = 1-pw
nn, nf = xtest.shape
# for class 0
arr = xtrain[ytrain == 0]
covs0 = np.cov(np.transpose(arr))
means0 = np.mean(arr, axis = 0)
# for class 1
arr = xtrain[ytrain == 1]
covs1 = np.cov(np.transpose(arr))
means1 = np.mean(arr, axis = 0)
# for Mahalanobis distance, avg of the two covariance matrix is chosen
covavg = (covs0+covs1)/2
# initialising yhat array
yhat = np.ones(len(ytest))
for i in range(len(ytest)):
#for class 0
d = np.matmul(np.matmul(xtest[i]-means0, np.linalg.inv(covavg)), xtest[i]-means0)
g0 = -d + np.log(pw0)
# for class 1
d = np.matmul(np.matmul(xtest[i]-means1, np.linalg.inv(covavg)), xtest[i]-means1)
g1 = -d + np.log(pw1)
# if g0>g1, then i belongs to 0, else to 1
if(g0>g1):
yhat[i] = 0
overall_acc = np.sum(yhat == ytest)/len(ytest)
class0_acc = np.sum(yhat[ytest == 0] == 0)/np.sum(ytest == 0)
class1_acc = np.sum(yhat[ytest == 1] == 1)/np.sum(ytest == 1)
t2 = t.time()
tt = t2-t1
return yhat, overall_acc, class0_acc, class1_acc, tt
Bayesian (Quadratic) Classifier
-
The covariance matrix is different for different categories.
-
It is a quadratic classifier.
$$ g_i(\vec{x}) = -\frac{1}{2} (\vec{x} - \vec{\mu_i})'\Sigma_i (\vec{x} - \vec{\mu_i}) - \frac{1}{2} \ln{|\Sigma_i|} + \ln{P(\omega_i)}. $$
Python Function
def bayes_classifier(xtrain, ytrain, xtest, ytest, pw):
t1 = t.time()
pw0 = pw
pw1 = 1-pw
nn, nf = xtest.shape
# for class 0
arr = xtrain[ytrain == 0]
covs0 = np.cov(np.transpose(arr))
means0 = np.mean(arr, axis = 0)
# for class 1
arr = xtrain[ytrain == 1]
covs1 = np.cov(np.transpose(arr))
means1 = np.mean(arr, axis = 0)
# initialising yhat array
yhat = np.ones(len(ytest))
for i in range(len(ytest)):
d = np.matmul(np.matmul(xtest[i]-means0, np.linalg.inv(covs0)), xtest[i]-means0) * -0.5
g0 = -0.5*np.log(np.linalg.det(covs0)) + d + np.log(pw0)
d = np.matmul(np.matmul(xtest[i]-means1, np.linalg.inv(covs1)), xtest[i]-means1) * -0.5
g1 = -0.5*np.log(np.linalg.det(covs1)) + d + np.log(pw1)
# if g0>g1, then i belongs to 0, else to 1
if(g0>g1):
yhat[i] = 0
overall_acc = np.sum(yhat == ytest)/len(ytest)
class0_acc = np.sum(yhat[ytest == 0] == 0)/np.sum(ytest == 0)
class1_acc = np.sum(yhat[ytest == 1] == 1)/np.sum(ytest == 1)
t2 = t.time()
tt = t2-t1
return yhat, overall_acc, class0_acc, class1_acc, tt
- Posted on:
- September 7, 2021
- Length:
- 4 minute read, 702 words
- Categories:
- R Python statistics ML
- See Also: