# Computing probability on a test data with Naive Bayes

Solution for Computing probability on a test data with Naive Bayes
is Given Below:

I am using the following code to predict output for an SMS text using Naive Bayes

``````from sklearn.naive_bayes import MultinomialNB
mnb=MultinomialNB()
mnb.fit(X,Y)
X_test = np.array(['This is a sample sms'], dtype=object)

X_test_transformed = vec.transform(X_test)

X_test = X_transformed.toarray()

proba=mnb.predict_proba(X_test)
print(proba)
``````

I train the model using `fit` function on X, Y. And now I want to predict if the
SMS `This is a sample sms` is spam or not. I am not sure what I am doing wrong
Because the last line should give me a probability. But it gives me the following output

enter image description here

`````` [[9.99999987e-01 1.30424974e-08]
[9.99996703e-01 3.29712871e-06]
[1.15232279e-22 1.00000000e+00]
...
[9.62666043e-01 3.73339566e-02]
[9.99984562e-01 1.54382674e-05]
[9.66244280e-01 3.37557203e-02]]
``````

Notice that for each row these two numbers add up to 1. For the first row:

9.99999987e-01 = 9.99999987 * 0.1 = 0.999999987

1.30424974e-08 = 1.30424974 * 0.00000001 = 0.000000013

So the predicted probability of this sms for class A (this could be either spam or ham, depending on the rest of the code) is 0.999… and the probability of this sms for class B is 0.00….1

So basically NB predicted class A there with a close to 1 probability. If for example the output was 0.6 , 0.4 (one row of your output matrix) then you would know that NB predicted class A with a 0.6 probability and class B with 0.4 probability. This additional info can be used to threshold your predictions for example.

Edit: If you don’t want this score replace .predict_proba with .predict