Solution for Seaborn Grouped Violin Plot WITHOUT pandas
is Given Below:
For reasons I won’t get into I need to make violin plots without using a pandas dataframe. For example I have the following ndarray and categories.
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
data = np.random.randn(5, 3)
category = np.array(["yes", "no", "no", "no", "yes", "yes","yes", "no", "yes", "yes", "yes", "no", "no", "no", "no"])
ax = sns.violinplot(data = data)
plt.show()
Results in an ungrouped violin plot.
However, I’d like to use the categorical data to make a grouped violin plot
ax = sns.violinplot(data = data, x = category)
plt.show()
Gives an error AttributeError: 'numpy.ndarray' object has no attribute 'get'
. Is there any way around this without pandas?
- Do not use the
data
parameter if using multiple numpy arrays forx
,y
andhue
. - From
y
, you can create an array of indices withnp.nonzero
. - Make sure all of your
np.arrays
are one-dimensional with.flatten()
. For example Iflatten
your array of random floats from a shape of5,3
to15,1
; Otherwise, you will get an error since the arrays have different shapes andSeaborn
doesn’t have a way to figure it out as it can with apandas
dataframe.
Likewise, if you pass three (5,3)
arrays to x
, y
and hue
, then Seaborn won’t know what to do. So, you must either a) FLATTEN
all arrays and make them equal length of (15,0)
OR b) use a pandas dataframe.
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
y = np.random.randn(5, 3)
x = np.nonzero(y)[-1]
y = y.flatten()
hue = np.array(["yes", "no", "no", "no", "yes", "yes","yes", "no", "yes", "yes", "yes", "no", "no", "no", "no"])
sns.violinplot(x=x, y=y, hue=hue)
print(x,'nn',y,'nn',hue)
[0 1 2 0 1 2 0 1 2 0 1 2 0 1 2]
[-0.28618123 -1.18132595 0.70535902 0.90685532 -1.27258432 0.90417094
3.03506025 0.99796779 0.20247628 0.43226169 0.25005372 -0.9923336
-0.43102785 -0.17117549 -0.16147393]
['yes' 'no' 'no' 'no' 'yes' 'yes' 'yes' 'no' 'yes' 'yes' 'yes' 'no' 'no'
'no' 'no']