Solution for How to drop the all the 1’s in a correlation matrix
is Given Below:
I’m trying to change/eliminate the 1’s that run diagonally in a correlation matrix so that when I take the average of the rows of the correlation matrix, the 1s don’t affect the mean of each of the rows.
Let’s say I have the dataset,
A B C D E F
0 45 100 58 78 80 35
1 49 80 80 104 58 20
2 49 80 65 78 79 20
3 65 100 80 159 83 45
4 65 123 78 115 100 50
5 45 122 84 100 85 20
6 60 120 78 44 105 55
7 62 80 109 48 78 25
8 63 39 85 65 79 25
9 80 52 100 50 103 30
10 80 43 78 64 120 60
11 60 60 130 43 135 45
12 80 50 111 59 115 50
13 82 65 130 63 78 90
14 83 58 85 80 45 80
15 100 64 100 65 30 70
When I do dfcorr = df.corr()
dfcorr
, I get
A B C D E F
A 1.000000 0.842125 0.834808 0.832773 0.844158 0.806787
B 0.842125 1.000000 0.847606 0.907595 0.818668 0.863645
C 0.834808 0.847606 1.000000 0.718199 0.804671 0.582033
D 0.832773 0.907595 0.718199 1.000000 0.884236 0.878421
E 0.844158 0.818668 0.804671 0.884236 1.000000 0.718668
F 0.806787 0.863645 0.582033 0.878421 0.718668 1.000000
I want all the 1’s to be dropped so that if I want to take the mean of each of the rows, the 1’s won’t affect them.
If you are working with it as a data frame this will work:
df=pd.DataFrame({'c1':[1, 0, 0.3, 0.4], 'c2':[0.2, 1, 0.6, 0.4], 'c3':[0.1, 0, 1, 0.4], 'c4':[0.7, 0.2, 0.2, 1]} )
df.where(df!=1).mean(axis=1)
This only works correctly if all 1’s are on the diagonal.