# Delete columns with several zeros

Solution for Delete columns with several zeros
is Given Below:

I need to remove all columns with more than 1 zero. I am currently using `df <- df[, colSums(df != 0) > 1]` but this does not work for all of the columns with many zeros. How can this be fixed or approached a different way?

``````> tibble(df)
# A tibble: 551 x 1,046
`aa`           `ab`           `ac`         `ad`         `ae`         `af`
<dbl>          <dbl>          <dbl>        <dbl>        <dbl>        <dbl>
1          32458          65068          32654        0            43115         1450
2          19387          38457          19447        0            22523          958
3          42690          85105          43247        0            14156         1088
4          62290         123325          61878        58422        36300         1145
``````

We may use `select` to select columns where the `mean` of the logical expression i.e those elements that are 0 are less than 0.7

``````library(dplyr)
df %>%
select(where(~ mean(. %in% 0) < 0.7))
``````

-output

``````    aa     ab    ac    ae   af
1 32458  65068 32654 43115 1450
2 19387  38457 19447 22523  958
3 42690  85105 43247 14156 1088
4 62290 123325 61878 36300 1145
``````

If it is to remove columns with more than 1 zero value

``````df %>%
select(where( ~sum(. %in% 0) < 2))
``````

-output

``````   aa     ab    ac    ae   af
1 32458  65068 32654 43115 1450
2 19387  38457 19447 22523  958
3 42690  85105 43247 14156 1088
4 62290 123325 61878 36300 1145
``````

Or a similar option in `base R`

`````` Filter(function(x) mean(x %in% 0) < 0.7, df)
aa     ab    ac    ae   af
1 32458  65068 32654 43115 1450
2 19387  38457 19447 22523  958
3 42690  85105 43247 14156 1088
4 62290 123325 61878 36300 1145
``````

or using `sum` for count of zeros

``````Filter(function(x) sum(x %in% 0) < 2, df)
``````

### data

``````df <- structure(list(aa = c(32458L, 19387L, 42690L, 62290L), ab = c(65068L,
38457L, 85105L, 123325L), ac = c(32654L, 19447L, 43247L, 61878L
), ad = c(0L, 0L, 0L, 58422L), ae = c(43115L, 22523L, 14156L,
36300L), af = c(1450L, 958L, 1088L, 1145L)),
class = "data.frame", row.names = c("1",
"2", "3", "4"))
``````

Maybe you can try `colMeans` like below

``````df[colMeans(df == 0) < 0.7]
``````