# Take max with condition

Solution for Take max with condition
is Given Below:

I have a data frame with a transaction date `txn_date` and 6 more date columns (`date_1`, `date_2`, etc) that can take values after or before the `txn_date` for a given observation.

I need to take the max of the `date_` s for a given observation but only taking the max amongst the dates that are less than the `txn_date`.

Date frame looks like this (all columns as.Date in the actual date frame but just numeric here for ease)

``````dat1 <- read.table(text = "txn_date date_0 date_1 date_2 date_3 date_4 date_5
50 30 40 50 60 70 75
60 30 40 50 60 70 75", header = TRUE)
``````

And ideally it would look like this:

``````dat1 <- read.table(text = "txn_date date_0 date_1 date_2 date_3 date_4 date_5 max
50 30 40 50 60 70 75 40
60 30 40 50 60 70 75 50", header = TRUE)
``````

Another base R option using `max.col`

``````dat1\$Max <- dat1[-1][
cbind(
1:nrow(dat1),
max.col(dat1[-1] < dat1[, 1], ties.method = "last")
)
]
``````

gives

``````> dat1
txn_date date_0 date_1 date_2 date_3 date_4 date_5 Max
1       50     30     40     50     60     70     75  40
2       60     30     40     50     60     70     75  50
``````

``````dat1\$max <- apply(dat1, 1, function(x) max(x[-1][x[-1] < x[1]], na.rm = TRUE))
dat1\$max
[1] 40 50
``````

If there are rows that doesn’t meet the condition, use an if/else

``````apply(dat1, 1, function(x) if(any(x[-1] < x[1])) max(x[-1][x[-1] < x[1]], na.rm = TRUE) else NA_real_)
[1] 40 50
``````

We could use apply in base R

``````dat1\$max <- apply(dat1, 1, function(x) max(x[-1][x[-1] < x[1]], na.rm = TRUE))
dat1\$max
[1] 40 50
``````

If there are rows that doesn’t meet the condition, use an if/else

``````apply(dat1, 1, function(x) if(any(x[-1] < x[1])) max(x[-1][x[-1] < x[1]], na.rm = TRUE) else NA_real_)
[1] 40 50
``````

Or use a much faster version of apply with dapply (from collapse)

``````library(collapse)
dapply(dat1, MARGIN = 1, FUN = function(x)
max(x[-1][x[-1] < x[1]], na.rm = TRUE))
#[1] 40 50
``````

Or make it faster with a vectorized option

``````do.call(pmax, c(replace(dat1[-1], dat1[-1] >= dat1[[1]], NA), na.rm = TRUE))
[1] 40 50
``````

Or using tidyverse

``````library(dplyr)
library(purrr)
dat1 %>%
mutate(max = pmap_dbl(., ~ {
tmp <- c(...)
max(tmp[-1][tmp[-1] < tmp[1]], na.rm = TRUE)}))
``````

-outupt

``````   txn_date date_0 date_1 date_2 date_3 date_4 date_5 max
1       50     30     40     50     60     70     75  40
2       60     30     40     50     60     70     75  50
``````

We can use `dplyr` and `purrr` with `rowwise()`and `max`, `keep`ing only columns with values < txn_date to feed to `max`

``````library(dplyr)
library(purrr)

dat1 %>% rowwise() %>%
mutate(max=max(keep(c_across(date_0:date_5), ~ .x < txn_date), na.rm=TRUE))

# A tibble: 2 x 8
# Rowwise:
txn_date date_0 date_1 date_2 date_3 date_4 date_5   max
<int>  <int>  <int>  <int>  <int>  <int>  <int> <int>
1       50     30     40     50     60     70     75    40
2       60     30     40     50     60     70     75    50
``````