# Find non-matching values across multiple columns in R

Solution for Find non-matching values across multiple columns in R
is Given Below:

I have a table similar to this:

``````library(tidyverse)
tribble(~PatientID, ~A1.recipient, ~A2.recipient, ~DonorID, ~A1.donor, ~A2.donor,
#--|---|---|---|---|---
101, "1", "3", 201, "3", "1",
102, "1", "2", 202, "2", "68",
103, "68", "69", 203, "2", "3",
104, "1", "2", 204, NA, NA,
)
``````

I would like to use R to find values in the two donor columns that are not in either of the two patient columns. My desired output would look something like this:

PatientID A1.recipient A2.recipient DonorID A1.donor A2.donor A1.mismatch A2.mismatch
101 1 3 201 3 1 NA NA
102 1 2 202 2 68 68 NA
103 68 69 203 2 3 2 3
104 1 2 204 NA NA NA NA

I think I need to use the `across` function, but I’m not sure how to do it. Solutions written in dplyr would be great, as that’s what I’m trying to learn.

``````df %>%
mutate(A1.mismatch = case_when(A1.donor == A1.recipient | A1.donor == A2.recipient ~ NA_integer_,
TRUE ~ as.integer(A1.donor))) %>%
mutate(A2.mismatch = case_when(A2.donor == A1.recipient | A2.donor == A2.recipient ~ NA_integer_,
TRUE ~ as.integer(A2.donor)))
``````

Using `base R`

``````nm1 <-  sub("donor", "mismatch", names(df1)[endsWith(names(df1), 'donor')])

df1[nm1] <- lapply(df1[endsWith(names(df1), 'donor')], function(x)
replace(x, Reduce(`|`, lapply(df1[endsWith(names(df1), 'recipient')],
function(y) x == y)), NA))
``````

-output

`````` df1
# A tibble: 4 x 8
PatientID A1.recipient A2.recipient DonorID A1.donor A2.donor A1.mismatch A2.mismatch
<dbl> <chr>        <chr>          <dbl> <chr>    <chr>    <chr>       <chr>
1       101 1            3                201 3        1        <NA>        <NA>
2       102 1            2                202 2        68       <NA>        68
3       103 68           69               203 2        3        2           3
4       104 1            2                204 <NA>     <NA>     <NA>        <NA>
``````

We want to extract the ‘donor’ value in all `case_when` it is not `%in%` the “recipient” values. We can call a function `across` the columns that `ends_with` “donor”, then compare the values of each `.x` with each of the columns that `ends_with` “recipient”.

Your expected output is likely wrong, I suspect you have exchanged A1 for A2 in the ‘mismatch’ columns.

This will work with any number of donor or recipient columns.

``````library(dplyr)

df %>% rowwise() %>%
mutate(across(ends_with('donor'), ~case_when(!.x %in% c_across(ends_with('recipient')) ~ .x), .names="{.col}_mismatch"))

# A tibble: 4 x 8
# Rowwise:
PatientID A1.recipient A2.recipient DonorID A1.donor A2.donor A1.donor_mismatch A2.donor_mismatch
<dbl> <chr>        <chr>          <dbl> <chr>    <chr>    <chr>             <chr>
1       101 1            3                201 3        1        NA                NA
2       102 1            2                202 2        68       NA                68
3       103 68           69               203 2        3        2                 3
4       104 1            2                204 NA       NA       NA                NA
``````