# R to Latex summary table for categorical variables by year

Solution for R to Latex summary table for categorical variables by year
is Given Below:

``````year <- c(2000,2000,2000,2001,2001,2001)
gender <- c("F","M","M","F","F","M")
``````

I would like to make a summary table but minimize manual codes and automates the process as much as possible. In my project, I have 170 variables to summarise.
I tried tidyverse group by but didn’t get the result that I want.
I will use xtable to move to latex file. (I tried add.to.row but failed to add “Gender” on the first line.)

Here is the outcome that I want. You can use `pivot_longer` and `summarise` to generate the summary values.

``````library(tidyverse)

df %>%
pivot_longer(-year) %>%
group_by(year, name, value) %>%
summarise(n = n()) %>%
mutate(prop = round(n / sum(n), 3) * 100)

# A tibble: 10 x 5
# Groups:   year, name 
year name   value     n  prop
<dbl> <chr>  <chr> <int> <dbl>
1  2000 gender F         1  33.3
2  2000 gender M         2  66.7
3  2000 grade  A         1  33.3
4  2000 grade  B         1  33.3
5  2000 grade  C         1  33.3
6  2001 gender F         2  66.7
7  2001 gender M         1  33.3
8  2001 grade  A         1  33.3
9  2001 grade  B         1  33.3
10  2001 grade  C         1  33.3
``````

You can also get closer to your desired table by joining values in a formatted string, and then using `pivot_wider`:

``````df %>%
pivot_longer(-year) %>%
group_by(year, name, value) %>%
summarise(n = n()) %>%
mutate(prop = round(n / sum(n), 3) * 100,
summary_str = glue::glue("{n}({prop}%)")) %>%
pivot_wider(id_cols = c(name, value), names_from = "year",
values_from = "summary_str")

name   value `2000`   `2001`
<chr>  <chr> <glue>   <glue>
1 gender F     1(33.3%) 2(66.7%)
2 gender M     2(66.7%) 1(33.3%)
``````

I mentioned in my comment that you can do this in the `tables` package. Here’s an example:

``````year <- c(2000,2000,2000,2001,2001,2001)
gender <- c("F","M","M","F","F","M")

# Our table treats the columns as factors, so save them that way
# I'll change the names to the way we'd like them to appear.

df <- data.frame(Year = factor(year),
Gender = factor(gender),

library(tables)
# write a small function to format the percent values the way you want.
fmtPercent <- function(x, digits = 1) paste0("(", format(x, digits = digits), "\%)")

# Calculate the table object.
data = df)

# Print it as text.
tab
#>
#>           Year
#>           2000         2001
#>  Gender F 1    (33\%) 2    (67\%)
#>         M 2    (67\%) 1    (33\%)
#>  Grade  A 1    (33\%) 1    (33\%)
#>         B 1    (33\%) 1    (33\%)
#>         C 1    (33\%) 1    (33\%)
``````

Created on 2021-07-31 by the reprex package (v2.0.0)

The reason I added the escapes before the percent sign is so that it would print properly in LaTeX. In PDF output in an R Markdown document it looks like this: 