Showing posts with label R. Show all posts
Showing posts with label R. Show all posts

May 30, 2020

Mimic Excel's Conditional Formatting in R


The DT package is an interface between R and the JavaScript DataTables library (RStudio DT documentation). In Example 3 (at this page) they show how to heatmap-format a table. This post modifies the example to
  1. format each column individually
  2. shade in green rather than red
  3. use base R syntax rather than piping
  4. omit the extra accoutrements of the displayed table (from the answer to this stackoverflow post), except
  5. include a title.
Here we generate data similar to that in Example 3, but with average values growing by column
set.seed(12345)
df = as.data.frame(
  cbind(round(rnorm(10, mean = 0), 3), 
  round(rnorm(10, mean = 4), 3), 
  round(rnorm(10, mean = 8), 3), 
  round(rnorm(10, mean = 16), 3), 
  round(rnorm(10, mean = 32), 3), 
  sample(0:1, 10, TRUE)))
Using the code in the example -- modified to green -- the darker values naturally appear in columns V4 and V5.


But that's not what we want.

For each column to have it's own scale, simply apply RStudio's algorithm to each column of df in a loop. The trick to notice is that formatStyle wants a datatable object as its first argument, and produces a datatable object as its result. Therefore, start off with a plain-Jane datatable and successively format each column, saving the result each time. Almost like building a ggplot. At the end, view the final result.
# Start with a (relatively) plain, unformatted datatable object
dt <- DT::datatable(df, 
                    options = list(dom = 't', ordering = FALSE),
                    caption = "Example 3 By Column")
# Loop through the columns formatting according to that column's distribution
for (j in seq_along(df)) {
  # Create breaks for shading column values high to low
  brks <- stats::quantile(x <- df[[j]], probs = seq(.05, .95, .05), na.rm = TRUE)
  # Create shades of green for backgrounds
  y <- round(seq(255, 40, length.out = length(brks) + 1), 0)
  clrs <- paste0("rgb(", y, ", 255,", y, ")")
  # Format cells in j-th column
  dt <- DT::formatStyle(dt, j, backgroundColor = DT::styleInterval(brks, clrs))
}
dt

Actuaries in the crowd might recognize the image at the top of the post as the table of link ratios from the GenIns dataset in the ChainLadder package. There do not appear to be any distinctive trends in the ratios by age.




Aug 30, 2016

Benford's Law Graphed in R

Using R to replicate Sara Silverstein's post at BusinessInsider.com


A first-year student near and dear to my heart at the Kellogg School of Management thought I would be interested in this Business Insider story by Sara Silverstein on Benford’s Law. After sitting through the requisite ad, I became engrossed in Ms. Silverstein’s talk about what that law theoretically is and how it can be applied in financial forensics.

I thought I would try duplicating the demonstration in R.1 This gave me a chance to compare and contrast the generation of combined bar- and line-plots using base R and ggplot2. It also gave me an opportunity to learn how to post RMarkdown output to blogger.

Using base R

Define the Benford Law function using log base 10 and plot the predicted values.
benlaw <- function(d) log10(1 + 1 / d)
digits <- 1:9
baseBarplot <- barplot(benlaw(digits), names.arg = digits, xlab = "First Digit", 
                       ylim = c(0, .35))
Remarks:
  • That was easy!