May 30, 2020

Mimic Excel's Conditional Formatting in R

The DT package is an interface between R and the JavaScript DataTables library (RStudio DT documentation). In Example 3 (at this page) they show how to heatmap-format a table. This post modifies the example to
  1. format each column individually
  2. shade in green rather than red
  3. use base R syntax rather than piping
  4. omit the extra accoutrements of the displayed table (from the answer to this stackoverflow post), except
  5. include a title.
Here we generate data similar to that in Example 3, but with average values growing by column
df =
  cbind(round(rnorm(10, mean = 0), 3), 
  round(rnorm(10, mean = 4), 3), 
  round(rnorm(10, mean = 8), 3), 
  round(rnorm(10, mean = 16), 3), 
  round(rnorm(10, mean = 32), 3), 
  sample(0:1, 10, TRUE)))
Using the code in the example -- modified to green -- the darker values naturally appear in columns V4 and V5.

But that's not what we want.

For each column to have it's own scale, simply apply RStudio's algorithm to each column of df in a loop. The trick to notice is that formatStyle wants a datatable object as its first argument, and produces a datatable object as its result. Therefore, start off with a plain-Jane datatable and successively format each column, saving the result each time. Almost like building a ggplot. At the end, view the final result.
# Start with a (relatively) plain, unformatted datatable object
dt <- DT::datatable(df, 
                    options = list(dom = 't', ordering = FALSE),
                    caption = "Example 3 By Column")
# Loop through the columns formatting according to that column's distribution
for (j in seq_along(df)) {
  # Create breaks for shading column values high to low
  brks <- stats::quantile(x <- df[[j]], probs = seq(.05, .95, .05), na.rm = TRUE)
  # Create shades of green for backgrounds
  y <- round(seq(255, 40, length.out = length(brks) + 1), 0)
  clrs <- paste0("rgb(", y, ", 255,", y, ")")
  # Format cells in j-th column
  dt <- DT::formatStyle(dt, j, backgroundColor = DT::styleInterval(brks, clrs))

Actuaries in the crowd might recognize the image at the top of the post as the table of link ratios from the GenIns dataset in the ChainLadder package. There do not appear to be any distinctive trends in the ratios by age.

May 15, 2020

How to Add a Vignette to a Package in RStudio

R package vignettes are user-friendly ways to demo your package's capabilities. They can also be helpful documentation for your own reference when modifying the package in the future  -- "what was I thinking there?"

The easiest way to create a vignette in RStudio is using File | New File | R Markdown | From Template | Package Vignette (HTML). Write your RMarkdown document. {Note: Be sure to copy the "title" of your vignette to where "Vignette Title" shows in the section below:
vignette: >
  %\VignetteIndexEntry{Vignette Title}

After that, build your package. But if you use the typical procedure for developing a package in RStudio, your vignette will not show up. [Pulling hair out] Let me explain.

My typical procedure for building package myPackage in RStudio is to use two menu items in the Build pane
  • Check, then
  • Install and Restart
Check makes sure everything checks out correctly. This includes processing vignettes in the myPackage/vignettes directory, which you can see in multiple areas of Check's output in that pane:
-  installing the package to build vignettes
v  creating vignettes (7.4s)
* checking files in 'vignettes' ... OK
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes in 'inst/doc' ... OK
* checking re-building of vignette outputs ... OK
When R CMD check succeeded appears at the bottom of that output, click Install and Restart in the Build pane again and myPackage will be recompiled, reloaded, and ready to go ... all except for the vignettes, because when you look for them, they won't be there:
> browseVignettes("myPackage")
No vignettes found by browseVignettes("myPackage")
It turns out, according to RStudio's slightly-outdated online documentation this behavior is to be expected because, presumably, developers don't want to worry about vignettes in the typical code/Install-and-Restart/re-code development cycle [I get that]; to wit: 
RStudio’s “Build & reload” does not build vignettes to save time.
{Note: "Build and reload" has been renamed "Install and Restart" in today's RStudio Version 1.2.5042.}

In any event, the correct workflow was pointed out in 2018 by user2554330 in a stackoverflow post
No. Go to the Build pane, then in the "More" dropdown, choose "Build Source Package". This will create a file with extension .tar.gz, which you can distribute to others. You install it using "Packages | Install | Install from... | Package Archive file". – user2554330 Mar 25 '18 at 19:10
That worked great. Thank you RStudio for software that tends to simply get it done one way or another, and thank you user2554330 for explaining how to get it done for vignettes.

Apr 24, 2020

New R Package 'foo' -- Updated


# Navigate to where you want your folder to be located
# Run create_package in 'usethis' package and read below what happens
# Messages show up in the console including this line
# ✔ Opening 'foo/' in new RStudio session

# Add a test environment and add your first test script
setwd("R") # this is where you need to be
usethis::use_test("firsttest") # edit your first test; close
# That will create a file 'test-firsttest.R' in foo\tests\testthat
# LICENSE...not necessary, but Check Package will issue warning without it
#  Open DESCRIPTION -- just click in RStudio
#  Assuming GPL-3 ...
#  Replace
#   License: What license it uses
# with
#   License: GPL-3 | file LICENSE
# Put a LICENSE file in the root. In RStudio, File, New File, Text File
#   "GPL3 License file"
# as the sole contents works for me, or the license here:
# File, Save As, LICENSE
# roxygenise to create help files, run tests, etc.
# Your package is ready to be checked.
# In RStudio Menu, go to Build, Check (as in "check package integrity")
# All should check ok, no errors, no problems.
# Write some code.
# Then roxygen2::roxygenise() and Build, Check
# Repeat
# When ready to build the actual package
# In RStudio Menu, go to Build, Install and Restart

That's it.

Note: Whenever you add new functionality from another package, don't forget to change the DESCRIPTION file -- roxygen can't do that for you.

May 30, 2019

How to start a new package with testing in R

# Navigate where you want your folder to be located setwd("C:/Users/chief/Documents/Github") # Assumes usethis is installed usethis::create_package("foo") # Say yes or no to next (annoying) popup window, it doesn't matter. # Add a test environment setwd("foo") usethis::use_testthat() # Add your first test function to at least get something in that folder. # Go to foo\tests\testthat # and add this file with a name that begins with 'test_' context("foo") library(foo) test_that("I'm testing something", { # do something with your code expect_equal(1:4, 1:4) }) # After writing a function with roxygen comments, roxygenize your package roxygen2::roxygenise() # Then click "Check" under RStudio's Build tab # You may get a warning about "Non-standard license specification". # To clean that up, see below. # Keep changing your code and roxygenizing until your package checks out clean.
# Once no errors, click "Install and Restart" next to "Check" and you're done.

You're Done! # Don't forget! # The only thing roxygen doesn't handle is, # whenever you add new functionality from another package, # you have have to change DESCRIPTION. License Assuming you just go GPL Open DESCRIPTION Replace "What license it uses" with GPL-3 | file LICENSE and put a file named LICENSE in the same directory as DESCRIPTION. For me, this file content sufficed Something about GPL but the GNU community would probably prefer you used the one here

Apr 12, 2017

CAS RPM: Installing & running Rattle in RStudio

At the Ratemaking and Product Management (RPM) seminar of the Casualty Actuarial Society (CAS) in San Diego last month, Linda Brobeck, Peggy Brinkmann, and yours truly gave a concurrent session on a machine learning technique called decision tree analysis. See the DSPA-2 session -- and other sessions -- at this link. Technical issues precluded showing the video of  installing and running Rattle within RStudio. Thus this post.

As Peggy notes, installing and running Rattle takes only three "commands" in R:
  install.packages("rattle", dependencies=c('Depends', "Suggests"))

Although not necessary, the "Suggests" option above can avoid Rattle's annoying requests for more packages as you click through its GUI; the downside is the time to download and install about 900 packages. Below ("Read more >>") is a link to a video of me running through the three lines above:

Oct 27, 2016

ChainLadder version 0.2.3 available on CRAN

ChainLadder is an R package for actuarial analysis of General / Property & Casualty insurance reserves. Version 0.2.3 on CRAN is is the first update in about a year. For the most part, the new version expands upon existing capabilities, as illustrated in the News vignette. Two of the most important are

  • the rownames (origin period) of a Triangle need no longer be numeric -- for example, accident years may be labeled with the beginning date of the period
  • the exposures of a glmReserve analysis may use names to match with origin period
Comments and contributors (!) are always welcome. Please refer to the package's repository.

Oct 16, 2016

October 2016 BARUG Meeting

The October meeting of the San Francisco Bay Area R User Group held at Santa Clara University consisted of socializing, an intro, and three speakers. In the intro, host representative Sanjiv Das highlighted the curriculum and advisory board of the school's new MS in Business Analytics program. The first speaker, yours truly, reenacted Sara Silverstein's Benford's Law post using R and insurance industry data (see previous posts in this blog). In light of the yahoo email scandal that broke that same day, it was posed to attendees whether a similar "law" might be found to discriminate between harmless and harmful emails without regard to message content. The last comment from the audience seemed to capture the evening's temperament: "Snooping is snooping!"

The other two timely talks dealt with election forecasting.

Mac Roach previewed a new online app from Alteryx to predict U.S. election results at the neighborhood level. Equally interesting was Mac's countrywide display, which was the first time I had seen graphical evidence of the increasing polarity of the American electorate, a disturbing trend IMO.

The last speaker, Pete Mohanty, spoke about presidential forecasting using bigKRLS. I was struck by the existence of a closed form solution to the problem. Pete's slides can be found here.

For a brief summary of the meeting, see BARUG's Meetup site.