The Applied Guide to R Lists: From Basics to Advanced Usage

Lists are the often overlooked workhorse of R data structures. Beyond basic vectors and dataframes lies efficient, flexible lists ready to organize your workflows.

In this comprehensive guide, we‘ll conquer creating, wrangling, and applying lists across your R programming with hands-on examples. Buckle up friend – this will take your R skills to the next level!

Why Care About Lists?

Before jumping into syntax, let‘s motivate why you should even care about lists:

1. Flexible data storage that handles varied data types and structures in one object. Tigers and lambs cozying up next to each other!

2. Critical for storing complex model objects and function outputs. Ever peek under the hood of what many R functions return? Surprise – lists aplenty!

3. Pass configuration parameters easily when calling functions. Rather than endless individual arguments, you can bundle them tidily into a single list.

4. Represent hierarchical and irregular data like JSON documents or ragged arrays lacking matrix conformity.

Let‘s demonstrate the list data structure with some motivating examples…

Listsgroupedata

Here we store mixed personal data including identifiers, contacts, appointments, and even medical history together in one list object per person. All tidy and easy to access!

Now that you appreciate why lists exist, let‘s get our hands dirty with some practice…

Creating Lists Just Got Easier

The constructor function is predictably named list() in base R:

my_list <- list(
  first = "Gandalf",
  id = arranged_ids, 
  data = some_long_dataframe,
  model_objects = stored_models
) 

Mix and match whatever elements you desire! Even nest complex objects like dataframes and model outputs.

Calling list() directly gets tedious though. Enter a shortcut – the useful tibble() from tibble package.

This allows defining list columns alongside data frames. Best of both worlds!

patients <- tibble(
  id = c("P-1", "P-2"), 
  vitals = list(
    P1_data, 
    P2_data
  ),
  history = list(NULL, "Family illness")
)

So easy to bundle related data sets and notes together rather than wrestle wide data frames!

Subsetting List Elements Like a Ninja ????

Creating lists ain‘t too rough. But what about getting our data back out? Let‘s discuss how to slice and dice elements!

The main R list subscriptors:

  • [[ ]] – Extract or replace single element
  • $ – Fast extract by list name
  • [] – Subset and reorder collection of elements

Consider our patient list:

patients <- list(
  id = c("P1", "P2"),
  age = c(52, 31), 
  medications = list(c("A", "B"), c("X", "Y"))
)

Extracting entries looks like:

first_med <- patients$medications[[1]] # "A", "B"

ages <- patients[["age"]] # 52, 31

Where it gets really fun is subsetting the list itself with []:

just_ages_meds <- patients[c("age", "medications")] # Extract 2 cols

reversed <- patients[c(3,2,1)] # Reverse element order

Now you can slice and dice lists to your ????????‘s content. But what else can they do?…

Applying Functions Over Lists

A common task is running an operation across every element. Rather than messy for loops, the apply family has your back!

These functions apply over list margins efficiently:

  • lapply() – Apply func to every element and return list
  • sapply() – Apply func and simplify if possible
  • vapply() – Strict output specification

Let‘s call class() on every column of our patients using lapply():

lapply(patients, class)

$id
[1] "character"

$age
[1] "numeric"

$medications
[1] "list"

Now we peek at the structure with no looping! Combining lapply() and unlist() is common for quick ops:

med_counts <- lapply(patients$medications, length) %>%
  unlist() %>%
  sum() # Total meds = 4

The apply family quietly works wonders under the hood across R. Now let‘s visualize some list workflows…

List workflow

Lists in the Wild ????

Beyond our own usage, lists permeate R thanks to their versatility. Understanding common cases unlocks better cooperation with R:

  • Model objects like LMs are lists containing model data, coefficients, classes, etc.

  • Package data sets like USArrests are often lists holding metrics across regions.

  • Function configuration via named list parameters instead of endless individual args.

  • Return objects from many R functions are complex lists holding key outputs.

  • Recursive lists represent tree-like data (see the data.tree package)

So poke around built-in lists with str()! Marvel at massive model elements! Appreciate each package‘s unique take. Lists are inescapable within R itself.

Which brings us to wrapping up with some key takeaways…

In Summary my Friend

We‘ve covered the motivation, creation, wrangling, application, and ubiquity of handy R lists including:

  • Flexible storage of varied data types
  • Easily pass configuration lists to functions
  • Subset and extract elements with [[ ]], $
  • Apply functions across margins with lapply() family
  • Recur throughout R itself as nested data structure

Today you leveled up from R basics to intermediate data wrangler! ????

Lists certainly have quirks, but becoming comfortable with them unlocks the next stage of your R journey.

Let me know if you have any other list-related questions arise on your adventures!

Read More Topics