Workshop Overview

Instructor(s) name(s) and contact information

Workshop Description

This workshop will introduce users to the CoreGx and PharmacoGx R packages, which are useful tools for pharmacogenomic modelling to discover biomarkers of treatment response in cancer model systems. PharmacoGx specifically focuses on drug sensitivity experiments in cancer cell lines, which will be the major focus of this workshop. Additional infrastructure from our lab includes ToxicoGx for toxicogenomics in healthy human cell-lines, RadioGx for radiogenomics in cancer cell-lines and Xeva for pharmacogenomics in patient derived xenograph (PDX) murine models.

Participants will learn the fundamentals of using CoreGx and PharmacoGx to create a PharmacoSet—an integrative container for the storage, analysis and visualization of pharmacogenomic experiments. Particular focus will be placed on newly developed support for storing, analyzing and visualizing drug combination sensitivity experiments and correlating results therefrom with multi-omic molecular profiles to discover biomarkers of drug sensitivity, resistance, synergy, or antagonism.

Pre-requisites

Useful publications:

  • Smirnov, P., Safikhani, Z., El-Hachem, N., Wang, D., She, A., Olsen, C., Freeman, M., Selby, H., Gendoo, D. M. A., Grossmann, P., Beck, A. H., Aerts, H. J. W. L., Lupien, M., Goldenberg, A., & Haibe-Kains, B. (2016). PharmacoGx: An R package for analysis of large pharmacogenomic datasets. Bioinformatics (Oxford, England), 32(8), 1244–1246. https://doi.org/10.1093/bioinformatics/btv723
  • Tonekaboni, M., Ali, S., Soltan Ghoraie, L., Manem, V. S. K. & Haibe-Kains, B. Predictive approaches for drug combination discovery in cancer. Brief Bioinform 19, 263–276 (2018).

Workshop Participation

Participants expected to have the following required packages installed on their machines to be able to run the commands along with the instructors:

R / Bioconductor packages used

Bioconductor:

CRAN:

Time outline

For a 1.5-hr workshop:

Activity Time
Introduction to CoreGx and PharmacoGx 5m
Overview of Data Structures 15m
How the TRE Supports Drug Combinations Data Analysis 10m
Using a DataMapper to build a Drug Combo PharmacoSet 10m
Dose Response Models and Drug Sensitivity measures 10m
Drug Combination Synergy Models 15m
Biomarker Discovery 10m
Further Resources 5m

Workshop goals and objectives

Learning goals

  • Describe pharmacogenomic mono and combination datasets and their usefulness in cancer research
  • Understand how experimental designs and research questions map onto data structures
  • Learn how to extract information from these datasets
  • Learn how to visualize experimental results from these datasets
  • Learn how to model dose-response for both monotherapy and combination small compound datasets
  • Learn measures to quantify response and synergy in cancer cell-line sensitivity screens

Learning objectives

  • Describe use cases for CoreGx and PharmacoGx in pharmacogenomics and beyond
  • Understand the structure of the CoreSet and PharmacoSet classes to facilitate their use in subsequent analyses
  • Learn how the TreatmentResponseExperiment provides a highly flexible container for storing high-throughput drug combination screening data
  • Use a TREDataMapper to create a TreatmentResponseExperiment and incorporate it into a PharmacoSet
  • Learn to apply arbitrary R functions over a TreatmentResponseExperiment using the aggregate and endoaggregate methods
  • Access the molecular features, dose-response and metadata contained within the data structures defined in our packages
  • Fit Hill Slope models to dose-response experiments using small compound treatments in cell-lines
  • Calculate the AAC, AUC, and IC50 metrics for response quantification in cell lines
  • Compute and store various metrics of drug synergy and antagonism using PharmacoGx and the TreatmentResponseExperiment
  • Learn how to use PharmacoGx to discover univariate biomarkers of drug response, resistance, synergy, or antagnoism

Introduction to CoreGx and PharmacoGx

This tutorial, titled Pharmacogenomic Analysis of Drug Combination Experiments to Identify Biomarkers of Response or Resistance, focuses on using the PharmacoGx R package to correlate treatment response, measured as the viability of cancer cell-lines after in vitro drug treatment, with their respective multi-omic profiles. CoreGx provides the core infrastructure for storing, analyzing, and visualizing generic treatment response experiments. It provides methods and classes which can be inherited from in downstream packages, such as ToxicoGx and RadioGx. We hope that the CoreSet object is generalized enough that it can be reused by other developers for their specific treatment (or stimuli) response use case.

CoreGx

Package Nomenclature

To facilitate modularization of the GxSuite of R packages, we have shifted the nomenclature within a CoreSet—and therefore in inheriting packages —to be more general.

To this end, have made the following changes:

  • Previous reference to cell (cell-line) have become sample, allowing the CoreSet to be used for other model systems
  • Drug (radiation in RadioGx) have become treatment, allowing the CoreSet to be treatment type (or stimuli) agnostic
  • Sensitivity will become response (sensitivity slot becomes treatmentResponse)

As a result of these changes, the names of some common accessors have been updated. The old accessors still remain functional to ensure backwards compatibility for at least two Bioconductor releases. A deprecation warning will be added to old accessors informing users of the corresponding new function, as per Bioconductor best practices.

PharmacoGx

PharmacoGx stores drug screening data together with molecular profiling of cell-lines in an object called a PharmacoSet, or PSet for short. This object inherits for the CoreSet class defined in our package CoreGx, which is used to share common datas structure and method across our suite of package. The primary use case for PharmacoGx is to provide a standardized and highly curated container for high-throughput screens in cancer-cell lines, with the goal of enabling discovery of biomarkers of treatment response or resistance which can be easily compared and validated across large published pharmacogneomic datasets.


Overview of Data Structures

The GxSuite of packages make use of various Bioconductor classes for storing molecular profile data.

CoreSet and PharmacoSet

With the recent updates in the structure and nomenclature of the CoreSet, the underling data structures are essentially identical. Previously, the “treatment” slot was specific to inheriting classes and named according to their domain specific use case, for example drug in PharmacoGx and radiation in RadioGx. Correspondingly, the “sample” slot was named cell and corresponded to in vitro immortalized human cancer cell-lines.

CoreSet / PharmacoSet class diagram. Objects comprising a CoreSet are enclosed in boxes. The first box indicates the type and name of each object. The second box indicates the structure of an object or class. The third box shows accessor methods from PharmacoGx for that specific object. ‘=>’ represents return and specifies what is returned from that item or method.

MultiAssayExperiment and SummarizedExperiment

The MultiAssayExperiment (MAE) is a common Bioconductor class for storing the results of diverse multi-omic profiling experiments, including mutation, RNA expression, copy number abberation and protein abundace. Along with the SummarizedExperiment (SE), this object is used to store molecular phenotype data within a CoreSet and PharmacoSet.

At the heart of both objects is a feature by sample matrix, with the major difference being that the SE require congruent matrices (i.e., observations on the same set of features and samples) where as the MAE allows storage of disjoint sets of feature and samples via the use of the sampleMap, which, conceptually, stores a graph of the mapping from samples to each assay in the object.

SummarizedExperiment and MultiAssayExperiment classes.

For both these objects, rows map to molecular features (often genomic coordinates to better generalize across diverse molecular datatypes), and columns to samples. Metadata for both is available via the rowData and colData accessors, respectively.

TreatmentResponseExperiment

The TRE object shares much of its design with the MultiAssayExperiment and SummarizedExperiment. We store dimension metadata in rowData and colData, similar to an SE but also allow disjoint rows and columns to be stored in assays, like an MAE. The major difference is that instead of using a matrix or array for storing assay observations, we instead store them as a long format data.table where integer keys define the relevant “shape” of the underlying data and observations live within columns of that table.

For those familar with relational modelling from RDBMS systems, our design has taken inspiration from the star schema often used in OLAP applications to improve the efficiency of analytic database queries.

TreatmentResponseExperiment class diagram. Objects comprising a TreatmentResponseExperiment are enclosed in boxes. The first box indicates the type and name of each object. The second box indicates the structure of an object or class. The third box shows accessor methods from PharmacoGx for that specific object. ‘=>’ represents return and specifies what is returned from that item or method.

The TRE differentiates between full assays and summary assays by the presence of various identifier columns which can be viewed with the idCols method. Each TRE has a set of rowIDs and colIDs which define these respective dimensions. By storing an assayIndex, analagous to the sampleMap from the MAE object, we are able to summarize over one or more row or column identifiers while storing only the relevant rows within the assays slot, allowing for a measure of compression relative to a matrix or array, where missing values must be stored as NA.

The association between the various identifiers in the object are shown below, with the assayIndex being the central repository for the graph like relationships between the rowIDs, colIDs and each respective assay.

The structure of a TreatmentResponseExperiment centers around the assayIndex which contains a mapping from unique identifiers for row or columns and the primary key of each assay in the object. Summaries over rowKey and colKeys, and thus their corresponding identifiers, and made possible by repeating a givan assay key for ecah rowKey/colKey combination it has been summarized over.


How the TRE Support Drug Combinations Data Analysis

Drug Combination Experiments

The field of precision oncology is increasingly interested in exploring possible synergies between cancer therapies as a means of overcoming acquired treatment resistance or overcoming in complete responses to first line therapies. To accelerate the process of translating insights from in vitro drug combination experiments into clinical practice, we have extended the CoreSet and PharmacoSet classes with a new object, the TreatmentResponseExperiment (TRE), which has been specifically designed to handle high dimensional biological stimilus-response data.

While drug combinations are the first use case for this data structure, as cancer model systems continue to advance we expected to also see sample-wise combinations. For example, 3D organoid models could include multiple cell-lines or patient tissue types. Additionally, we can imagine the object being useful for other types of stimuli-response experiments across the biological sciences, such as monitoring the response of nerve cells to various types of sensory stimuli.

Building A Drug Combination PharmacoSet

DataMapper Class

The DataMapper is an abstract class included in CoreGx which defines metadata specifying a “recipe” to go from rawdata files associated with a specific experiment or publication to a curated and standardized data format within an R S4 class.

In CoreGx we have implemented the TREDataMapper class, which encodes how the rawdata from a drug combination experiment should be transformed into its associated TRE object. A primary motivation for the creation of this class was to document and standardize creation of TRE objects from the myriad experimental designs available in drug combination experiments in the wild. It is our hope that, for example, 3x3 drug combination experiments have enough in common that we could theoretically create a 3by3TREDataMapper which works for (almost) all publications following this design.

As things stand right now, the TREDataMapper primarily acts as helper class to explore the specifics of a drug combination experimental design. It allows us to generate hypotheses about which identifiers are required to identify observations in an assay. Below we will see how, along with the guessMapping method, we can use the TREDataMapper to ensure we have mapped all data from the NCI-ALMANAC drug combination experiment into a TRE object, ensuring that all study observations and relevant metadata are captured in our TRE and therefore in our PharmacoSet.

By curating and releasing such PharmacoSet objects, we ensure that our users can access all information from the original publication via PharmacoGx and therefore easily ensure the reproducibility of the results as well as simplify their comparison against other curated mono and combination therapy PharmacoSets object. In doing so, we help to improve the reproducibility and consistency of in vitro pharmacogenomic experiments and accelerate translation of insights from the lab into the clinic.

Using the TREDataMapper to Construct a TreatmentResponseExperiment

For this workshop we have included a subset of the NCI_ALMANAC_2017 PharmacoSet, curated from (Holbeck et al. 2017). When approaching new treament response experiment data, the first step is to merge it into a single, long format table by attaching additional metadata for treatments, sample and any other entities defined in the publication. We have done this already and included it as the NCI_ALMANAC_raw dataset in the workshop package.

data(NCI_ALMANAC_raw)
(head(NCI_ALMANAC_raw))
treatment1id treatment2id treatment1dose treatment2dose sampleid CONC1 CONC2 NSC2 CELLNAME CELLNBR PANEL PANELNBR viability PERCENTGROWTH PERCENTGROWTHNOTZ EXPECTEDGROWTH TESTVALUE CONTROLVALUE TZVALUE CONCINDEX1 CONCINDEX2 SAMPLE1 SAMPLE2 SCORE COMBODRUGSEQ STUDY TESTDATE PLATE SCREENER PREFIX2 bio_rep tech_rep
1,3-Bis(2-chloroethyl)-1-nitrosourea 0.05 NA 786-O 0 NA NA 786-0 18 Renal Cancer 9 98.89562 98.367 NA NA 2.776 2.807 0.909 -1 0 23 NA NA 5587379 1307AC31 07/08/2013 3577 1A 1 1
1,3-Bis(2-chloroethyl)-1-nitrosourea 0.05 NA A-498 0 NA NA A498 13 Renal Cancer 9 98.63179 96.232 NA NA 2.451 2.485 1.589 -1 0 23 NA NA 5587364 1307AC31 07/08/2013 3432 1A 1 1
1,3-Bis(2-chloroethyl)-1-nitrosourea 0.05 NA A-549 0 NA NA A549/ATCC 4 Non-Small Cell Lung Cancer 1 100.91533 101.089 NA NA 1.764 1.748 0.325 -1 0 23 NA NA 5587194 1307AC31 07/08/2013 3548 1A 1 1
1,3-Bis(2-chloroethyl)-1-nitrosourea 0.05 NA ACHN 0 NA NA ACHN 23 Renal Cancer 9 100.81652 101.054 NA NA 2.099 2.082 0.492 -1 0 23 NA NA 5587384 1307AC31 07/08/2013 3606 1A 1 1
1,3-Bis(2-chloroethyl)-1-nitrosourea 0.05 NA BT-549 0 NA NA BT-549 13 Breast Cancer 5 95.67643 91.405 NA NA 2.058 2.151 1.069 -1 0 23 NA NA 5587284 1307AC31 07/08/2013 3577 1A 1 1

Viewing the table, we see that each column represents a variable, and that the number of rows in the table corresponds to the number of dose-response observations measured in the NCI-ALMANAC study. To those familiar with the R tidyverse, you may recognize that this dataset is in tidy format. It is also said be in a “long” format because all observations are contained in a single column; contrast this with a “wide” format where columns generally respresent samples and values are observations are stored across both rows and columns (i.e., as in a matrix or tensor).

In order for the helper methods of the TREDataMapper class to work, our rawdata must first be in a single large, tidy data.frame or data.table. Since these conditions are already met for the included example data, we can create of TREDataMapper object and start exploring the to dimensionality of the experiment in question!

print(treDataMapper <- TREDataMapper(rawdata=NCI_ALMANAC_raw))
## 
[33m
[1m
[3m<TREDataMapper> 
## 
[23m
[22m
[39m
[33m
[1mrawdata:
[22m
[39m dim(534179, 32)
##                              treatment1id treatment2id treatment1dose
##                                    <char>       <char>          <num>
##   1: 1,3-Bis(2-chloroethyl)-1-nitrosourea                        0.05
##   2: 1,3-Bis(2-chloroethyl)-1-nitrosourea                        0.05
##   3: 1,3-Bis(2-chloroethyl)-1-nitrosourea                        0.05
##   29 variables not shown: [treatment2dose <num>, sampleid <char>, CONC1
##     <num>, CONC2 <num>, NSC2 <int>, CELLNAME <char>, CELLNBR <int>,
##     PANEL <char>, PANELNBR <int>, viability <num>, ...] 
## 

[33m
[1mrowDataMap:
[22m
[39m
##   
[32mrowIDs:
[39m 
##   
[32mrowMeta:
[39m 
## 
[33m
[1mcolDataMap:
[22m
[39m
##   
[32mcolIDs:
[39m 
##   
[32mcolMeta:
[39m 
[33m
[1m
## assayMap:
[22m
[39m
[33m
[1m
## metadataMap:
[22m
[39m
[32m NA
[39m

A DataMapper as a concept is simply a collection of metadata around our raw data which defines how different columns map to different aspects of our target S4 class. For a TRE, similar to a SummarizedExperiment, we have the concept of rowData which defines what is considered a treatment (a specific drug combination, its doses and potential technical replicates) and colData which defines what is considered a sample (a specific cell-line and possible biological replicate identifier).

The guessMapping method can be used to test our hypotheses about which information maps to treatments (rows), samples (columns) or assays (experimental observations). This is done by defining a set of groupings and querying the table to find other columns which map 1:1 with those groups.

For our first guess, lets assume there are no replicates and see if we are able to map all of our rawdata columns using just treatment, treatment dose and sample identifiers. Once we format our guess for rows, columns and assays, we specify the subset parmeter to indicate we want to remove any mapped columns before moving on to the next group. This prevents multimapping a column to several groups and is generally recommended.

groups1 <- list(
    rowDataMap=c("treatment1id", "treatment2id", "treatment1dose",
        "treatment2dose"),
    colDataMap=c("sampleid"),
    assayMap=c("treatment1id", "treatment2id", "treatment1dose",
        "treatment2dose", "sampleid"
    )
)

(guess1 <- guessMapping(treDataMapper, groups=groups1, subset=TRUE))
## [CoreGx::guessMapping,LongTableDataMapper-method]
##  Mapping for group rowDataMap: treatment1id, treatment2id, treatment1dose, treatment2dose
## [CoreGx::guessMapping,LongTableDataMapper-method]
##  Mapping for group colDataMap: sampleid
## [CoreGx::guessMapping,LongTableDataMapper-method]
##  Mapping for group assayMap: treatment1id, treatment2id, treatment1dose, treatment2dose, sampleid
## $metadata
## $metadata$id_columns
## [1] NA
## 
## $metadata$mapped_columns
## character(0)
## 
## 
## $rowDataMap
## $rowDataMap$id_columns
## [1] "treatment1id"   "treatment2id"   "treatment1dose" "treatment2dose"
## 
## $rowDataMap$mapped_columns
## [1] "CONC1"      "CONC2"      "NSC2"       "CONCINDEX2"
## 
## 
## $colDataMap
## $colDataMap$id_columns
## [1] "sampleid"
## 
## $colDataMap$mapped_columns
## [1] "CELLNAME" "CELLNBR"  "PANEL"    "PANELNBR"
## 
## 
## $assayMap
## $assayMap$id_columns
## [1] "treatment1id"   "treatment2id"   "treatment1dose" "treatment2dose"
## [5] "sampleid"      
## 
## $assayMap$mapped_columns
## [1] "EXPECTEDGROWTH" "SAMPLE2"        "SCORE"         
## 
## 
## $unmapped
##  [1] "viability"         "PERCENTGROWTH"     "PERCENTGROWTHNOTZ"
##  [4] "TESTVALUE"         "CONTROLVALUE"      "TZVALUE"          
##  [7] "CONCINDEX1"        "SAMPLE1"           "COMBODRUGSEQ"     
## [10] "STUDY"             "TESTDATE"          "PLATE"            
## [13] "SCREENER"          "PREFIX2"           "bio_rep"          
## [16] "tech_rep"

Observing the results of our first guess, we can see that many column remain unmapped. This likely indicates there is an additional piece of information needed to uniquely identify the dimensions of our drug combination experiment. One potential cause for this is inclusion of technical or biological replicates.

Indeed, for the NCI-ALMANAC both of these kinds of replicates are present.

str(rawdata(treDataMapper), give.attr=FALSE)
## Classes 'data.table' and 'data.frame':   534179 obs. of  32 variables:
##  $ treatment1id     : chr  "1,3-Bis(2-chloroethyl)-1-nitrosourea" "1,3-Bis(2-chloroethyl)-1-nitrosourea" "1,3-Bis(2-chloroethyl)-1-nitrosourea" "1,3-Bis(2-chloroethyl)-1-nitrosourea" ...
##  $ treatment2id     : chr  "" "" "" "" ...
##  $ treatment1dose   : num  0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 ...
##  $ treatment2dose   : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ sampleid         : chr  "786-O" "A-498" "A-549" "ACHN" ...
##  $ CONC1            : num  5e-08 5e-08 5e-08 5e-08 5e-08 5e-08 5e-08 5e-08 5e-08 5e-08 ...
##  $ CONC2            : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ NSC2             : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ CELLNAME         : chr  "786-0" "A498" "A549/ATCC" "ACHN" ...
##  $ CELLNBR          : int  18 13 4 23 13 3 10 15 3 2 ...
##  $ PANEL            : chr  "Renal Cancer" "Renal Cancer" "Non-Small Cell Lung Cancer" "Renal Cancer" ...
##  $ PANELNBR         : int  9 9 1 9 5 7 4 9 11 4 ...
##  $ viability        : num  98.9 98.6 100.9 100.8 95.7 ...
##  $ PERCENTGROWTH    : num  98.4 96.2 101.1 101.1 91.4 ...
##  $ PERCENTGROWTHNOTZ: num  NA NA NA NA NA NA NA NA NA NA ...
##  $ EXPECTEDGROWTH   : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ TESTVALUE        : num  2.78 2.45 1.76 2.1 2.06 ...
##  $ CONTROLVALUE     : num  2.81 2.48 1.75 2.08 2.15 ...
##  $ TZVALUE          : num  0.909 1.589 0.325 0.492 1.069 ...
##  $ CONCINDEX1       : int  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
##  $ CONCINDEX2       : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ SAMPLE1          : int  23 23 23 23 23 23 23 23 23 23 ...
##  $ SAMPLE2          : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ SCORE            : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ COMBODRUGSEQ     : int  5587379 5587364 5587194 5587384 5587284 5587324 5587244 5587369 5587429 5587229 ...
##  $ STUDY            : chr  "1307AC31" "1307AC31" "1307AC31" "1307AC31" ...
##  $ TESTDATE         : chr  "07/08/2013" "07/08/2013" "07/08/2013" "07/08/2013" ...
##  $ PLATE            : chr  "3577" "3432" "3548" "3606" ...
##  $ SCREENER         : chr  "1A" "1A" "1A" "1A" ...
##  $ PREFIX2          : chr  "" "" "" "" ...
##  $ bio_rep          : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ tech_rep         : int  1 1 1 1 1 1 1 1 1 1 ...

Examining the columns in our data, we see we have already defined the both bio_reps and tech_reps during raw data in preprocessing. We identified biological replicates as observations of the sample treatment, dose and sample combination across multiple dates. Technical replicates were the defined as any additional rows not mapped to treatment, dose, sample and biological replicates.

For your own data in the wild, you will likely need to consult the original publication and explore the data to differentiate these two types of replicates and assign identifiers which make every observation within an experiment unique.

For our second guess we will include these columns, mapping biological replicates to samples and technical replicates to treatments.

groups2 <- list(
    rowDataMap=c("treatment1id", "treatment2id", "treatment1dose",
        "treatment2dose", "tech_rep"),
    colDataMap=c("sampleid", "bio_rep"),
    assayMap=c("treatment1id", "treatment2id", "treatment1dose",
        "treatment2dose",  "sampleid", "bio_rep", "tech_rep")
)

(guess2 <- guessMapping(treDataMapper, groups=groups2, subset=TRUE))
## [CoreGx::guessMapping,LongTableDataMapper-method]
##  Mapping for group rowDataMap: treatment1id, treatment2id, treatment1dose, treatment2dose, tech_rep
## [CoreGx::guessMapping,LongTableDataMapper-method]
##  Mapping for group colDataMap: sampleid, bio_rep
## [CoreGx::guessMapping,LongTableDataMapper-method]
##  Mapping for group assayMap: treatment1id, treatment2id, treatment1dose, treatment2dose, sampleid, bio_rep, tech_rep
## $metadata
## $metadata$id_columns
## [1] NA
## 
## $metadata$mapped_columns
## character(0)
## 
## 
## $rowDataMap
## $rowDataMap$id_columns
## [1] "treatment1id"   "treatment2id"   "treatment1dose" "treatment2dose"
## [5] "tech_rep"      
## 
## $rowDataMap$mapped_columns
## [1] "CONC1"      "CONC2"      "NSC2"       "CONCINDEX2"
## 
## 
## $colDataMap
## $colDataMap$id_columns
## [1] "sampleid" "bio_rep" 
## 
## $colDataMap$mapped_columns
## [1] "CELLNAME" "CELLNBR"  "PANEL"    "PANELNBR"
## 
## 
## $assayMap
## $assayMap$id_columns
## [1] "treatment1id"   "treatment2id"   "treatment1dose" "treatment2dose"
## [5] "sampleid"       "bio_rep"        "tech_rep"      
## 
## $assayMap$mapped_columns
##  [1] "viability"         "PERCENTGROWTH"     "PERCENTGROWTHNOTZ"
##  [4] "EXPECTEDGROWTH"    "TESTVALUE"         "CONTROLVALUE"     
##  [7] "TZVALUE"           "CONCINDEX1"        "SAMPLE1"          
## [10] "SAMPLE2"           "SCORE"             "COMBODRUGSEQ"     
## [13] "STUDY"             "TESTDATE"          "PLATE"            
## [16] "SCREENER"          "PREFIX2"          
## 
## 
## $unmapped
## character(0)

Since we no longer have any unmapped columns, we know that we have found valid dimension identifiers for our data. It is possible that some datasets will have multiple valid dimension definitions, so further exploration may be required if the first solution doesn’t fit with your domain knowledge of the experiment or with the design of the TRE, which stores data more efficiently when only one dimension is large.

Lets assign our mapped columns back to the TREDataMapper so we can use it to construct a TreatmentResponseExperiment.

For identified metadata, we need to assign a name to be used in the metadata list of the createed TRE, other dimension maps can be assigned directly to the slots of the TREDataMapper.

metadataMap(treDataMapper) <- list(experiment_metadata=guess2$metadata$mapped_columns)
rowDataMap(treDataMapper) <- guess2$rowDataMap
colDataMap(treDataMapper) <- guess2$colDataMap

Unless we wish to have a single large assay in our TRE, we should further subdivide our assay columns into sensible groups.

(guess2$assayMap)
## $id_columns
## [1] "treatment1id"   "treatment2id"   "treatment1dose" "treatment2dose"
## [5] "sampleid"       "bio_rep"        "tech_rep"      
## 
## $mapped_columns
##  [1] "viability"         "PERCENTGROWTH"     "PERCENTGROWTHNOTZ"
##  [4] "EXPECTEDGROWTH"    "TESTVALUE"         "CONTROLVALUE"     
##  [7] "TZVALUE"           "CONCINDEX1"        "SAMPLE1"          
## [10] "SAMPLE2"           "SCORE"             "COMBODRUGSEQ"     
## [13] "STUDY"             "TESTDATE"          "PLATE"            
## [16] "SCREENER"          "PREFIX2"

For the NCI-ALMANAC we have decided to separate our treatment-response measurements into the “response” assay. Any additional metadata which doesn’t map cleaning to rows or columns gets assigned to “assay_metadata”. The publsihed synergy score will go in the “profiles” assay, which generally stores metrics computed from the “response” assay.

For our assay definitions we setup a nested list where each item defines an assay in the TRE, with the “id_columns” containing identifiers needed to uniquely map each row in that assay to its associated treatments and samples. The “mapped_columns” item specifies the definition of the assay table.

For this case, all of our assays have the same identifier columns. However, this is not a requirement and it may be more efficient to test further subgroupings with our identifier columns to prevent uncessarily repeating values in our TRE.

assaymap <- list(
    response=list(
        id_columns=guess2$assayMap$id_columns,
        mapped_columns=c("viability", "PERCENTGROWTH", "PERCENTGROWTHNOTZ",
            "EXPECTEDGROWTH", "TESTVALUE", "CONTROLVALUE", "TZVALUE")
    ),
    profiles=list(
        id_columns=guess2$assayMap$id_columns,
        mapped_columns=c("SCORE")
    ),
    assay_metadata=list(
        id_columns=guess2$assayMap$id_columns,
        mapped_columns=c("COMBODRUGSEQ", "STUDY", "TESTDATE", "PLATE",
            "SCREENER", "PREFIX2", "CONCINDEX1", "CONCINDEX2", "SAMPLE1",
            "SAMPLE2")
    )
)
assayMap(treDataMapper) <- assaymap
(treDataMapper)
## 
[33m
[1m
[3m<TREDataMapper> 
## 
[23m
[22m
[39m
[33m
[1mrawdata:
[22m
[39m dim(534179, 32)
##                              treatment1id treatment2id treatment1dose
##                                    <char>       <char>          <num>
##   1: 1,3-Bis(2-chloroethyl)-1-nitrosourea                        0.05
##   2: 1,3-Bis(2-chloroethyl)-1-nitrosourea                        0.05
##   3: 1,3-Bis(2-chloroethyl)-1-nitrosourea                        0.05
##   29 variables not shown: [treatment2dose <num>, sampleid <char>, CONC1
##     <num>, CONC2 <num>, NSC2 <int>, CELLNAME <char>, CELLNBR <int>,
##     PANEL <char>, PANELNBR <int>, viability <num>, ...] 
## 

[33m
[1mrowDataMap:
[22m
[39m
##   
[32mrowIDs:
[39m treatment1id, treatment2id, treatment1dose, treatment2dose, tech_rep
##   
[32mrowMeta:
[39m CONC1, CONC2, NSC2, CONCINDEX2
## 
[33m
[1mcolDataMap:
[22m
[39m
##   
[32mcolIDs:
[39m sampleid, bio_rep
##   
[32mcolMeta:
[39m CELLNAME, CELLNBR, PANEL, PANELNBR
[33m
[1m
## assayMap:
[22m
[39m
##   
[31mresponse:
[39m
##     keys: treatment1id, treatment2id, treatment1dose, treatment2dose, sampleid,
##       bio_rep, tech_rep
##     values: viability, PERCENTGROWTH, PERCENTGROWTHNOTZ, EXPECTEDGROWTH, TESTVALUE,
##       CONTROLVALUE, TZVALUE
##   
[31mprofiles:
[39m
##     keys: treatment1id, treatment2id, treatment1dose, treatment2dose, sampleid,
##       bio_rep, tech_rep
##     values: SCORE
##   
[31massay_metadata:
[39m
##     keys: treatment1id, treatment2id, treatment1dose, treatment2dose, sampleid,
##       bio_rep, tech_rep
##     values: COMBODRUGSEQ, STUDY, TESTDATE, PLATE, SCREENER, PREFIX2, CONCINDEX1,
##       CONCINDEX2, SAMPLE1, SAMPLE2
[33m
[1m
## metadataMap:
[22m
[39m
##   
[32mexperiment_metadata:
[39m

Now that we have assigned all the necessary metadata to the TREDataMapper, creating the object is as simple as calling metaConstruct, which combines together the rawdata and the metadata into a constructor call for the TreatmentResponseExperiment.

(NCI_ALMANAC_tre <- metaConstruct(treDataMapper))
## <TreatmentResponseExperiment> 
## dim:  10080 62 
## assays(3): response profiles assay_metadata 
## rownames(10080): 1,3-Bis(2-chloroethyl)-1-nitrosourea::0.05:NA:1 1,3-Bis(2-chloroethyl)-1-nitrosourea::0.15:NA:1 ... Vorinostat:Vincristine:1:0.01:1 Vorinostat:Vincristine:1:0.1:1 
## rowData(9): treatment1id treatment2id treatment1dose ... CONC2 NSC2 CONCINDEX2 
## colnames(62): 786-O:1 A-498:1 A-549:1 ... UACC-257:1 UACC-62:1 UO-31:1 
## colData(6): sampleid bio_rep CELLNAME CELLNBR PANEL PANELNBR 
## metadata(1): experiment_metadata

Computing Over a TreatmentResponseExperiment

So far we have seen how a TRE allows for flexible definitions of treatments and samples to enable storage of dose-response observations across a range of experimental designs. However, data storage is not the sole point of the TRE. Once our data is curated and stored safely inside our object we are ready to start answering interesting questions about the experiment.

To facilitate computation over a TRE, we have included the aggregate and endoaggregate methods which allows arbitrary R functions to be applied over groups of identifiers within an assay. For example, we may want to take averages for all monotherapy viabilities over technical replicates. We can do so using the aggregate method as follows:

(NCI_ALMANAC_tre |>
  subset(treatment2id == "") |>
  aggregate(
    assay="response",
    mean(viability),
    by=c("treatment1id", "treatment1dose", "sampleid", "bio_rep")
  ) ->
  mono_response_no_reps)
##                                treatment1id treatment1dose sampleid bio_rep
##     1: 1,3-Bis(2-chloroethyl)-1-nitrosourea           0.05    786-O       1
##     2: 1,3-Bis(2-chloroethyl)-1-nitrosourea           0.05    A-498       1
##     3: 1,3-Bis(2-chloroethyl)-1-nitrosourea           0.05    A-549       1
##     4: 1,3-Bis(2-chloroethyl)-1-nitrosourea           0.05     ACHN       1
##     5: 1,3-Bis(2-chloroethyl)-1-nitrosourea           0.05   BT-549       1
##    ---                                                                     
## 15181:                           Vorinostat          20.00    TK-10       1
## 15182:                           Vorinostat          20.00  U-251MG       1
## 15183:                           Vorinostat          20.00 UACC-257       1
## 15184:                           Vorinostat          20.00  UACC-62       1
## 15185:                           Vorinostat          20.00    UO-31       1
##        mean_viability
##     1:       98.89562
##     2:       98.63179
##     3:      100.91533
##     4:      100.81652
##     5:       95.67643
##    ---               
## 15181:       35.99182
## 15182:       11.43731
## 15183:       31.02885
## 15184:       17.48634
## 15185:       25.83149

As you can see above, we select only our monotherapy response observations using the subset method. Aggregate provides an argument for the assay we want compute over, in this case the “response” assay. We also specify how we want our groups defined using the by argument. Finally, we can specify one or more functions to aggregate columns with. In this case, we are simply taking the mean of viability within groups defined by by. We have excluded “tech_rep” for the by argument which means that our summary will be computed over it.

The resulting table will contain the identifer columns specified to by, as well as any columns which were summarized. By default, for unnamed aggregations, aggregate tries to guess the column name. We can see in the results it figured out that we are taking the mean of the viability column and returned the results as “mean_viability”.

You can also customize your resulting column names by naming the aggregation arguments:

For example:

(NCI_ALMANAC_tre |>
  aggregate(
    subset=treatment2id == "",
    assay="response",
    my_viability=mean(viability),
    my_dose=mean(treatment1dose),
    by=c("treatment1id", "treatment1dose", "sampleid", "bio_rep")
  ) ->
  mono_response_custom_names)
##                                treatment1id treatment1dose sampleid bio_rep
##     1: 1,3-Bis(2-chloroethyl)-1-nitrosourea           0.05    786-O       1
##     2: 1,3-Bis(2-chloroethyl)-1-nitrosourea           0.05    A-498       1
##     3: 1,3-Bis(2-chloroethyl)-1-nitrosourea           0.05    A-549       1
##     4: 1,3-Bis(2-chloroethyl)-1-nitrosourea           0.05     ACHN       1
##     5: 1,3-Bis(2-chloroethyl)-1-nitrosourea           0.05   BT-549       1
##    ---                                                                     
## 15181:                           Vorinostat          20.00    TK-10       1
## 15182:                           Vorinostat          20.00  U-251MG       1
## 15183:                           Vorinostat          20.00 UACC-257       1
## 15184:                           Vorinostat          20.00  UACC-62       1
## 15185:                           Vorinostat          20.00    UO-31       1
##        my_viability my_dose
##     1:     98.89562    0.05
##     2:     98.63179    0.05
##     3:    100.91533    0.05
##     4:    100.81652    0.05
##     5:     95.67643    0.05
##    ---                     
## 15181:     35.99182   20.00
## 15182:     11.43731   20.00
## 15183:     31.02885   20.00
## 15184:     17.48634   20.00
## 15185:     25.83149   20.00

Here we can see that the viability column is named “my_viability” and that our dose column named “my_dose”. We also use the subset parameter, which was added to aggregate for convenience to filter to relevant assay rows in a single function call.

For more complex operations, such as optimizing parameters to a non-linear function (as is done for fitting the Hill curve model to dose-response data), we have provided the nthread argument to parallelize computations when we have additional compute resources available. In generaly, parallelized computations with aggregate will use much more RAM than their serialized counterparts. Since there is also overhead associated with dispatching parallel threads for a computation, this feature should be used sparingly for only complex and time consuming computations.

The final feature of aggregate is allowing information be shared between summarized columns to prevent the need for multiple parallelized calls On our TRE object. For these cases, we set the enlist parameter to FALSE and pass a block of code wrapped in {}. The final statment in the curly brackets should return a named list, where each list item will become a column in the returned table.

The utility of this option will also become apparent in subsequent sections.

Combining with Omics Data into a PharmacoSet

The NCI-ALMANAC study did not conduct its own set of molecular profiling experiments, since the NCI60 already profiled all cell-lines available in the NCI-ALMANAC. We have already curated a PharmacoSet for NCI60, so adding our molecular data to the NCI-ALMANAC PharmacoSet requires no manual curation. We must simply subset the NCI60 dataset to those cell-lines present in NCI-ALMANAC.

Lets examine the included molecualr data:

data(NCI60_molecular_data)
(NCI60_molecular_data)
## A MultiAssayExperiment object of 4 listed
##  experiments with user-defined names and respective classes.
##  Containing an ExperimentList class object of length 4:
##  [1] rna: RangedSummarizedExperiment with 54609 rows and 59 columns
##  [2] mirna: RangedSummarizedExperiment with 417 rows and 59 columns
##  [3] rnaseq.comp: RangedSummarizedExperiment with 23808 rows and 59 columns
##  [4] rnaseq.iso: RangedSummarizedExperiment with 23808 rows and 59 columns
## Functionality:
##  experiments() - obtain the ExperimentList instance
##  colData() - the primary/phenotype DataFrame
##  sampleMap() - the sample coordination DataFrame
##  `$`, `[`, `[[` - extract colData columns, subset, or experiment
##  *Format() - convert into a long or wide DataFrame
##  assays() - convert ExperimentList to a SimpleList of matrices
##  exportClass() - save data to flat files

We can see the molecular data is available as MultiAssayExperiment containing four RangedSummarizedExperiment objects. The molecular data types are:

  1. “rna” represents RNA expression microarray data.
  2. “mirna” represents microRNA expression data.
  3. “rnaseq.comp” contains RNA sequencing data.
  4. “rnaseq.iso” is RNA sequencing data specifically targeting transcript isoforms.

Once we have our MultiAssayExperiment and TreatmentResponseExperiment objects ready, we next have to extract and curate sample and treatment metadata tables which store a summary of metadata for available samples and treatments in our PharmacoSet object.

For brevity, we have included preassembled annotation files for NCI-ALMANAC since manual curation is beyong the scope of this workshop. When creating your own PharmacoSet, we recommend assembling a set of standardized annotations from sources like PubChem or ChEMBL for treatments and Cellosaurus for cancer cell-lines and using these to standardize identifiers in your data as well as adding links to additional information in external databases.

data(NCI_ALMANAC_sample_metadata)
head(NCI_ALMANAC_sample_metadata)
sampleid tissueid cellosaurus_disease cellosaurus_id NCI_ALMANAC.sampleid NCI_ALMANAC.disease pharmacodb_cid
786-O 786-O Kidney Renal cell carcinoma CVCL_1051 786-0 Renal Cancer 7860_28_2019
A-498 A-498 Kidney Renal cell carcinoma CVCL_1056 A498 Renal Cancer A498_47_2019
A-549 A-549 Lung Lung adenocarcinoma CVCL_0023 A549/ATCC Non-Small Cell Lung Cancer A549_48_2019
ACHN ACHN Kidney Papillary renal cell carcinoma CVCL_1067 ACHN Renal Cancer ACHN_54_2019
BT-549 BT-549 Breast Invasive ductal carcinoma, not otherwise specified CVCL_1092 BT-549 Breast Cancer BT549_111_2019
CCRF-CEM CCRF-CEM Lymphoid Childhood T acute lymphoblastic leukemia CVCL_0207 CCRF-CEM Leukemia CCRFCEM_190_2019

The above table contains the sample metadata which will comprise the @sample slot of our PharmacoSet. It contains a set of standardized identifiers including cellid and tissueid, as well as cell-line identifiers from the NCI-ALMANAC publication and some additional metadata we have added.

data(NCI_ALMANAC_treatment_metadata)
head(NCI_ALMANAC_treatment_metadata)
treatmentid NCI_ALMANAC.treatmentid cid smiles inchikey
Pralatrexate Pralatrexate pralatrexate 148121 C#CCC(CC1=CN=C2C(=N1)C(=NC(=N2)N)N)C3=CC=C(C=C3)C(=O)NC(CCC(=O)O)C(=O)O OGSBUKJUDHAQEA-WMCAAGNKSA-N
1,3-Bis(2-chloroethyl)-1-nitrosourea 1,3-Bis(2-chloroethyl)-1-nitrosourea Carmustine 2578 C(CCl)NC(=O)N(CCCl)N=O DLGOEMSEDOSKAD-UHFFFAOYSA-N
Ixabepilone Ixabepilone Ixabepilone 6445540 CC1CCCC2(C(O2)CC(NC(=O)CC(C(C(=O)C(C1O)C)(C)C)O)C(=CC3=CSC(=N3)C)C)C FABUFPQFXZVHFB-PVYNADRNSA-N
Clofarabine Clofarabine Clofarabine 119182 Nc1nc(Cl)nc2n(cnc12)C3OC(CO)C(O)C3F WDDPHFBMKLOVOX-UHFFFAOYSA-N
Dasatinib Dasatinib Dasatinib 3062316 CC1=C(C(=CC=C1)Cl)NC(=O)C2=CN=C(S2)NC3=NC(=NC(=C3)N4CCN(CC4)CCO)C ZBNZXTGUTAYRHI-UHFFFAOYSA-N
Vincristine Vincristine Vincristine sulfate 5978 CCC1(O)CC2CN(C1)CCc3c([nH]c4ccccc34)C(C2)(C(=O)OC)c5cc6c(cc5OC)N(C=O)C7C86CCN9CC=CC(CC)(C98)C(C(=O)OC)C7(O)C(=O)OC

Similar to the sample table, this one contains treatment metadata which will be assigned to the @treatment slot of our PharmacoSet. It also contains a set of standardized identifiers including treatmentid, structural chemical identifiers, links to external databases as well as the drug identifiers from the NCI-ALMANAC publication.

For the curation list, we extract our mappings from our standardized identifiers to dataset specific identifiers and store those separately to ensure these associations are not lost should the treatment or sample metadata tables be modified.

curation <- list(
  sample=NCI_ALMANAC_sample_metadata[, c("sampleid", "NCI_ALMANAC.sampleid")],
  treatment=NCI_ALMANAC_treatment_metadata[, c("treatmentid", "NCI_ALMANAC.treatmentid")],
  tissue=data.frame(
    NCI_ALMANAC_sample_metadata[, c("tissueid", "NCI_ALMANAC.disease")],
    rownames=NCI_ALMANAC_sample_metadata$sampleid
  )
)

With all of our pieces in place, we can finally construct a drug combination PharmacoSet which can be used in subsequent analysis. Along with the recent changes in the structure and nomenclature of a PharmacoSet comes the introduction of the PharmacoSet2 contructor. This version enforces new requirements for the object such as a MultiAssayExperiment for the molecular profiles slot and a TreatmentResponseExperiment for treatment response slot. The original constructor is still included in the package for backwards compatibility but will be retired after we finish updating all our existing datasets on ORCESTRA.ca.

If you have an old style PharmacoSet but want to use the new package features, simply call updateObject to return the dataset with an updated class structure. This method will be available in all future releases to ensure datasets can be updated and reused regardless of their age.

(NCI_ALMANAC_2017 <- PharmacoSet2(
  name="NCI_ALMANAC_2017",
  molecularProfiles=NCI60_molecular_data,
  treatmentResponse=NCI_ALMANAC_tre,
  treatment=NCI_ALMANAC_treatment_metadata,
  sample=NCI_ALMANAC_sample_metadata,
  curation=curation
))
## <PharmacoSet>
## Name:  NCI_ALMANAC_2017 
## Date Created:  Fri Aug  5 23:52:07 2022 
## Number of samples:  60 
## Molecular profiles:
##    A MultiAssayExperiment object of 4 listed
##    experiments with user-defined names and respective classes.
##    Containing an ExperimentList class object of length 4:
##    [1] rna: RangedSummarizedExperiment with 54609 rows and 59 columns
##    [2] mirna: RangedSummarizedExperiment with 417 rows and 59 columns
##    [3] rnaseq.comp: RangedSummarizedExperiment with 23808 rows and 59 columns
##    [4] rnaseq.iso: RangedSummarizedExperiment with 23808 rows and 59 columns 
## Treatment response:
##    <TreatmentResponseExperiment> 
##   dim:  10080 62 
##   assays(3): response profiles assay_metadata 
##   rownames(10080): 1,3-Bis(2-chloroethyl)-1-nitrosourea::0.05:NA:1 1,3-Bis(2-chloroethyl)-1-nitrosourea::0.15:NA:1 ... Vorinostat:Vincristine:1:0.01:1 Vorinostat:Vincristine:1:0.1:1 
##   rowData(9): treatment1id treatment2id treatment1dose ... CONC2 NSC2 CONCINDEX2 
##   colnames(62): 786-O:1 A-498:1 A-549:1 ... UACC-257:1 UACC-62:1 UO-31:1 
##   colData(6): sampleid bio_rep CELLNAME CELLNBR PANEL PANELNBR 
##   metadata(1): experiment_metadata

Finally we see our finished drug combination PharmacoSet and are ready to use it to answer interesting research questions and discover the relationship between molecular phenotypes and treatment response!


Dose Response Models and Drug Sensitivity Measures

Understanding monotherapy dose-viability data is an important step in assessing drug combination response. Drug synergy is assessed by comparing the observed combination response to what is expected from the response of cells to each drug separately. While different models for “expected” combination effect will be discussed in the next section, we first review how monotherapy data are analyzed.

Monotherapy dose-response experiments are usually analyzed by first fitting a dose-response or dose-viability model to the observed measurements. The most common model used in the literature is a form of sigmoid function, called the Hill Curve. The Hill Curve has its basis in the Michaelis-Menten chemical kinetics model, while adding an extra parameter that is traditionally interpreted to account for co-operativity of binding. To better match real observed data, the model can also be extended to include parameters controlling the maximum and minimum observed viability.

NB: Sometimes, dose-viability and dose-response curves are used interchangebly in the literature. Precisely, dose-response curves for a cell kiling screen would plot % cell death against dose, leading to an increasing sigmoid curve.

In PharmacoGx, we use the 3 Parameter Hill Curve function as our model of drug response in cancer cell lines:

\[y = E_\infty + \frac{1-E_\infty}{1 + (\frac{x}{EC50})^{HS}}\]

This is a log-logistic model, meaning that it takes the form of a logistic curve when x is on the log scale. The three parameters are: \(E_\infty\), which is the maximal inhibition predicted at infinite concentration of the drug (the right asymptote); \(EC50\), which is the inflection point of the logit, where the inhibition reaches 50% of the maximum; \(HS\), the Hill Slope, which is thought to be a measure of the cooperativity of binding. This parameter controls the steepness of the logit, and is interpreted based on its relation to 1. Values larger than 1 imply positive cooperativity in inhibition/binding of the target for the small molecule ligands, which values less than 1 imply negative cooperativity/antagonism.

Diagram of a logLogistic Hill Slope

PharmacoGx provides functions to both fit and visualize dose-viability curves. Plotting dose-viability data is easy using the drugDoseResponseCurve function:

concentrations <- (1 / 2)^seq(0, 8) * 1
viabilities <- c(0.1, 5, 15, 23.8, 55.2, 85.9, 96.9, 98.4, 99.2)

drugDoseResponseCurve(
  concentrations = list("Exp 1" = concentrations),
  viabilities = list("Exp 1" = viabilities),
  plot.type = "Both"
)

Similarly, fitting Log-Logistic models to data is simple using the logLogisticRegression function in PharmacoGx.

concentrations <- (1 / 2)^seq(0, 8) * 1
viabilities <- c(0.1, 5, 15, 23.8, 55.2, 85.9, 96.9, 98.4, 99.2)

(pars <- logLogisticRegression(conc = concentrations, viability = viabilities))
## $HS
## [1] 2.053572
## 
## $E_inf
## [1] 3.284808
## 
## $EC50
## [1] 0.06902647
## 
## attr(,"Rsquare")
## [1] 0.9965584

Note that sometimes, it is may be more appropriate to use either a 2 parameter model (excluding the \(E_\infty\) parameter), or a 4 parameter model, including a parameter for the left-assymptote, that is, the measurement when no drug is added. PharmacoGx is opiniated in using the 3 parameter model, as the logical value for the viability as a percentage of untreated cells when no drug is added is 100%. However, the drc CRAN package provides a larger variety of dose-response models, should it be appropriate in your application.

We will soon demonstrate how to apply the logLogisticRegression to dose-viability data stored in a TreatmentResponseExperiment, but first we will talk about how to summarize dose-response curves for downstream analysis.

Computing Summary Measures for DDRCs

Often with in vitro Pharmacogenomics data, we want to compare the drug sensitivity of a cell line to some omic features. For this, we want to summarize the drug dose response curve into a single number representing the sensitivity of the cell line.

Some commonly used summary metrics for a dose-response curve

The IC50 and Area Above the Curve are two convenient metrics for quantifying the observed drug sensitivity. If you noticed above, the drugDoseResponseCurve function computes them by default. In PharmacoGx, they can be computed manually as follows:

concentrations <- (1 / 2)^seq(0, 8) * 1
viabilities <- c(0.1, 5, 15, 23.8, 55.2, 85.9, 96.9, 98.4, 99.2)

print(PharmacoGx::computeAUC(concentration = concentrations, viability = viabilities))
## [1] 46.63697
print(PharmacoGx::computeIC50(concentration = concentrations, viability = viabilities))
## [1] 0.0713488

If measurements are passed into these function, the Log Logistic model is fit to compute the IC50 and AAC. Clearly, if multiple metrics are being computed, redoing the fit is inefficent. Therefore, these functions can also take parameters fit by logLogisticRegression.

concentrations <- (1 / 2)^seq(0, 8) * 1
viabilities <- c(0.1, 5, 15, 23.8, 55.2, 85.9, 96.9, 98.4, 99.2)
pars <- logLogisticRegression(conc = concentrations, viability = viabilities)

print(PharmacoGx::computeAUC(concentration = concentrations, Hill_fit = pars))
## [1] 46.63697
print(PharmacoGx::computeIC50(concentration = concentrations, Hill_fit = pars))
## [1] 0.0713488

As processsing monotherapy data is often a step in preparing for assessing synergy from combination experiments, we will demonstrate how to use these functions together with a TreatmentResponseExperiment during our preprocessing for synergy calculation.

Drug Combination Synergy Models

When assessing drug combinations, the question of interest is usually not to quantify drug response to the therapy, but to identify combinations that give more or less response to therapy than expected from the monotherapy regiments, that is, to look for synergy or antagonism. Alternatively, for a paritcular drug combination, we may be interested to identify which biological models show synergy or antagonism, and understand why that could be the case on a molecular level.

For these tasks, it is important to define what we expect the result of combining two drugs at a particular combination to be without the presence of any synergy or antagonism between them. Usually, when searching for synergy, will have access to experiments testing each drug in monotherapy, as well as the drugs in combination. For our discussion, we we restrict ourselves to the two-drug combination case, however, the principles we discuss can be extended to handle multiple drugs.

Unfortunately, there is no consensus of what the theoretical expected value of response for combining two drugs is, given the response that each drug elicits in monotherapy. However, there are a few models widely applied in the literature, each with different biological assumptions behind them.

To introduce these models, we will define some mathematical notation, which we will use both throughout our exposition and the example computations. Let \(x_{1}\), and \(x_{2}\) correspond to the corresponding drug 1 and drug 2 doses at which we wish to predict the expected combination response. Let \(v_{1}\) and \(v_{2}\) be the viabilities that are observed in monotherapy for these drugs, and \(v_{c}\) be the viability observed in combination. Furthermore, let \(r_{i} = 1 - v_{i}\) be the response value corresponding to each measured viability. We will use both response and viability measurements in explaining the models, as some are more natural from one or the other perspective. However, it is easy to convert between these variables using the relationship above.

When we look for drug synergy, we are looking for cases where \(v_c < v_{model}\), or \(r_c > r_{model}\). The degree of synergy is usually measured either on the additive or multplicative scale, that is, looking at either \(v_c - v_{model}\) or \(v_c / v_{model}\). However, some models, such as the ZIP, also perscribe a certain perspective or scoring method to assess synergy.

This vignette covers 4 different models for predicting what the effect of drug combination treatment should be.

In the live workshop we focus only on the first 3 models.

  1. Highest Single Agent
  2. Bliss Independence
  3. Loewe Additivity
  4. Zero Interaction Potency (ZIP)

As we introduce each method, we will go through the theoretical motivation behind it, the calculation, and then show how to compute the expected viability/response for each model using the TreatmentResponseExperiment API.

Preprocessing Monotherapy Data for Synergy Estimation

Prior to computing expected viabilities from combination experiments, the monotherapy data may need to be preprocessed. In our case, the dataset we are using has replicate experiments for the monotherapy curves, as seen below:

treatmentResponse(NCI_ALMANAC_2017) |>
  subset(treatment2id == "") |>
  aggregate(
      assay="response",
      N=.N,
      by=c("treatment1id", "treatment1dose", "sampleid", "bio_rep")
  ) |>
  with(expr=hist(N))

The majority of these were replicates across plates, most probably so that monotherapy results could be plate matched to combination results. Therefore, we can make two choices for how to handle these replicates, either average across plates, or to match analysis by plate whenever possible. Ideally, we would recommend the latter approach, however, it is computationally more expensive (as we must fit monotherapy curves per-plate). In interest of computation time, we average replicates across all experiments for our demonstration.

Here, we use the endoaggregate TRE method while setting the by argument to compute an aggregation over each unique combination of treatment1id, treatment1dose, sampleid and bio_rep (summarizing over technical replicates).

NCI_TRE <- treatmentResponse(NCI_ALMANAC_2017)

# -- Extract and summarize monotherapy data, defined by an absent treatment2id
NCI_TRE |>
  endoaggregate(
    assay="response",
    target="mono_response",
    subset = treatment2id == "",
    viability=mean(viability),
    by=c("treatment1id", "treatment1dose", "sampleid", "bio_rep")
  ) ->
  NCI_TRE

head(NCI_TRE$mono_response)
##                            treatment1id treatment1dose sampleid bio_rep
## 1: 1,3-Bis(2-chloroethyl)-1-nitrosourea           0.05    786-O       1
## 2: 1,3-Bis(2-chloroethyl)-1-nitrosourea           0.05    A-498       1
## 3: 1,3-Bis(2-chloroethyl)-1-nitrosourea           0.05    A-549       1
## 4: 1,3-Bis(2-chloroethyl)-1-nitrosourea           0.05     ACHN       1
## 5: 1,3-Bis(2-chloroethyl)-1-nitrosourea           0.05   BT-549       1
## 6: 1,3-Bis(2-chloroethyl)-1-nitrosourea           0.05 CCRF-CEM       1
##    viability
## 1:  98.89562
## 2:  98.63179
## 3: 100.91533
## 4: 100.81652
## 5:  95.67643
## 6: 102.38559

Additionally, computing the Loewe and ZIP scores requires knowing the Hill Fit parameters for the monotherapy experiments. This is also a required step whenever it is necessary to interpolate between measured monotherapy doses to predict for the measured combination viability doses.

We can fit the curves using logLogisticRegression, and save the monotherapy fits back to the TreatmentResponseExperimentClassDiagram using the same endoaggregate pattern as above. Since we are already calculating the Hill Curve fits for these experiments, we will also calculate the AAC and IC50 values.

Note that by default, logLogisticRegression returns the E_inf parameter on a percentage scale when the input data is on a percentage scale. We convert to a fraction scale because that makes the subsequent calculations simpler.

# -- Fit the Hill curve model to monotherapy drugs and compute associated metrics
NCI_TRE |>
  endoaggregate(
    {
      fit <- PharmacoGx::logLogisticRegression(treatment1dose, viability)
      ic50 <- PharmacoGx::computeIC50(treatment1dose, Hill_fit = fit)
      aac <- PharmacoGx::computeAUC(treatment1dose, Hill_fit = fit, area.type = "Fitted")
      list(
        HS = fit[["HS"]], E_inf = fit[["E_inf"]]/100, EC50 = fit[["EC50"]],
        Rsq = as.numeric(unlist(attributes(fit))),
        aac = aac,
        ic50 = ic50
      )
    },
    assay = "mono_response",
    by = c("treatment1id", "sampleid", "bio_rep"),
    enlist = FALSE,
    target = "mono_profiles"
  ) -> NCI_TRE

head(NCI_TRE$mono_profiles)
##                            treatment1id sampleid bio_rep HS      E_inf EC50
## 1: 1,3-Bis(2-chloroethyl)-1-nitrosourea    786-O       1  1 0.08977556   50
## 2: 1,3-Bis(2-chloroethyl)-1-nitrosourea    A-498       1  1 0.56096579  500
## 3: 1,3-Bis(2-chloroethyl)-1-nitrosourea    A-549       1  1 0.15732265   50
## 4: 1,3-Bis(2-chloroethyl)-1-nitrosourea     ACHN       1  1 0.01729107   50
## 5: 1,3-Bis(2-chloroethyl)-1-nitrosourea   BT-549       1  1 0.05578801   20
## 6: 1,3-Bis(2-chloroethyl)-1-nitrosourea CCRF-CEM       1  1 0.12852970   50
##          Rsq       aac     ic50
## 1: 0.9425047 23.687649 60.94225
## 2: 0.7388963  3.303585       NA
## 3: 0.8796848 21.929806 72.95492
## 4: 0.8303083 25.573983 51.79104
## 5: 0.4578065 33.375273 22.51177
## 6: 0.9603414 22.679113 67.30013

We will now extract the combination data into its own assay in the LongTable, and append the monotherapy Hill fits using the mergeAssays function, so that we can later use the fitted data from above to calculate expected combination viability. We also filter our downstream analysis only to those experiments where the monotherapy fits were good (R squared better than 0.5).

Notice that we need to do two merges, one for appending the monotherapy Hill fits for the first compound treatment, and one for appending the monotherapy Hill fits for the second compound treatment. Fortunately, the suffixes parameter will allow us to differentiate between the two once they are merged to the same table.

The mergeAssays is built on top of R’s merge function (actually, merge from data.table), and therefore accepts the same arguments to control how the merge is done.

# -- Attach the Hill parameters from the monotherapy profiles to the combination data
# Extract the monotherapy fits
NCI_TRE |>
  assay("mono_profiles", summarize=TRUE) |>
  subset(
    Rsq > 0.5,  # filter to only good quality fits
    c("treatment1id", "sampleid", "bio_rep", "HS", "E_inf", "EC50")
  ) ->
  NCI_TRE$mono_fits

# Extract the combination response observations
NCI_TRE |>
  assay("response") |>
  subset(treatment2id != "") ->
  NCI_TRE$combo_response

# Merge with the fits from the monotherapy experiments
NCI_TRE |>
  mergeAssays(
    x="combo_response",
    y="mono_fits",
    by = c("treatment1id", "sampleid", "bio_rep"),
    sort = FALSE # workaround to not have weird DT keys as a result.
  ) |>
  mergeAssays(
    x="combo_response",
    y="mono_fits",
    by.x = c("treatment2id", "sampleid", "bio_rep"),
    by.y = c("treatment1id", "sampleid", "bio_rep"),
    suffixes = c("_1", "_2"),
    sort = FALSE # workaround to not have weird DT keys as a result.
  ) ->
  NCI_TRE
(NCI_TRE$combo_response)
##                                 treatment1id treatment2id treatment1dose
##      1: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
##      2: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
##      3: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
##      4: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
##      5: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
##     ---                                                                 
## 127310:                           Vorinostat  Vincristine            1.0
## 127311:                           Vorinostat  Vincristine            1.0
## 127312:                           Vorinostat  Vincristine            1.0
## 127313:                           Vorinostat  Vincristine            1.0
## 127314:                           Vorinostat  Vincristine            1.0
##         treatment2dose tech_rep sampleid bio_rep CONC1 CONC2 CONCINDEX2   NSC2
##      1:           0.03        1    786-O       1 2e-07 3e-08          1 102816
##      2:           0.03        1    A-498       1 2e-07 3e-08          1 102816
##      3:           0.03        1    A-549       1 2e-07 3e-08          1 102816
##      4:           0.03        1     ACHN       1 2e-07 3e-08          1 102816
##      5:           0.03        1 CCRF-CEM       1 2e-07 3e-08          1 102816
##     ---                                                                       
## 127310:           0.10        1       SR       1 1e-06 1e-07          3  67574
## 127311:           0.10        1    SW620       1 1e-06 1e-07          3  67574
## 127312:           0.10        1    T-47D       1 1e-06 1e-07          3  67574
## 127313:           0.10        1 UACC-257       1 1e-06 1e-07          3  67574
## 127314:           0.10        1  UACC-62       1 1e-06 1e-07          3  67574
##          CELLNAME CELLNBR                      PANEL PANELNBR  viability
##      1:     786-0      18               Renal Cancer        9 101.481127
##      2:      A498      13               Renal Cancer        9 103.771525
##      3: A549/ATCC       4 Non-Small Cell Lung Cancer        1  93.210581
##      4:      ACHN      23               Renal Cancer        9 100.015651
##      5:  CCRF-CEM       3                   Leukemia        7 106.627710
##     ---                                                                 
## 127310:        SR      19                   Leukemia        7   2.297507
## 127311:    SW-620       9               Colon Cancer        4  39.885949
## 127312:     T-47D      14              Breast Cancer        5  40.278674
## 127313:  UACC-257      21                   Melanoma       10  47.402650
## 127314:   UACC-62      20                   Melanoma       10  36.644778
##         PERCENTGROWTH PERCENTGROWTHNOTZ EXPECTEDGROWTH TESTVALUE CONTROLVALUE
##      1:        101.75            101.48        100.000   6951240      6849786
##      2:        104.84            103.77        100.000   4182200      4030200
##      3:         92.34             93.21         95.400   5888640      6317566
##      4:        100.02            100.02         98.098   6607800      6606766
##      5:        107.76            106.63        100.000  13606520     12760773
##     ---                                                                      
## 127310:        -77.71              2.30        -82.190    199440      8680713
## 127311:         32.22             39.89         14.564   7887720     19775686
## 127312:          4.38             40.28          4.133   1612280      4002813
## 127313:          9.23             47.40          5.886   6928760     14616820
## 127314:        -25.42             36.64        -15.760   4056120     11068753
##         TZVALUE      HS_1    E_inf_1     EC50_1 HS_2   E_inf_2      EC50_2
##      1: 1042043 1.0000000 0.08977556  50.000000    1 0.2720558  3.00000000
##      2:  889724 1.0000000 0.56096579 500.000000    1 0.4647826 30.00000000
##      3:  716463 1.0000000 0.15732265  50.000000    1 0.2215019  3.00000000
##      4: 1158169 1.0000000 0.01729107  50.000000    1 0.2172949  3.00000000
##      5: 1863442 1.0000000 0.12852970  50.000000    1 0.2407407  3.00000000
##     ---                                                                   
## 127310:  894742 1.0000000 0.17711243   2.000000    4 0.1928162  0.01523481
## 127311: 2235554 1.0000000 0.07429719   1.000000    1 0.4680214  0.03000000
## 127312: 1502784 4.0000000 0.36282782   1.429975    1 0.4751067  0.03000000
## 127313: 6146751 0.9672031 0.26448079   1.146777    1 0.5252780  0.03000000
## 127314: 5438488 1.0000000 0.17486339   2.000000    1 0.6472657  0.03000000

Finally, we can make predictions for each of the measured combination doses using the corresponding fit parameters and the Hill Curve models. This step is necessary when interpolating between monotherapy doses, but is often done in any case, as the predicted viabilities can be less noisy than each individual monotherapy measurement.

# predict viability at the combo doses from the Hill curves fit to monotherapy data
NCI_TRE |>
  endoaggregate(
    {
      viability_1 <- PharmacoGx:::.Hill(
        log10(treatment1dose),
        c(HS_1, E_inf_1, log10(EC50_1))
      )
      viability_2 <- PharmacoGx:::.Hill(
        log10(treatment2dose),
        c(HS_2, E_inf_2, log10(EC50_2))
      )
      list(
        viability_1 = viability_1,
        viability_2 = viability_2
      )
    },
    assay = "combo_response",
    by = c("treatment1id", "treatment2id", "treatment1dose", "treatment2dose",
    "sampleid", "bio_rep"),
    enlist = FALSE
  ) -> NCI_TRE

Last, we also put the measured combination viabilily onto a fractional, not percentage scale.

NCI_TRE$combo_response$viability <-
  NCI_TRE$combo_response$viability / 100

Finally, we can talk about the different models to compute the predicted response from monotherapy, and thereby compute estimates of synergy or antagonism.

Highest Single Agent / Highest Single Effect

The simplest model that one can assume for the response of drugs in combination is the Highest Single Agent model. This model predicts that the result of treating with two drugs is the same as the highest response (or lowest viability) seen by treating with either drug independently. Mathematically, this is equivalent to:

\[HSA((r_1,x_1), (r_2,x_2)) = \mbox{max}(r_1, r_2)\]

For viability values, this flips to a minimum.

Intuitively, this model can be interpreted as assuming that one compound is at the maximum possible effect, and the other is treated as a potentiator, contributing little to the response itself, but allowing the maximum response of the first drug to be increased. Practically, this case is not very often observed during in vitro dose-response experiments, and the highest single agent model is often a very weak baseline from which to compute synergy.

Computing HSA Reference Viabilities

As one can imagine, this is very simple to do using the min or max functions in R.

# -- compute our drug synergy metrics
NCI_TRE |>
  endoaggregate(
    HSA = min(viability_1, viability_2),
    by = c("treatment1id", "treatment2id", "treatment1dose", "treatment2dose",
      "sampleid", "bio_rep"),
    assay = "combo_response",
    target = "combo_response"
  ) -> NCI_TRE
head(NCI_TRE$combo_response)
##                            treatment1id treatment2id treatment1dose
## 1: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 2: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 3: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 4: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 5: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 6: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
##    treatment2dose tech_rep sampleid bio_rep CONC1 CONC2 CONCINDEX2   NSC2
## 1:           0.03        1    786-O       1 2e-07 3e-08          1 102816
## 2:           0.03        1    A-498       1 2e-07 3e-08          1 102816
## 3:           0.03        1    A-549       1 2e-07 3e-08          1 102816
## 4:           0.03        1     ACHN       1 2e-07 3e-08          1 102816
## 5:           0.03        1 CCRF-CEM       1 2e-07 3e-08          1 102816
## 6:           0.03        1 COLO 205       1 2e-07 3e-08          1 102816
##     CELLNAME CELLNBR                      PANEL PANELNBR viability
## 1:     786-0      18               Renal Cancer        9 1.0148113
## 2:      A498      13               Renal Cancer        9 1.0377152
## 3: A549/ATCC       4 Non-Small Cell Lung Cancer        1 0.9321058
## 4:      ACHN      23               Renal Cancer        9 1.0001565
## 5:  CCRF-CEM       3                   Leukemia        7 1.0662771
## 6:  COLO 205      10               Colon Cancer        4 0.9205163
##    PERCENTGROWTH PERCENTGROWTHNOTZ EXPECTEDGROWTH TESTVALUE CONTROLVALUE
## 1:        101.75            101.48        100.000   6951240      6849786
## 2:        104.84            103.77        100.000   4182200      4030200
## 3:         92.34             93.21         95.400   5888640      6317566
## 4:        100.02            100.02         98.098   6607800      6606766
## 5:        107.76            106.63        100.000  13606520     12760773
## 6:         90.74             92.05         86.870   5822480      6325233
##    TZVALUE HS_1    E_inf_1 EC50_1 HS_2    E_inf_2 EC50_2 viability_1
## 1: 1042043    1 0.08977556     50    1 0.27205581      3   0.9963736
## 2:  889724    1 0.56096579    500    1 0.46478261     30   0.9998245
## 3:  716463    1 0.15732265     50    1 0.22150189      3   0.9966427
## 4: 1158169    1 0.01729107     50    1 0.21729490      3   0.9960848
## 5: 1863442    1 0.12852970     50    1 0.24074074      3   0.9965280
## 6:  895972    1 0.07365439     50    1 0.02186312      3   0.9963094
##    viability_2       HSA
## 1:   0.9927926 0.9927926
## 2:   0.9994653 0.9994653
## 3:   0.9922921 0.9922921
## 4:   0.9922504 0.9922504
## 5:   0.9924826 0.9924826
## 6:   0.9903155 0.9903155

For convenience, PharmacoGx::computeHSA(viability_1, viability_2) is a more readable alias for computing the HSA reference viability.

Bliss Independence

Bliss Independence can be understood by making the assumption that cell death from treatment with a drugs is a stochastic process with the probability of cell death being proportional to the response observed in the experiment. The model than assumes that the cell death events from the two drugs are independent, for example because they occur through completely separate pathways. As such, the model predicts that adding a second drug will kill the same fraction of cells as it kills on its own, and therefore the fractions of cells remaining alive are caculated by multplying the fraction of cells alive after treatment with drug 1 in monotherapy, with the fraction of cells alive after treatment with drug 2 in monotherapy.

Diagram of Bliss Independence Assumption. Bliss independence assumes a constant fraction of cells is killed independently by each drug. In this diagram, the orange drug leaves 40% of the cells alive, and the blue drug leaves 60% of the cells alive. Therefore, Bliss predicts that the two drugs in combination will leave 24% of the cells alive.

Mathematically, that works out to:

\[BLISS((v_1,x_1), (v_2,x_2)) = v_1 v_2\]

In response units this is:

\[BLISS((r_1,x_1), (r_2,x_2)) = r_1 + r_2 - r_1*r_2\]

This model is always going to be more strict that HSA (as long as response is between 0 and 1), and is usually a more practical assumption to make when searching for drug synergy. Intuitively, one can think of it as modeling the case where the two drugs have separate mechanisms, with synergy signifiying a biological interaction that makes them more effective than expected.

Computing Bliss Reference Viabilities

This is also quite simple to do, as we just use the product of the monotherapy viabilities.

# -- compute our drug synergy metrics
NCI_TRE |>
  endoaggregate(
    Bliss = prod(viability_1, viability_2),
    by = c("treatment1id", "treatment2id", "treatment1dose", "treatment2dose",
      "sampleid", "bio_rep"),
    assay="combo_response",
    target = "combo_response"
  ) -> NCI_TRE
head(NCI_TRE$combo_response)
##                            treatment1id treatment2id treatment1dose
## 1: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 2: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 3: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 4: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 5: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 6: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
##    treatment2dose tech_rep sampleid bio_rep CONC1 CONC2 CONCINDEX2   NSC2
## 1:           0.03        1    786-O       1 2e-07 3e-08          1 102816
## 2:           0.03        1    A-498       1 2e-07 3e-08          1 102816
## 3:           0.03        1    A-549       1 2e-07 3e-08          1 102816
## 4:           0.03        1     ACHN       1 2e-07 3e-08          1 102816
## 5:           0.03        1 CCRF-CEM       1 2e-07 3e-08          1 102816
## 6:           0.03        1 COLO 205       1 2e-07 3e-08          1 102816
##     CELLNAME CELLNBR                      PANEL PANELNBR viability
## 1:     786-0      18               Renal Cancer        9 1.0148113
## 2:      A498      13               Renal Cancer        9 1.0377152
## 3: A549/ATCC       4 Non-Small Cell Lung Cancer        1 0.9321058
## 4:      ACHN      23               Renal Cancer        9 1.0001565
## 5:  CCRF-CEM       3                   Leukemia        7 1.0662771
## 6:  COLO 205      10               Colon Cancer        4 0.9205163
##    PERCENTGROWTH PERCENTGROWTHNOTZ EXPECTEDGROWTH TESTVALUE CONTROLVALUE
## 1:        101.75            101.48        100.000   6951240      6849786
## 2:        104.84            103.77        100.000   4182200      4030200
## 3:         92.34             93.21         95.400   5888640      6317566
## 4:        100.02            100.02         98.098   6607800      6606766
## 5:        107.76            106.63        100.000  13606520     12760773
## 6:         90.74             92.05         86.870   5822480      6325233
##    TZVALUE HS_1    E_inf_1 EC50_1 HS_2    E_inf_2 EC50_2 viability_1
## 1: 1042043    1 0.08977556     50    1 0.27205581      3   0.9963736
## 2:  889724    1 0.56096579    500    1 0.46478261     30   0.9998245
## 3:  716463    1 0.15732265     50    1 0.22150189      3   0.9966427
## 4: 1158169    1 0.01729107     50    1 0.21729490      3   0.9960848
## 5: 1863442    1 0.12852970     50    1 0.24074074      3   0.9965280
## 6:  895972    1 0.07365439     50    1 0.02186312      3   0.9963094
##    viability_2       HSA     Bliss
## 1:   0.9927926 0.9927926 0.9891924
## 2:   0.9994653 0.9994653 0.9992899
## 3:   0.9922921 0.9922921 0.9889607
## 4:   0.9922504 0.9922504 0.9883656
## 5:   0.9924826 0.9924826 0.9890367
## 6:   0.9903155 0.9903155 0.9866606

For convenience, PharmacoGx::computeBliss(viability_1, viability_2) is a more readable alias for computing the Bliss reference viability.

Loewe Additivity

Loewe Additivity comes from a different assumption on what the effect of adding two drugs should be. While Bliss Independence assumes that the drugs act independently, for example through different pathways, Loewe assumes that in the null case, adding \(x_2\) concentration of a second drug should lower the viability as if we added a proportional amount of the same drug, with that proportion caculated according to the relative efficacy of the two drugs at the final combination viability. This proportinal relationship is defined implicity at each predicted value of $v_{loewe} by the following formula:

\[ \frac{x_1}{X^{loewe}_1} + \frac{x_2}{X^{loewe}_2} = 1\]

where $ X^{loewe}_1 $ is the concentration of drug 1 required to achieve \(v_{loewe}\) on its own, and similarly for $ X^{loewe}_2$ and drug 2. Translating this to english, it assumes that we can lower the concentration of drug 1 needed to see the same effect by replacing the missing dose with a similar amount of drug 2, adjusted for the difference in potency between the two drugs.

Practically, this formula is not very useful for estimating \(v_{loewe}\) from actual dose-response data. However, if we assume that drug response follows a Hill Curve model, we can define \(y_{loewe}\) implicitly using the parameters \(HS_{1}\), \(EC50_{1}\), and \(E_{\infty,1}\), and \(HS_{2}\), \(EC50_{2}\), and \(E_{\infty,2}\) by rearranging the Hill Curve relationship as follows:

\[ x = EC50 * (\frac{v-1}{v-E_{\infty}})^{\frac{1}{HS}}\]

We can then plug this relationship in to get an emplicit formula for \(v_{loewe}\):

\[ \frac{x_1}{EC50_1 (\frac{1 - v_{loewe}}{v_{loewe} - E_{\infty,1}})^{\frac{1}{HS_1}}} + \frac{x_2}{EC50_2 (\frac{1 - v_{loewe}}{v_{loewe} - E_{\infty,2}})^{\frac{1}{HS_2}}} = 1 \]

This formula can be solved numerically to find the value of \(v_{loewe}\) for any two particular fitted curves.

While the Loewe Additivity model can be used to calculate the expected response at any pair of doses, more commonly, the Loewe Additivity is assessed in terms of the Combination Index. The combination index measures the deviation from the Loewe assumption as a ratio at each point, by computing the Loewe model with the observed combination viability instead of the reference viability. As it doesn’t require calculating \(v_{loewe}\), it is also easier to compute. The formula for the Combination Index given a Loewe assumption is:

\[ CI_{loewe} = \frac{x_1}{EC50^1(\frac{v_c}{v_c - E_{\infty}^1})^{1/HS_1}} + \frac{x_2}{EC50^2(\frac{v_c}{v_c - E_{\infty}^2})^{1/HS_2}} \]

Diagram of Loewe Additivity Assumption. Loewe additivity assumes that when plotting viability as a function of both concentrations (bottom plots), lines of equal viability (isoboles) will be straight. Using this assumption, either the expected viability for each pair of concentrations can be calculated (left), or the combination index can be computed to assess synergy or antagonism directly (right).

Computing Loewe Reference Viabilities

As mentioned above, the computation of the Loewe reference viability requires solving a non-linear implicitly defined equation. PharmacoGx provides the computeLoewe function to do this calculation.

# -- compute our drug synergy metrics
NCI_TRE |>
  endoaggregate(
    Loewe = PharmacoGx::computeLoewe(
      treatment1dose, HS_1, E_inf_1, EC50_1,
      treatment2dose, HS_2, E_inf_2, EC50_2
    ),
    by = c("treatment1id", "treatment2id", "treatment1dose", "treatment2dose",
      "sampleid", "bio_rep"),
    assay = "combo_response"
  ) -> NCI_TRE
head(NCI_TRE$combo_response)
##                            treatment1id treatment2id treatment1dose
## 1: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 2: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 3: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 4: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 5: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 6: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
##    treatment2dose tech_rep sampleid bio_rep CONC1 CONC2 CONCINDEX2   NSC2
## 1:           0.03        1    786-O       1 2e-07 3e-08          1 102816
## 2:           0.03        1    A-498       1 2e-07 3e-08          1 102816
## 3:           0.03        1    A-549       1 2e-07 3e-08          1 102816
## 4:           0.03        1     ACHN       1 2e-07 3e-08          1 102816
## 5:           0.03        1 CCRF-CEM       1 2e-07 3e-08          1 102816
## 6:           0.03        1 COLO 205       1 2e-07 3e-08          1 102816
##     CELLNAME CELLNBR                      PANEL PANELNBR viability
## 1:     786-0      18               Renal Cancer        9 1.0148113
## 2:      A498      13               Renal Cancer        9 1.0377152
## 3: A549/ATCC       4 Non-Small Cell Lung Cancer        1 0.9321058
## 4:      ACHN      23               Renal Cancer        9 1.0001565
## 5:  CCRF-CEM       3                   Leukemia        7 1.0662771
## 6:  COLO 205      10               Colon Cancer        4 0.9205163
##    PERCENTGROWTH PERCENTGROWTHNOTZ EXPECTEDGROWTH TESTVALUE CONTROLVALUE
## 1:        101.75            101.48        100.000   6951240      6849786
## 2:        104.84            103.77        100.000   4182200      4030200
## 3:         92.34             93.21         95.400   5888640      6317566
## 4:        100.02            100.02         98.098   6607800      6606766
## 5:        107.76            106.63        100.000  13606520     12760773
## 6:         90.74             92.05         86.870   5822480      6325233
##    TZVALUE HS_1    E_inf_1 EC50_1 HS_2    E_inf_2 EC50_2 viability_1
## 1: 1042043    1 0.08977556     50    1 0.27205581      3   0.9963736
## 2:  889724    1 0.56096579    500    1 0.46478261     30   0.9998245
## 3:  716463    1 0.15732265     50    1 0.22150189      3   0.9966427
## 4: 1158169    1 0.01729107     50    1 0.21729490      3   0.9960848
## 5: 1863442    1 0.12852970     50    1 0.24074074      3   0.9965280
## 6:  895972    1 0.07365439     50    1 0.02186312      3   0.9963094
##    viability_2       HSA     Bliss     Loewe
## 1:   0.9927926 0.9927926 0.9891924 0.9892436
## 2:   0.9994653 0.9994653 0.9992899 0.9993076
## 3:   0.9922921 0.9922921 0.9889607 0.9889720
## 4:   0.9922504 0.9922504 0.9883656 0.9884240
## 5:   0.9924826 0.9924826 0.9890367 0.9890752
## 6:   0.9903155 0.9903155 0.9866606 0.9867043

BONUS: ZIP

The Zero interaction Potency model was recently introduced with the goal of capturing the ideas behind both the Bliss and Loewe assumptions. As the name implies the model assumes that the two drugs do not “potentiate” each other, meaning, that the only difference between the dose response curves of Drug 1 with and without Drug 2 present is that the presence of Drug 2 changes the baseline response, without affecting the potency of Drug 1. While the ZIP model can be extended to 3 and 4 parameter Hill Curve models, it is simplest to understand and explain using the 2 parameter Hill Curve, which is equivalent to the case where \(E_{\infty}=0\).

The authors were then able to show that the ZIP model can be derived from multiplying the two fitted Dose-Response curves together.

\[ ZIP((v_1,x_1), (v_2,x_2)) = (E_{\infty}^1 + \frac{1-E_{\infty}^1}{1+(\frac{x_1}{EC50^1})^{HS^1}}) (E_{\infty}^2 + \frac{1-E_{\infty}^2}{1+(\frac{x_2}{EC50^2})^{HS^2}}) \]

While the formula for computing the ZIP reference value at each point works out to being the Bliss formula applied to the fitted curves, the ZIP assumptions suggest a new method to assess syngergy or antagosism at each point. Building on the assumption that combination without synergy should not change the potency of each particular drug, the ZIP \(\delta\) score defines a measure comparing the difference between the fitted curve for each drug with and without the presence of the second drug, taking the average of differences for the two drugs.

Mathematically, ZIP defines two values: \[ v_c^{1 \leftarrow 2}((v_1,x_1), (v_2,x_2)) = \frac{v_2}{1 + (\frac{x_1}{EC50^{1 \leftarrow 2}})^{HS^{1 \leftarrow 2}}} \] \[ v_c^{2 \leftarrow 1}((v_1,x_1), (v_2,x_2)) = \frac{v_1}{1 + (\frac{x_2}{EC50^{2 \leftarrow 1}})^{HS^{2 \leftarrow 1}}} \]

In this notation, {\(1 \leftarrow 2\)} means that we fit the Hill Curve for drug 1 after the addition of drug 2 at the dose \(x_2\). Intuitively, we adjust for the observed effect of adding \(x_2\) of drug 2 (\(v_2\)), and then treat drug 1 as if it is a monotherapy experiment. The \(\delta\) score is then defined as the difference of the viability predicted by this curve and the ZIP predicted viability, averaged over treating each of the drugs as the “main drug”:

\[ \delta((v_1,x_1), (v_2,x_2)) = \frac{v_c^{1 \leftarrow 2} - v_{ZIP}}{2} + \frac{v_c^{2 \leftarrow 1} - v_{ZIP}}{2} \]

For more information on the ZIP model, we refer the reader to the original publication (Yadav et al. 2015). Alternatively, the synergyfinder package contains vignettes and the reference implementation for ZIP calculations.

Picking a Null Drug Combination Model

Given the fact that there is no universal consensus on which null model is correct in all cases, when analyzing drug combination data, a choice must be made by the analyst. Ideally, this choice would be made according to reasonable assumptions about the drugs and their mechanisms of action. For example, for two compounds which inhibit targets converging on the same pathway, the Loewe model may be most appropriate, whereas for drug combinations with completely unrelated mechanisms, Bliss may be a more natural choice.

However, practical considerations may also limit your choices. For both the Loewe and ZIP approaches, one needs enough points to be measured to be able to fit Hill Curves to the available data. For Loewe, this needs to be done only in monotherapy, while for ZIP, both monotherapy and combination data must have sufficient points to fit a curve. On the other hand, both HSA and Bliss can be computed from the raw measurements of monotherapy viability, not requiring any models of the dose-response data to be fit.

In general, we would recommend using a model which makes stronger assumptions than HSA, which is the weakest of the presented models. If the number of tested doses is low (<6), the Bliss model is probably the reference of choice, even though it is known to be quite strict. When possible to calculate, the ZIP approach has advantages in a high-throughput setting, where it would be difficult to pick between Bliss and Loewe a-priori for each individual combination.

Quantifying Synergy or Antagonism Compared to a Null Model

Once a null model is chosen, the expected combination viability or response is compared to the observed value, as mentioned above. For all methods, this can be done using either a difference, or a ratio. In addition, the Loewe model directly facilitates calculating the Combination Index described above, and the ZIP model defines the \(\delta\) score for quantifying synergy. The main practical consideration when choosing how to compare the observations to the null is the scale on which results will be interpreted or further analyzed. Taking the ratio of \(v_c/v_{model}\), or using the Combination Index for Loewe leads to values that lie in \((0, \infty]\) (with 1 as the reference for no syngergy or antagonism), similar to fold changes. On the other hand, both differences and the ZIP \(\delta\)-score lie on a bounded scale \([-1,1]\), with 0 as the reference.

If the goal is simply to identify whether there was synergy or antagonism observed at a certain dose, the last step would be to visualize the results, and examine the extent of deviation from the null model. For this, the scale chosen is a matter of preference. However, if the downstream goal is biomarker discovery using a summary of synergy for each model tested with a particular combination, if the Combination Index or a ratio is used, it may be more appropriate to analyze the results on a log-scale, as otherwise the range is asymmetric around the reference.

In practice, our personal preference is to use the absolute deviation from the Bliss predicted value for downstream analysis.

Note that we also skip the ZIP delta score computation in the interest of time.

# -- compute combination index and score
NCI_TRE |>
  endoaggregate(
    HSA_CI = viability / HSA,
    HSA_score = HSA - viability,
    Bliss_CI = viability / Bliss,
    Bliss_score = Bliss - viability,
    Loewe_CI = (
      (treatment1dose /
        EC50_1 * ((1 - viability_1) / (viability_1 - E_inf_1))^(1 / HS_1)
      ) +
        (treatment2dose /
          EC50_2 * ((1 - viability_2) / (viability_2 - E_inf_2))^(1 / HS_2)
        )
        ),
    Loewe_score = Loewe - viability,
    viability = viability,
    viability_1 = viability_1,
    viability_2 = viability_2,
    by = c("treatment1id", "treatment2id", "treatment1dose", "treatment2dose",
      "sampleid", "bio_rep"),
    assay = "combo_response",
    target = "combo_scores"
  ) ->
  NCI_TRE

combo_scores <- NCI_TRE$combo_scores
head(combo_scores)
##                            treatment1id treatment2id treatment1dose
## 1: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 2: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 3: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 4: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 5: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
## 6: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine            0.2
##    treatment2dose sampleid bio_rep    HSA_CI    HSA_score  Bliss_CI Bliss_score
## 1:           0.03    786-O       1 1.0221785 -0.022018634 1.0258988 -0.02561889
## 2:           0.03    A-498       1 1.0382704 -0.038249933 1.0384527 -0.03842538
## 3:           0.03    A-549       1 0.9393462  0.060186284 0.9425105  0.05685488
## 4:           0.03     ACHN       1 1.0079678 -0.007906062 1.0119297 -0.01179090
## 5:           0.03 CCRF-CEM       1 1.0743535 -0.073794516 1.0780966 -0.07724041
## 6:           0.03 COLO 205       1 0.9295182  0.069799189 0.9329614  0.06614431
##    Loewe_CI Loewe_score viability viability_1 viability_2
## 1: 1.16e-04 -0.02556769 1.0148113   0.9963736   0.9927926
## 2: 1.16e-06 -0.03840768 1.0377152   0.9998245   0.9994653
## 3: 1.16e-04  0.05686616 0.9321058   0.9966427   0.9922921
## 4: 1.16e-04 -0.01173252 1.0001565   0.9960848   0.9922504
## 5: 1.16e-04 -0.07720193 1.0662771   0.9965280   0.9924826
## 6: 1.16e-04  0.06618805 0.9205163   0.9963094   0.9903155

Once we have a measure of synergy calculated for each dose combination in a particular experiment, we often would like to visualize the whole surface of drug-dose combinations to understand how synergy depends on dosage.

Below, we plot the Bliss score for an antagonistic combination of drugs, Mitotane and Paclitaxel, on the A-549 cell line. If we recall the formula for the Bliss score above, we took \(v_{model} - v_c\), meaning that in this case, negative numbers indicate antagonism, as there is more viability observed than expected.

library(ggplot2)

p <- ggplot(combo_scores[
    treatment1id == "Mitotane" & treatment2id == "Paclitaxel" & sampleid == "A-549"
    ],
    aes(treatment1dose, treatment2dose, fill = Bliss_score)
  ) +
  geom_tile() +
  scale_x_log10() +
  scale_y_log10() +
  xlab("Mitotane") +
  ylab("Paclitaxel")
print(p)

Note that here, it seems that antagonism is exacerbated at higher doses, suggesting a dose-dependant mechanism for antagonism.

Summarizing Synergy Matrices into Single Values

Often, we will want to summarize a whole synergy surface into a single synergy score for a drug combination per cell-line, for example, if we want to discover biomarkers of synergy. There are several methods that this can be done, the most common of which is taking the average value of your score matrix for the combination. If your doses are equally spaced (on the log scale), this will give a measure of Volume Under the Surface.

For some experiments, you may want to be a bit more careful, and not select the average of the whole surface. For example, in the experiment above, we see very strong antagonism when both doses of the drugs are high. However, this may not be a very interesting regime, as often one of the goals of drug combination therapy is to limit the dose of any single drug used to reduce associated toxicities. Therefore, an option could be to summarize only over a dose-range of interest.

If you find yourself in a situation where synergy is observed only in a narrow dose range, the most extreme deviation from the reference (deviation from 0 or 1, depending on use of additive or fold-change metrics) may instead be more appropriate.

For our demo, we will summarize the combination profiles we have calculated into a Mean Bliss Score per experiment.

(NCI_TRE |>
  aggregate(
    Mean_Bliss = mean(Bliss_score),
    assay = "combo_scores",
    by = c("treatment1id", "treatment2id", "sampleid")
  ) ->
  mean_bliss_score)
##                                treatment1id treatment2id sampleid   Mean_Bliss
##     1: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine    786-O -0.038395834
##     2: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine    A-498  0.063834787
##     3: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine    A-549 -0.048417317
##     4: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine     ACHN -0.039941211
##     5: 1,3-Bis(2-chloroethyl)-1-nitrosourea  Azacitidine CCRF-CEM -0.018621532
##    ---                                                                        
## 14142:                           Vorinostat  Vincristine       SR  0.170580816
## 14143:                           Vorinostat  Vincristine    SW620 -0.078155666
## 14144:                           Vorinostat  Vincristine    T-47D -0.006124021
## 14145:                           Vorinostat  Vincristine UACC-257 -0.025048742
## 14146:                           Vorinostat  Vincristine  UACC-62  0.099422703

Biomarker Discovery:

Monotherapy response

PharmacoGx makes an initial analysis for univariate biomarker discovery very simple, by providing the drugSensitivitySig function which will compute the association between molecular features and drug response summaries (such as AAC) across the cell lines in a PharmacoSet.

There are two approaches implemented in PharmacoGx for measuring the association between molecular features and drug response. The first, older approach is to use a linear model to estimate the effect of the molecular feature measurements on the drug response summary, and an analysis of variance F-test to test the significance of the molecular feature in predicting drug response over a Null model. If multiple tissue types are present in the dataset, the Null model includes the tissue as an explanatory features, otherwise, it is just the intercept. This approach is most flexible, allowing for both continous and binary measurements of drug response, as well as continous, categorical and binary molecular features. However, the estimate of the effect returned is a standardized regression coefficient, which may be difficult to interpret and compare between biomarkers.

The second approach is to use a Pearson Correlation, or a Partial Pearson Correlation corrected for tissue type to test for associations. Currently, this is our preferred approach for two reasons:

  1. The returned effect size estimate will be on the scale of a Pearson Correlation, which is familiar to most scientists and comparable between molecular features (as long as the sample sizes do not vary too drastically).
  2. Currently, PharmacoGx has implemented efficient permutation testing of Pearson Correlations for significance assessment, in addition to the regular asymptotic formulas for inferring significance.

As our focus is on drug combination analysis in this workshop, we won’t be demonstrating the usage of drugSensitivitySig to compute monotherapy associations with drug response. Rather, we skip directly to computing associations with drug synergy. An example of monotherapy analysis can be found in the PharmacoGx vignette: vignette("PharmacoGx").

Combination Synergy

The drugSensitivitySig function can be used to find biomarkers for synergy, by passing in a matrix of synergy values for each drug combination and cell line.

Above, we had already summarized our synergy surfaces into a single mean value for each experiment. Now, we need to reshape the “long” format table into a table of drug combinations x cell lines.

## Create a matrix of drug combo x cell lines.
synergy_summaries <- as.data.frame(data.table::dcast(
  mean_bliss_score,
  treatment1id + treatment2id ~ sampleid
))

## Name our rows something sensible related to the drug combinations.
rownames(synergy_summaries) <- paste(
    synergy_summaries$treatment1id,
    synergy_summaries$treatment2id,
    sep="_+_"
  )

## Removing the two columns of treatment ids, and converting to a numerical matrix.
synergy_summaries <- data.matrix(synergy_summaries[, -c(1:2)])

Finally, we can run our biomarker discovery. Note that we pass in the treatment x sample matrix to the sProfiles argument, short for sensitivity profiles. For the purposes of this demonstration, we also subset the features we are computing over to 100 random genes, simply to save time.

gene_exp_sigs <- drugSensitivitySig(NCI_ALMANAC_2017,
  mDataType = "rnaseq.comp",
  sProfiles = synergy_summaries,
  modeling.method = "pearson",
  features = sample(rownames(molecularProfiles(NCI_ALMANAC_2017, "rnaseq.comp")), 100),
  nthread = 1
)
## Computing drug sensitivity signatures...

The object returned by drugSensitivitySig is a 3D array, with the first two dimensions of features X treatments, and the third dimension returning computed statistics, the associated confidence intervals, p values, and FDR correction.

One intersting (toy) example we find here is that expression of a couple of genes are significantly correlated with antagonism (negative Mean Bliss Score) for the combination of Vinblastine and Dasatinib.

(significant_vin_das <- subset(
  gene_exp_sigs[, "Vinblastine_+_Dasatinib", ],
  gene_exp_sigs[, "Vinblastine_+_Dasatinib", "significant"] == 1
))
##           estimate  n df significant       pvalue      lower      upper
## NEUROD2 -0.5833826 42 30           1 0.0004574348 -0.7745218 -0.2946210
## MOBP    -0.6247997 42 30           1 0.0001320618 -0.7993452 -0.3530149
##                fdr
## NEUROD2 0.02287174
## MOBP    0.01320618

To get a more global view of the results, we can also plot the resulting object, which by default will display as a volcano plot. Note below that the object returned is a normal ggplot2 plot, and therefore we can add other annotations to it, such as the gene labels. Below, we will add the labels for the two genes predicting antagonism between Vinblastine and Dasatinib to the plot.

p <- plot(gene_exp_sigs)
p <- p + geom_label(aes(x = significant_vin_das[, "estimate"],
                        y = -log10(significant_vin_das[, "pvalue"]),
                        label = rownames(significant_vin_das)))
print(p)

Further Resources

We hope that this workshop leaves you feeling familiar with both the data structures defined in CoreGx as well as the analysis of drug combination experiments. Over time, our group will be curating public drug combination studies and releasing PharmacoSet objects through our orcestra.ca platform, which will be available for everyone to download and use in their own research. These studies will use identifiers mapped to the same databases (Pubchem and Cellosaurus) as already existing monotherapy PSets, allowing studies combining monotherapy and combination response data.

Furthermore, on orcestra.ca we are curating clinical datasets combining genomic or transcriptomic profiling of cancer patient samples with clinical drug response, mapped to the same identifiers and processed as much as possible with the same pipelines. Much of the time, clinical treatments are done in combination therapies, as standard of care often involves more than 1 treatment. As this resource grows, it will become possible to evaluate how well cell line combination experiment results translate to clinical outcomes.


References

Holbeck, Susan L., Richard Camalier, James A. Crowell, Jeevan Prasaad Govindharajulu, Melinda Hollingshead, Lawrence W. Anderson, Eric Polley, et al. 2017. “The National Cancer Institute ALMANAC: A Comprehensive Screening Resource for the Detection of Anticancer Drug Pairs with Enhanced Therapeutic Activity.” Cancer Research 77 (13): 3564–76. https://doi.org/10.1158/0008-5472.CAN-17-0489.
Yadav, Bhagwan, Krister Wennerberg, Tero Aittokallio, and Jing Tang. 2015. “Searching for Drug Synergy in Complex Dose an Interaction Potency Model.” Computational and Structural Biotechnology Journal 13 (January): 504–13. https://doi.org/10.1016/j.csbj.2015.09.001.

SessionInfo

## R version 4.2.0 (2022-04-22)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.4 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] ggplot2_3.3.6               data.table_1.14.2          
##  [3] PharmacoGx_3.1.5            CoreGx_2.1.7               
##  [5] SummarizedExperiment_1.27.1 Biobase_2.57.1             
##  [7] GenomicRanges_1.49.0        GenomeInfoDb_1.33.3        
##  [9] IRanges_2.31.0              S4Vectors_0.35.1           
## [11] MatrixGenerics_1.9.1        matrixStats_0.62.0         
## [13] BiocGenerics_0.43.1         pgxBioc2022Workshop_1.0.9  
## [15] BiocStyle_2.25.0           
## 
## loaded via a namespace (and not attached):
##   [1] fgsea_1.23.0                colorspace_2.0-3           
##   [3] ellipsis_0.3.2              rprojroot_2.0.3            
##   [5] lsa_0.73.3                  XVector_0.37.0             
##   [7] fs_1.5.2                    farver_2.1.1               
##   [9] SnowballC_0.7.0             MultiAssayExperiment_1.23.4
##  [11] DT_0.23                     fansi_1.0.3                
##  [13] codetools_0.2-18            cachem_1.0.6               
##  [15] knitr_1.39                  jsonlite_1.8.0             
##  [17] magicaxis_2.2.14            cluster_2.1.3              
##  [19] shinydashboard_0.7.2        shiny_1.7.2                
##  [21] mapproj_1.2.8               BiocManager_1.30.18        
##  [23] compiler_4.2.0              backports_1.4.1            
##  [25] Matrix_1.4-1                fastmap_1.1.0              
##  [27] bench_1.1.2                 limma_3.53.5               
##  [29] cli_3.3.0                   later_1.3.0                
##  [31] visNetwork_2.1.0            htmltools_0.5.3            
##  [33] tools_4.2.0                 igraph_1.3.4               
##  [35] gtable_0.3.0                glue_1.6.2                 
##  [37] GenomeInfoDbData_1.2.8      reshape2_1.4.4             
##  [39] RANN_2.6.1                  dplyr_1.0.9                
##  [41] maps_3.4.0                  fastmatch_1.1-3            
##  [43] Rcpp_1.0.9                  slam_0.1-50                
##  [45] jquerylib_0.1.4             pkgdown_2.0.6              
##  [47] vctrs_0.4.1                 BumpyMatrix_1.5.0          
##  [49] xfun_0.31                   stringr_1.4.0              
##  [51] mime_0.12                   lifecycle_1.0.1            
##  [53] gtools_3.9.3                MASS_7.3-57                
##  [55] zlibbioc_1.43.0             scales_1.2.0               
##  [57] ragg_1.2.2                  promises_1.2.0.1           
##  [59] relations_0.6-12            parallel_4.2.0             
##  [61] RColorBrewer_1.1-3          sets_1.0-21                
##  [63] yaml_2.3.5                  memoise_2.0.1              
##  [65] gridExtra_2.3               downloader_0.4             
##  [67] sass_0.4.2                  stringi_1.7.8              
##  [69] highr_0.9                   NISTunits_1.0.1            
##  [71] desc_1.4.1                  plotrix_3.8-2              
##  [73] checkmate_2.1.0             caTools_1.18.2             
##  [75] boot_1.3-28                 BiocParallel_1.31.12       
##  [77] rlang_1.0.4                 pkgconfig_2.0.3            
##  [79] systemfonts_1.0.4           bitops_1.0-7               
##  [81] pracma_2.3.8                evaluate_0.15              
##  [83] lattice_0.20-45             purrr_0.3.4                
##  [85] labeling_0.4.2              htmlwidgets_1.5.4          
##  [87] tidyselect_1.1.2            plyr_1.8.7                 
##  [89] magrittr_2.0.3              bookdown_0.27              
##  [91] R6_2.5.1                    gplots_3.1.3               
##  [93] generics_0.1.3              sm_2.2-5.7.1               
##  [95] DelayedArray_0.23.1         withr_2.5.0                
##  [97] pillar_1.8.0                RCurl_1.98-1.8             
##  [99] tibble_3.1.8                crayon_1.5.1               
## [101] KernSmooth_2.23-20          utf8_1.2.2                 
## [103] rmarkdown_2.14              grid_4.2.0                 
## [105] marray_1.75.0               piano_2.13.0               
## [107] digest_0.6.29               xtable_1.8-4               
## [109] httpuv_1.6.5                textshaping_0.3.6          
## [111] celestial_1.4.6             coop_0.6-3                 
## [113] munsell_0.5.0               bslib_0.4.0                
## [115] shinyjs_2.1.0