Project Explorer

Analysis of growth factor signaling in genetically diverse breast cancer lines

Mario Niepel1*, Marc Hafner1*, Emily A. Pace2*, Mirra Chung1, Diana H. Chai2, Lili Zhou1, Jeremy L. Muhlich1, Birgit Schoeberl2, and Peter K. Sorger1

1 HMS LINCS Center, Harvard Medical School, Boston, MA; 2 Merrimack Pharmaceuticals, Cambridge, MA

BMC Biol (2014) 12:20.
doi:10.1186/1741-7007-12-20 / PMID:24655548 / PMCID:PMC4234128

Responses of genetically diverse cell lines to biological ligands

Peptidyl ligands (cytokines, chemokines and growth factors) are among the most important naturally occurring perturbagens. They play a role in regulating cell motility, differentiation, adhesion, proliferation and cell survival. Growth factors are one class of biological ligands that function by binding to one or more of the 58 known human receptor tyrosine kinases (RTKs) and activating “immediate early” signaling by MAPK, PI3K/Akt, and other kinase cascades.

Description of the dataset and downloads

This dataset describes the responsiveness of a canonical collection of 39 breast cancer cell lines of the NCI-ICBP43 set, whose genotypes span many of the mutations observed in primary disease 1. The data comprises the phosphorylation levels of ERK (MAPK1/3) and Akt (Akt/1/2/3) kinases following exposure to 15 growth factors and cytokines, at two doses and three time points, as well as the abundance and phosphorylation levels of 20 RTKs and several additional intracellular proteins. We previously used this dataset to predict the sensitivity of the cell lines to 43 therapeutic inhibitors by statistical modeling 2. Here we explore in more detail how the the signaling responses vary across the cell lines and how the responses relate to the basal expression and activity levels of RTKs 3.


The goals for this dataset were to: (i) characterize the diversity of growth factor response and determine how it mapped to clinical subtypes, (ii) identify factors that control the magnitude and duration of ligand response and (iii) generate a simple means to look up and compare ligand-response data that have hitherto been unavailable or scatted across the literature. To this end, we have produced several "views" of the dataset:

  • Response matrix: Graphical lookup tables summarizing the signaling responses across all experimental conditions.
  • By cell line: Analysis and visualization of dataset slices corresponding to data from individual cell lines.
  • By ligand: Analysis and visualization of dataset slices corresponding to data from individual ligand treatments.

For a more detailed description of how we analyzed and plotted the data, refer to the methods and results section of the publication 3. Please cite this paper when reusing any data or analysis found here. For more information, contact Mario Niepel (

Response matrix

The response matrix graphically summarizes the pERK and pAkt signaling responses for each of the ~600 cell line / ligand combinations in a tabular format. For each combination, the actual raw time course data can also be displayed.

Data by Cell Line

A compact overview of both the basal and signaling profiles of each cell are simple network maps displayed as node-edge graphs. This visualization allows the user to quickly identify which pathways are highly expressed, strongly phosphorylated, or particularly sensitive in a given cell lines. The outer and inner circles represent basal measurements with circle size indicating expression levels and the shading indicating phosphorylation levels. The data is normalized across all cell lines and gray outer circles denote phosphorylation levels below the threshold of detection. The lines between the outer and middle circles denote simplified ligand/receptor binding. The colored arrows between the outer and middle circles denote the strength of pERK (blue) or pAkt (red) induction by ligand. The maximal response by ligand was normalized across all responses and gray lines denote responses that are not significant.

To put the individual network maps into context we created three maps that represent each clinical subtype (TNBC, HER2amp, HR+). To create these maps we used the trimmed mean of all values of a given subtype, rather than the mean, to reduce artifacts that might occur due to ligand/receptor combinations that only result in sporadic responses.




We can also plot the distribution of individual signaling features by subtype. Here we show the fraction of responses to low doses of ligands by pERK and pAkt. For example, TNBCs are particularly sensitive and respond even to low doses of ligand when measuring pERK. No such difference is apparent when looking at pAkt. Other features that can be similarly compared are the mean fold-changes at treatment with high doses of ligand or the pERK/pAkt induction bias. We then show the data of individual cell types by a diamond overlayed on the overall distribution of the subtypes.

Also the basal expression and phosphorylation levels of RTKs have a wide distribution across all cell lines. For each cell line we can highlight the most upregulated and downregulated basal measurements. We indicated the level of a given cell line by a diamond overlayed on the distribution of the same measurement across all lines. The distributions by subtype are indicated as colored bars.

The magnitude of response to individual ligands varies significantly from one cell line to the next. We can illustrate this range of responses as boxplot for each ligand for pERK and pAkt fold-changes. Again, we can indicate the measurement of any given cell line as a diamond, to put a single cell line into its appropriate context.

Data by Ligand

We can display the full time courses for individual ligand/downstream kinase combination across all cell lines to see both the variance in the kinetics and the magnitude of the responses. For a given ligand there are two plots corresponding to a low and high dose treatment and the two different downstream targets. We chose six cell lines that are highlighted in every plot that can be used a reference they that cover much of the variablity observed in the whole data set.

We can then classify these time course into four kinetic response classes (sustained, transient, late, none) and plot their distribution by target and ligand-family. This shows that certain ligands predominantly give responses of certain classes, like the ErbB/FGF responses being mainly sustained or the INS/IGF responses being often late. We can then compare the response to individual ligands to the main ligand-families.

Even though we collected responses only at two doses, we can use this data to approximate the sensitivity of any cell line to a particular ligand. Analogous to the time courses above, we can plot the response of all cell lines to a specific ligand measured by a specific downstream kinase across the two doses.

We find that there is a significant range in the magnitude of responses both at low and high dose. This means we can classify these sensitivities based on whether cells respond equally at high and low dose, higher at high than at low dose, only at high dose, or not at all. Once classified this way we can plot the overall distribution of these sensitivity classes for all cell lines given a ligand/downstream kinase combination. This shows that certain ligands measured by certain targets predominantly fall into certain classes, like the ErbB and FGF responses being higher at high dose than at low dose meaning they are rarely saturated at low dose.

When we compare the pERK and pAkt response for the same ligand in a single cell line (see above) it is apparent that individual ligands induce a pathway bias. We can plot this bias as the response angle distribution or boxplots to compare individual ligands to the remaining ligands. When browsing individual ligands we highlight it in black so it stands out in contrast to the other ligands.

Available data and software

Data All basal levels (HMS Dataset #20137 from Niepel et al (2013). Sci Signal. 6, ra84). Details Download (.xlsx)
Data All ligand responses (HMS Dataset #20140 from Niepel et al (2013). Sci Signal. 6, ra84). Details Download (.xlsx)
Signatures Results of kinetic clustering and dose-dependence classification. Details Download (.xlsx)
Software A link to our HMS LINCS GitHub for access to the code for the signaling data matrix presented on this project exploration website. hmslincs at GitHub


  1. Neve, R. M. et al. (2006) A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes. Cancer Cell. 10(6), 515-527. doi:10.1016/j.ccr.2006.10.008 PMID:17157791 PMCID:PMC2730521
  2. Niepel, M., Hafner, M., Pace, E. A., Chung, M., Chai, D. H., Zhou, L., Schoeberl, B., Sorger, P.K. (2013) Profiles of Basal and Stimulated Receptor Signaling Networks Predict Drug Response in Breast Cancer Lines. Sci Signal. 6(294), ra84. doi:10.1126/scisignal.2004379 PMID:24065145 PMCID:PMC3845839
  3. Niepel, M., Hafner, M., Pace, E. A., Chung, M., Chai, D. H., Zhou, L., Muhlich J. L., Schoeberl, B., Sorger, P.K. (2014) Analysis of growth factor signaling in genetically diverse breast cancer lines. BMC Biol. 12:20. doi:10.1186/1741-7007-12-20 PMID:24655548 PMCID:PMC4234128

When using any of the data or analysis presented here, please cite Niepel et al., BMC Biology, 2014. For more information, contact Mario Niepel (