Transformed datasets used for drug predictions
==============================================

Details
-------

To use the datasets described above as independent variables for our PLSR models
to predict responses to targeted therapeutics we normalized and transformed the
data.

Method for the ligand response datasets
---------------------------------------

1. We only considered the levels after stimulation and did not used the control
   values as variables.

2. For the continuous datasets, we log-transformed the values.

3. For the binarized datasets, we considered a response to be significant (1) if
   the measured value is 3 standard deviations above the control value (measured
   in 6 replicates), we assigned the value ‘0’ otherwise.

4. Depending on the dataset used for prediction, we either:

   * Averaged the values across the three time points (also valid for the
     binarized datasets)

   * Take the maximal value across the three time points (also valid for the
     binarized datasets)

5. For each individual measure, we linearly normalized the values between 0 and
   1 across all cell lines. By construction the binarized datasets are already
   normalized.

Method for the ligand response datasets
---------------------------------------

1. The measures that were below the detection threshold (i.e. not detected) were
   set to the value of the detection threshold.

2. For the binarized datasets, we considered the measurement for a cell line to
   be high (1) if the measured value is above the median of all cell lines for
   this particular measure (removing the measurements below the detection
   threshold), otherwise the measurement is low (0).

3. For the continuous datasets, we log-transformed the values.

4. For each individual measure, we linearly normalized the values between 0 and
   1 across all cell lines. By construction the binarized datasets are already
   normalized.