Transformed datasets used for drug predictions ============================================== Details ------- To use the datasets described above as independent variables for our PLSR models to predict responses to targeted therapeutics we normalized and transformed the data. Method for the ligand response datasets --------------------------------------- 1. We only considered the levels after stimulation and did not used the control values as variables. 2. For the continuous datasets, we log-transformed the values. 3. For the binarized datasets, we considered a response to be significant (1) if the measured value is 3 standard deviations above the control value (measured in 6 replicates), we assigned the value ‘0’ otherwise. 4. Depending on the dataset used for prediction, we either: * Averaged the values across the three time points (also valid for the binarized datasets) * Take the maximal value across the three time points (also valid for the binarized datasets) 5. For each individual measure, we linearly normalized the values between 0 and 1 across all cell lines. By construction the binarized datasets are already normalized. Method for the ligand response datasets --------------------------------------- 1. The measures that were below the detection threshold (i.e. not detected) were set to the value of the detection threshold. 2. For the binarized datasets, we considered the measurement for a cell line to be high (1) if the measured value is above the median of all cell lines for this particular measure (removing the measurements below the detection threshold), otherwise the measurement is low (0). 3. For the continuous datasets, we log-transformed the values. 4. For each individual measure, we linearly normalized the values between 0 and 1 across all cell lines. By construction the binarized datasets are already normalized.