HMS LINCS is collaborating with developers of ISA Infrastructure to explore the possibility of adapting the ISA-Tab file format to accomodate the metadata requirements of LINCS experimental data. ISA Infrastructure is a new general-purpose format (ISA-Tab) and freely available desktop software suite (ISA-Tools) that assists in the reporting and local management of experimental metadata. ISA Infrastructure uses community-defined minimum information checklists and ontologies. It assists in formating data for submission to public repositories and compliance with emerging minimum information reporting standards such as MIACA.
“ISA” refers to the organization of experimental metadata into three hierarchical tiers: Investigation, Study, and Assay. These three tiers correspond to the three types of files in the ISA-Tab format standard. The following describes an example of how this was implemented for an HMS LINCS breast cancer cue-signal-response dataset.
An example of how ISA-Tab can be used:
Our first use of the ISA-Tab format was for an HMS LINCS breast cancer cue-signal-response dataset. There are 43 Study files: one for each cell line assayed. Each Study file contains 576 (6 x 96) rows describing the treatment conditions (perturbagen, time, concentration) in each well of the six 96-well assay plates run for each cell line. For every Study file there is a corresponding Assay file that describes the data readout (phosphorylation state of 4 proteins monitored by imaging) for each perturbagen condition. In this case, each well was imaged in two channels, measuring protein phosphorylation of two different targets, in 4 replicate sites resulting in 4608 16-bit grayscale images. Thus, each Assay file contains 4608 rows describing the phosphorylation state measured in each of the images in the dataset. The Assay file also contains pointers to the image files and the image processing, data extraction, and processing protocols that will be applied to each image. The actual experimental data (in this case 207,360 images and quantitation of protein phosphorylation states) are not stored in the ISA-Tab files. Rather, pointers in the ISA Assay files describe where the raw and processed data are stored. The figure below depicts a simplified representation of the ISA-Tab files in this example.