Frequently Asked Questions
Analyses
How can I reproduce an environment to explore JUMP data?
The easiest way to set things up will be installing from pip in your environment of choice:
pip install jump-deps
Data
Does JUMP contain my compound/gene of interest?
The easiest way to find out is querying your dataset using this web tool. Alternatively, you can explore the metadata tables on the datasets repository, which are used to generate the explorable table.
Where are the datasets specification?
The main resource to understand the technicalities of the JUMP datasets collection and assembly is on this repo.
Why do some samples have images but no downstream analysis?
Some plates failed Quality Control (QC) but we kept them because they may be useful for developing QC methods.
Why do some perturbations have so many replicates?
Most chemical compound plates contain 16 negative control wells, while some have as many as 28 wells. In the ORF dataset, replicates are positioned in wells O23, O24, P23 and P24. The remaining wells contain ORF treatments, with a single replicate of each per plate map and with five replicate plates produced per plate map (issue).
Which pipelines produced the final datasets?
Details on the pipelines at each step can be found on this page.
Do we expect one gene’s JCP (JUMP Cell Painting ID) to be associated with multiple targets?
Yes, many genes are associated with multiple targets and are correctly annotated as such. For instance, JCP2022_050797
(quinidine/quinine) has the targets KCNK1
and KCNN4
. Other genes are annotated as targeting genes in disparate families.
Do JCPs within the compound dataset refer to the same compound?
Sometimes, two compounds were given separate JCPs because they had different names and broad_sample
names. But after all of the data cleanup steps, they ended up being the same. Hence two different entries.
Do JCPs within either the CRISPR or ORF datasets refer to the same gene?
In CRISPR, each JCP ID corresponds to a different gene. But in ORF there are frequently multiple reagents representing the same gene. In this case, we compute consensus profiles at the gene level (more info here).