Getting Started
-
pip install hail
Use
hailctl
to start a Google Dataproc cluster with thegnomad
package installed (see Hail on the Cloud for more detail onhailctl
):hailctl dataproc start cluster-name --packages gnomad
Connect to a Jupyter Notebook on the cluster:
hailctl dataproc connect cluster-name notebook
Import gnomAD data in Hail Table format:
gnomAD v2.1.1 variants:
from gnomad.resources.grch37 import gnomad gnomad_v2_exomes = gnomad.public_release("exomes") exomes_ht = gnomad_v2_exomes.ht() exomes_ht.describe() gnomad_v2_genomes = gnomad.public_release("genomes") genomes_ht = gnomad_v2_genomes.ht() genomes_ht.describe()
gnomAD v3 variants:
from gnomad.resources.grch38 import gnomad gnomad_v3_genomes = gnomad.public_release("genomes") ht = gnomad_v3_genomes.ht() ht.describe()
Shut down the cluster when finished with it:
hailctl dataproc stop cluster-name