Installation ------------ The easiest and fastest way to install BAGHERA using **conda**:: $ conda install -c stracquadaniolab -c bioconda -c conda-forge We have prepared also a **docker** image that can be found at this `link `_. The image can be pulled as follows:: $ docker run docker.pkg.github.com/stracquadaniolab/baghera/baghera:latest By downloading the docker image, you will download a virtual machine with the latest version of baghera and its requirements. Once the image is downloaded, you can run baghera from the docker. For example, use the command below to be prompted baghera's help:: $ docker run docker.pkg.github.com/stracquadaniolab/baghera/baghera:latest -h Here below is an example on how to use the `create-files` command:: $ docker run --rm -v "$PWD:$PWD" -w "$PWD" docker.pkg.github.com/stracquadaniolab/baghera/baghera:latest create-files -l -a -s -g Tutorial --------------- A typical BAGHERA analysis consists of 3 steps, we briefly explain them here, more details can be found in the documentation and a practical example is in the snakemake workflow. 1- Build a SNP annotation file ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Build a SNP annotation file, where SNPs are annotated to genes and LD scores are assigned. We use `precomputed ld-score `_ , from the set of variants for the European population of 1000 Genomes, and the genes in the `Gencode v31 annotations `_ , using only the protein coding terms. To cope with overlapping genes, we clustered them, obtaining a dataset of 15000 non-overlapping genes. For the annotation, we use a 50 kb window. :: $ baghera-tool create-files -l -a -s -g 2- Annotate summary statistics ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Annotate summary statistics with the SNP annotation built in step 1:: $ baghera-tool generate-snp-file -s -i -o -a 3- Run the regression ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Run the regression:: $ baghera-tool gene-heritability --sweeps --burnin --n-chains --n-cores -m Example +++++++ Running BAGHERA on the UK Biobank summary statistics for breast cancer, using EUR LD scores and the Gencode annotation. :: $ baghera-tool create-files -l data/eur_w_ld_chr/ -a data/gencode.v31lift37.basic.annotation.gtf -s data/ld_annotated_gencode_v31.csv -g data/genes_gencode_v31.csv $ baghera-tool generate-snp-file -s data/C50.gwas.imputed_v3.both_sexes.tsv -i position_ukbb -o data/c50.snps.csv -a data/ld_annotated_gencode_v31.csv $ baghera-tool gene-heritability data/c50.snps.csv data/results_normal_c50.csv data/summary_normal_c50.csv data/log_normal_c50.txt --sweeps 10000 --burnin 2500 --n-chains 4 --n-cores 4 -m normal Workflow ++++++++ Alongside BAGHERA, we are providing a snakemake workflow `repository `_ with sample data.