We develop quantitative and computational methods and tools to sift through the vast amounts of genomic and other high dimensional data with the goal of making discoveries that can be translated to improve human health.

News and Posts

More Posts

In this report I calculate the lower bounds of p-values when using very rare variants, for which minor allele counts are in the single digits. This report was prepared back in 2012 for the T2D-GENES consortium that had just generated 10K whole exome sequenced data.


We have opened direct access to the gene2pheno database, where we are hosting the PrediXcan results of close to 3000 phenotypes (from public GWAS meta analysis results and UKBiobank results from Ben Neale/HAIL team). Below are R functions that will allow you access and query the database. These results are based on GTEx V6p models and details of the analysis can be found in our preprint link to preprint in press now in Nature Communications.


We are releasing prediction models trained on GTEx Version 7 data. Download from here. We have updated our processing pipeline, and restricted to individuals of European ancestry to obtain more reliable LD data. This reduces false positive associations in the Summary Version of PrediXcan. Because of this choice, the gain in sample size relative to V6p is modest (ranging from -18 to 89), with whole blood, LCLs and fibroblasts experiencing reduced sample size.


After two years in the Lab, Scott has decided to join the well paid workforce. Scott is a wonderful colleague to all of us and has made important contributions to our team. Thank you and best of lucks, Scott 🍀🍀🍀

Also thank you, Wenndy, for organizing and buying the present for Scott.


Keep in mind that significant associations shown here do not imply causality. That said, given that PrediXcan is seeking to test the role of gene expression variation on traits and we and others have shown that significant PrediXcan genes are enriched in causal genes, these results should be useful to delve into the mechanisms underlying gene to phenotype associations. False positives can arise because of several factors LD contamination By computing the probability of LD contamination, we try to reduce false positives due to LD rather than genuine colocalization of trait and expression causal variants.


Recent Publications

  • Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics

    Details PDF

  • Survey of the Heritability and Sparse Architecture of Gene Expression Traits across Human Tissues

    Details PDF

  • A gene-based association method for mapping traits using reference transcriptome data

    Details PDF

  • Poly-Omic Prediction of Complex Traits: OmicKriging

    Details PDF

  • On sharing quantitative trait GWAS results in an era of multiple-omics data and the limits of genomic privacy

    Details PDF

  • Mixed effects modeling of proliferation rates in cell-based models: consequence for pharmacogenomics and cancer

    Details PDF

Recent & Upcoming Talks

More Talks

Software & Resources


Gene Correlation Across Tissues

Gene correlation across tissues

Web Apps

Access our web applications (PrediXcan, multi tissue PrediXcan)


Mechanistically driven gene level association test

summary predixcan

Summary extension of PrediXcan


Database of Prediction models to be used with PrediXcan


Catalog of gene to phenome associations

Omic Kriging

Predicting traits integrating heterogeneous sources of data

Current Members

  • Hae Kyung Im, PhD - Bio

  • Alvaro Barbeira, MS - Bio

  • Milton Pividori, PhD - Bio

  • Rodrigo Bonazzola, MS - Bio

  • Yanyu Liang (GGSB Grad Student) - Bio

  • Padma Sheila Rajagopal, MD - Bio


  • Jiamao Zheng, PhD - Bio

  • Scott Dickinson, MS - Bio