About Me

Hi! My name is Razvan Ioan Panea and I am a Lead Genomic Data Engineer at Regeneron Genetics Center. I am currently leading the engineering efforts in the Genome Informatics group to ensure an at-scale and at at-speed genomics production environment.

Contact Details

Razvan Panea

Contact me

Education

Duke University

PhD in Computational Biology and Bioinformatics August 2014 - May 2020

Thesis: A Cloud-Based Infrastructure for Cancer Genomics

Jacobs University Bremen

B.S. in Applied and Computational Mathematics August 2011 - May 2014

Thesis: Metabolic network modeling of Theobroma Cacao

Research and Work

Design, innovate and develop the genomics production environment

Lead Genomic Data Engineer August 2020 - Present

Optimized and innovated genomics workflows that process and analyze over 500,000 samples per year. Led the software engineering efforts by applying software development best practices, specifically version control using Git, software development life cycle tracking using Jira, and continuous integration, delivery and testing methods through Amazon Web Services CodePipeline. Worked with bioinformaticians and data scientists to develop tools on data platforms such as DNANexus to facilitate at-scale, at-speed genomics pipelines for QC, variant calling and multi-omics analysis. Worked closely with other data engineers to integrate production with distributed compute environments. Interacted with software engineers and external technology partners to ensure 24/7 production uptime.

Cloud-Based Infrastructure using CloudConductor

Research Software Engineer September 2015 - May 2020

Led a group of 5 people to develop CloudConductor, a modular, scalable, elastic, parallelizable and extensible framework to generate different analysis pipelines. Researched cloud-based services for optimizing and improving the performance of analysis pipelines. Integrated over 50 bioinformatics tools into the newly developed framework. Integrated and tested the framework on Google Cloud Platform and currently working to make the framework platform-agnostic. Finally, we extended CloudConductor to a complete infrastructure using Kubernetes for automatic deployment and Django for user interface.

The Whole Genome Landscape of Burkitt Lymphoma subtypes

Research Assistant January 2018 - November 2019

Sequenced the whole genome and transcriptome of 101 Burkitt Lymphoma cases of different subtypes. Identified genetic alterations in the sequencing data using the newly developed cloud-based framework. Identified mutations in driver genes, such as MYC, ID3 and DDX3X, the presence of the Epstein-Barr virus (EBV) in the samples, and translocations characteristic to Burkitt Lymphoma. Identified associations between lymphoma subtypes and EBV status. Work has been accepted and published in the Blood journal.

Features implementation for the SILVA project

Software Developer June 2013 - August 2014

Interned and continued working at Max Planck Institute for Marine Microbiology to implement features for the SILVA project. Researched SWIG and implemented SWIG interfaces in the SILVA Project to bind and call methods implemented in Python on C++ objects. Benchmarked, compared and deployed a new more efficient phylogenetic tree generator software. Implemented additional features to ensure software stability.

Defining the interaction between neighboring genes in E.Coli

Research Assistant March 2013 - June 2014

Developed a software in Java to aggregate and analyze the expression levels of all neighboring genes at different time points in E.Coli life cycle. Identified a correlation between the differential expression between two neighboring genes and their transcriptional orientation to each other.

Publications

Panea R.*, Love C.*, Shingleton J.*, Reddy A.* et al., “The whole genome landscape of Burkitt lymphoma subtypes”, Blood (2019).


Shingleton J., [ and 16 authors, including Panea R. ] (2020). "Non-Hodgkin Lymphomas: Malignancies Arising from Mature B Cells". In M. Kharas, R. L. Levine, A. M. Melnick (Ed.), "Leukemia and Lymphoma: Molecular & Therapeutic Insights". Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press, (In Press).


Li X., [ and 75 authors, including Panea R. ], "Whole Exome and Transcriptome Sequencing in 1042 Cases Reveals Distinct Clinically Relevant Genetic Subgroups of Follicular Lymphoma" Blood (2019), (Manuscript in preparation).

Presentations

Google Cloud Next '18

Breakout session July 2018

Title: Speeding up Research in Genomics

Duke Cancer Institute

Poster presentation October 2017

Title: A Cloud-Based Framework for Cancer Genomics

Intelligent Systems for Molecular Biology (ISMB)

Poster presentation June 2016

Title: Defining the Microbiome of Lymphomas

Duke Cancer Institute

Poster presentation October 2015, October 2016

Title: Defining the Microbiome of Lymphomas

Skills

Programming: Python, C, R, Java, C++, SQL

Computing Platforms: Google Cloud, SLURM, Kubernetes, AWS

Libraries/Packages: Apache Libcloud, Plotly, Pandas, SQLAlchemy

Operating Systems: Linux, Windows, OS X, Chrome OS

Software: PyCharm, Jupyter Notebook, RStudio, Adobe Illustrator, Microsoft Suite