McCormick Bioinformatics Core (MBC)

About

The McCormick Bioinformatics Core (MBC), which is supported by the McCormick Endowment Fund, offers bioinformatics services to researchers within the Department of Biochemistry and Molecular Medicine and the School of Medicine and Health Sciences. We emphasize the use of cutting-edge bioinformatics tools and methodologies to help unlock the full potential of complex multidimensional genomic data. Our mission is to foster a collaborative environment that brings together experimental and computational biologists from GWU and worldwide, with the ultimate objective of advancing the biomedical research.

Our Team

Anelia Horvath, PhD

Professor, Department of Biochemistry and Molecular Medicine, GW
Director, McCormick Bioinformatics Core
Co-Director, The McCormick Genomic & Proteomic Center, GW

Following my postdoctoral training in Cancer Genomics (NICHD, NIH, Bethesda, MD), where I led efforts to identify novel genes implicated in endocrine tumors, in September 2013 I joined the George Washington University (GWU), as a Co-Director of the McCormick Genomic and Proteomic Center (MGPC) a Professor in the Department of Biochemistry and Molecular Medicine. Here, my primary focus lay in the development of novel methodologies for the comprehensive analysis and integration of genomics and transcriptomics data. My expertise spans over 15 years in leading next-generation sequencing (NGS) analyses and developing custom bioinformatics solutions.

Over the past several years, my scientific focus has been dedicated to single cell genomics, and, more specifically, on the intratumor heterogeneity related to expressed genetic variation. This encompasses an exhaustive exploration, not limited to DNA-derived germline and somatic variants but extending into the dynamic landscape of RNA-originating variations. Establishing myself as one of the leaders in this field, I have spearheaded the development of multiple analytical tools tailored to unraveling the complexities of cell-level expressed genetic variants. Ongoing initiatives involve the creation of a cutting-edge, cloud-based scalable platform for public analysis of cell-level expressed genetic variation, complemented by the integration of artificial intelligence to guide the modeling of the origin and effects of these variations.

The Horvath Lab

Siera Martinez

BA in Biology with a concentration in Neurobiology
MS in Electrical Engineering (ongoing)

Siera brings a unique blend of expertise in bioinformatics and computational biology, merging her background in biology and neuroscience with advanced skills in machine learning, programming, and statistical analysis. Her experience includes satellite data analysis at NASA and creating bioinformatics workflows for single-cell RNA sequencing in Dr. Horvath's lab at George Washington University. With strong proficiency in next-generation sequencing, software development, and cancer data modeling, Siera is exceptionally equipped to contribute to the McCormick Bioinformatics Core. Her technical skills and hands-on research experience align seamlessly with the Core’s mission to propel genomic and transcriptomic research through innovative computational approaches.

Luke Johnson

MS in Bioinformatics

Luke Johnson, a graduate of GWU's Bioinformatics and Molecular Biochemistry program, combines a passion for technology with a strong commitment to scientific discovery. With expertise in Linux, shell scripting, R, and Python, alongside a solid foundation in biochemistry and practical experience in machine learning, Luke brings a comprehensive technical skill set to the team. His focus on single-cell data analytics drives his enthusiasm for bioinformatics, where he is eager to make impactful contributions and continue exploring the field’s potential.

Dr. Hovhannes Arestakesyan, PhD

Bioinformatics Specialist

Hovhannes Arestakesyan holds a PhD in Human and Animal Physiology and an MS in Bioinformatics and Molecular Biochemistry.
He completed his postdoctoral training at GW, studying cardiovascular and metabolic disorders, and was a Fulbright Visiting Scholar focused on advanced imaging, stem cell biology, and tissue engineering.
At the McCormick Bioinformatics Core, he supports both lab-based and collaborative projects through NGS data analysis, single-cell and bulk RNA-seq workflows, variant discovery, and reproducible pipeline development. His toolkit includes R, Seurat, Cell Ranger, Snakemake, and CLI-based workflows in Linux, with ongoing work in Python and machine learning. As a neuroscientist and bioinformatician, Hovhannes is passionate about integrating neuroscience with computational biology to explore complex questions in cancer, immunology, and the tumor microenvironment.Known for his consistency - in the lab and in his 5am workouts - he brings focus, discipline, and a collaborative spirit to every project.

Vania Ballesteros Prieto

B.S. in Biology, M.S. in Bioinformatics

As an innovative bioinformatician at the Horvath Lab, Vania leverages her expertise in genetics, molecular biology and programming languages like R, Python, and Bash to develop and implement new bioinformatics pipelines. Her work directly contributes to the team's ability to apply computational methods and unravel the complexities of genetic variation.

Dr. Jewel Josephine Dias

M.B.B.S, M.S. Bioinformatics & Molecular Biochemistry
Volunteer- McCormick Bioinformatics Core

Dr. Jewel J. Dias is a physician-scientist with an M.B.B.S. (M.D. equivalent) from India, including service during the COVID-19 pandemic. She has completed a Master’s degree in Bioinformatics & Molecular Medicine at The George Washington University, with focused training in machine learning methodologies and AI-driven approaches for biomedical data analysis.
She conducted research in the Horvath Lab, where she applied single-cell and long-read sequencing technologies to investigate genetic variation, allele-specific expression, and transcriptome complexity. Her long-term research interests center on integrating clinical insight with computational and artificial intelligence–based approaches to advance regenerative and preventive strategies.
She will be volunteering with the McCormick Bioinformatics Core within the Department of Biochemistry & Molecular Medicine, applying her computational, machine learning, and data analysis expertise to support ongoing research efforts.

Jump to:

Services How We Work Resources Methods Our Research

Services

Genomics Data Analysis

Our primary aim revolves around the comprehensive analysis of genomics data. This involves data preprocessing, quality control, statistical analysis, and standard and custom bioinformatics pipelines.

Visualization

We offer comprehensive data visualization that cater to both standard and custom needs. For standard needs, we offer readily available visualizations such as UMAP and tSNE plots, volcano plots, heatmaps, circus plots, etc. Simultaneously, for custom needs, we can create visualizations tailored to the unique characteristics and research objectives of the study.

Data Management

We maintain a robust system for fast and efficient ‘omics’ data acquisition and transfer to our secure and dedicated high-performance server. This system is meticulously designed to ensure both precision and efficiency throughout the data transfer, analysis, and delivery of outputs.

Custom Solutions

When existing pipelines do not adequately meet the needs of a particular study, we design and deploy custom workflows tailored to the specific project. This may involve the incorporation of machine learning models, specialized statistical methods for differential gene expression analysis, or pathway enrichment tools for functional interpretation.

Methodological Writing

We offer expert support and guidance for the development of the methodology section for to various contexts, including research papers, grant applications, and collaborative projects.

Experimental Design

Statistical power is a critical metric that measures the ability of an experiment to detect true effects accurately while minimizing the risk of false negatives. We can help you plan every aspect of an experiment, from sample size determination to the choice of controls and treatments. By carefully considering factors such as effect size, variability, and statistical tests, we aim to design experiments that are optimized to detect even subtle biological effects.

How We Work

Submit Your Request

We kindly request that you complete this form, providing a concise overview of your project's initial information. Upon receiving your form, we will promptly reach out to you, typically on the same day, to set up an initial meeting at a time that suits your schedule.

Acknowledgements

The Core services are currently subsidized generously by the McCormick Endowment Fund. It is free to all labs in the Department. We kindly request that you acknowledge our team's contributions by including their percentage of effort in your future grant proposals and recognizing them as co-authors in publications. Also, please acknowledge the McCormick Endowment Fund support in publications, and in poster and oral presentations.

Resources

MGPC Server

We possess a secure and dedicated High-Performance Computing (HPC) server specifically equipped and approved by the National Institutes of Health (NIH) to handle individual-level human data with the utmost security and compliance. The storage and computational configuration of the server is as follows: 2 x Intel Xeon E5-2680 v4, 14 cores @ 2.40 GHz, 1.5 TB RAM DDR4, 70 TB of usable space for storing data. The server is running Rocky 8 Linux and it is dbGaP data compliant.

The MGPC server boasts an extensive repertoire of over 100 software packages curated for genomic data analysis. These include, but are not limited to: STAR, STARsolo, BWA, GATK, SAMtools, GATK, Strelka2, CellRanger, etc. These packages are subject to regular updates, ensuring that we maintain the most current versions to facilitate cutting-edge research in genomics.

Software Licenses

We hold a variety of commercial software licenses for genomic analysis, including Partek, IPA, GraphiaProfessional, and BioRender and provide assistance in utilizing these tools.

Methods

Data Types

We employ a systematic approach to acquire and preprocess various types of biological data, including:

Sequencing Data: High-throughput DNA- and RNA-sequencing data from various platforms, including Illumina, PacBio and Oxford Nanopore.
Sequencing Approach: Whole Genome Sequencing (WGS), Whole Exome Sequencing (WES), Targeted panel sequencing, Enrichment Libraries Sequencing (i.e. ATAC-seq, CHiP-seq), bulk RNA-sequencing, Single Cell RNA-sequencing (scRNA-seq).

Analyses Types

Genomic Analysis: Alignment, Variant Calling, Variant Functional Annotation
Bulk RNA-seq: Alignment, Assembly, Expression quantification, Differential Gene Expression, Isoform Analysis, Splicing Analysis, Gene Set Enrichment Analysis, Variant Calling, Variant Functional Annotation
Single Cell Analysis: Alignment, Assembly, Expression quantification, Cell Type Identification, Differential Gene Expression, Isoform Analysis, Splicing Analysis, Variant Calling, Gene Set Enrichment Analysis, Velocity, Trajectory, Spatial transcriptomics.
Epigenomic Analysis: DNA methylation patterns, histone modifications, chromatin accessibility.
Multiomics Analysis: To gain comprehensive insights, we integrate multi-omics data using advanced bioinformatic tools.

Software and Tools

We utilize a diverse range of bioinformatics software and tools, including but not limited to:

Single Cell Transcriptomics: STARsolo, CellRanger, Seurat, SingleR, ScType, Slingshot, Velocito
Genomic Analysis: BWA, Bowtie, GATK, SAMtools
Transcriptomic Analysis: STAR, Salmon, DESeq2, edgeR, DEXseq

Quality Control and Validation

All analyses are subjected to rigorous quality control and validation to ensure the reliability and accuracy of results.

Our Research

Examples of projects that we helped with

Trim28 in prostate cancer and immune response
CDK4/6 inhibitor in Breast Cancer
Spatial transcriptomics of TNBC
HDAC8 & HDAC11 RNAseq data analysis
Fibroblast-macrophage communication
CD8 immunity against toxoplasmosis
Correlating genetic changes of DCIS cells with biomechanical alterations
Single cell RNA-seq of human neoroblastoma cells after photothermal nanoparticles treatment
Jurkat_BRCA1_HA_FKBP_clone
KRAS and P16 in Mouse Models of Esophageal Cancer
Investigating the Renin-Angiotensin system and Dopamine system in adolescent and adult prefrontal cortex.
Brain Renin–Angiotensin System as Novel and Potential Therapeutic Target for Alzheimer’s Disease
Differential expression of genes and pathways in parental and resistant cell lines and patient samples
SULF1 and SULF2 Dependent Pathways
Analysis of TCR repertoire in response to immunotherapy
BACH1 in head and neck cancer
Association of somatic variants with ethnicity in patients with colon cancer
ERV expression in Trim28 deleted prostates
CDK4/6-inhibitor resistance

The Department of Biochemistry & Molecular Medicine