McCormick Bioinformatics Core (MBC)

About

The McCormick Bioinformatics Core (MBC), which is supported by the McCormick Endowment Fund, offers bioinformatics services to researchers within the Department of Biochemistry and Molecular Medicine and the School of Medicine and Health Sciences. We emphasize the use of cutting-edge bioinformatics tools and methodologies to help unlock the full potential of complex multidimensional genomic data. Our mission is to foster a collaborative environment that brings together experimental and computational biologists from GWU and worldwide, with the ultimate objective of advancing the biomedical research.​

Our Team

Anelia Horvath, PhD

Professor, Department of Biochemistry and Molecular Medicine, GW

Director, McCormick Bioinformatics Core

Co-Director, The McCormick Genomic & Proteomic Center, GW

Anelia Horvath

Following my postdoctoral training in Cancer Genomics (NICHD, NIH, Bethesda, MD), where I led efforts to identify novel genes implicated in endocrine tumors, in September 2013 I joined the George Washington University (GWU), as a Co-Director of the McCormick Genomic and Proteomic Center (MGPC) a Professor in the Department of Biochemistry and Molecular Medicine. Here, my primary focus lay in the development of novel methodologies for the comprehensive analysis and integration of genomics and transcriptomics data. My expertise spans over 15 years in leading next-generation sequencing (NGS) analyses and developing custom bioinformatics solutions.

Over the past several years, my scientific focus has been dedicated to single cell genomics, and, more specifically, on the intratumor heterogeneity related to expressed genetic variation. This encompasses an exhaustive exploration, not limited to DNA-derived germline and somatic variants but extending into the dynamic landscape of RNA-originating variations. Establishing myself as one of the leaders in this field, I have spearheaded the development of multiple analytical tools tailored to unraveling the complexities of cell-level expressed genetic variants. Ongoing initiatives involve the creation of a cutting-edge, cloud-based scalable platform for public analysis of cell-level expressed genetic variation, complemented by the integration of artificial intelligence to guide the modeling of the origin and effects of these variations.

The Horvath Lab

Raja Mazumder, PhD

Professor, Department of Biochemistry and Molecular Medicine, GW

Co-Director, The McCormick Genomic & Proteomic Center, GW

Raja Mazumder

As a Professor of Biochemistry and Molecular Medicine at The George Washington University (GW), member of GW Cancer Center and while working at UniProt (co-team lead) and leading international projects such as BioCompute, GlyGen and OncoMX, Dr. Mazumder has worked closely with his colleagues in developing international molecular biology and informatic resources and standards. He is the co-founder of the BioCompute bioinformatics data analysis reporting standard supported by the United States Food and Drug Administration (FDA). Through NCI, NSF, NIGMS, NIAD, pharmaceutical, and FDA funding his group is actively involved in genomic and bioinformatics research associated with cancer biology, proteomics, glycobiology, metagenomics, and Electronic Health Records. He is the co-founder of the cloud compatible High Performance Integrated Virtual Environment (HIVE) for genomics and healthcare data analysis using novel algorithms including Machine Learning. He co-directs The McCormick Genomic & Proteomic Center and also co-directs the Bioinformatics M.S. and Ph.D. graduate programs at GW.

Allen Kim, PhD

Bioinformatics Specialist

Allen Kim

As a Bioinformatics Specialist at The George Washington University, Allen is responsible for providing support to several next-generation sequencing projects in the Biochemistry and Molecular Medicine Department. He also works on a center project which involves the identification and characterization of single nucleotide variants (SNVs) in scRNA-seq datasets from publicly available datasets. In the past, he has worked on numerous other scientific projects ranging from proteomics, cell biology, and structural biology. These have included experimental work to characterize different folds of a protein and various kinase reporter designs. Computational work in the past has included measuring protein turnover rates in proteomics experiments and identifying proteins with the ability to switch secondary structures.

 

Jump to:


  Services

Genomics Data Analysis

Our primary aim revolves around the comprehensive analysis of genomics data. This involves data preprocessing, quality control, statistical analysis, and standard and custom bioinformatics pipelines.​

Visualization

We offer comprehensive data visualization that cater to both standard and custom needs. For standard needs, we offer readily available visualizations such as UMAP and tSNE plots, volcano plots, heatmaps, circus plots, etc. Simultaneously, for custom needs, we can create visualizations tailored to the unique characteristics and research objectives of the study.​

Data Management

We maintain a robust system for fast and efficient ‘omics’ data acquisition and transfer to our secure and dedicated high-performance server. This system is meticulously designed to ensure both precision and efficiency throughout the data transfer, analysis, and delivery of outputs.​

Custom Solutions

When existing pipelines do not adequately meet the needs of a particular study, we design and deploy custom workflows tailored to the specific project. This may involve the incorporation of machine learning models, specialized statistical methods for differential gene expression analysis, or pathway enrichment tools for functional interpretation. ​

Methodological Writing

We offer expert support and guidance for the development of the methodology section for to various contexts, including research papers, grant applications, and collaborative projects.​

Experimental Design

Statistical power is a critical metric that measures the ability of an experiment to detect true effects accurately while minimizing the risk of false negatives. We can help you plan every aspect of an experiment, from sample size determination to the choice of controls and treatments. By carefully considering factors such as effect size, variability, and statistical tests, we aim to design experiments that are optimized to detect even subtle biological effects.​

  How We Work

Submit Your Request

We kindly request that you complete this form, providing a concise overview of your project's initial information. Upon receiving your form, we will promptly reach out to you, typically on the same day, to set up an initial meeting at a time that suits your schedule.​

Acknowledgements

The Core services are currently subsidized generously by the McCormick Endowment Fund. It is free to all labs in the Department. We kindly request that you acknowledge our team's contributions by including their percentage of effort in your future grant proposals and recognizing them as co-authors in publications. Also, please acknowledge the McCormick Endowment Fund support in publications, and in poster and oral presentations.

  Resources

MGPC Server

We possess a secure and dedicated High-Performance Computing (HPC) server specifically equipped and approved by the National Institutes of Health (NIH) to handle individual-level human data with the utmost security and compliance. The storage and computational configuration of the server is as follows: 2 x Intel Xeon E5-2680 v4, 14 cores @ 2.40 GHz, 1.5 TB RAM DDR4, 70 TB of usable space for storing data. The server is running Rocky 8 Linux and it is dbGaP data compliant.​

The MGPC server boasts an extensive repertoire of over 100 software packages curated for genomic data analysis. These include, but are not limited to: STAR, STARsolo, BWA, GATK, SAMtools, GATK, Strelka2, CellRanger, etc. These packages are subject to regular updates, ensuring that we maintain the most current versions to facilitate cutting-edge research in genomics.

Software Licenses

We hold a variety of commercial software licenses for genomic analysis, including Partek, IPA, GraphiaProfessional, and BioRender and provide assistance in utilizing these tools.​

  Methods

Data Types

We employ a systematic approach to acquire and preprocess various types of biological data, including:​

  • Sequencing Data: High-throughput DNA- and RNA-sequencing data from various platforms, including Illumina, PacBio and Oxford Nanopore.​
  • Sequencing Approach: Whole Genome Sequencing (WGS), Whole Exome Sequencing (WES), Targeted panel sequencing, Enrichment Libraries Sequencing (i.e. ATAC-seq, CHiP-seq), bulk RNA-sequencing, Single Cell RNA-sequencing (scRNA-seq).​

Analyses Types

  • Genomic Analysis: Alignment, Variant Calling, Variant Functional Annotation​
  • Bulk RNA-seq: Alignment, Assembly, Expression quantification, Differential Gene Expression, Isoform Analysis, Splicing Analysis, Gene Set Enrichment Analysis, Variant Calling, Variant Functional Annotation​
  • Single Cell Analysis: Alignment, Assembly, Expression quantification, Cell Type Identification, Differential Gene Expression, Isoform Analysis, Splicing Analysis, Variant Calling, Gene Set Enrichment Analysis, Velocity, Trajectory, Spatial transcriptomics.​
  • Epigenomic Analysis: DNA methylation patterns, histone modifications, chromatin accessibility.​
  • Multiomics Analysis: To gain comprehensive insights, we integrate multi-omics data using advanced bioinformatic tools.

Software and Tools

We utilize a diverse range of bioinformatics software and tools, including but not limited to:​

  • Single Cell Transcriptomics: STARsolo, CellRanger, Seurat, SingleR, ScType, Slingshot, Velocito​
  • Genomic Analysis: BWA, Bowtie, GATK, SAMtools
  • Transcriptomic Analysis: STAR, Salmon, DESeq2, edgeR, DEXseq​

Quality Control and Validation

All analyses are subjected to rigorous quality control and validation to ensure the reliability and accuracy of results.

Our Research

Examples of projects that we helped with

  • Trim28 in prostate cancer and immune response
  • CDK4/6 inhibitor in Breast Cancer
  • Spatial transcriptomics of TNBC​
  • HDAC8 & HDAC11 RNAseq data analysis​
  • Fibroblast-macrophage communication​
  • CD8 immunity against toxoplasmosis​
  • Correlating genetic changes of DCIS cells with biomechanical alterations​
  • Single cell RNA-seq of human neoroblastoma cells after photothermal nanoparticles treatment​
  • Jurkat_BRCA1_HA_FKBP_clone​
  • KRAS and P16 in Mouse Models of Esophageal Cancer​
  • Investigating the Renin-Angiotensin system and Dopamine system in adolescent and adult prefrontal cortex.​
  • Brain Renin–Angiotensin System as Novel and Potential Therapeutic Target for Alzheimer’s Disease​
  • Differential expression of genes and pathways in parental and resistant cell lines and patient samples​
  • SULF1 and SULF2 Dependent Pathways​
  • Analysis of TCR repertoire in response to immunotherapy​
  • BACH1 in head and neck cancer​
  • Association of somatic variants with ethnicity in patients with colon cancer​
  • ERV expression in Trim28 deleted prostates​
  • CDK4/6-inhibitor resistance​

Our Own Research