About
The McCormick Bioinformatics Core (MBC), which is supported by the McCormick Endowment Fund, offers bioinformatics services to researchers within the Department of Biochemistry and Molecular Medicine and the School of Medicine and Health Sciences. We emphasize the use of cutting-edge bioinformatics tools and methodologies to help unlock the full potential of complex multidimensional genomic data. Our mission is to foster a collaborative environment that brings together experimental and computational biologists from GWU and worldwide, with the ultimate objective of advancing the biomedical research.
Our Team
- Anelia Horvath, PhD
-
Professor, Department of Biochemistry and Molecular Medicine, GW
Director, McCormick Bioinformatics Core
Co-Director, The McCormick Genomic & Proteomic Center, GWFollowing my postdoctoral training in Cancer Genomics (NICHD, NIH, Bethesda, MD), where I led efforts to identify novel genes implicated in endocrine tumors, in September 2013 I joined the George Washington University (GWU), as a Co-Director of the McCormick Genomic and Proteomic Center (MGPC) a Professor in the Department of Biochemistry and Molecular Medicine. Here, my primary focus lay in the development of novel methodologies for the comprehensive analysis and integration of genomics and transcriptomics data. My expertise spans over 15 years in leading next-generation sequencing (NGS) analyses and developing custom bioinformatics solutions.
Over the past several years, my scientific focus has been dedicated to single cell genomics, and, more specifically, on the intratumor heterogeneity related to expressed genetic variation. This encompasses an exhaustive exploration, not limited to DNA-derived germline and somatic variants but extending into the dynamic landscape of RNA-originating variations. Establishing myself as one of the leaders in this field, I have spearheaded the development of multiple analytical tools tailored to unraveling the complexities of cell-level expressed genetic variants. Ongoing initiatives involve the creation of a cutting-edge, cloud-based scalable platform for public analysis of cell-level expressed genetic variation, complemented by the integration of artificial intelligence to guide the modeling of the origin and effects of these variations.
- Siera Martinez
-
BS in Neurobiology
MS in Computer Science and Machine Learning (ongoing)Siera brings a unique blend of expertise in bioinformatics and computational biology, merging her background in biology and neuroscience with advanced skills in machine learning, programming, and statistical analysis. Her experience includes satellite data analysis at NASA and creating bioinformatics workflows for single-cell RNA sequencing in Dr. Horvath's lab at George Washington University. With strong proficiency in next-generation sequencing, software development, and cancer data modeling, Siera is exceptionally equipped to contribute to the McCormick Bioinformatics Core. Her technical skills and hands-on research experience align seamlessly with the Core’s mission to propel genomic and transcriptomic research through innovative computational approaches.
- Luke Johnson
-
MS in Bioinformatics
Luke Johnson, a graduate of GWU's Bioinformatics and Molecular Biochemistry program, combines a passion for technology with a strong commitment to scientific discovery. With expertise in Linux, shell scripting, R, and Python, alongside a solid foundation in biochemistry and practical experience in machine learning, Luke brings a comprehensive technical skill set to the team. His focus on single-cell data analytics drives his enthusiasm for bioinformatics, where he is eager to make impactful contributions and continue exploring the field’s potential.
Jump to:
Services
Genomics Data Analysis
Our primary aim revolves around the comprehensive analysis of genomics data. This involves data preprocessing, quality control, statistical analysis, and standard and custom bioinformatics pipelines.
Visualization
We offer comprehensive data visualization that cater to both standard and custom needs. For standard needs, we offer readily available visualizations such as UMAP and tSNE plots, volcano plots, heatmaps, circus plots, etc. Simultaneously, for custom needs, we can create visualizations tailored to the unique characteristics and research objectives of the study.
Data Management
We maintain a robust system for fast and efficient ‘omics’ data acquisition and transfer to our secure and dedicated high-performance server. This system is meticulously designed to ensure both precision and efficiency throughout the data transfer, analysis, and delivery of outputs.
Custom Solutions
When existing pipelines do not adequately meet the needs of a particular study, we design and deploy custom workflows tailored to the specific project. This may involve the incorporation of machine learning models, specialized statistical methods for differential gene expression analysis, or pathway enrichment tools for functional interpretation.
Methodological Writing
We offer expert support and guidance for the development of the methodology section for to various contexts, including research papers, grant applications, and collaborative projects.
Experimental Design
Statistical power is a critical metric that measures the ability of an experiment to detect true effects accurately while minimizing the risk of false negatives. We can help you plan every aspect of an experiment, from sample size determination to the choice of controls and treatments. By carefully considering factors such as effect size, variability, and statistical tests, we aim to design experiments that are optimized to detect even subtle biological effects.
How We Work
Submit Your Request
We kindly request that you complete this form, providing a concise overview of your project's initial information. Upon receiving your form, we will promptly reach out to you, typically on the same day, to set up an initial meeting at a time that suits your schedule.
Acknowledgements
The Core services are currently subsidized generously by the McCormick Endowment Fund. It is free to all labs in the Department. We kindly request that you acknowledge our team's contributions by including their percentage of effort in your future grant proposals and recognizing them as co-authors in publications. Also, please acknowledge the McCormick Endowment Fund support in publications, and in poster and oral presentations.
Resources
MGPC Server
We possess a secure and dedicated High-Performance Computing (HPC) server specifically equipped and approved by the National Institutes of Health (NIH) to handle individual-level human data with the utmost security and compliance. The storage and computational configuration of the server is as follows: 2 x Intel Xeon E5-2680 v4, 14 cores @ 2.40 GHz, 1.5 TB RAM DDR4, 70 TB of usable space for storing data. The server is running Rocky 8 Linux and it is dbGaP data compliant.
The MGPC server boasts an extensive repertoire of over 100 software packages curated for genomic data analysis. These include, but are not limited to: STAR, STARsolo, BWA, GATK, SAMtools, GATK, Strelka2, CellRanger, etc. These packages are subject to regular updates, ensuring that we maintain the most current versions to facilitate cutting-edge research in genomics.
Software Licenses
We hold a variety of commercial software licenses for genomic analysis, including Partek, IPA, GraphiaProfessional, and BioRender and provide assistance in utilizing these tools.
Methods
Data Types
We employ a systematic approach to acquire and preprocess various types of biological data, including:
- Sequencing Data: High-throughput DNA- and RNA-sequencing data from various platforms, including Illumina, PacBio and Oxford Nanopore.
- Sequencing Approach: Whole Genome Sequencing (WGS), Whole Exome Sequencing (WES), Targeted panel sequencing, Enrichment Libraries Sequencing (i.e. ATAC-seq, CHiP-seq), bulk RNA-sequencing, Single Cell RNA-sequencing (scRNA-seq).
Analyses Types
- Genomic Analysis: Alignment, Variant Calling, Variant Functional Annotation
- Bulk RNA-seq: Alignment, Assembly, Expression quantification, Differential Gene Expression, Isoform Analysis, Splicing Analysis, Gene Set Enrichment Analysis, Variant Calling, Variant Functional Annotation
- Single Cell Analysis: Alignment, Assembly, Expression quantification, Cell Type Identification, Differential Gene Expression, Isoform Analysis, Splicing Analysis, Variant Calling, Gene Set Enrichment Analysis, Velocity, Trajectory, Spatial transcriptomics.
- Epigenomic Analysis: DNA methylation patterns, histone modifications, chromatin accessibility.
- Multiomics Analysis: To gain comprehensive insights, we integrate multi-omics data using advanced bioinformatic tools.
Software and Tools
We utilize a diverse range of bioinformatics software and tools, including but not limited to:
- Single Cell Transcriptomics: STARsolo, CellRanger, Seurat, SingleR, ScType, Slingshot, Velocito
- Genomic Analysis: BWA, Bowtie, GATK, SAMtools
- Transcriptomic Analysis: STAR, Salmon, DESeq2, edgeR, DEXseq
Quality Control and Validation
All analyses are subjected to rigorous quality control and validation to ensure the reliability and accuracy of results.
Our Research
Examples of projects that we helped with
- Trim28 in prostate cancer and immune response
- CDK4/6 inhibitor in Breast Cancer
- Spatial transcriptomics of TNBC
- HDAC8 & HDAC11 RNAseq data analysis
- Fibroblast-macrophage communication
- CD8 immunity against toxoplasmosis
- Correlating genetic changes of DCIS cells with biomechanical alterations
- Single cell RNA-seq of human neoroblastoma cells after photothermal nanoparticles treatment
- Jurkat_BRCA1_HA_FKBP_clone
- KRAS and P16 in Mouse Models of Esophageal Cancer
- Investigating the Renin-Angiotensin system and Dopamine system in adolescent and adult prefrontal cortex.
- Brain Renin–Angiotensin System as Novel and Potential Therapeutic Target for Alzheimer’s Disease
- Differential expression of genes and pathways in parental and resistant cell lines and patient samples
- SULF1 and SULF2 Dependent Pathways
- Analysis of TCR repertoire in response to immunotherapy
- BACH1 in head and neck cancer
- Association of somatic variants with ethnicity in patients with colon cancer
- ERV expression in Trim28 deleted prostates
- CDK4/6-inhibitor resistance