LentiMPRA identifies breast cancer associated risk variants with functional regulatory activity

Loading...
Thumbnail Image

Embargo End Date

2026-09-05

ICR Authors

Authors

Mackie, K

Document Type

Thesis or Dissertation

Date

2026-02-05

Date Accepted

Abstract

Genome-wide association studies (GWAS), combined with fine-mapping have identified 196 independent signals associated with breast cancer risk. Identifying f unctional variants, and their target genes, at these r egions is important as it can inform on u derlying m echanisms contributing to breast cancer risk and in turn direct prevention and treatment interventions. The f unctionality of thousands of putative regulatory regions can be measured simultaneously using massively p arallel reporter assays (MPRA). One iteration of this t echnique uses a lentivirus-based protocol (lentiMPRA), w hich has the advantage of assessing egulatory p otential following genomic integration of sequences i nto cells. LentiMPRA was used to assess the f unctionality of 5,116 breast cancer associated credible c ausal variants (CCVs) reported by the Breast Cancer Association Consortium (BCAC). A barcoded oligonucleotide library of candidate regulatory sequences (CRS), centred on the reference and alternative allele of each CCV was generated. This was then packaged into a lentivirus library and used to infect T-47D cells, from which a transcription rate for each CRS was estimated using DNA and RNA sequencing. Cell t ype specificity was assessed by repeating the l entiMPRA, using a subset of the oligonucleotide library, i n a fibroblast cell line (GS2). Output data from the T-47D lentiMPRA was used to identify the subset of CCVs that (i) map to a CRS with enhancer activity and (ii) show allelic differences in that activity. There were 306 CRS with significant enhancer activity and 7 09 variants, m apping to 140 risk regions, that were associated with significant variation b etween alleles. Validation of i ndividual CRS and CCVs, and investigation into p ossible target genes, was carried out using CRISPR interference (CRISPRi) and CRISPR activation (CRISPRa). This i dentified CCDC88C and ZMIZ1 as target genes of l entiMPRA prioritised variants. Follow-up analysis found b oth genes to be associated with E R+ disease, and, for C CDC88C, patient outcome.

Citation

2026

DOI

Source Title

Publisher

Institute of Cancer Research (University Of London)

ISSN

eISSN

Research Team

Functional Genetic Epi

Notes