Sources & Data

References & Data Sources

Every fact, statistic, and biological claim on this website is sourced from peer-reviewed literature or authoritative public databases. This page catalogues them all.

Educational use only. This website is a learning tool built on publicly available research data. It is not a clinical resource and should not be used for medical decisions. For clinical information about TP53 mutations, consult a qualified healthcare provider or refer directly to the databases listed below.
Primary Databases data sources
NCI TP53 Database
tp53.cancer.gov
The authoritative resource for TP53 variants, maintained by the National Cancer Institute. R21 (January 2025) contains ~29,900 somatic tumor variants from the literature since 1989, plus germline variants, functional data, and cell line information.
Mutation map data Cancer frequency data Hotspot frequencies
RCSB Protein Data Bank
rcsb.org · Structure 2OCJ
The global archive of 3D macromolecular structure data. Structure 2OCJ is the crystal structure of the p53 core domain (DNA-binding domain) at 2.05 Å resolution, solved by Joerger, Ang & Fersht (2006). Used for all 3D visualizations in this site.
3D protein viewer Domain visualization
UniProt P04637
uniprot.org/uniprot/P04637
The Swiss-Prot entry for human p53 (TP53_HUMAN). Used as the authoritative source for domain boundary positions, post-translational modifications, and protein length (393 amino acids).
Domain boundaries Protein length Domain functions
COSMIC
cancer.sanger.ac.uk/cosmic
Catalogue of Somatic Mutations in Cancer, maintained by the Wellcome Sanger Institute. Provides mutation frequency data and tissue distribution across cancer types, complementary to the NCI TP53 Database.
Mutation frequencies Cancer type context
cBioPortal
cbioportal.org
An open platform for exploring multidimensional cancer genomics data from TCGA and other large studies. Used to cross-check mutation prevalence figures across cancer types.
Prevalence cross-check Cancer genomics
ClinVar
ncbi.nlm.nih.gov/clinvar
NCBI's archive of relationships between genomic variants and human health. Used to verify germline TP53 variant classifications related to Li-Fraumeni syndrome.
Germline variants Clinical significance
Discovery & Historical Overview foundational papers
[1]
Lane DP, Crawford LV.
T antigen is bound to a host protein in SV40-transformed cells.
Nature 278, 261–263 (1979).
Used for: Discovery of p53 (1979). This paper first described a 53 kDa cellular protein co-immunoprecipitating with the SV40 large T antigen. Along with Linzer & Levine and DeLeo et al. (all 1979), it established p53 as a cellular protein of interest. Lane later coined the phrase "guardian of the genome" (1992).
[2]
Linzer DIH, Levine AJ.
Characterization of a 54K dalton cellular SV40 tumor antigen present in SV40-transformed cells and uninfected embryonal carcinoma cells.
Cell 17(1), 43–52 (1979).
Used for: Independent co-discovery of p53 in May 1979, two months after Lane & Crawford, using a different experimental system.
[3]
Lane DP.
Cancer. p53, guardian of the genome.
Nature 358(6381), 15–16 (1992).
Used for: Origin of the phrase "guardian of the genome" — used in the title and hero section of this website.
[4]
Olivier M, Hollstein M, Hainaut P.
TP53 mutations in human cancers: origins, consequences, and clinical use.
Cold Spring Harbor Perspectives in Biology 2(1), a001008 (2010).
Used for: Mutation prevalence statistics; the figure that 86% of TP53 mutations cluster between codons 125–300 (the DNA-binding domain); and the observation that >75% of all TP53 mutations are missense substitutions.
[5]
Vousden KH, Prives C.
Blinded by the light: the growing complexity of p53.
Cell 137(3), 413–431 (2009).
Used for: Authoritative review of p53 function: DNA damage sensing, cell cycle arrest (growth arrest), apoptosis induction, and the MDM2 negative feedback loop. Basis for the About page functional descriptions.
[6]
Bouaoun L, Sonkin D, Ardin M, et al.
TP53 variations in human cancers: new lessons from the IARC TP53 database and genomics data.
Human Mutation 37(9), 865–876 (2016).
Used for: Current mutation frequency data; comparison of NCI/IARC and TCGA genomics data; hotspot codon positions (248, 273, 175, 245 as top four by frequency). This is the primary methodological reference for the mutation map.
[7]
de Andrade KC, Lee EE, Tookmanian EM, et al.
The TP53 Database: transition from the International Agency for Research on Cancer to the US National Cancer Institute.
Cell Death & Differentiation 29, 1071–1073 (2022).
Used for: The official citation for the NCI TP53 Database (R21, 2025) as directed by the database itself. The R21 release contains ~29,900 tumor variants — the source of the "29,900+ catalogued variants" statistic on the Home page.
Protein Structure 3D viewer sources
[8]
Joerger AC, Ang HC, Fersht AR.
Structural basis for understanding oncogenic p53 mutations and designing rescue drugs.
Proceedings of the National Academy of Sciences 103(41), 15056–15061 (2006).
Used for: The crystal structure loaded in the 3D viewer is PDB entry 2OCJ, which is the structure described in this paper. It shows the p53 core domain (DNA-binding domain) at 2.05 Å resolution and includes structural context for hotspot mutations R175H, R248W, and R273H.
[9]
Joerger AC, Fersht AR.
Structural biology of the tumor suppressor p53.
Annual Review of Biochemistry 77, 557–579 (2008).
Used for: Definitive reference for p53 protein domain structure and function, structural and contact mutant classification, and the structural basis of hotspot mutations. Domain boundary positions cited in protein_domains.json are consistent with this review.
Hotspot Mutation Biology mutations page
[10]
Brosh R, Rotter V.
When mutants gain new powers: news from the mutant p53 field.
Nature Reviews Cancer 9, 701–713 (2009).
Used for: The gain-of-function concept for hotspot mutants described on the About page ("When It Fails" card) and Mutations page. Documents how R175H, R248W, and R273H not only lose tumour suppressor function but acquire new cancer-promoting activities.
[11]
Bullock AN, Fersht AR.
Rescuing the function of mutant p53.
Nature Reviews Cancer 1, 68–76 (2001).
Used for: Classification of hotspot mutations into structural mutants (e.g., R175H — distort the protein fold) and contact mutants (e.g., R248W, R273H — directly disrupt DNA binding without major structural change). This classification is used in the hotspot detail cards.
[12]
Petitjean A, Mathe E, Kato S, et al.
Impact of mutant p53 functional properties on TP53 mutation patterns and tumor phenotype: lessons from recent developments in the IARC TP53 database.
Human Mutation 28(6), 622–629 (2007).
Used for: Hotspot codon frequencies (R175, G245, R248, R249, R273, R282 are the six most frequent somatic hotspots in the IARC database). Also the source for the figure that >75% of TP53 mutations are missense substitutions.
[13]
Muller PAJ, Vousden KH.
Mutant p53 in cancer: new functions and therapeutic opportunities.
Cancer Cell 25(3), 304–317 (2014).
Used for: Comprehensive review of gain-of-function mutant p53 biology, including the oncogenic activities of R175H (structural mutant) and R273H/R248W (contact mutants). Also the basis for the website's description that some mutant p53 proteins "promote cancer cell survival, invasion, and drug resistance."
[14]
Joerger AC, Fersht AR.
The p53 pathway: origins, inactivation in cancer, and emerging therapeutic approaches.
Annual Review of Biochemistry 85, 375–404 (2016).
Used for: Molecular detail for individual hotspot mutations: R175H disrupts the L2/L3 loops and the hydrophobic core of the DNA-binding domain; R248 and R273 are DNA-contact residues whose substitution directly abolishes DNA binding without major conformational change. Basis for hotspot card descriptions on the Mutations page.
MDM2 Regulation & Feedback Loop about page
[15]
Momand J, Zambetti GP, Olson DC, George D, Levine AJ.
The mdm-2 oncogene product forms a complex with the p53 protein and inhibits p53-mediated transactivation.
Cell 69(7), 1237–1245 (1992).
Used for: Discovery of the MDM2–p53 interaction and the negative feedback loop. MDM2 binds to p53's transactivation domain and blocks its activity; p53 in turn transcriptionally activates MDM2, completing the auto-regulatory loop described in the "MDM2 Balance" card.
[16]
Vogelstein B, Lane D, Levine AJ.
Surfing the p53 network.
Nature 408, 307–310 (2000).
Used for: Overview of the p53 pathway and its role as a network hub integrating stress signals, activating the MDM2 feedback loop, and coordinating cell fate decisions (arrest, repair, apoptosis, senescence). Foundational context for the About page.
Li-Fraumeni Syndrome about & statistics pages
[17]
Li FP, Fraumeni JF Jr.
Soft-tissue sarcomas, breast cancer, and other neoplasms: a familial syndrome?
Annals of Internal Medicine 71(4), 747–752 (1969).
Used for: Original description of Li-Fraumeni syndrome — the hereditary cancer predisposition syndrome caused by germline TP53 mutations. Mentioned on the Statistics and About pages.
[18]
Guha T, Malkin D.
Inherited TP53 mutations and the Li-Fraumeni syndrome.
Cold Spring Harbor Perspectives in Medicine 7(4), a026187 (2017).
Used for: Li-Fraumeni syndrome prevalence (~1 in 5,000–20,000 individuals), lifetime cancer risk of approximately 90% for women and 70% for men who carry a germline TP53 pathogenic variant, and the spectrum of associated tumours (adrenocortical carcinoma, breast cancer, CNS tumours, osteosarcoma, soft-tissue sarcoma). The statistics in reference [18] supersede earlier estimates of "near-100%" which did not stratify by sex.
Software & Libraries technical credits
[19]
Rose AS, Bradley AR, Valasatava Y, Duarte JM, Prlić A, Rose PW.
NGL viewer: web-based molecular graphics for large complexes.
Bioinformatics 34(21), 3755–3758 (2018).
Used for: NGL Viewer is the JavaScript library that renders the 3D protein structure in structure.html. All molecular visualizations (cartoon, surface, ball-and-stick representations) are produced by NGL.
[20]
Bostock M, Ogievetsky V, Heer J.
D³: data-driven documents.
IEEE Transactions on Visualization and Computer Graphics 17(12), 2301–2309 (2011).
Used for: D3.js is the JavaScript library used to render the 393-bar mutation frequency heatmap on mutations.html.
[21]
Chart.js Contributors.
Chart.js: Simple yet flexible JavaScript charting library.
Open source software. Version 4.x. MIT License.
Used for: Chart.js renders the cancer frequency bar chart and mutation type doughnut chart on statistics.html.
[22]
Instrument Fonts — Rodrigo Fuenzalida / Google Fonts.
Instrument Serif, Instrument Sans.
Open Font License. Available at fonts.google.com.
Used for: Typography throughout the website. Instrument Serif is used for headings and the site logo. Instrument Sans is used for body text. Inconsolata is used for data labels, navigation links, and monospaced displays.
A note on data currency
The mutation frequency data displayed in this website is based on the NCI TP53 Database R21 (January 2025). Mutation databases are updated periodically as new literature is added; some frequencies may differ slightly in future releases. The protein structure (PDB 2OCJ) is a static crystal structure from 2006 and remains a valid reference for the p53 DNA-binding domain. For the most current variant data, always consult tp53.cancer.gov directly.