Int J Med Sci 2018; 15(1):46-58. doi:10.7150/ijms.20508

Research Paper

Antioxydation And Cell Migration Genes Are Identified as Potential Therapeutic Targets in Basal-Like and BRCA1 Mutated Breast Cancer Cell Lines

Maud Privat1,2, Justine Rudewicz2, Nicolas Sonnier1,2,3, Christelle Tamisier2, Flora Ponelle-Chachuat1,2, Yves-Jean Bignon1,2,3 Corresponding address

1. Université Clermont Auvergne, Centre Jean Perrin, INSERM, U1240 Imagerie Moléculaire et Stratégies Théranostiques, F-63000 Clermont Ferrand, France
2. Département d'Oncogénétique, Centre Jean Perrin, F-63000 Clermont Ferrand, France
3. Biological Resources Center BB-0033-00075, Centre Jean Perrin, F-63000 Clermont Ferrand, France

This is an open access article distributed under the terms of the Creative Commons Attribution (CC BY-NC) license ( See for full terms and conditions.
Privat M, Rudewicz J, Sonnier N, Tamisier C, Ponelle-Chachuat F, Bignon YJ. Antioxydation And Cell Migration Genes Are Identified as Potential Therapeutic Targets in Basal-Like and BRCA1 Mutated Breast Cancer Cell Lines. Int J Med Sci 2018; 15(1):46-58. doi:10.7150/ijms.20508. Available from

File import instruction


Basal-like breast cancers are among the most aggressive cancers and effective targeted therapies are still missing. In order to identify new therapeutic targets, we performed Methyl-Seq and RNA-Seq of 10 breast cancer cell lines with different phenotypes. We confirmed that breast cancer subtypes cluster the RNA-Seq data but not the Methyl-Seq data. Basal-like tumor hypermethylated phenotype was not confirmed in our study but RNA-Seq analysis allowed to identify 77 genes significantly overexpressed in basal-like breast cancer cell lines. Among them, 48 were overexpressed in triple negative breast cancers of TCGA data. Some molecular functions were overrepresented in this candidate gene list. Genes involved in antioxydation, such as SOD1, MGST3 and PRDX or cadherin-binding genes, such as PFN1, ITGB1 and ANXA1, could thus be considered as basal like breast cancer biomarkers. We then sought if these genes were linked to BRCA1, since this gene is often inactivated in basal-like breast cancers. Nine genes were identified overexpressed in both basal-like breast cancer cells and BRCA1 mutated cells. Amongst them, at least 3 genes code for proteins implicated in epithelial cell migration and epithelial to mesenchymal transition (VIM, ITGB1 and RhoA).

Our study provided several potential therapeutic targets for triple negative and BRCA1 mutated breast cancers. It seems that migration and mesenchymal properties acquisition of basal-like breast cancer cells is a key functional pathway in these tumors with a high metastatic potential.

Keywords: Basal-like breast cancer, BRCA1, RNA-Seq, cell migration, antioxydation


Basal like breast cancers (BLBC) represent between 10 and 20% of breast cancers. They are associated with an aggressive phenotype, high histological grade, poor clinical behavior, and high rates of relapse (1). This cancer subgroup is characterized by lack of estrogen receptor (ER), progesterone receptor (PR), and HER2 amplification (TNBC: triple-negative breast cancers) with expression of basal cytokeratins 5/6, 14, 17, epidermal growth factor receptor (EGFR), and/or c-KIT. Currently, BLBC lack any specific targeted therapy, due to the fact that they do not express ER or HER2 and thus are typically refractory to endocrine therapy and to trastuzumab, a humanized monoclonal antibody that targets HER2.

The identification of new markers and therapeutic targets is thus necessary for this bad prognosis cancer type. First, BRCA1-associated BC are mostly BLBC (2) and sporadic BLBC (occurring in women without germline BRCA1 mutations) often show dysfunction of the BRCA1 pathway. The characteristics of hereditary BRCA1-associated BC found in sporadic BLBC cancers have thus been termed «BRCA-ness» with potential clinical implications (3). As BRCA1 pathway may be deficient in BLBC, these tumors may respond to specific therapeutic regimens, such as inhibitors of the poly (ADP-ribose) polymerase (PARP) enzyme (4). Cells deficient in BRCA1 have indeed a defect in the repair of DNA double strand breaks which could make them particularly sensitive to the chemotherapy drugs that generate such breaks, such as inhibitors of PARP enzyme. However, not all BLBC are associated with BRCA1 inactivation.

Then, EGFR could represent a therapeutic target as it is often overexpressed in BLBC. Recently, a phase II clinical trial showed good results (57% of pathological complete response) of panitumumab combined with an anthracycline/taxane-based chemotherapy in operable triple-negative breast cancer (5). Nevertheless, this study highlighted biological signatures correlated with treatment response. Heterogeneity of triple negative breast cancers requires subtyping in order to better identify molecular-based therapy. In 2006 already, Neve et al. separated BLBC cell lines in two subgroup (basal A and basal B) with different invasive properties (6). Lehmann et al. then identified 6 triple-negative breast cancer subtypes including 2 basal-like (BL1 and BL2), an immunomodulatory (IM), a mesenchymal (M), a mesenchymal stem-like (MSL) and a luminal androgen receptor (LAR) subtype (7).

All these subclassification of triple negative breast cancers were identified by studying transcriptomic profiles. Epigenetic modifications in breast cells could also allow identifying characteristics of these breast cancers. Roll et al. reported a hypermethylator phenotype in BLBC, characterized by methylation-dependent silencing of CEACAM6, CDH1, CST6, ESR1, GNA11, MUC1, MYB, SCNN1A, and TFF3 genes that are involved in a wide range of neoplastic processes relating to tumors with poor prognosis (8).

Our results of Methyl-Seq did not confirm this hypermethylator phenotype but we could not identify hypermethylated BLBC specific genes. On the other hand the RNA-Seq data allowed us to identify antioxidation and cell migration as specifically activated pathways in basal-like breast cancer cells.

Materials and methods

Biological material

The main characteristics of the cell lines used are presented in table 1. MDA-MB-231 and HCC1937 human breast cancer cell lines were purchased from the American Type Culture Collection (Rockville, MD, USA) and were grown in RPMI medium supplemented with 10% foetal calf serum, 2 mM L-glutamine and 20 μg/ml gentamicin. SUM149 and SUM1315 human breast cancer cell lines were obtained from Asterand (Hertfordshire, UK) and grown in Ham's F12 medium according to the manufacturer's instructions. SUM1315MO2 cells were transfected with a pLXSN plasmid containing the full-length BRCA1 cDNA using Fugene 6 transfection reagent (Roche Molecular Biochemicals). Control cells were transfected with the pLXSN empty vector. After selection in 721.5 µM G418 (Sigma Aldrich), clones were tested for BRCA1 expression by Western blotting (9). All cell lines were grown at 37˚C in a humidified atmosphere containing 5% CO2. All our cell lines are stored and managed by the CJP Biological Resources Center (BB-0033-00075).

Cell immunohistochemistry

Cells were fixed in Preservcyt solution (Thinprep) and cytoblocks were prepared with Shandon Cytoblock kit (Thermo Scientific). Hormone receptors (ER and PR), HER2, EGFR and cytokeratin status were studied as already described (5). The immunostainings were scored semi-quantitatively by an expert pathologist under a light upright microscope.

 Table 1 

Main characteristics of the cell lines.

Cell lineSite of originPathologyMolecular type (6)Triple negative subtype (7)BRCA1 statusTP53 status
MCF10ANormal breastFibrocysticBasal B-Wild typeWild Type
MCF7Pleural effusionAdenocarcinomaLuminal-Wild typeWild Type
T47DPleural effusionAdenocarcinomaLuminal-Wild typeMissense mutation
MDA231Pleural effusionAdenocarcinomaBasal BMSLWild typeMissense mutation
MDA436Pleural effusionAdenocarcinomaBasal BMSL5382insCNonsense mutation
HCC1937Primary tumorInfiltrating ductal carcinomaBasal ABL15396 + 1G>ANonsense mutation
SUM149Primary tumorInflammatory breast carcinomaBasal BBL22288delTMissense mutation
SUM1315Skin metastasisInfiltarting ductal carcinomaBasal B-185delAGMissense mutation
SUM1315-LXSN (SL)Skin metastasisInfiltarting ductal carcinomaBasal B-185delAGMissense mutation
SUM1315-BRCA1 (SB)Skin metastasisInfiltarting ductal carcinomaBasal B-185delAG + sauvageMissense mutation

Nucleic acid extraction

DNA extraction was performed using the QIAamp DNA mini kit (Qiagen) for cell lines and the QIAamp DNA micro kit (Qiagen) for tumors. RNA extraction was performed using RNeasy mini kit (Qiagen). The quantity and quality of the nucleic acids obtained were measured spectrophotometrically at 260nm and 280nm. RNA were also checked on 2100 Bioanalyzer (Agilent Technologies).

RNA sequencing and data processing

First, mRNA were purified using Oligotex mRNA mini kit (Qiagen). cDNA libraries were then generated following the GS-FLX Titanium cDNA Rapid Library Preparation Method Manual (Roche). Finally, emPCR amplification and 454 sequencing were performed according to the manufacturer's protocol (emPCR Amplification Manual- Lib-L LV and Sequencing Method Manual-GS FLX Titanium Series, Roche). RNA-Seq data are available in the ArrayExpress database ( under accession number E-MTAB-5465.

Sequence reads were aligned on the human genome (hg19) with GS Reference Mapper software (Roche) and mapped on the human exome using a home-made software named AGSA. Data was then normalized by calculating the 'reads per kilo base per million mapped reads' (RPKM) for each gene. When the RPKM value was below the threshold of 0.3, then it was considered as background noise and replaced by zero.

Validation of gene regulation by q-RT-PCR

Total RNAs were extracted from cell lines using RNeasy mini kit (Qiagen) according to the manufacturer's protocol. Quality of RNAs was checked using the 2100 BioAnalyzer (Agilent Technologies). Five microgram RNA was then reverse-transcribed using First-strand cDNA synthesis kit (GE Healthcare). Multiplex quantitative RT-PCR was performed using a 7900HT Fast Real-Time PCR System (Applied Biosystems).

Predesigned and validated gene-specific probe-based Taq-Man Gene Expression Assays were used and relative gene expression was determined using the comparative threshold cycle method. Ribosomal 18S was chosen as the endogenous control gene.

Methyl-DNA sequencing and data processing

First, DNA was fragmented by nebulization during 1min30sec at 2.1 bar of nitrogen pressure. After DNA purification using Qiaquick PCR purification kit (Qiagen), methylated DNA was captured using MethylCap kit (Diagenode) following the supplier recommendations. Libraries were generated using GS FLX Titanium Rapid Library Preparation Kit (Roche). Finally, emPCR amplification and 454 sequencing were performed according to the manufacturer's protocol (emPCR Amplification Manual- Lib-L LV and Sequencing Method Manual-GS FLX Titanium Series, Roche). Methyl-Seq data are available in the ArrayExpress database ( under accession number E-MTAB-5468.

Sequence reads were aligned on the human genome (hg19) with GS Reference Mapper software (Roche) and mapped on the human proximal promotors using a home-made software named AGSA. Data was then normalized by calculating the 'reads per million mapped reads' (RPM) for each gene. When the RPM value was below the threshold of 0.3, then it was considered as background noise and replaced by zero.

TCGA data analysis

Both clinical and RNA sequencing data (Illumina HiSeq RNAseq Version 2 data) of invasive breast cancers were downloaded from The Cancer Genome Atlas (TCGA) database. A total of 449 patients with information on ER, PR and HER2 status were selected to compare the expression profiles of genes in the respective tumors. 71 cases were found to have a negative ER, PR end HER2 phenotype (i.e., triple-negative), whereas 371 cases were positive for at least one of these receptors.

Statistical analysis

Statistical analysis of data was performed using R software. Significant differences between cell line groups were sought by Wilcoxon test. Statistical overrepresentation test were performed using PANTHER classification system (10). For TCGA data, Student's t-test was used to assess statistical differences in mean normalized expression between triple negative and non triple negative groups. A p-value ≤ 0.05 was considered statistically significant.


Data normalization

Our transcriptome data consisted of 16,008 genes before and 13,168 genes after RPKM normalization. Methylome data consisted of 6,140 genes before and 6,109 genes after RPM normalization. Eliminated genes are those whose expression or methylation, for each cell line, did not exceed the background noise.

For the transcriptome, the average standard of each cell line data was much higher than the median and the third quartile revealing a very high concentration of data around zero and a very large number of reads for some genes as observed in the box plots (Fig 1 and Table 2). Regarding the methylome, median and third quartile were zero while the average was between 0.8 and 1.8 readings per gene revealing again a very high concentration of data around zero and a very large number of reads for some genes. In addition, for the transcriptome and methylome, all cell lines displayed a strong deviation above the average of the number of reads per gene. This revealed a very high dispersion of data.

From RPKM or RPM data, log normalization was performed to dilate the low values and strengthen high values. Standardization was also performed to obtain a standardized normal distribution. These two normalizations could be coupled to give reduced centered log normalized data.

 Fig 1 

Data standardization. The distribution of transcriptome and methylome data are represented in boxplot. For most of the cell lines, the distribution has many values ​​close to zero and a minority of extreme values.

Int J Med Sci Image

(View larger image in new window)

 Table 2 

Summary of transcriptome and methylome data.

Transcriptome (RPKM)Mean3.
3rd quartile1.561.471.581.761.451.821.631.461.371.43
Standard deviation21.1211.919.98014.6512.3114.5515.5611.658.339.43
Methylome (RPM)Mean0.950.890.951.531.340.880.830.750.920.76
3rd quartile0.
Standard deviation1.772.082.703.063.901.992.331.612.332.18

Means, medians, 3rd quartile, maximum values and standard deviations of the normalized data RPKM (transcriptome) and RPM (methylome) are presented for each cell line.

Non supervised analysis

First we investigated how breast cancer cell lines clustered by RNA-Seq and Methyl-Seq.

For RNA-Seq, a hierarchical clustering on the RPKM normalized RNA-Seq data was generated from Euclidean distances according to Ward's method (Fig 2). This hierarchical clustering was performed for the base matrix (Fig 2A), the log normalized data (Fig 2B), the standardized data (Fig 2C) and the log-standardized data (Fig 2D). This showed that whatever the normalization, the SUM1315 lines, SB and SL are always classified together, like the two luminal T47D and MCF7 cell lines. As already observed with RNA microarrays, transcriptomic analyse by RNA-Seq thus allowed to separate luminal and basal-like breast cancer cells.

In contrast, the benign MCF10A line appears to be different from the tumor lines only for the base matrix (Fig 2A). This is in agreement with the fact that this cell line was classified as basal-like in several studies (6,11).

For Metyl-seq, a hierarchical clustering on the RPM normalized Methyl-Seq data was generated from Euclidean distances according to Ward's method (Fig 3). This hierarchical clustering was performed for the base matrix (Fig 3A), the log normalized data (Fig 3B), the standardized data (Fig 3C) and the log-standardized data (Fig 3D). Metyl-seq data were neither massively influenced by breast cancer subtype nor by BRCA1 mutation. It seems like MDA231 and MDA436 present a hypermetylated phenotype. These two cell lines have thus a higher mean of methylation (1.535 and 1.343 versus 0.7 to 0.95 for all the other cell lines).

Search for genes most significantly regulated

To determine whether the data are normally distributed, a Kolmogorov-Smirnov test was performed by taking as reference the normal distribution on basic matrix (data), normalized log (data log), centered reduced (data C & R) and centered reduced normalized log (log data C & R). For transcriptome and methylome data, the p-value was less than 2.10-16 for the four matrices. Thus, the p-value is very significantly lower the first degree risk α, set at 0.01. It is recognized that the data does not follow a normal distribution. Wilcoxon non-parametric statistical test was thus performed for each gene to select genes significantly regulated between two groups of cell lines.

Subtype: 1205 genes with p<0.05

First of all, we compared the luminal to the basal cell lines. 1205 genes were found significantly different for expression in luminal versus basal-like cell lines (p<0.05). Among them, we found well-known basal-specific genes, such as EGFR, VIM, CAV1 and CAV2 (1). Conversely, ESR1, coding for the estrogen receptor alpha, is only expressed in the two luminal cell lines. Luminal keratins (KRT8, KRT18 and KRT19) are also significantly more expressed in luminal cell lines.

 Fig 2 

Ascending hierarchical classification of breast cancer cell line RNA-Seq data. Classification of cell lines according to the Euclidean distances of gene expression for the basic matrix (A), log normalized (B), centered reduced (C) and log normalized and centered reduced (D).

Int J Med Sci Image

(View larger image in new window)

 Fig 3 

Ascending hierarchical classification of breast cancer cell line Methyl-Seq data. Classification of cell lines according to the Euclidean distances of gene methylation for the basic matrix (A), log normalized (B), centered reduced (C) and log normalized and centered reduced (D).

Int J Med Sci Image

(View larger image in new window)

 Table 3 

Comparison of RNA-Seq and immunohistochemistry data.

IHCER-+ (90%)+ (50%)-------
IHCPR-+ (40%)+ (80%)-------
IHCCK5/6+ (90%)----+ (25%)+ (25%)---
IHCCK14+ (80%)----+ (20%)+ (<1%)---
IHCEGFR+ (100%)-+ (40%)+ (100%)+ (90%)+ (100%)+ (100%)+ (90%)+ (60%)+ (80%)

RNA-Seq results are presented as RPKM values. Results of immunohistochemistry are presented as negative (-) or positive (+), specifying the percentage of labeled cells.

IHC: immunohistochemistry; ER: estrogen receptor; PR: progesteron receptor; CK: cytokeratin.

For the genes used for clinical classification of basal-like breast cancers, we could compare our RNA-Seq data to immunohistochemistry results (Table 3). A very good correlation was observed between mRNA expression analysed by RNA-Seq and protein expression studied by immunocytochemistry.

With the objective of identifying new therapeutic targets for basal-like breast cancers, we selected genes that were highly expressed (>10 RPKM) and significantly up-regulated in basal cell lines. This reduced the list to 77 candidate genes (Table 4). Thanks to the expression data extracted from the TCGA project "Breast Invasive Carcinoma", we confirmed significant overexpression for 48 genes of this list in triple negative breast tumors compared to non triple-negative breast tumors. Among them, some have already been described as basal-like markers, such as Annexin A1 (12,13) and Vimentin (14,15).

 Table 4 

List of the 77 genes highly expressed and significantly up-regulated in basal-like breast cancer cell lines.

UQCRHL #58,1677,51430,89182,15184,14222,29352,09412,39223,07181,27
TMSB10 #25,429,40150,63508,74111,4885,40215,62244,01142,27192,18
PFN1 #77,6279,56152,67276,20206,92233,86257,91111,56137,20122,80
S100A6 #22,9819,45416,84181,7687,12197,18213,62134,5377,2878,92
S100A2 *#00396,331,836,931,0358,9513,3912,774,03
LGALS1 #4,338,9794,7331,8440,8366,7488,9276,6445,5635,35
MRPL51 #9,486,1033,8840,2583,6153,4779,5953,9043,9033,53
GADD45G IP1 #20,2113,1056,2546,8146,2752,1354,9861,4249,4144,08
VIM *#0,5004,1666,680,7745,395,5592,39111,7957,37
GBA3 #23,0113,3338,6063,1141,7626,4747,5044,6742,6543,55
SF3B5 #21,5711,7360,9126,8630,7565,2822,7344,0540,8647,65
RHOA *18,0815,9431,0223,7134,3431,1130,4939,1742,3326,96
ANXA1 #00,7428,2548,4868,2932,6723,7021,0515,5919,83
SEC61G #6,179,3131,3115,7425,8625,4028,0955,5518,6422,24
PRKCDBP #0012,95110,2858,211,4816,035,365,9711,87
IGFBP3 #000,523,0635,625,030,427,2371,3272,54
MRPL18 #11,3210,2820,1425,4131,2422,6722,0425,2623,3221,50
TOMM5 #5,418,7122,6737,9032,0423,6410,3524,7313,8817,39
NOL7 #5,338,4510,2527,7822,538,9411,4242,4328,9029,18
NCL #12,2112,2118,5620,2012,6040,1020,9724,8020,6520,23
LOC100134713 #1,050,4045,2632,0438,6313,575,7818,7712,099,76
PRNP #3,604,9426,8231,7839,996,2519,8215,226,4115,54
SNX3 *#7,774,9420,6925,2640,0923,5710,6217,0813,869,34
GADD45A *0,781,598,9131,5932,6217,732,6720,1927,4512,92
TM4SF1 #0,5103,2737,4425,036,819,9726,0318,6726,40
MRPL34 #12,2011,1929,6419,4519,3118,2615,3118,1013,4918,97
MANF *#3,784,368,9452,2630,048,2013,0013,6916,198,36
PRDX1 #7,426,6215,438,6220,4516,2421,7621,9719,5026,71
MLF2 #9,787,3213,4113,1526,3620,4723,3716,4419,1916,94
C17orf89 #4,005,1913,2132,8251,747,7412,948,258,0511,30
WBP5 #01,769,7242,2611,1831,024,5920,0015,799,17
EEF1E1 #4,185,2915,9841,8524,6812,1220,587,805,378,01
EBNA1BP2 #3,207,189,5823,1018,6810,839,7117,3226,0016,38
TGFBI #0014,000,361,690,987,1828,4525,4553,07
SSBP1 #4,594,4422,0317,0217,6412,737,9217,3012,9011,92
FAM96B #2,862,497,9820,2737,203,5815,769,4514,007,22
MGST3 #2,933,5928,316,6113,6312,6210,6914,6912,3110,71
ITGB1 *1,713,113,2412,4213,205,086,8127,4321,6312,56
CAPG #0,442,7216,1224,8014,308,3810,0012,289,436,45
NOP16 #7,043,5517,4711,5811,9311,6411,6717,289,5210,16
MRPS15 #3,554,5315,288,9016,478,1311,8611,2412,7712,19
UBA52 #4,683,1718,728,218,3311,3614,4717,478,438,56
MRPS6 #1,553,2510,226,784,268,2116,3826,159,3812,48
TRAPPC3 #5,104,3811,1613,8611,947,539,1914,1413,2112,60
SSU72 #7,687,1712,5213,8015,577,808,4511,3310,3111,90
RBX1 #1,511,998,7622,4422,067,599,5910,662,053,22
COTL1 #2,132,154,195,4520,0613,866,979,8010,0213,43
DDT #3,884,6329,745,756,535,779,579,917,708,61
RPS12 #4,113,5917,477,048,509,259,5214,968,637,79
NOP56 #3,234,497,5220,8719,587,558,794,965,147,90

RNA-Seq results are presented as RPKM values.

* : genes that are also significantly overexpressed in BRCA1 mutated (SL) compared to BRCA1 restored (SB) cell lines.

# : genes that are significantly overexpressed in triple negative breast cancers in the TCGA RNAseq data.

 Fig 4 

Expression of some genes involved in antioxydation and in cadherin binding. A. Gene expression found in our RNA-Seq study are presented as RPKM values. For all these genes, significant overexpression was observed in triple negative cell lines. B. TCGA data are presented as RPKM values. **: p <0.01; ***: p<0.001

Int J Med Sci Image

(View larger image in new window)

This 77 basal-like specific genes list was also submitted to Panther statistical overrepresentation test. Two molecular functions were given as overrepresented in this list (Fig 4A): antioxydant activity (5 genes, p=0.0318), and cadherin involved in cell-cell adhesions (6 genes, p=0.0231). For these 11 genes, TCGA data were compared for triple negative and non triple negative breast cancers (Fig 4B). Significant overexpression in triple negative breast cancers was found for SOD1, MGST3 and PRDX1 antioxydation genes and for PFN1, ITGB1, ARGLU1 and ANXA1 cadherin binding genes.

Influence of BRCA1: specific study SL / SB: 979 genes with p<0.05

We studied our BRCA1 transfected cell lines: SL and SB come from the BRCA1 mutated SUM1315 cell line that was stably transfected with empty LXSN plasmid (SL: SUM1315-LXSN) or with a BRCA1 coding plasmid (SB: SUM1315-BRCA1). As already demonstrated (9), we could check that SB cell line expressed around 5 times more BRCA1 transcripts than SL cell line. Comparing these 2 cell line RNA sequencing, we found 979 genes significantly differently expressed. Among them, 304 genes were expressed at least twice more in SL comparing SB (Table 5). This gene list was submitted to Panther statistical overrepresentation test. The molecular functions the most overrepresented included semaphorin receptor activity (5 genes, p=0.0071), growth factor binding (12 genes, p=0.00327) and cell adhesion molecule binding (21 genes, p=0.0177).

We found 9 genes that were overexpressed both in basal-like cell lines compared to luminal cell lines and in BRCA1 mutated (SL) compared to BRCA1 restored (SB) cell lines (Fig 5A). We performed q-RT-PCR experiments in order to validate these overexpressed genes. For 7 of these genes, we could validate higher expression in basal-like cell lines (Figure 5B) and in BRCA1 mutated cell line (Figure 5C). These 7 genes can be considered as basal-like biomarkers and potential therapeutic targets, particularly in BRCA1-mutated cancers. Above these genes, four are linked to cytoskeleton and could thus be implicated in epithelial cell migration: VIM, ITGB1, RHOA and TUBB6.


Our study tested RNA-Seq and Methyl-Seq as sensitive methods to categorize breast cancer tumor cells. We found that Methyl-Seq is not an appropriate method to differentiate breast cancer subtypes. The weak quantity of data generated with our technology is a limit that could probably be overcome with the latest generations of sequencers. Nevertheless it seems that the subtypes of breast cancers do not broadly influence genome methylation. Among the genes published by Roll et al.(8), only ESR1 was found to be methylated in the MDA-MB-231 and MCF10A cell lines. We also observed a specific profile of the MDA-MB-231 and MDA-MB-436 cell lines, which appear to be globally hypermethylated. Deeper study will be needed to understand why these cell lines are hypermethylated.

 Table 5 

List of the 304 genes significantly up-regulated in BRCA1 mutated cell line (SL) compared to BRCA1 wild-type cell line (SB).


RNA-Seq results are presented as RPKM values.

 Fig 5 

Expression of genes overexpressed in triple negative and in BRCA1-mutated cell lines. A. Gene expressions found in our RNA-Seq study are presented as RPKM values. For all these genes, significant overexpression was observed both in triple negative versus luminal cell lines and in BRCA1 mutated (SL) versus BRCA1 (SB) restored cell lines. B. Gene overexpression in basal-like breast cancer cell lines was confirmed by q-RT-PCR. *: p <0.05 C. Gene overexpression in BRCA1 mutated (SL) compared to BRCA1 restored (SB) cell lines was confirmed by q-RT-PCR. *: p <0.05

Int J Med Sci Image

(View larger image in new window)

By contrast, RNA-Seq was proven as a reliable method for breast cancer classification. We showed that different subtypes could be separated by global gene clustering. Moreover transcript expression analyzed by RNA-Seq was shown to correlate with protein expression evaluated by immunohistochemistry. With the steady decrease in sequencing prices, RNA-Seq could become the standard method for the molecular characterization of breast tumors. This could make it possible to propose a personalized treatment according to the therapeutic targets overexpressed in the tumor.

In our study we chose to focus on basal-like breast tumors. We thus identified some potentially new targets in this breast cancer subtype. PANTHER analysis (10) revealed that genes involved in cadherin binding were overrepresented in the basal-like overexpressed genes. For most of them, we could confirm overexpression in triple negative breast cancers thanks to the Cancer Genome Atlas data. Among them PFN1 codes for profilin1, a regulator of actin polymerization but it is known to be downregulated in breast cancers (16). In contrast, our study confirms overexpression of two genes involved in breast cancer progression, the ANXA1 and ITGB1 genes. ANXA1 was already shown to be associated with triple negative breast cancers (13,17). A recent study already identified ITGB1, that codes for integrin beta 1, as a potential prognosis biomarker in triple negative breast cancers (18).

Furthermore, genes involved in cell oxidation regulation were also found to be overexpressed in basal-like breast cancers. GPX1 codes for glutathione peroxidase 1 that protects cells against oxidative stress. A polymorphism of this gene has been associated to breast cancer risk (19). The PARK7 gene (also known as DJ-1) is involved in neuron protection against oxidative stress and cell death. It was recently shown to interact with HER3 receptor (20). For these two genes, our results showed at least a 3 fold overexpression in basal-like breast cancer cell lines but in TCGA triple negative breast tumors this overexpression was not confirmed. For SOD1, PRDX1 and MGST3, triple negative cell overexpression was observed both in our cell lines and in TCGA tumors. MGST3 is another gene belonging to antioxidant system shown to be overexpressed in some melanomas (21). Superoxyde dismutase 1 is an enzyme coded by the gene SOD1. It has already been suggested as a potential anti-cancer drug (22,23). At last, PRDX1 has a controversial role in oxidization-reduction balance and its expression seems to be up-regulated in breast cancer tissues (24). Oxydative stress could thus be targeted in basal-like breast cancers. This therapeutic strategy has already been proposed in other types of cancer (25,26).

BRCA1 is a double strand break repair gene known to be frequently inactivated in basal-like breast cancers. In our results, BRCA1 mutation does not appear to influence massively the transcriptome of breast cancer cells. Nevertheless, we looked for genes overexpressed both in basal-like and in BRCA1 mutated breast cancer cells. ITGB1 is one of these genes, BRCA1 could thus be proposed as a regulator of integrin beta 1. It could be particularly interesting to inhibit ITGB1 in BRCA1 mutated basal-like breast cancers. VIM gene codes for vimentin protein, a mesenchymal marker, and we also found it overexpressed in BLBC cell lines and in BRCA1 mutated cell line. It is also an interesting biomarker of triple negative and BRCA1 mutated breast cancers. MicroRNA-138, which targets vimentin, has thus been proposed as a therapeutic agent for breast cancer (28). RhoA and TUBB6 are also overexpressed in BRCA1 mutated and basal-like breast cancers. TUBB6 gene codes for a tubulin protein, the major constituent of microtubule cytoskeleton. RhoA is a small GTPase involved in actin cytoskeleton organization and it thus regulates cell shape and motility (27). Physical reorganization of the cytoskeleton appears to be important in BRCA1 mutated breast cancers. This ability to remodel the cellular form could explain the high metastatic capacities of these cancers. Targeting the proteins involved in this function could thus be an effective therapeutic strategy for basal type breast cancers.


The results published here are in part based upon data generated by the TCGA Research Network: We used the PANTHER classification system to analyze RNA-Seq data (10).

Competing Interests

The authors have declared that no competing interest exists.


1. Valentin MD, da Silva SD, Privat M, Alaoui-Jamali M, Bignon Y-J. Molecular insights on basal-like breast cancer. Breast Cancer Res Treat. 2012Jul;134(1):21-30

2. Waddell N, Arnold J, Cocciardi S, da Silva L, Marsh A, Riley J. et al. Subtypes of familial breast tumours revealed by expression and copy number profiling. Breast Cancer Res Treat. 2010Oct;123(3):661-77

3. Lips EH, Mulder L, Oonk A, van der Kolk LE, Hogervorst FBL, Imholz ALT. et al. Triple-negative breast cancer: BRCAness and concordance of clinical features with BRCA1-mutation carriers. Br J Cancer. 2013May28;108(10):2172-7

4. Lee J, Ledermann JA, Kohn EC. PARP Inhibitors for BRCA1/2 mutation-associated and BRCA-like malignancies. Ann Oncol. 2014Jan;25(1):32-40

5. Nabholtz JM, Abrial C, Mouret-Reynier MA, Dauplat MM, Weber B, Gligorov J. et al. Multicentric neoadjuvant phase II study of panitumumab combined with an anthracycline/taxane-based chemotherapy in operable triple-negative breast cancer: identification of biologically defined signatures predicting treatment impact. Ann Oncol. 2014Aug1;25(8):1570-7

6. Neve RM, Chin K, Fridlyand J, Yeh J, Baehner FL, Fevr T. et al. A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes. Cancer Cell. 2006Dec;10(6):515-27

7. Lehmann BD, Bauer JA, Chen X, Sanders ME, Chakravarthy AB, Shyr Y. et al. Identification of human triple-negative breast cancer subtypes and preclinical models for selection of targeted therapies. J Clin Invest. 2011Jul1;121(7):2750-67

8. Roll JD, Rivenbark AG, Sandhu R, Parker JS, Jones WD, Carey LA. et al. Dysregulation of the epigenome in triple-negative breast cancers: Basal-like and claudin-low breast cancers express aberrant DNA hypermethylation. Exp Mol Pathol. 2013;95(3):276-87

9. Privat M, Aubel C, Arnould S, Communal Y, Ferrara M, Bignon Y-J. Breast cancer cell response to genistein is conditioned by BRCA1 mutations. Biochem Biophys Res Commun. 2009Feb13;379(3):785-9

10. Mi H, Huang X, Muruganujan A, Tang H, Mills C, Kang D. et al. PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements. Nucleic Acids Res. 2016 gkw1138

11. Subik K, Lee J-F, Baxter L, Strzepek T, Costello D, Crowley P. et al. The Expression Patterns of ER, PR, HER2, CK5/6, EGFR, Ki-67 and AR by Immunohistochemical Analysis in Breast Cancer Cell Lines. Breast Cancer Basic Clin Res. 2010May20;4:35-41

12. Sobral-Leite M, Wesseling J, Smit VTHBM, Nevanlinna H, van Miltenburg MH, Sanders J. et al. Annexin A1 expression in a pooled breast cancer series: association with tumor subtypes and prognosis. BMC Med. 2015:13

13. Bhardwaj A, Ganesan N, Tachibana K, Rajapakshe K, Albarracin CT, Gunaratne PH. et al. Annexin A1 Preferentially Predicts Poor Prognosis of Basal-Like Breast Cancer Patients by Activating mTOR-S6 Signaling. PLoS ONE. 2015:10 (5)

14. Zelenko Z, Gallagher EJ, Tobin-Hess A, Belardi V, Rostoker R, Blank J. et al. Silencing vimentin expression decreases pulmonary metastases in a pre-diabetic mouse model of mammary tumor progression. Oncogene. 2016 Aug 29

15. Tanaka K, Tokunaga E, Inoue Y, Yamashita N, Saeki H, Okano S. et al. Impact of Expression of Vimentin and Axl in Breast Cancer. Clin Breast Cancer.

16. Valenzuela-Iglesias A, Sharma VP, Beaty BT, Ding Z, Gutierrez-Millan LE, Roy P. et al. Profilin1 regulates invadopodium maturation in human breast cancer cells. Eur J Cell Biol. 2015Feb;94(2):78-89

17. Sobral-Leite M, Wesseling J, Smit VTHBM, Nevanlinna H, van Miltenburg MH, Sanders J. et al. Annexin A1 expression in a pooled breast cancer series: association with tumor subtypes and prognosis. BMC Med. 2015 Jul 2;13

18. Klahan S, Huang W-C, Chang C-M, Wong HS-C, Huang C-C, Wu M-S. et al. Gene expression profiling combined with functional analysis identify integrin beta1 (ITGB1) as a potential prognosis biomarker in triple negative breast cancer. Pharmacol Res. 2016;104:31-7

19. Hu J, Zhou G-W, Wang N, Wang Y-J. GPX1 Pro198Leu polymorphism and breast cancer risk: a meta-analysis. Breast Cancer Res Treat. 2010Nov1;124(2):425-31

20. Zhang S, Mukherjee S, Fan X, Salameh A, Mujoo K, Huang Z. et al. Novel association of DJ-1 with HER3 potentiates HER3 activation and signaling in cancer. Oncotarget. 2016Aug25;7(40):65758-69

21. Bracalente C, Ibañez IL, Berenstein A, Notcovich C, Cerda MB, Klamt F. et al. Reprogramming human A375 amelanotic melanoma cells by catalase overexpression: Upregulation of antioxidant genes correlates with regression of melanoma malignancy and with malignant progression when downregulated. Oncotarget. 2016May10;7(27):41154-71

22. Che M, Wang R, Li X, Wang H-Y, Zheng XFS. Expanding roles of superoxide dismutases in cell regulation and cancer. Drug Discov Today. 2016Jan;21(1):143-9

23. Papa L, Manfredi G, Germain D. SOD1, an unexpected novel target for cancer therapy. Genes Cancer. 2014Jan;5(1-2):15-21

24. Ding C, Fan X, Wu G. Peroxiredoxin 1 - an antioxidant enzyme in cancer. J Cell Mol Med. 2016 Sep 1

25. Trachootham D, Alexandre J, Huang P. Targeting cancer cells by ROS-mediated mechanisms: a radical therapeutic approach?. Nat Rev Drug Discov. 2009Jul1;8(7):579-91

26. Fang J, Seki T, Maeda H. Therapeutic strategies by modulating oxygen stress in cancer and inflammation. Adv Drug Deliv Rev. 2009;61(4):290-302

27. O'Connor K, Chen M. Dynamic functions of RhoA in tumor cell migration and invasion. Small GTPases. 2013Jul1;4(3):141-7

28. Zhang J, Liu D, Feng Z, Mao J, Zhang C, Lu Y. et al. MicroRNA-138 modulates metastasis and EMT in breast cancer cells by targeting vimentin. Biomed Pharmacother. 2016;77:135-41

Author contact

Corresponding address Corresponding author:

Received 2017-9-19
Accepted 2017-10-11
Published 2018-1-1