info@biomedres.us   +1 (720) 414-3554
  One Westbrook Corporate Center, Suite 300, Westchester, IL 60154, USA

Biomedical Journal of Scientific & Technical Research

March, 2022, Volume 42, 3, pp 33543-33549

Short Communication

Short Communication

The Abundance of Homologous MicroRNA-Binding Sites in the Human c-MET mRNA 3’UTR

Holly Clifton1, Miruna Giurgiu1, Summer Stephens1, Robert Kaltenbach1, Emma Davis1, Jahnavi Raval1, Shraddha ChandThakuri2, Brian Helms1, Valery Drozhenko3, Kai Ding4, Samuel Milanovich5, Alexei Savinov5, Vitaly Voloshin6, Jiling Zhong7 and Alexander Kofman1*

Author Affiliations

1Department of Biological and Environmental Sciences, Troy University, USA

2Hunter College, City University of New York, USA

3Filatov Institute, Odesa, Ukraine

4The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, USA

5Sanford Research, Sioux Falls, USA

6Department of Mathematics, Troy University, USA

7Department of Computer Sciences, Troy University, USA

Received: February 18, 2022 | Published: March 04, 2022

Corresponding author: Alexander Kofman, Department of Biological & Environmental Sciences, 210-D MSCX (McCall Hall), Troy University, Troy, AL 36082, USA, Email: akofman@troy.edu

DOI: 10.26717/BJSTR.2022.42.006740

ABSTRACT

MicroRNAs are small non-coding RNAs that regulate the expression of virtually all genes in nearly all biological species. They exert their function through binding to the complementary site on the targeted mRNA. The single nucleotide polymorphisms within microRNA-binding sites may weaken or abolish the interaction between microRNA and their target transcripts. Each microRNA sequence can potentially influence the expression of hundreds of genes, whereas almost every mRNA can be targeted by dozens and even hundreds of microRNAs. The reported multiple seedmatching sites for the same microRNA on the similar mRNA target sequence are suggested to act cooperatively to enhance microRNA-mediated effects. However, the multiple microRNA-binding sites could also safeguard the microRNA-mediated control of gene expression if one of the sites is disrupted by SNPs. To address the question of whether such multiple binding sites for the same microRNAs occur at sufficient frequency we analyzed the microRNA-binding sites located at the 3’UTR of the human c-MET mRNA. We found that these sites are abundant, including both conserved and non-conserved ones, mostly canonical but also non-canonical ones, characterized by various structural types, and positioned at variable distances from each other. Since the repetitive microRNA seed-matching sites are not always structurally identical and each of them is surrounded by a unique nucleotide context, we termed them as homologous sites. We suggest that the homologous microRNA-binding sites may not only enhance the effects of the microRNA but also rescue the phenotype if one of the microRNA-binding sites becomes non-functional due to the presence of SNPs within or beyond the mRNA seed-matching region as well as tissue-specific variations in the expression of the RNA-binding proteins, 3’UTR isoforms and other factors that may influence the microRNA-mRNA interaction.

Keywords: microRNA; mRNA; Seed; Repetitive; Homologous; Single Nucleotide Polymorphisms

Abbreviations: miR: microRNA; mRNA: Messenger; RNA UTR: Untranslated Region; RISC – RNA: Induced Silencing Complex Ago – Argonaute (proteins); SNP: Single Nucleotide Polymorphism; HGF: Hepatocyte Growth Factor; PDGFRA: Platelet-Derived Growth Factor Receptor Alpha; ACE2: Angiotensin-Converting Enzyme 2; CCR2: C-C Motif Chemokine Receptor Type 2; CD46: Complement Regulatory Protein; VTCN1: V-Set Domain- Containing T-Cell Activation Inhibitor 1; ARID5B: AT-Rich Interaction Domain 5B

Introduction

MicroRNAs (miRs) are small, about 20-24 nucleotides long, noncoding RNAs that are expressed in a wide range of eukaryotic organisms [1]. miRs bind to the complementary sequences of host mRNA-targets as part of the RNA-induced silencing complex (RISC) with its core component Argonaute (Ago) proteins [2]. In plants, miRs show the perfect or near-perfect complementarity to their target RNA. In mammalian cells, as well as in other eukaryotes, miRs generally bind to their target mRNA sequences with imperfect complementarity [2,3]. The 5’-end of miR exhibits the so-called seed sequence that is complementary to 6-8 nucleotides at the 3’-UTR of mRNA [1,4]. Canonical miR target sites are classified upon the extent and location of matching miR nucleotides: 6mers perfectly pair to nucleotides 2-7 on the 5’-end of microRNA, 7-merA1 and 7mer8 – additional pairing with miR nucleotide 1 or 8 respectively, and 8mer sites match miR nucleotides 1-8. The efficacy of these sites is usually augmented by additional pairing at 3’end of miRs. The miR-mRNA complementarity with only nucleotides at 3’end or with nucleotides in the center on miRs (the so-called 3’-compensatory sites and centered sites, respectively) as well as within non-canonical sites are believed to be much less effective in regulating gene expression [4-7].

The naturally occurring variations, particularly the single nucleotide polymorphisms (SNPs) in microRNAs and their target mRNAs [8] affect miR-mRNA interaction, which in turn may contribute to susceptibility to various diseases [9-11] including cancer [12]. It should be taken into consideration that miR-binding efficacy also depends on many factors such as 3’UTR isoforms [5,7,13-15] (UTR size, alternative cleavage, and the location of target sequences within AU-rich regions), cell type (expression of RNA-binding proteins that compete with miRs for binding to the same mRNA spot), and the presence of SNPs outside the target region [7,16] which can alter the secondary mRNA structure and have an impact on the availability of miR-binding sites) [7,17]. Each miR sequence can potentially influence the expression of hundreds of genes [7], whereas almost every mRNA can be targeted by dozens and even hundreds of miRs [18,19]. Thus, several miRs can exert both synergistic effects [20-26], and also antagonize each other by targeting the same transcripts [27]. It has been reported that some miRs may have more than one binding (seed-matching) site in their mRNA target [25,28-34]. Such multiple sites are proposed to act synergistically [21,23,29,31-33] which in turn, may depend on the distance between the sites [21,29] as well as the activity and structural variations of miR-Ago complexes [34,35]. It would be rational to suggest that if SMPs disrupt one of the miR-binding sites, the other sites for the same miR could preserve the miRmediated control of gene expression. However, the published data on repetitive seed-matching sites are very scarce, and it is not clear if such sites are frequent in human genes.

c-MET is a receptor tyrosine kinase for the hepatocyte growth factor (HGF). The binding of HGF to c-MET regulates several cellular functions, such as cell differentiation, proliferation, epithelial cell motility, angiogenesis, and epithelial-mesenchymal transition [36]. Abnormal upregulation of the c-MET gene is well known to be associated with the development of cancer through activation of downstream oncogenic pathways, increased angiogenesis of the growing tumors, and metastases [36]. c-MET has been predicted and shown to be the target gene of multiple miRs, which play a crucial role in controlling its activity [37]. We analyzed the presence, structural characteristics, and distances between the repetitive miR seed-matching sites in the 3’UTR of the human c-MET mRNA. We found that these sites are abundant, including both conserved and non-conserved ones, mostly canonical but also non-canonical sites, characterized by various structural types, and positioned at variable distances from each other. Since the repetitive miR seed-matching sites are not always structurally identical and each of them is surrounded by a unique nucleotide context, their functionality may significantly differ, and they ought to be considered as homologous sites. We also found that while 72 miRs could be predicted to fail to regulate their targeted gene expression due to SNPs in the mRNA target sites, only 3 of these miRs did not have the alternative undisrupted by SNPs homologous binding sites. Our observations allowed us to propose that the presence of the homologous sites may safeguard the miR-mediated regulation of their target gene expression in case one of the homologous miR-binding sites is disrupted by SNPs.

Methods

miR Target Prediction

We used the TargetScanHuman database [38] (Release 7.2, March 2018) for computational target prediction of miR-binding sites in the human C-MET mRNA 3’UTR (reference sequence ENST00000397752.3).

Identification of Repetitive miR-Binding Sites

To find the repetitive miR-binding sites we used conditional formatting in Excel (MS Office) with the following selection of the sites that had a different location in C-MET mRNA 3’UTR.

Characterization of miR-Binding Sites

We used the TargetScanHuman database to compare the structural properties of the non-repetitive and repetitive miR seed-matching sequences: the number of the repetitive sites for each separately taken miR, whether the sites were conserved or non-conserved, and to which structural category they belonged: canonical 7mer-1A, 7mer- m8, 8mers as well as non-canonical ones.

Spacing between the Repetitive miRNA-Binding Sites

We calculated the distances between the 5’ ends of the repetitive mRNA seed matching regions [21,25]. If the number of the homologous sites for each specific miR was more than 2, we calculated the distances between the closely located sites.

Single Nucleotide Polymorphisms within the Repetitive miR-Binding Sites

Information about the presence of SNPs within the human c-MET mRNA 3’UTR was obtained from GenBank [39]. We collected only those SNPs that had information about their minor allele frequencies as of January 1, 2021. The sequence NM_001127500.1 (GenBank) served as a reference sequence because it matches the ENST00000397752.3 reference sequence used to localize the miRbinding sites. Because the TargetScanHuman database presents the positions of the miR-binding sites beginning from the first nucleotide of the 3’UTR, and GenBank presents the SNP positions beginning from the first nucleotide of the reference sequences, we recalculated the positions of miR-binding sites from the beginning of the mRNA reference sequence (GenBank, NM_001127500.1). To perform this task we used the NCBI BLAST Global Alignment Tool, (Needleman-Wunsch Algorithm [40]). To visualize the locations of SNPs within the seed match mRNA regions as well as within the whole miR-corresponding region of mRNA we used NEIL algorithm [41], which allows us to align two sequences upon their complementarity. Subsequently, the alignments were superimposed on the genomic map of the corresponding mRNA sequence thus providing the image of the location of the miR-binding sites and SNPs.

Results

The Repetitive miR-Binding Sites in Human C-MET 3’-UTR mRNA

Figure 1: Canonical miR-binding sites recognized by the TargetScan database.

The data mining of miR-binding sites revealed that the 3’UTR of the human C-MET mRNA had 871 miR- binding sites; among them 41conserved and 830 poorly conserved sites. The conserved sites were represented by 7mer-1A (4 sites, 9.8%), 7mer-m8 (19 sites, 46.3%), and 8mers (18 sites, 43.9%), whereas the poorly conserved sites were represented by 7mer-1A (382 sites, 46.0%), 7mer-m8 (304 sites, 36.6%), 8mers (139 sites, 16.7%) and non- canonical (5 sites, 0.6%) sites (Figures 1 & 2). At the same time, human C-MET mRNA 3’UTR could be targeted by only 683 miRs; among them, 128 miRNAs had 2 target sites, 27 miRNAs had 3 target sites, and 2 miRNAs had 4 target sites (Supplementary Table 1). Thus, 23.0% of all miRNAs had repetitive sites, that is more than one binding site for the same miRNA. We further compared non-repetitive and repetitive miR-binding sites and found that altogether human C- MET mRNA 3’ UTR harbored 526 (60.4%) non-repetitive and 345 (39.6%) repetitive, sites in the different positions within 3’UTR. 345 out of 871 (39.6%) miR-binding sites were repetitive: 256 sites – as duplicates, 81 sites – as triplicates, and 8 sites as – quadruplicates (Figure 3). The non-repetitive sites were represented by 17 conserved and 509 non-conserved; 3.3% and 96.7% correspondingly of all non-repetitive sites. The repetitive sites were characterized by24 conserved and 321 non-conserved; 7.0% and 93.0% correspondingly of all repetitive sites (Table 1). The conserved non-repetitive sites comprised 7mer-1A (2 sites, 11.8%), 7mer-m8 (3 sites, 17.6%), and 8- mers (12 sites, 70.6%), whereas the poorly conserved non-repetitive sites included 7mer- 1A (247 sites, 48.5%), 7mer- m8 (191 sites, 37.5%), 8-mers (168 sites, 13.4%) and non-canonical (3 sites, 0.6%) sites (Figure 4). The conserved repetitive sites had 7mer-1A (2 sites, 8.3%), 7mer-m8 (16 sites, 66.7%), and 8-mers (6 sites, 25.0%), while the poorly conserved repetitive sites included 7mer-1A (135 sites, 42.1%), 7mer-m8 (113 sites, 35.2%), 8-mers (71 site, 22.1%) and noncanonical (2 sites, 0.6%) sites (Figure 5).

Figure 2: The occurrence of various structural types related to conserved (A) and poorly conserved (B) miR-binding sites.

Figure 3: The presence of repetitive (red color) miR-binding sites in the human C-MET mRNA 3’UTR. A: miRs with multiple binding sites; 1, 2, 3, and 4 indicate the numbers of homologous binding sites for the same miRs. B: repetitive miR-binding sites; 1x, 2x,3x, and 4x correspond to single, duplicated, triplicated, and quadruplicated homologous sites.

Figure 4: The structural characteristics of the

A. Non-repetitive conserved and

B. Poorly conserved miR-binding sites.

Figure 5: The structural characteristics of the repetitive miR-binding sites;

A. Conserved and

B. Poorly conserved.

Table 1: Repetitive vs. non-repetitive miR-binding sites.

The Structural Characteristics and Distances between the Repetitive Sites

Some of the repetitive sites had the same structural pattern (i,e., 7mer-1A, 7mer-m8, or 8mers) whereas others did not. Of note, miR-665 had two homologous sites: the 8mer and non-canonical one (Supplementary Table 2). The occurrence of the structurally different sites for the same miR was noticeable for miRs with two, three, and four homologous sites thus indicating the diversity of binding patterns. The homologous sites were located apart for each other at variable distances ranging from 2114 nucleotides (miR-34a-5p, miR34b-5p, miR34c-5p, miR-449a, miR- 449b-5p, miR-449c-5p) to only 4 nucleotides (miR-495-3p, miR-5688) and 5 nucleotides (miR-6851-5p, miR-3689d). Since the size of the seed-matching site is not shorter than 6 nucleotides, the distances of 4 and 5 nucleotides indicate that the sites are overlapping. Of note, both miR-495-3 and miR-5688 sites have identical locations on the mRNA (positions 4929-4935, 4933-4939, and 5954-5960, have three homologous sites, and their third homologous sites are well separated from the other two (1021 nucleotide). miR-6851-5p and miR-3689d sites also have an identical location on the mRNA (positions 5006-5012 and 5011-5017), yet they have only two homologous sites that are overlapping each other.

SNPs in Homologous miR-Binding Sites

The homologous sites might potentially rescue the miRmediated regulation of the target genes if one of the sites is disrupted by SNP. To address this hypothesis, we assessed the distribution of SNPs within the homologous sites (Supplementary Table 3). Out of a total of 157 miRs with repetitive sites, 72 miRs had at least one SNP in their mRNA target sites. 8 miRs had 2 SNPs in one of the homologous sites while possessing only two homologous sites. 3 miRs had SNPs in one of their homologous sites, and 1 SNP in the second homologous site. Of note, these miRs (miR-4421, miR- 5699-3p, and miR-6748-3p) possessed only 2 homologous binding sites and speculatively their binding to the target mRNA could be completely disrupted. The analysis demonstrates that while 72 miRs could be predicted to fail to regulate their targeted gene expression due to SNPs, only 3 of them did not have the alternative undisrupted by SNPs homologous binding sites.

Discussion

The analysis of seed-matching sites remains the favored approach to predict miR targeting [21,42]. Using the TargetScan database we found that the human c-MET mRNA 3’UTR is rich in the seed-matching sites, which represent the repetitive binding targets for various miRs. These sites are not identical due to their structural differences and the varying nucleotide sequence context beyond the seed-matching regions. Consequently, these sites may not be considered identical nor expected to be functionally equal. Therefore, we termed them as homologous miR-binding sites to reflect their structural and conceivably functional diversity. We explored the homologous sites in the mRNAs of several other randomly chosen human genes, such as PDGFRA, ACE2, VTCN1, ARID5B, CCR2, and CD46. All of these genes have the homologous miR-binding sites in the 3’UTR of their mRNAs although at varying rates (publication in preparation). Our data indicated that while many genes possess the homologous miR-binding sites, their frequency significantly varied and may depend on the type of the gene and the properties of the gene product.

Multiple miR-binding sites on the target gene for the same miR can enhance its effect. However, such cooperation appears to depend on the distance between the sites. It has been reported that the synergistically favorable distance should be about 15-26 nucleotides [23] or between 13 and 35 nucleotides [21]. It also may depend on the specific Ago proteins within RISC [35]. We suggest that, besides the cooperation with each other, the presence of at least two homologous sites could preserve the miR-mediated regulation of the target gene if one of the sites is disrupted by SNPs. Even if SNPs are positioned in the proximity of miR-binding sites, they may change both the local nucleotide context and affect the mRNA secondary structure thus blocking the accessibility of miRbinding sites [16,43]. In this case, the second homologous site should presumably be at the distance that is sufficient to avoid the above-mentioned structural cataclysms and to preserve miR functionality.

Some of the homologous sites of the human c MET mRNA 3’UTR do harbor SNPs. Our results indicate that while 72 miRs could be predicted to fail to regulate their targeted gene expression due to SNPs, only 3 of them did not have the alternative undisrupted by SNPs homologous binding sites. One may suggest that analogously to the protein-coding regions the SNPs within homologous miRbinding sites could be a subject of the selection pressure. The human c-MET gene is a potent oncogene, and deregulation of miR-mediated control may result in its overexpression with the following oncogenic transformation and elimination of the host individual. It would be of interest to further study the presence of the homologous sites and SNPs in various genes while attempting to bridge the data with the available information on the clinical impact of certain SNPs. In conclusion, we suggest that the homologous miR-binding sites may not only enhance the effects on miR but also rescue the phenotype if one of the miR-binding sites becomes nonfunctional due to the presence of SNPs within or beyond the mRNA seed-matching region, tissue-specific variations in the expression of the RNA-binding proteins, 3’UTR isoforms and other factors that may influence the miR-mRNA interaction. The presence of the homologous miR-binding sites should be taken into consideration to predict if the specific miR can exert the regulatory effect on its gene target.

Acknowledgement

The work was supported by the Faculty Development Grant and the Department of Biological and Environmental Sciences, Troy University, AL, U.S.A.

References

Short Communication

The Abundance of Homologous MicroRNA-Binding Sites in the Human c-MET mRNA 3’UTR

Holly Clifton1, Miruna Giurgiu1, Summer Stephens1, Robert Kaltenbach1, Emma Davis1, Jahnavi Raval1, Shraddha ChandThakuri2, Brian Helms1, Valery Drozhenko3, Kai Ding4, Samuel Milanovich5, Alexei Savinov5, Vitaly Voloshin6, Jiling Zhong7 and Alexander Kofman1*

Author Affiliations

1Department of Biological and Environmental Sciences, Troy University, USA

2Hunter College, City University of New York, USA

3Filatov Institute, Odesa, Ukraine

4The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, USA

5Sanford Research, Sioux Falls, USA

6Department of Mathematics, Troy University, USA

7Department of Computer Sciences, Troy University, USA

Received: February 18, 2022 | Published: March 04, 2022

Corresponding author: Alexander Kofman, Department of Biological & Environmental Sciences, 210-D MSCX (McCall Hall), Troy University, Troy, AL 36082, USA, Email: akofman@troy.edu

DOI: 10.26717/BJSTR.2022.42.006740

ABSTRACT

MicroRNAs are small non-coding RNAs that regulate the expression of virtually all genes in nearly all biological species. They exert their function through binding to the complementary site on the targeted mRNA. The single nucleotide polymorphisms within microRNA-binding sites may weaken or abolish the interaction between microRNA and their target transcripts. Each microRNA sequence can potentially influence the expression of hundreds of genes, whereas almost every mRNA can be targeted by dozens and even hundreds of microRNAs. The reported multiple seedmatching sites for the same microRNA on the similar mRNA target sequence are suggested to act cooperatively to enhance microRNA-mediated effects. However, the multiple microRNA-binding sites could also safeguard the microRNA-mediated control of gene expression if one of the sites is disrupted by SNPs. To address the question of whether such multiple binding sites for the same microRNAs occur at sufficient frequency we analyzed the microRNA-binding sites located at the 3’UTR of the human c-MET mRNA. We found that these sites are abundant, including both conserved and non-conserved ones, mostly canonical but also non-canonical ones, characterized by various structural types, and positioned at variable distances from each other. Since the repetitive microRNA seed-matching sites are not always structurally identical and each of them is surrounded by a unique nucleotide context, we termed them as homologous sites. We suggest that the homologous microRNA-binding sites may not only enhance the effects of the microRNA but also rescue the phenotype if one of the microRNA-binding sites becomes non-functional due to the presence of SNPs within or beyond the mRNA seed-matching region as well as tissue-specific variations in the expression of the RNA-binding proteins, 3’UTR isoforms and other factors that may influence the microRNA-mRNA interaction.

Keywords: microRNA; mRNA; Seed; Repetitive; Homologous; Single Nucleotide Polymorphisms

Abbreviations: miR: microRNA; mRNA: Messenger; RNA UTR: Untranslated Region; RISC – RNA: Induced Silencing Complex Ago – Argonaute (proteins); SNP: Single Nucleotide Polymorphism; HGF: Hepatocyte Growth Factor; PDGFRA: Platelet-Derived Growth Factor Receptor Alpha; ACE2: Angiotensin-Converting Enzyme 2; CCR2: C-C Motif Chemokine Receptor Type 2; CD46: Complement Regulatory Protein; VTCN1: V-Set Domain- Containing T-Cell Activation Inhibitor 1; ARID5B: AT-Rich Interaction Domain 5B