Revealing
lncRNA
Structures
and
Interactions
by
Sequencing-Based
Approaches
Sequencing
Based
ReviewRevealing lncRNA Structures andInteractions by Sequencing-BasedApproachesXingyang Qian,1,3Jieyu Zhao,2,3Pui Yan Yeung,2,3Qiangfeng Cliff Zhang,1,*andChun Kit Kwok,2,*Long noncoding RNAs(lncRNAs)have emerged as significant players in almostevery level of gene function and regulation.Thus,characterizing the structuresand interactions of lncRNAs is essential for understanding their mechanisticroles in cells.Through a combination of(bio)chemical approaches andautomated capillary and high-throughput sequencing(HTS),the complexityand diversity of RNA structures and interactions has been revealed in thetranscriptomes of multiple species.These methods have uncovered importantbiological insights into the mechanistic and functional roles of lncRNA in geneexpression and RNA metabolism,as well as in development and disease.In thisreview,we summarize the latest sequencing strategies to reveal RNA structure,RNARNA,RNADNA,and RNAprotein interactions,and highlight the recentapplications of these approaches to map functional lncRNAs.We discuss theadvantages and limitations of these strategies,and provide recommendationsto further advance methodologies capable of mapping RNA structure andinteractions in order to discover new biology of lncRNAs and decipher theirmolecular mechanisms and implication in diseases.Biological Significance of lncRNAs and RNA StructureIn the human genome,approximately 93%of DNA can be transcribed as RNA,only 2%of which isprotein-encoding mRNAs,while the remaining 98%is known as noncoding RNAs 1,2.Amongthese noncoding RNA dark matters,RNAs longer than 200 bases are classified as lncRNAs.Since the advent of the genomic era in the 2000s,significant progress has been made toward theunderstanding of the prevalence,abundance,biogenesis,and functions of lncRNAs acrossdifferent cell types and species 3,4.In particular,lncRNAs have been demonstrated to playimportant roles in epigenetic control and the regulation of transcription,translation,RNA metabo-lism(Table 1),as well as stem cell maintenance and differentiation,cell autophagy and apoptosis,and embryonic development 57.In addition,lncRNAs have been implicated in major diseasesincluding different types of cancer,and neurological and cardiovascular diseases 810.With theaccumulating knowledge of genomic variations and expanding lncRNA repository,many disease-associated single nucleotide polymorphisms(SNPs)have been mapped to lncRNA genes1114.Databases such as LincSNP and LincSNP 2.0 have been created to facilitate theexploration of the potential functions of lncRNA-associated SNPs 15,16.The discovery of the catalytic and regulatory functions of RNA have refined the central dogma ofmolecular biology and highlighted the multifaceted biological roles of RNA 17,18.A significantbody of research has shown that the higher-order structures as well as interactions of RNA serveHighlightsHigher-order structures and interac-tions of lncRNA are critical for itsdiverse roles in gene function andregulation.Novel chemical and sequencing toolk-its are being developed to decipherRNA structures and interactions in vitroand in vivo.Application of these innovative meth-ods to lncRNA has revealed new andimportant structural motifs and inter-action groups.The methods and results reviewedhere can help to better understandand further investigate the lncRNAstructurefunction relationship.1MOE Key Laboratory ofBioinformatics,Center for Syntheticand Systems Biology,Tsinghua-Peking Joint Center for Life Sciences,School of Life Sciences,TsinghuaUniversity,Beijing 100084,China2Department of Chemistry,CityUniversity of Hong Kong,KowloonTong,Hong Kong SAR,China3These authors contributed equally tothis work*Correspondence:(Q.Q.Zhang)andckkwok42cityu.edu.hk(C.K.Kwok).TIBS 1504 No.of Pages 20Trends in Biochemical Sciences,Month Year,Vol.xx,No.yy https:/doi.org/10.1016/j.tibs.2018.09.012 1 2018 Elsevier Ltd.All rights reserved.TIBS 1504 No.of Pages 20GlossaryBivalent RNADNA linker:a linkerthat can ligate RNA to proximal DNA.Click reaction:a chemical reactionthat is selective,high yielding,andsimple to perform.Crosslinking andimmunoprecipitation(CLIP):amethod that couples UV crosslinkingwith immunoprecipitation to identifytranscripts that interacted with aspecific protein.Cross-species control:a controlassay performed using samples fromtwo species.For example,MARIOwas used in Drosophila S2 cells andmouse ES cells to test the extent ofrandom ligation of RNA molecules.Fragmentation sequencing(Frag-seq):a method that couples RNaseP-mediated cleavage with HTS.G quadruplex:a nucleic acidsecondary structure formed by a G-rich sequence that can self-assembleinto two or more G-quartet planes,which then stack on top of eachother.Parallel analysis of RNA structure(PARS):a method that couplesRNase V1-or RNase S1-mediatedcleavage with HTS.Proximity ligation:a ligation of twophysically proximal nucleic acidtermini.Ribonucleoprotein particle:abiomolecular complex that consistsof RNA and RBPs.Riboswitch:an RNA molecule thatcan sense small ligands,such asmetabolites or ions,and induce RNAconformational changes to affectgene expression.Ribozyme:an RNA molecule thatcan act as an enzyme and catalyzereactions,such as RNA ligation orcleavage.RNA interactome:a term todescribe the interactions of alltranscripts in the transcriptome,suchas RNARNA,RNADNA,RNAprotein,and others.RNA immunoprecipitationsequencing(RIP-seq):a methodthat couples native RNAimmunoprecipitations with HTS toidentify transcripts that interactedwith a specific protein.RNA structurome:a term todescribe the structures of alltranscripts in the transcriptome.Selective 20hydroxyl acylationanalyzed by primer extensionTable 1.Roles of Representative lncRNAs in Gene Expression and RNA MetabolismBiologicalprocesslncRNA example Role in gene expression and RNA metabolism RefsTranscriptionNRON NRON interacts with importin-b proteins and inhibits thetrafficking of NFAT transcription factor from the cytoplasmto the nucleus,which can lead to inactivation of targetgenes.154HSR1 HSR1 can interact with eEF1A,forming a HSR1eEF1Acomplex,which can capture and activate the transcriptionfactor HSF1,resulting in the transcription of Hsp andexpression of HSPs in response to heat and other stressstimuli.155SplicingMALAT1 MALAT1 regulates alternative splicing by controlling thephosphorylation and distribution of serine/arginine splicingfactors in nuclear speckle domains.156ASCO-lncRNA The ASCO-lncRNA is a nuclear alternative splicingregulator and influences the splicing patterns throughbinding with nuclear speckle RBP during development inArabidopsis.157TranslationAntisense Uchl1 The Uchl1 mRNA is complemented by an antisenselncRNA Uchl1,which is shuttled from nucleus to thecytoplasm under stress condition,to increase UCHL1protein synthesis.158HULC HULC is upregulated in hepatocellular carcinoma,whichcan bind to miR-372 and downregulates its activity,leading to reduced translational suppression of its targettranscript PRKACB.159RNA localizationXist A-repeat within the lncXist contains two long stem loopstructures,which can recruit PRC2,while C-repeat bindsYY1 transcription factor assisting XistPRC2 complex intargeting the specific sites on X-inactivation center,thenlead to X-linked gene silencing.90,160ENOD40 A novel nuclear speckle RBP,MtRBP1(Medicagotruncatula RNA binding protein 1)can be transported intocytoplasmic granules during nodule organogenesis byinteracting with ENOD40 in the leguminous plants.161,162RNA decay1/2-sbsRNAs Alu elements within cytoplasmic lncRNA(1/2-sbsRNAs)can form imperfect complementary RNA duplexes withanother Alu elements in the 30untranslated regions(UTRs)of mRNAs,then STAU1 protein subsequently recognizesand binds the resultant dsRNA elements and initiatestarget mRNA degradation.107gadd7 gadd7 can regulate the cell cycle G1/S checkpoint inresponse to UV irradiation.UV-induced gadd7 can directlybind to TAR DNA-binding protein(TDP-43)and interferewith the interaction between TDP43 and cyclin-dependentkinase 6(Cdk6)mRNA,resulting in Cdk6 mRNAdegradation.163RNA editingCTN-RNA The 30UTR of CTN-RNA contains inverted repeatsequences that can form stem loop recognized by ADARenzyme for adenosine-to-inosine editing,then the editedRNA interacts with p54nrb,promoting its nuclearretention.This nuclear retention can be involved in theregulation of mCAT2 gene expression.164sas-10 The sas-10 transcripts pair with 4f-rnp mRNA to formdouble-stranded molecules as target for A-to-G editing by1652 Trends in Biochemical Sciences,Month Year,Vol.xx,No.yyTIBS 1504 No.of Pages 20(SHAPE):a technique that uses anacylating agent,such as 1M7 andNAI,to react with flexible 20OHgroups of RNA,followed by primerextension reaction for readout.Shotgun secondary structure(3S)approach:a technique that breaksdown long RNA into smallerfragments for structure probing,andallows the identification of RNAstructural domains.Single-stranded/double-strandedRNA sequencing(ss/dsRNA-seq):a method that couples cleavage byss/dsRNA ribonucleases with HTS.Structural probing of elongatingtranscripts sequencing(SPET-seq):a method that couplestreatment with fast-reacting DMSprobe and HTS to determine RNAsecondary structures of transcriptionintermediates.versatile functions,as exemplified by ribozymes,riboswitches,and ribonucleoprotein com-plexes(see Glossary)19,20.RNA adopts diverse structural motifs,such as stem-loop,pseu-doknot,triplex,G-quadruplex,and is capable of long-range interactions,contributing to its basicbiological functions 21.These structural elements can form through cis(intramolecular)inter-actions within the same RNA molecule,or through trans(intermolecular)interactions with otherbiomolecules such as RNA,DNA,and proteins,to regulate fundamental cellular processes.Identifying RNA structures and interactions that are involved in gene regulation and function is thuscritical for the elucidation of the underlying biochemical mechanisms.In addition,major effortshave been dedicated to predicting the impact of SNPs on lncRNA secondary structures andlncRNAmiRNA interactions,especially with respect to their recently uncovered mechanistic rolesin various diseases 2224.To facilitate these efforts,there is a need to experimentally obtainlncRNA structures and interactions in vivo across diverse disease and cancer models.Combiningthese experimental data with a robust computational pipeline will likely generate more accuratecandidates of functional,disease-related lncRNA SNPs for further mechanistic characterizationand potential therapeutic intervention.One of the main mysteries of lncRNAs is the discrepancy between their low sequenceconservation and functionally important roles.Thus,many studies have been dedicated tothe search for conserved structural elements 25,26.For example,a large number of correlatedpositions in lncRNA were revealed by multiple alignments,suggesting evolutionary conserva-tion of lncRNA secondary structures 27.Additionally,many conserved structure elementswere found to be enriched in lncRNAs by screening for functional RNA structures conservedbetween mice and 59 other vertebrates 26.However,all this evidence is from computationalpredictions.It remains to be seen how many of these predicted conserved structures are realand functionally important in vivo.Another mystery of lncRNAs is their association with the ribosome and potential for encodingpeptides/proteins.Studies have shown that many lncRNAs in the cytosol are bound byribosomes 2830.While a number of studies have suggested that these ribosome-boundlncRNAs do not yield peptide/protein products 31,32,implying that the function of theselncRNAs is at the RNA level,others have suggested that some lncRNAs are likely to generatepeptide/proteins 33,34.It is still not entirely clear how and why ribosomes and translationregulators recognize and interact with lncRNAs,and possibly lead to productive peptide/protein synthesis.Table 1.(continued)BiologicalprocesslncRNA example Role in gene expression and RNA metabolism RefsdADAR editase in the 30UTR,leading to downregulation of4f-rnp mRNA levels.EpigeneticremodelingpRNA pRNA is a lncRNA that is complementary to the rDNApromoter,which can interact with the target site of thetranscription factor TTF-I,forming a DNA:RNA triplex thatis specifically recognized by the DNA methyltransferaseDNMT3b,then mediate de novo CpG of rRNA gene torepress its expression.166HOTAIR HOTAIR forms multiple double stem-loop structures thatbind to PRC2 histone-modification complexes and lysine-specific demethylase 1,mediating different pattern ofhistone modifications on target genes related to cancerdiseases.167Trends in Biochemical Sciences,Month Year,Vol.xx,No.yy 3TIBS 1504 No.of Pages 20Despite the importance of structural information for the understanding of lncRNA functions,ourknowledge of lncRNA structures is limited.According to the PDB database 35,the only itemsthat contain tertiary RNA structure information are the classical ncRNAs such as rRNAs,tRNAs,and small nuclear RNAs.The flexibility and the relatively large size of lncRNAs have made theirstructures difficult to be resolved by traditional 3D structural determination methods.Thecomputational approach is a useful alternative to predict RNA secondary structure andinteractions between two RNAs 3639.Classically,the default module of most computationalmethods predicts the most thermodynamically stable structure of an isolated RNA moleculeusing minimum free energy approach,with accuracy of prediction decreasing for longer andmore complex RNAs 37.Recent approaches have suggested that the real biological structureis more likely to be found when considering a Boltzmann ensemble of suboptimal folding states40.Parallel to computation approaches,experimental enzymatic and chemical RNA probingmethods were developed to analyze the structure of individual RNA transcripts 41.Never-theless,in contrast to the classical RNAs such as rRNAs and tRNAs,the structure andinteraction of lncRNAs and other RNA types remained elusive until recently.In the last fewyears,HTS methodology development has allowed us to discover and appreciate the elaboratestructure and interaction landscape on a transcriptome-wide scale,and this progress isdiscussed below.Recent Advances in RNA Structure ProbingMethods for probing RNA structuromes,which couple HTS with ribonuclease cleavage orchemical probing,have facilitated the transcriptome-wide mapping of RNA structure 21,42.The initial approaches such as parallel analysis of RNA structure(PARS)43,fragmen-tation sequencing(Frag-seq)44,single-stranded/double-stranded RNA sequencing(ss/dsRNA-seq)45,46 and selective 20hydroxyl acylation analyzed by primer exten-sion(SHAPE)-seq 47 were only able to determine transcriptome-wide RNA secondarystructures in vitro.Subsequent development of a new generation of methods,includingStructure-seq 48,DMS-seq 49,Mod-seq 50,SHAPE-MaP 51,and icSHAPE 52,enabledin vivo transcriptome-wide RNA structure probing,t