-
CiteScore
0.67
Impact Factor
Volume 1, Issue 1, Journal of Artificial Intelligence in Bioinformatics
Volume 1, Issue 1, 2025
Submit Manuscript Edit a Special Issue
Academic Editor
Abdur Rasool
Abdur Rasool
University of Hawaii at Manoa, United States
Article QR Code
Article QR Code
Scan the QR code for reading
Popular articles
Journal of Artificial Intelligence in Bioinformatics, Volume 1, Issue 1, 2025: 12-29

Open Access | Review Article | 30 April 2025
Advances in Intelligent Design and Optimization Methods for Nucleic Acid Sequences
1 Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian 116622, China
2 School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
* Corresponding Author: Yanfen Zheng, [email protected]
Received: 03 January 2025, Accepted: 27 March 2025, Published: 30 April 2025  
Abstract
As a carrier of genetic information, the precise design and optimization of nucleic acid (DNA or RNA) sequences are of critical importance for the realization of specific biological functions. From synthesizing genes to designing novel nucleic acid drugs, from constructing efficient expression vectors to modifying microbial metabolic pathways, all are inseparable from the fine regulation of nucleic acid sequences. The primary purpose of nucleic acid sequence design is to generate new sequences tailored to specific requirements for gene expression, function prediction, drug development, and other applications. In this paper, we first review the theoretical basis of nucleic acid sequence design, followed by an overview of current research methods for nucleic acid sequence design. Traditional nucleic acid sequence design methods rely on manual experience and experimentation, and although some progress has been made in the past decades, they still suffer from high cost, long time, and low efficiency in most cases. Therefore, optimizing nucleic acid sequences to improve their performance and stability has become particularly important. In recent years, artificial intelligence technology has provided a new direction for the design and optimization of nucleic acid sequences, opening up new possibilities for more efficient and accurate design methods. These methods include traditional rule-based nucleic acid sequence optimization approaches as well as AI-driven optimization methods for nucleic acid sequence generation. This review systematically examines the latest advancements in both traditional and AI-driven nucleic acid sequence design methods and analyzes the technical details, strengths, and limitations of each application. Finally, the article discusses the current challenges and future development directions of nucleic acid sequence design.

Graphical Abstract
Advances in Intelligent Design and Optimization Methods for Nucleic Acid Sequences

Keywords
nucleic acid sequence design
machine learning
optimization methods
generative models
heuristic algorithms
large language models
nucleic acid structure prediction

Data Availability Statement
Data will be made available on request.

Funding
This work was supported in part by the 111 Project under Grant D23006; in part by the National Natural Science Foundation of China under Grant 62272079; in part by the National Foreign Expert Project of China under Grant D20240244; in part by the Natural Science Foundation of Liaoning Province under Grant 2024-MS-212; in part by the Scientific Research Project of Liaoning Provincial Department of Education under Grant LJ222411258005; in part by the LiaoNing Revitalization Talent Program under Grant XLYC2403039; in part by the Artificial Intelligence Innovation Development Plan Project of Liaoning Province under Grant 2023JH26/10300025; in part by the Dalian Outstanding Young Science and Technology Talent Support Program under Grant 2022RJ08; in part by the Dalian Major Projects of Basic Research under Grant 2023JJ11CG002; in part by the Interdisciplinary Project of Dalian University under Grant DLUXK-2024-YB-001; in part by the Joint plan of Liaoning Province science and technology plan under Grant 2024JH2/102600064 and Grant 2024-MSLH-009.

Conflicts of Interest
The authors declare no conflicts of interest.

Ethical Approval and Consent to Participate
Not applicable.

References
  1. Shin, S., Lee, I., Kim, D., & Zhang, B. (2005). Multiobjective evolutionary optimization of DNA sequences for reliable DNA computing. IEEE Transactions on Evolutionary Computation, 9(2), 143-158.
    [CrossRef]   [Google Scholar]
  2. Feldkamp, U., Rauhe, H., & Banzhaf, W. (2003). Software tools for DNA sequence design. Genetic Programming and Evolvable Machines, 4, 153-171.
    [CrossRef]   [Google Scholar]
  3. Raab, D., Graf, M., Notka, F., Schödl, T., & Wagner, R. (2010). The GeneOptimizer algorithm: Using a sliding window approach to cope with the vast sequence space in multiparameter DNA sequence optimization. Systems and Synthetic Biology, 4(3), 215-225.
    [CrossRef]   [Google Scholar]
  4. Mardis, E. R. (2008). Next-generation DNA sequencing methods. Annu. Rev. Genomics Hum. Genet., 9(1), 387-402.
    [CrossRef]   [Google Scholar]
  5. Bates, M., Lachoff, J., Meech, D., Zulkower, V., Moisy, A., Luo, Y., Tekotte, H., Franziska Scheitz, C. J., Khilari, R., Mazzoldi, F., Chandran, D., & Groban, E. (2017). Genetic constructor: An online DNA design platform. ACS Synthetic Biology, 6(12), 2362-2365.
    [CrossRef]   [Google Scholar]
  6. Villalobos, A., Ness, J. E., Gustafsson, C., Minshull, J., & Govindarajan, S. (2006). Gene designer: A synthetic biology tool for constructing artificial DNA segments. BMC Bioinformatics, 7(1).
    [CrossRef]   [Google Scholar]
  7. Angelbello, A. J., Chen, J. L., Childs-Disney, J. L., Zhang, P., Wang, Z., & Disney, M. D. (2018). Using genome sequence to enable the design of medicines and chemical probes. Chemical Reviews, 118(4), 1599-1663.
    [CrossRef]   [Google Scholar]
  8. Alarcon, C. M., Shan, G., Layton, D. T., Bell, T. A., Whipkey, S., & Shillito, R. D. (2018). Application of DNA- and protein-based detection methods in agricultural biotechnology. Journal of Agricultural and Food Chemistry, 67(4), 1019-1028.
    [CrossRef]   [Google Scholar]
  9. Tyo, K. E., Kocharin, K., & Nielsen, J. (2010). Toward design-based engineering of industrial microbes. Current Opinion in Microbiology, 13(3), 255-262.
    [CrossRef]   [Google Scholar]
  10. Goldman, N., Bertone, P., Chen, S., Dessimoz, C., LeProust, E. M., Sipos, B., & Birney, E. (2013). Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature, 494(7435), 77-80.
    [CrossRef]   [Google Scholar]
  11. Zhang, P., Wang, H., Xu, H., Wei, L., Liu, L., Hu, Z., & Wang, X. (2023). Deep flanking sequence engineering for efficient promoter design using DeepSEED. Nature Communications, 14(1).
    [CrossRef]   [Google Scholar]
  12. Hoose, A., Vellacott, R., Storch, M., Freemont, P. S., & Ryadnov, M. G. (2023). DNA synthesis technologies to close the gene writing gap. Nature Reviews Chemistry, 7(3), 144-161.
    [CrossRef]   [Google Scholar]
  13. Nguyen, E., Poli, M., Durrant, M. G., Thomas, A. W., Kang, B., Sullivan, J., Ng, M. Y., Lewis, A., Patel, A., Lou, A., Ermon, S., Baccus, S. A., Hernandez-Boussard, T., Re, C., Hsu, P. D., & Hie, B. L. (2024). Sequence modeling and design from molecular to genome scale with Evo. Science, 386(6723), eado9336.
    [CrossRef]   [Google Scholar]
  14. Zrimec, J., Fu, X., Muhammad, A. S., Skrekas, C., Jauniskis, V., Speicher, N. K., Börlin, C. S., Verendel, V., Chehreghani, M. H., Dubhashi, D., Siewers, V., David, F., Nielsen, J., & Zelezniak, A. (2022). Controlling gene expression with deep generative design of regulatory DNA. Nature Communications, 13(1), 5099.
    [CrossRef]   [Google Scholar]
  15. Keskin Karakoyun, H., Yüksel, Ş. K., Amanoglu, I., Naserikhojasteh, L., Yeşilyurt, A., Yakıcıer, C., Timuçin, E., & Akyerli, C. B. (2023). Evaluation of AlphaFold structure-based protein stability prediction on missense variations in cancer. Frontiers in Genetics, 14, 1052383.
    [CrossRef]   [Google Scholar]
  16. Gosai, S. J., Castro, R. I., Fuentes, N., Butts, J. C., Mouri, K., Alasoadura, M., Kales, S., Nguyen, T. T., Noche, R. R., Rao, A. S., Joy, M. T., Sabeti, P. C., Reilly, S. K., & Tewhey, R. (2024). Machine-guided design of cell-type-targeting cis-regulatory elements. Nature, 634(8036), 1211-1220.
    [CrossRef]   [Google Scholar]
  17. Cox, B., Denyer, J. C., Binnie, A., Donnelly, M. C., Evans, B., Green, D. V., ... & Watson, S. P. (2000). Application of high-throughput screening techniques to drug discovery. Progress in Medicinal Chemistry, 37, 83-133.
    [CrossRef]   [Google Scholar]
  18. Gervasio, J. H. D. B., da Costa Oliveira, H., da Costa Martins, A. G., Pesquero, J. B., Verona, B. M., & Cerize, N. N. P. (2024). How close are we to storing data in DNA?. Trends in Biotechnology, 42(2), 156-167.
    [CrossRef]   [Google Scholar]
  19. Clark, D. P., & Pazdernik, N. J. (2012). Molecular biology. Elsevier.
    [Google Scholar]
  20. Waterman, M. S. (2018). Introduction to computational biology: maps, sequences and genomes. Chapman and Hall/CRC.
    [Google Scholar]
  21. Watson, J. D., & Crick, F. H. (1953, January). The structure of DNA. In Cold Spring Harbor symposia on quantitative biology (Vol. 18, pp. 123-131). Cold Spring Harbor Laboratory Press.
    [CrossRef]   [Google Scholar]
  22. Travers, A., & Muskhelishvili, G. (2015). DNA structure and function. The FEBS journal, 282(12), 2279-2295.
    [CrossRef]   [Google Scholar]
  23. Komili, S., Farny, N. G., Roth, F. P., & Silver, P. A. (2007). Functional specificity among Ribosomal proteins regulates gene expression. Cell, 131(3), 557-571.
    [CrossRef]   [Google Scholar]
  24. Neylon, C. (2004). Chemical and biochemical strategies for the randomization of protein encoding DNA sequences: library construction methods for directed evolution. Nucleic acids research, 32(4), 1448-1459.
    [CrossRef]   [Google Scholar]
  25. Rohs, R., Jin, X., West, S. M., Joshi, R., Honig, B., & Mann, R. S. (2010). Origins of specificity in Protein-DNA recognition. Annual Review of Biochemistry, 79(1), 233-269.
    [CrossRef]   [Google Scholar]
  26. Beerli, R. R., Segal, D. J., Dreier, B., & Barbas III, C. F. (1998). Toward controlling gene expression at will: specific regulation of the erbB-2/HER-2 promoter by using polydactyl zinc finger proteins constructed from modular building blocks. Proceedings of the National Academy of Sciences, 95(25), 14628-14633.
    [CrossRef]   [Google Scholar]
  27. Goldenzweig, A., Goldsmith, M., Hill, S., Gertman, O., Laurino, P., Ashani, Y., Dym, O., Unger, T., Albeck, S., Prilusky, J., Lieberman, R., Aharoni, A., Silman, I., Sussman, J., Tawfik, D., & Fleishman, S. (2016). Automated structure- and sequence-based design of proteins for high bacterial expression and stability. Molecular Cell, 63(2), 337-346.
    [CrossRef]   [Google Scholar]
  28. Matange, K., Tuck, J. M., & Keung, A. J. (2021). DNA stability: A central design consideration for DNA data storage systems. Nature Communications, 12(1), 1-9.
    [CrossRef]   [Google Scholar]
  29. Szostak, J. W., Bartel, D. P., & Luisi, P. L. (2001). Synthesizing life. Nature, 409(6818), 387-390.
    [CrossRef]   [Google Scholar]
  30. Bulyk, M. L. (2003). Computational prediction of transcription-factor binding site locations. Genome biology, 5, 1-11.
    [CrossRef]   [Google Scholar]
  31. Kudla, G., Murray, A. W., Tollervey, D., & Plotkin, J. B. (2009). Coding-sequence determinants of gene expression in Escherichia coli. Science, 324(5924), 255-258.
    [CrossRef]   [Google Scholar]
  32. Francis, D. M., & Page, R. (2010). Strategies to optimize protein expression inE. coli. Current Protocols in Protein Science, 61(1), 5.24. 1-5.24. 29.
    [CrossRef]   [Google Scholar]
  33. Murakami, S., & Jaffrey, S. R. (2022). Hidden codes in mRNA: Control of gene expression by m6A. Molecular Cell, 82(12), 2236-2251.
    [CrossRef]   [Google Scholar]
  34. Dowell, R. D., & Eddy, S. R. (2004). Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction. BMC Bioinformatics, 5(1), 1-14.
    [CrossRef]   [Google Scholar]
  35. WANG, Y., WANG, H., YAN, M., HU, G., & WANG, X. (2021). Design of biomolecular sequences by artificial intelligence. Synthetic Biology Journal, 2(1), 1-14.
    [Google Scholar]
  36. Condon, A. (2006). Designed DNA molecules: Principles and applications of molecular nanotechnology. Nature Reviews Genetics, 7(7), 565-575.
    [CrossRef]   [Google Scholar]
  37. Lathe, R. (1985). Synthetic oligonucleotide probes deduced from amino acid sequence data. Journal of Molecular Biology, 183(1), 1-12.
    [CrossRef]   [Google Scholar]
  38. Newman, Z. R., Young, J. M., Ingolia, N. T., & Barton, G. M. (2016). Differences in codon bias and GC content contribute to the balanced expression of TLR7 and TLR9. Proceedings of the National Academy of Sciences, 113(10).
    [CrossRef]   [Google Scholar]
  39. Burgess-Brown, N. A., Sharma, S., Sobott, F., Loenarz, C., Oppermann, U., & Gileadi, O. (2008). Codon optimization can improve expression of human genes in escherichia coli: A multi-gene study. Protein Expression and Purification, 59(1), 94-102.
    [CrossRef]   [Google Scholar]
  40. Gingold, H., & Pilpel, Y. (2011). Determinants of translation efficiency and accuracy. Molecular Systems Biology, 7(1), 481.
    [CrossRef]   [Google Scholar]
  41. Sharp, P. M., & Li, W. (1987). The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Research, 15(3), 1281-1295.
    [CrossRef]   [Google Scholar]
  42. Browne, P. D., Nielsen, T. K., Kot, W., Aggerholm, A., Gilbert, M. T., Puetz, L., Rasmussen, M., Zervas, A., & Hansen, L. H. (2020). GC bias affects genomic and metagenomic reconstructions, underrepresenting GC-poor organisms. GigaScience, 9(2), giaa008.
    [CrossRef]   [Google Scholar]
  43. Mathews, D. H., Sabina, J., Zuker, M., & Turner, D. H. (1999). Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. Journal of Molecular Biology, 288(5), 911-940.
    [CrossRef]   [Google Scholar]
  44. Brahmachari, S. K., Meera, G., Sarkar, P. S., Balagurumoorthy, P., Tripathi, J., Raghavan, S., Shaligram, U., & Pataskar, S. (1995). Simple repetitive sequences in the genome: Structure and functional significance. ELECTROPHORESIS, 16(1), 1705-1714.
    [CrossRef]   [Google Scholar]
  45. Van Belkum, A., Scherer, S., Van Alphen, L., & Verbrugh, H. (1998). Short-sequence DNA repeats in prokaryotic genomes. Microbiology and Molecular Biology Reviews, 62(2), 275-293.
    [CrossRef]   [Google Scholar]
  46. Treangen, T. J., & Salzberg, S. L. (2011). Repetitive DNA and next-generation sequencing: Computational challenges and solutions. Nature Reviews Genetics, 13(1), 36-46.
    [CrossRef]   [Google Scholar]
  47. Kosuri, S., & Church, G. M. (2014). Large-scale de Novo DNA synthesis: Technologies and applications. Nature Methods, 11(5), 499-507.
    [CrossRef]   [Google Scholar]
  48. Chen, Z., Pan, N., & Beachy, R. N. (1988). A DNA sequence element that confers seed-specific enhancement to a constitutive promoter. The EMBO Journal, 7(2), 297-302.
    [CrossRef]   [Google Scholar]
  49. Slobodin, B., Han, R., Calderone, V., Vrielink, J. A., Loayza-Puch, F., Elkon, R., & Agami, R. (2017). Transcription impacts the efficiency of mRNA translation via Co-transcriptional N6-adenosine methylation. Cell, 169(2), 326-337.e12.
    [CrossRef]   [Google Scholar]
  50. Shaul, O. (2017). How introns enhance gene expression. The International Journal of Biochemistry & Cell Biology, 91, 145-155.
    [CrossRef]   [Google Scholar]
  51. Studier, F., & Moffatt, B. A. (1986). Use of bacteriophage T7 RNA polymerase to direct selective high-level expression of cloned genes. Journal of Molecular Biology, 189(1), 113-130.
    [CrossRef]   [Google Scholar]
  52. Kozak, M. (1986). Influences of mRNA secondary structure on initiation by eukaryotic ribosomes. Proceedings of the National Academy of Sciences, 83(9), 2850-2854.
    [CrossRef]   [Google Scholar]
  53. Kozak, M. (2005). Regulation of translation via mRNA structure in prokaryotes and eukaryotes. Gene, 361, 13-37.
    [CrossRef]   [Google Scholar]
  54. Milenkovic, O., & Kashyap, N. (2005). DNA codes that avoid secondary structures. Proceedings. International Symposium on Information Theory, 2005. ISIT 2005, 288-292.
    [CrossRef]   [Google Scholar]
  55. Lorenz, R., Bernhart, S. H., Höner zu Siederdissen, C., Tafer, H., Flamm, C., Stadler, P. F., & Hofacker, I. L. (2011). ViennaRNA Package 2.0. Algorithms for molecular biology, 6, 1-14.
    [CrossRef]   [Google Scholar]
  56. Schlake, T., Thess, A., Fotin-Mleczek, M., & Kallen, K. (2012). Developing mrna-vaccine technologies. RNA Biology, 9(11), 1319-1330.
    [CrossRef]   [Google Scholar]
  57. Arita, M., & Kobayashi, S. (2002). DNA sequence design using templates. New Generation Computing, 20(3), 263-277.
    [CrossRef]   [Google Scholar]
  58. Ling, M. M., & Robinson, B. H. (1997). Approaches to DNA mutagenesis: An overview. Analytical Biochemistry, 254(2), 157-178.
    [CrossRef]   [Google Scholar]
  59. Wachsmuth, M., Findeiss, S., Weissheimer, N., Stadler, P. F., & Morl, M. (2012). De Novo design of a synthetic riboswitch that regulates transcription termination. Nucleic Acids Research, 41(4), 2541-2551.
    [CrossRef]   [Google Scholar]
  60. Angermueller, C., Pärnamaa, T., Parts, L., & Stegle, O. (2016). Deep learning for computational biology. Molecular systems biology, 12(7), 878.
    [CrossRef]   [Google Scholar]
  61. Brophy, J. A., & Voigt, C. A. (2014). Principles of genetic circuit design. Nature methods, 11(5), 508-520.
    [CrossRef]   [Google Scholar]
  62. Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., ... & Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. nature, 596(7873), 583-589.
    [Google Scholar]
  63. Zhang, H., Zhang, L., Lin, A., Xu, C., Li, Z., Liu, K., ... & Huang, L. (2021). Lineardesign: Efficient algorithms for optimized mrna sequence design.
    [Google Scholar]
  64. Liu, J., Li, J., Wang, H., & Yan, J. (2020). Application of deep learning in genomics. Science China Life Sciences, 63, 1860-1878.
    [CrossRef]   [Google Scholar]
  65. Li, X., Cao, B., Wang, J., Meng, X., Wang, S., Huang, Y., ... & Song, T. (2025). Predicting mutation-disease associations through protein interactions via deep learning. IEEE Journal of Biomedical and Health Informatics.
    [CrossRef]   [Google Scholar]
  66. Killoran, N., Lee, L. J., DeLong, A., Duvenaud, D., & Frey, B. J. (2017). Generating and designing DNA with deep generative models. arXiv preprint arXiv:1712.06148.
    [Google Scholar]
  67. Riesselman, A. J., Ingraham, J. B., & Marks, D. S. (2018). Deep generative models of genetic variation capture the effects of mutations. Nature Methods, 15(10), 816-822.
    [CrossRef]   [Google Scholar]
  68. Kingma, D. P., & Welling, M. (2013). Auto-Encoding Variational Bayes. arXiv e-prints, arXiv-1312. https://ui.adsabs.harvard.edu/link_gateway/2013arXiv1312.6114K/doi:10.48550/arXiv.1312.6114
    [Google Scholar]
  69. Sumi, S., Hamada, M., & Saito, H. (2024). Deep generative design of RNA family sequences. Nature Methods, 21(3), 435-443.
    [CrossRef]   [Google Scholar]
  70. Seo, E., Choi, Y., Shin, Y., Kim, D., & Lee, J. (2023). Design of synthetic promoters for cyanobacteria with generative deep-learning model. Nucleic Acids Research, 51(13), 7071-7082.
    [CrossRef]   [Google Scholar]
  71. Hawkins-Hooker, A., Depardieu, F., Baur, S., Couairon, G., Chen, A., & Bikard, D. (2021). Generating functional protein variants with variational autoencoders. PLOS Computational Biology, 17(2), e1008736.
    [CrossRef]   [Google Scholar]
  72. Sadeghi, E., Mastracco, P., Gonzàlez-Rosell, A., Copp, S. M., & Bogdanov, P. (2024). Multi-objective design of DNA-stabilized Nanoclusters using variational Autoencoders with automatic feature extraction. ACS Nano, 18(39), 26997-27008.
    [CrossRef]   [Google Scholar]
  73. Greener, J. G., Moffat, L., & Jones, D. T. (2018). Design of metalloproteins and novel protein folds using variational autoencoders. Scientific Reports, 8(1), 16189.
    [CrossRef]   [Google Scholar]
  74. Moomtaheen, F., Killeen, M., Oswald, J., Gonzàlez-Rosell, A., Mastracco, P., Gorovits, A., Copp, S. M., & Bogdanov, P. (2022). DNA-stabilized silver Nanocluster design via regularized variational Autoencoders. Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 3593-3602.
    [CrossRef]   [Google Scholar]
  75. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2020). Generative adversarial networks. Communications of the ACM, 63(11), 139-144.
    [CrossRef]   [Google Scholar]
  76. Sohn, K., Lee, H., & Yan, X. (2015). Learning structured output representation using deep conditional generative models. Advances in neural information processing systems, 28.
    [Google Scholar]
  77. Sohn, K., Lee, H., & Yan, X. (2015). Learning structured output representation using deep conditional generative models. Advances in neural information processing systems, 28.
    [Google Scholar]
  78. Dai, J., Zhang, Y., Shi, C., Liu, Y., Xiu, P., & Wang, Y. (2024). BEGAN: Boltzmann-Reweighted Data Augmentation for Enhanced GAN-Based Molecule Design in Insect Pheromone Receptors. The Journal of Physical Chemistry B, 128(47), 11666-11675.
    [CrossRef]   [Google Scholar]
  79. Chiquitto, A. G., Oliveira, L. S., Bugatti, P. H., Saito, P. T. M., Basham, M., Raittz, R. T., & Paschoal, A. R. (2024). Generative Approaches for Nucleotide Sequences to Enhance Non-coding RNA Classification. bioRxiv, 2024-11.
    [CrossRef]   [Google Scholar]
  80. Yelmen, B., Decelle, A., Boulos, L. L., Szatkownik, A., Furtlehner, C., Charpiat, G., & Jay, F. (2023). Deep convolutional and conditional neural networks for large-scale genomic data generation. PLOS Computational Biology, 19(10), e1011584.
    [CrossRef]   [Google Scholar]
  81. Yu, H., & Welch, J. D. (2021). MichiGAN: sampling from disentangled representations of single-cell data using generative adversarial networks. Genome biology, 22(1), 158.
    [CrossRef]   [Google Scholar]
  82. Yelmen, B., Decelle, A., Ongaro, L., Marnetto, D., Tallec, C., Montinaro, F., ... & Jay, F. (2021). Creating artificial human genomes using generative neural networks. PLoS genetics, 17(2), e1009303.
    [CrossRef]   [Google Scholar]
  83. Macedo, B., Ribeiro Vaz, I., & Taveira Gomes, T. (2024). MedGAN: optimized generative adversarial network with graph convolutional networks for novel molecule design. Scientific reports, 14(1), 1212.
    [CrossRef]   [Google Scholar]
  84. Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., & Ganguli, S. (2015, June). Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning (pp. 2256-2265). pmlr.
    [Google Scholar]
  85. DaSilva, L. F., Senan, S., Patel, Z. M., Reddy, A. J., Gabbita, S., Nussbaum, Z., ... & Pinello, L. (2024). DNA-diffusion: leveraging generative models for controlling chromatin accessibility and gene expression via synthetic regulatory elements. bioRxiv.
    [CrossRef]   [Google Scholar]
  86. Sarkar, A., Tang, Z., Zhao, C., & Koo, P. K. (2024). Designing DNA with tunable regulatory activity using discrete diffusion. bioRxiv, 2024-05.
    [CrossRef]   [Google Scholar]
  87. Li, Z., Ni, Y., Huygelen, T. A. B., Das, A., Xia, G., Stan, G. B., & Zhao, Y. (2023). Latent diffusion model for dna sequence generation. arXiv preprint arXiv:2310.06150.
    [CrossRef]   [Google Scholar]
  88. Luo, S., Su, Y., Peng, X., Wang, S., Peng, J., & Ma, J. (2022). Antigen-specific antibody design and optimization with diffusion-based generative models for protein structures. Advances in Neural Information Processing Systems, 35, 9754-9767.
    [Google Scholar]
  89. Wang, Z., Liu, Z., Zhang, W., Li, Y., Feng, Y., Lv, S., ... & Li, X. (2024). AptaDiff: de novo design and optimization of aptamers based on diffusion models. Briefings in Bioinformatics, 25(6), bbae517.
    [CrossRef]   [Google Scholar]
  90. Avdeyev, P., Shi, C., Tan, Y., Dudnyk, K., & Zhou, J. (2023, July). Dirichlet diffusion score model for biological sequence generation. In International Conference on Machine Learning (pp. 1276-1301). PMLR.
    [Google Scholar]
  91. Consens, M. E., Dufault, C., Wainberg, M., Forster, D., Karimzadeh, M., Goodarzi, H., ... & Wang, B. (2023). To transformers and beyond: large language models for the genome. arXiv preprint arXiv:2311.07621.
    [CrossRef]   [Google Scholar]
  92. Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A neural probabilistic language model. Journal of machine learning research, 3(Feb), 1137-1155.
    [Google Scholar]
  93. Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training.
    [Google Scholar]
  94. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
    [Google Scholar]
  95. Cao, B., Wang, B., & Zhang, Q. (2023). GCNSA: DNA storage encoding with a graph convolutional network and self-attention. Iscience, 26(3).
    [CrossRef]   [Google Scholar]
  96. Zheng, Y., Cao, B., Zhang, X., Cui, S., Wang, B., & Zhang, Q. (2024). DNA-QLC: an efficient and reliable image encoding scheme for DNA storage. BMC genomics, 25(1), 266.
    [CrossRef]   [Google Scholar]
  97. Sanabria, M., Hirsch, J., Joubert, P. M., & Poetsch, A. R. (2024). DNA language model GROVER learns sequence context in the human genome. Nature Machine Intelligence, 6(8), 911-923.
    [CrossRef]   [Google Scholar]
  98. Zhang, D., Zhang, W., Zhao, Y., Zhang, J., He, B., Qin, C., & Yao, J. (2023). DNAGPT: a generalized pre-trained tool for versatile DNA sequence analysis tasks. arXiv preprint arXiv:2307.05628.
    [CrossRef]   [Google Scholar]
  99. Chen, Y., & Zou, J. (2024). GenePT: a simple but effective foundation model for genes and cells built from ChatGPT. bioRxiv, 2023-10.
    [CrossRef]   [Google Scholar]
  100. Ji, Y., Zhou, Z., Liu, H., & Davuluri, R. V. (2021). DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics, 37(15), 2112-2120.
    [CrossRef]   [Google Scholar]
  101. Shao, B., & Yan, J. (2024). A long-context language model for deciphering and generating bacteriophage genomes. Nature Communications, 15(1), 9392.
    [CrossRef]   [Google Scholar]
  102. Madani, A., Krause, B., Greene, E. R., Subramanian, S., Mohr, B. P., Holton, J. M., ... & Naik, N. (2023). Large language models generate functional protein sequences across diverse families. Nature biotechnology, 41(8), 1099-1106.
    [CrossRef]   [Google Scholar]
  103. Tinoco Jr, I., & Bustamante, C. (1999). How RNA folds. Journal of molecular biology, 293(2), 271-281.
    [CrossRef]   [Google Scholar]
  104. Aslam, S., Rasool, A., Li, X., & Wu, H. (2025). Cel: A continual learning model for disease outbreak prediction by leveraging domain adaptation via elastic weight consolidation. Interdisciplinary Sciences: Computational Life Sciences, 1-19.
    [CrossRef]   [Google Scholar]
  105. Butt, M. H. F., Li, J. P., Ji, J., Riaz, W., Anwar, N., Butt, F. F., ... & Uddin, M. Y. (2024). Intelligent tumor tissue classification for Hybrid Health Care Units. Frontiers in Medicine, 11, 1385524.
    [CrossRef]   [Google Scholar]
  106. Dutt, Y., Pandey, R. P., Dutt, M., Gupta, A., Vibhuti, A., Vidic, J., ... & Priyadarshini, A. (2023). Therapeutic applications of nanobiotechnology. Journal of nanobiotechnology, 21(1), 148.
    [CrossRef]   [Google Scholar]
  107. Rasool, A., Hong, J., Hong, Z., Li, Y., Zou, C., Chen, H., ... & Dai, J. (2024). An Effective DNA‐Based File Storage System for Practical Archiving and Retrieval of Medical MRI Data. Small Methods, 8(10), 2301585.
    [CrossRef]   [Google Scholar]
  108. Rasool, A., Qu, Q., Jiang, Q., & Wang, Y. (2021, December). A strategy-based optimization algorithm to design codes for DNA data storage system. In International Conference on Algorithms and Architectures for Parallel Processing (pp. 284-299). Cham: Springer International Publishing.
    [CrossRef]   [Google Scholar]
  109. Westhof, E. R. I. C., Auffinger, E., & Gaspin, C. (1996). DNA and RNA structure prediction. DNA–Protein Sequence Analysis, Oxford, 255-278.
    [Google Scholar]
  110. Hofacker, I. L. (2003). Vienna RNA secondary structure server. Nucleic acids research, 31(13), 3429-3431.
    [CrossRef]   [Google Scholar]
  111. Wen, X., Sun, L., Xie, L., Zheng, Y., Cao, B., & Wang, B. (2024, December). MFN: Explainable DNA triple helixes Stabilized Design based on mCGR and flow network. In 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (pp. 248-253). IEEE.
    [CrossRef]   [Google Scholar]
  112. Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255-260.
    [CrossRef]   [Google Scholar]
  113. Turner, D. H., & Mathews, D. H. (2010). NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure. Nucleic acids research, 38(suppl\_1), D280-D282.
    [CrossRef]   [Google Scholar]
  114. Bellaousov, S., Reuter, J. S., Seetin, M. G., & Mathews, D. H. (2013). RNAstructure: web servers for RNA secondary structure prediction and analysis. Nucleic acids research, 41(W1), W471-W474.
    [CrossRef]   [Google Scholar]
  115. Xia, T., SantaLucia Jr, J., Burkard, M. E., Kierzek, R., Schroeder, S. J., Jiao, X., ... & Turner, D. H. (1998). Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson--Crick base pairs. Biochemistry, 37(42), 14719-14735.
    [CrossRef]   [Google Scholar]
  116. Zakov, S., Goldberg, Y., Elhadad, M., & Ziv-Ukelson, M. (2011). Rich parameterization improves RNA structure prediction. Journal of Computational Biology, 18(11), 1525-1542.
    [CrossRef]   [Google Scholar]
  117. Akiyama, M., Sato, K., & Sakakibara, Y. (2018). A max-margin training of RNA secondary structure prediction integrated with the thermodynamic model. Journal of bioinformatics and computational biology, 16(06), 1840025.
    [CrossRef]   [Google Scholar]
  118. Sato, K., Akiyama, M., & Sakakibara, Y. (2021). RNA secondary structure prediction using deep learning with thermodynamic integration. Nature communications, 12(1), 941.
    [CrossRef]   [Google Scholar]
  119. Wang, L., Liu, Y., Zhong, X., Liu, H., Lu, C., Li, C., & Zhang, H. (2019). DMfold: a novel method to predict RNA secondary structure with pseudoknots based on deep learning and improved base pair maximization principle. Frontiers in genetics, 10, 143.
    [CrossRef]   [Google Scholar]
  120. Kunitski, M., Eicke, N., Huber, P., Köhler, J., Zeller, S., Voigtsberger, J., ... & Dörner, R. (2019). Double-slit photoelectron interference in strong-field ionization of the neon dimer. Nature communications, 10(1), 1.
    [CrossRef]   [Google Scholar]
  121. Singh, J., Paliwal, K., Zhang, T., Singh, J., Litfin, T., & Zhou, Y. (2021). Improved RNA secondary structure and tertiary base-pairing prediction using evolutionary profile, mutational coupling and two-dimensional transfer learning. Bioinformatics, 37(17), 2589-2600.
    [CrossRef]   [Google Scholar]
  122. Shen, T., Hu, Z., Peng, Z., Chen, J., Xiong, P., Hong, L., ... & Li, Y. (2022). E2Efold-3D: end-to-end deep learning method for accurate de novo RNA 3D structure prediction. arXiv preprint arXiv:2207.01586.
    [CrossRef]   [Google Scholar]
  123. Townshend, R. J., Eismann, S., Watkins, A. M., Rangan, R., Karelina, M., Das, R., & Dror, R. O. (2021). Geometric deep learning of RNA structure. Science, 373(6558), 1047-1051.
    [CrossRef]   [Google Scholar]
  124. Chen, C. C., & Chan, Y. M. (2023). REDfold: accurate RNA secondary structure prediction using residual encoder-decoder network. BMC bioinformatics, 24(1), 122.
    [CrossRef]   [Google Scholar]
  125. Fu, L., Cao, Y., Wu, J., Peng, Q., Nie, Q., & Xie, X. (2022). UFold: fast and accurate RNA secondary structure prediction with deep learning. Nucleic acids research, 50(3), e14-e14.
    [CrossRef]   [Google Scholar]
  126. Truong-Quoc, C., Lee, J. Y., Kim, K. S., & Kim, D. N. (2024). Prediction of DNA origami shape using graph neural network. Nature Materials, 23(7), 984-992.
    [CrossRef]   [Google Scholar]
  127. Zablocki, L. I., Bugnon, L. A., Gerard, M., Di Persia, L., Stegmayer, G., & Milone, D. H. (2025). Comprehensive benchmarking of large language models for RNA secondary structure prediction. Briefings in Bioinformatics, 26(2), bbaf137.
    [CrossRef]   [Google Scholar]
  128. Tušek, A., & Kurtanjek, Z. (2012). Mathematical modelling of gene regulatory networks. Applied Biological Engineering—Principles and Practice.
    [Google Scholar]
  129. Lu, T., Liang, H., Li, H., & Wu, H. (2011). High-dimensional ODEs coupled with mixed-effects modeling techniques for dynamic gene regulatory network identification. Journal of the American Statistical Association, 106(496), 1242-1258.
    [CrossRef]   [Google Scholar]
  130. Bubeck, S., Chadrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., ... & Zhang, Y. (2023, March). Sparks of artificial general intelligence: Early experiments with gpt-4.
    [Google Scholar]
  131. Boycott, K. M., Vanstone, M. R., Bulman, D. E., & MacKenzie, A. E. (2013). Rare-disease genetics in the era of next-generation sequencing: discovery to translation. Nature Reviews Genetics, 14(10), 681-691.
    [CrossRef]   [Google Scholar]
  132. Jucker, M., & Walker, L. C. (2018). Propagation and spread of pathogenic protein assemblies in neurodegenerative diseases. Nature neuroscience, 21(10), 1341-1349.
    [CrossRef]   [Google Scholar]
  133. Thalpage, N. (2023). Unlocking the black box: Explainable artificial intelligence (XAI) for trust and transparency in ai systems. J. Digit. Art Humanit, 4(1), 31-36.
    [Google Scholar]
  134. Abramson, J., Adler, J., Dunger, J., Evans, R., Green, T., Pritzel, A., ... & Jumper, J. M. (2024). Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature, 630(8016), 493-500.
    [CrossRef]   [Google Scholar]

Cite This Article
APA Style
Yang, T., Han, M., Wen, X., & Zheng, Y. (2025). Advances in Intelligent Design and Optimization Methods for Nucleic Acid Sequences. Journal of Artificial Intelligence in Bioinformatics, 1(1), 12–29. https://doi.org/10.62762/JAIB.2025.194547

Article Metrics
Citations:

Crossref

0

Scopus

0

Web of Science

0
Article Access Statistics:
Views: 32
PDF Downloads: 7

Publisher's Note
IECE stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions
CC BY Copyright © 2025 by the Author(s). Published by Institute of Emerging and Computer Engineers. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
Journal of Artificial Intelligence in Bioinformatics

Journal of Artificial Intelligence in Bioinformatics

ISSN: request pending (Online) | ISSN: request pending (Print)

Email: [email protected]

Portico

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/iece/

Copyright © 2025 Institute of Emerging and Computer Engineers Inc.