I am pleased to share another recent paper from Dr. Gail Rosen Lab (Drexel University College of Engineering), “Enhancing nucleotide sequence representations in genomic analysis with contrastive optimization”, just published in Communications Biology. This paper introduces Scorpio (Sequence Contrastive Optimization for Representation and Predictive Inference on DNA), a versatile framework designed for nucleotide sequences that employ contrastive learning to improve embeddings. By leveraging pre-trained genomic language models and k-mer frequency embeddings, Scorpio demonstrates competitive performance in diverse applications, including taxonomic and gene classication, antimicrobial resistance (AMR) gene identication, and promoter detection. A key strength of Scorpio is its ability to generalize to novel DNA sequences and taxa, addressing a significant limitation of alignment-based methods. Kudos to the lead author, Mohammad Saleh Refahi, a very talented PhD student and fellow co-authors Bahrad Sokhansanj, Joshua C. Mell, Hyunwoo Yoo and Gavin Hearne.