Margaret Foster
arXiv:2602.00022v1 Announce Type: new Abstract: We propose a measurement framework for difficult-to-access contexts that uses indirect data traces, interpretable machine-learning models, and theory...
preprint
Zhenyu Pu, Yu Yang, Lun Yang, Qing-Shan Jia, Xiaohong Guan, Costas J. Spanos
arXiv:2602.00027v1 Announce Type: new Abstract: Hydrogen-based multi-energy systems (HMES) have emerged as a promising low-carbon and energy-efficient solution, as it can enable the coordinated ope...
preprint
Aneeqa Mehrab, Jan Willem Van Looy, Pietro Demurtas, Stefano Iotti, Emil Malucelli, Francesca Rossi, Ferdinando Zanchetta, Rita Fioresi
arXiv:2602.00159v1 Announce Type: new Abstract: The purpose of this paper is to elucidate the theory and mathematical modelling behind the sheaf neural network (SNN) algorithm and then show how SNN...
preprint
Bob Junyi Zou, Lu Tian
arXiv:2602.00218v1 Announce Type: new Abstract: Identifying truly predictive covariates while strictly controlling false discoveries remains a fundamental challenge in nonlinear, highly correlated,...
preprint
Meng Ding, Zeqing Zhang, Di Wang, Lijie Hu
arXiv:2602.00329v1 Announce Type: new Abstract: Reliable data attribution is essential for mitigating bias and reducing computational waste in modern machine learning, with the Shapley value servin...
preprint
Anushka Narayanan, Karianne J. Bergen
arXiv:2602.00331v1 Announce Type: new Abstract: Explainable AI (XAI) is essential for understanding machine learning (ML) decision-making and ensuring model trustworthiness in scientific applicatio...
preprint
Delia McGrath, Curtis Chong, Rohil Kulkarni, Gerbrand Ceder, Adeesh Kolluru
arXiv:2602.00376v1 Announce Type: new Abstract: Scientific reasoning in materials science requires integrating multimodal experimental evidence with underlying physical theory. Existing benchmarks ...
preprint
Philipp Hoellmer, Stefano Martiniani
arXiv:2602.00424v1 Announce Type: new Abstract: Continuous-time generative models for crystalline materials enable inverse materials design by learning to predict stable crystal structures, but inc...
preprint
Jiarui Zhang, Yuchen Yang, Ran Yan, Zhiyu Mei, Liyuan Zhang, Daifeng Li, Wei Fu, Jiaxuan Gao, Shusheng Xu, Yi Wu, Binhang Yuan
arXiv:2602.00482v1 Announce Type: new Abstract: Reinforcement learning (RL) based post-training for large language models (LLMs) is computationally expensive, as it generates many rollout sequences...
preprint
Minghui Sun, Haoyu Gong, Xingyu You, Jillian Hurst, Benjamin Goldstein, Matthew Engelhard
arXiv:2602.00520v1 Announce Type: new Abstract: Event stream data often exhibit hierarchical structure in which multiple events co-occur, resulting in a sequence of multisets (i.e., bags of events)...
preprint
Xinmo Jin, Bowen Fan, Xunkai Li, Henan Sun, YuXin Zeng, Zekai Chen, Yuxuan Sun, Jia Li, Qiangqiang Dai, Hongchao Qin, Rong-Hua Li, Guoren Wang
arXiv:2602.00539v1 Announce Type: new Abstract: Drug-Drug Interactions (DDIs) significantly influence therapeutic efficacy and patient safety. As experimental discovery is resource-intensive and ti...
preprint
Zilin Jing, Vincent Jeanselme, Yuta Kobayashi, Simon A. Lee, Chao Pang, Aparajita Kashyap, Yanwei Li, Xinzhuo Jiang, Shalmali Joshi
arXiv:2602.00541v1 Announce Type: new Abstract: Clinical events captured in Electronic Health Records (EHR) are irregularly sampled and may consist of a mixture of discrete events and numerical mea...
preprint
Seunghyun Yoo, Sanghong Kim, Namkyung Yoon, Hwangnam Kim
arXiv:2602.00547v1 Announce Type: new Abstract: Identifying molecules from mass spectrometry (MS) data remains a fundamental challenge due to the semantic gap between physical spectral peaks and un...
preprint
David Craveiro, Hugo Silva
arXiv:2602.00809v1 Announce Type: new Abstract: Smartphone sensors can be extremely useful in providing information on the activities and behaviors of persons. Human activity recognition is increas...
preprint
Yuhao Huang, Shih-Hsin Wang, Andrea L. Bertozzi, Bao Wang
arXiv:2602.00849v1 Announce Type: new Abstract: Mean flow (MeanFlow) enables efficient, high-fidelity image generation, yet its single-function evaluation (1-NFE) generation often cannot yield comp...
preprint
Shih-Hsin Wang, Yuhao Huang, Taos Transue, Justin Baker, Jonathan Forstater, Thomas Strohmer, Bao Wang
arXiv:2602.00862v1 Announce Type: new Abstract: Graph neural networks (GNNs) have emerged as powerful tools for learning protein structures by capturing spatial relationships at the residue level. ...
preprint
Louis Serrano, Jiequn Han, Edouard Oyallon, Shirley Ho, Rudy Morel
arXiv:2602.00884v1 Announce Type: new Abstract: Neural operators have shown promise in learning solution maps of partial differential equations (PDEs), but they often struggle to generalize when te...
preprint
Cuong Manh Nguyen, Truong-Son Hy
arXiv:2602.00910v1 Announce Type: new Abstract: Deep learning has revolutionized medical image analysis, playing a vital role in modern clinical applications. However, the deployment of large-scale...
preprint
Sahar Almahfouz Nasser, Juan Francisco Pesantez Borja, Jincheng Liu, Tanvir Hasan, Zenghan Wang, Suman Ghosh, Sandeep Manandhar, Shikhar Shiromani, Twisha Shah, Naoto Tokuyama, Anant Madabhushi
arXiv:2602.00953v1 Announce Type: new Abstract: Despite significant progress in computational pathology, many AI models remain black-box and difficult to interpret, posing a major barrier to clinic...
preprint
Leonardo Ferreira Guilhoto, Akshat Kaushal, Paris Perdikaris
arXiv:2602.00960v1 Announce Type: new Abstract: Scientific machine learning (SciML) increasingly requires models that capture multimodal conditional uncertainty arising from ill-posed inverse probl...
preprint
Fukang Ge, Jiarui Zhu, Linjie Zhang, Haowen Xiao, Xiangcheng Bao, Fangnan Xie, Danyang Chen, Yanrui Lu, Yuting Wang, Ziqian Guan, Lin Gu, Jinhao Bi, Yingying Zhu
arXiv:2602.00019v1 Announce Type: new Abstract: Modern AI technologies for drug discovery are distributed across heterogeneous platforms-including web applications, desktop environments, and code l...
preprint
Tingting Dan, Jiaqi Ding, Guorong Wu
arXiv:2602.00057v1 Announce Type: new Abstract: Neural coupling in both neuroscience and artificial intelligence emerges as dynamic oscillatory patterns that encode abstract concepts. To this end, ...
preprint
Yang Tan, Yuyuan Xi, Can Wu, Bozitao Zhong, Mingchen Li, Guisheng Fan, Jiankang Zhu, Yafeng Liang, Nanqing Dong, Liang Hong
arXiv:2602.00197v1 Announce Type: new Abstract: Zero-shot mutation prediction is vital for low-resource protein engineering, yet existing protein language models (PLMs) often yield statistically co...
preprint
Hasi Hays, William J. Richardson
arXiv:2602.00586v1 Announce Type: new Abstract: Network topology excels at structural predictions but fails to capture functional semantics encoded in biomedical literature. We present a retrieval-...
preprint
Jiahao Zhang, Zeqing Zhang, Di Wang, Lijie Hu
arXiv:2602.00782v1 Announce Type: new Abstract: Protein language models (PLMs) have enabled advances in structure prediction and de novo protein design, yet they frequently collapse into pathologic...
preprint
Masayuki Nagai, Alan E. Murphy, Kaeli Rizzo, Peter K. Koo
arXiv:2602.01230v1 Announce Type: new Abstract: Deciphering how DNA sequence encodes gene regulation remains a central challenge in biology. Advances in machine learning and functional genomics hav...
preprint
Alexander Dack, Tomislav Plesa, Thomas E. Ouldridge
arXiv:2602.02374v1 Announce Type: new Abstract: Both natural and synthetic chemical systems not only exhibit a range of non-trivial dynamics, but also transition between qualitatively different dyn...
preprint
Laura Cif, Diane Demailly, Gabriella A. Horv\`ath, Juan Dario Ortigoza Escobar, Nathalie Dorison, Mayt\'e Castro Jim\'enez, C\'ecile A. Hubsch, Thomas Wirth, Gun-Marie Hariz, Sophie Huby, Morgan Dornadic, Zohra Souei, Muhammad Mushhood Ur Rehman, Simone Hemm, Mehdi Boulayme, Eduardo M. Moraud, Jocelyne Bloch, Xavier Vasques
arXiv:2602.00163v1 Announce Type: cross Abstract: Hyperkinetic movement disorders (HMDs) such as dystonia, tremor, chorea, myoclonus, and tics are disabling motor manifestations across childhood an...
preprint
Kunyi Fan, Mengjie Chen, Longlong Li, Cunquan Qu
arXiv:2602.01751v1 Announce Type: cross Abstract: Predicting drug-drug interactions (DDIs) is essential for safe pharmacological treatments. Previous graph neural network (GNN) models leverage mole...
preprint
Furkan Eris
arXiv:2602.01845v1 Announce Type: cross Abstract: Protein language models (PLMs) face a fundamental divide: masked language models (MLMs) excel at fitness prediction while causal models enable gene...
preprint
Nima Shoghi, Yuxuan Liu, Yuning Shen, Rob Brekelmans, Pan Li, Quanquan Gu
arXiv:2602.02128v1 Announce Type: cross Abstract: Molecular dynamics (MD) simulations remain the gold standard for studying protein dynamics, but their computational cost limits access to biologica...
preprint
Feiyang Cai, Guijuan He, Yi Hu, Jingjing Wang, Joshua Luo, Tianyu Zhu, Srikanth Pilla, Gang Li, Ling Liu, Feng Luo
arXiv:2602.02320v1 Announce Type: cross Abstract: Molecular function is largely determined by structure. Accurately aligning molecular structure with natural language is therefore essential for ena...
preprint
Amaru Caceres Arroyo, Lea Bogensperger, Ahmed Allam, Michael Krauthammer, Konrad Schindler, Dominik Narnhofer
arXiv:2602.02425v1 Announce Type: cross Abstract: Protein fitness optimization is challenged by a vast combinatorial landscape where high-fitness variants are extremely sparse. Many current methods...
preprint
Dulhan Jayalath, Oiwi Parker Jones
arXiv:2602.02494v1 Announce Type: cross Abstract: Clinical brain-to-text interfaces are designed for paralysed patients who cannot provide extensive training recordings. Pre-training improves data-...
preprint
T. Anderson Keller, Lyle Muller, Terrence J. Sejnowski, Max Welling
arXiv:2409.13669v2 Announce Type: replace Abstract: Spatiotemporal flows of neural activity, such as traveling waves, have been observed throughout the brain since the earliest recordings; yet ther...
preprint
Elvire Roblin, Paul-Henry Courn\`ede, Stefan Michiels
arXiv:2506.12277v2 Announce Type: replace Abstract: Objective: In randomized clinical trials, prediction models can be used to explore the relationships between patients' variables (e.g., clinical,...
preprint
Kangcong Li, Peng Ye, Chongjun Tu, Lin Zhang, Chunfeng Song, Jiamin Wu, Tao Yang, Qihao Zheng, Tao Chen
arXiv:2506.17310v2 Announce Type: replace Abstract: While Large Language Models (LLMs) demonstrate strong performance across domains, their long-context capabilities are limited by transient neural...
preprint
Wei Wu, Qiuyi Li, Yuanyuan Zhang, Zhihao Zhan, Ruipu Chen, Mingyang Li, Kun Fu, Junyan Qi, Yongzhou Bao, Chao Wang, Yiheng Zhu, Zhiyun Zhang, Jian Tang, Fuli Feng, Jieping Ye, Yuwen Liu, Hui Xiong, Zheng Wang
arXiv:2502.07272v5 Announce Type: replace-cross Abstract: The rapid advancement of DNA sequencing has produced vast genomic datasets, yet interpreting and engineering genomic function remain fundam...
preprint
Runhan Shi, Letian Chen, Gufeng Yu, Yang Yang
arXiv:2511.06356v3 Announce Type: replace-cross Abstract: Chemical reaction prediction remains a fundamental challenge in organic chemistry, where existing machine learning models face two critical...
preprint
Jingjie Ning, Xiangzhen Shen, Li Hou, Shiyi Shen, Jiahao Yang, Junrui Li, Hong Shan, Sanan Wu, Sihan Gao, H. Eric Xu, Xinheng He
arXiv:2601.19149v2 Announce Type: replace-cross Abstract: G protein-coupled receptors (GPCRs) govern diverse physiological processes and are central to modern pharmacology. Yet discovering GPCR mod...
preprint
Zambaldi et al.
AlphaProteo generates novel protein binders with state-of-the-art binding affinities across diverse targets.
protein-design binder deepmind
bioRxiv
CURATED
2024-06-25 Hayes et al.
A multimodal generative language model that reasons over the sequence, structure, and function of proteins.
protein-lm generative foundation-model
Nature
CURATED
2024-05-08 Abramson et al.
AlphaFold 3 can predict the joint structure of complexes including proteins, nucleic acids, small molecules, ions, and modified residues.
structure-prediction protein deepmind
Nature
CURATED
2023-07-11 Watson et al.
A structure denoising diffusion probabilistic model for protein backbone generation.
protein-design diffusion baker-lab
ICLR 2023
CURATED
2023-02-01 Corso et al.
A diffusion generative model over the non-Euclidean manifold of ligand poses for molecular docking.
docking diffusion drug-discovery
Miao, Z., Fang, Z., Shi, X., Zeng, Y., Wu, T., Zheng, R., Li, M.
RNA velocity techniques have emerged as efficient tools for unraveling the complex trajectories of cell development and differentiation. However, most of existing RNA velocity approaches are constr...
preprint
Tasmin, M., Mohanty, S., Kulkarni, S., Farhat, M. R., Green, A. G.
Foundation models aim to learn useful representations of biological sequences. However, the applicability of these representations for a wide range of tasks, including phenotype prediction and vari...
preprint
Guo, T., Dang, P., Fang, Y., Zhu, H., Wang, X., Wang, J., Ma, A., Ma, Q., Cao, S., Zhang, C.
DNA methylation is a central epigenetic modification that regulates gene expression, maintains genomic stability, and guides cellular differentiation. However, direct measurements of DNA methylatio...
preprint
Schroeder, A., Yu, X., Li, W., Mao, L., Yuan, M., Yang, J., Sachs, N., Dumoulin, B., Xu, G. X., Luo, X., Huang, A., Susztak, K., Hwang, T. H., Kadara, H., Maegdefessel, L., Yu, J., Li, M.
High-resolution histology images are indispensable for pathology and increasingly serve as the structural backbone for spatial omics. Yet whole-slide images (WSIs) frequently contain artifacts, ace...
preprint
Niu, K., Kulmanov, M., Hoehndorf, R.
Motivation: Current machine learning methods for enzyme function prediction primarily treat proteins as independent entities, ignoring the metabolic context in which they operate. This reductionist...
preprint
Kunz, T. R., Rivera-Feliciano, J.
The effects of perturbation on a biological system can be readily measured in terms of transcriptional changes. However, despite a wealth of transcriptional perturbation response data, there are cu...
preprint
Wang, X., Wang, Y., Visscher, P. M., Wray, N. R., Yengo, L.
Conditional and joint (COJO) analysis of genome-wide association study (GWAS) summary statistics to identify single nucleotide polymorphisms (SNPs) independently associated with a trait is standard...
preprint
Christidis, A., Ghazi, A. R., Chawla, S., Turaga, N., Gentleman, R., Geistlinger, L.
Although cell type annotation has become an integral part of single-cell analysis workflows, the assessment of computational annotations remains challenging. Many annotation tools transfer labels f...
preprint
Hinkston, M. A., Bradley, A. S.
Molecular biomarkers preserved in rocks provide evidence about ancient life but interpreting them requires inference through multiple stages of information loss arising from phylogenetic, biosynthe...
preprint
Van Puyvelde, B. R., Devreese, R., Chiva, C., Sabido, E., Pfammatter, S., Panse, C., Rijal, J. B., Keller, C., Batruch, I., Pribil, P., Vincendet, J.-B., Fontaine, F., Lefever, L., Magalhaes, P., Deforce, D., Nanni, P., Ghesquiere, B., Perez-Riverol, Y., Martens, L., Carapito, C., Bouwmeester, R., Dhaenens, M.
Recent advances in liquid chromatography mass spectrometry (LCMS) have accelerated the adoption of high-throughput workflows that deliver deep proteome coverage using minimal sample amounts. This t...
preprint
Takaesu, F., Villarreal, D. J., Zhou, A., Jimenez, M., Turner, M., Spiess, J. L., Kievert, J., Deshetler, C., Schwartzman, W., Yates, A. R., Kelly, J. M., Breuer, C. K., Davis, M.
Background: Post-operative tachycardia is a common and poorly understood complication following the Fontan procedure. Post-operative factors such as surgical scarring and venous hypertension can co...
preprint
Chen, S., Zevnik, U., Ziegenhain, C.
Motivation: Gene-body coverage bias differs across scRNA-seq protocols and can influence downstream analyses, yet coverage is often assessed using bulk-level summaries that obscure cell-to-cell var...
preprint
Arend, L., Woller, F., Rehor, B., Emmert, D., Frasnelli, J., Fuchsberger, C., Blumenthal, D. B., List, M.
Motivation: A growing volume of large-scale genome-wide association study (GWAS) datasets offers unprecedented power to uncover the genetic determinants of complex traits, but existing web-based pl...
preprint
Ramachandran, S., Ramakrishnan, N.
Epigenetic mechanisms regulate gene-expression by altering the structure of the chromatin without modifying the underlying DNA sequence. Histone post-translational modifications (PTMs) are critical...
preprint
Portal, N., Karroucha, W., Mallet, V., Bonomi, M.
Protein function and other biological properties often depend on structural dynamics, yet most machine-learning predictors rely on static representations. Physics-based molecular simulations can de...
preprint
Alipour Pijani, B., Rifat, J. I. M., Bozdag, S.
Motivation: Multi-omics datasets capture complementary aspects of biological systems and are central to modern machine learning applications in biology and medicine. Existing graph-based integratio...
preprint
Thomsen, A. A., Jensen, O. N.
We assess batch correction methods for MALDI mass spectrometry imaging experiments. ComBAT reduced batch-related technical variance, maintained biological variation, and improved the overall score ...
preprint
Cohen, P., Johnson, S., Zavala, E. I., Moorjani, P., Slon, V.
Kinship reconstruction in ancient populations provides key insights into past social organization and evolutionary history. Sedimentary ancient DNA (sedaDNA) enables access to deep-time human popul...
preprint
Rahman, M. T., Al Olaimat, M., Bozdag, S., Alzheimer's Disease Neuroimaging Initiative
Motivation: Electronic Health Records (EHRs) contain vast amounts of longitudinal patient medical history data, making them highly informative for early disease prediction. Numerous computational m...
preprint
Lodh, E., Majumder, S., Chowdhury, T., De, M.
Large-scale pharmacogenomic screens provide extensive measurements of drug response across diverse cancer cell lines; however, most computational approaches emphasize point-wise sensitivity predict...
preprint