Research Papers

Curated papers with reproducible code and real impact

Auto-updated from arXiv & bioRxiv ยท Last sync: 6/11/2026

arXiv cs.LG
2026-06-10

SocraticPO: Policy Optimization via Interactive Guidance

Zirui Liu, Jie Ouyang, Qi Liu, Xianquan Wang, Jiayu Liu, Tingyue Pan, Qingchuan Li, Jing Sha, Zhenya Huang, Shijin Wang, Enhong Chen

arXiv:2606.09887v1 Announce Type: new Abstract: Reinforcement learning (RL) for large language models usually supervises reasoning with scalar outcome rewards, such as binary correctness. Such rewa...

preprint
Paper โ†’ arXiv
arXiv cs.LG
2026-06-10

Towards Diverse Scientific Hypothesis Search with Large Language Models

Haorui Wang, Parshin Shojaee, Kazem Meidani, Kunyang Sun, Jos\'e Miguel Hern\'andez-Lobato, Teresa Head-Gordon, Jiajun He, Chandan K. Reddy, Chao Zhang, Yuanqi Du

arXiv:2606.10587v1 Announce Type: new Abstract: Large language models (LLMs) are on the rise for accelerating scientific discovery, most recently in advanced tasks such as generating valid scientif...

preprint
Paper โ†’ arXiv
arXiv cs.LG
2026-06-10

Pre-AF 13: An Interpretable Atrial Fibrillation Risk Score Mined from Discharge Reports

Olga Shakhmatova, Dmitrii Kriukov, Daniil Larionov, Nikita Khromov, Iaroslav Bespalov, Alexander Zolotarev, Kirill Grishchenkov, Ekaterina Ivanova, Miron Kuznetsov, Ilya Sochenkov, Elizaveta Panchenko, Artem Shelmanov, Dmitry V. Dylov

arXiv:2606.10725v2 Announce Type: new Abstract: Background. Atrial fibrillation (AF) is the most prevalent cardiac arrhythmia and a major determinant of prognosis. Established AF risk scores rely o...

preprint
Paper โ†’ arXiv
bioRxiv

scFAIR Consortium: a decentralized hub for single-cell RNA-Seq data standardization and unification

Gardeux, V., Carsanaro, S., Chen, W. J., David, F. P. A., Goutte-Gattat, D., Hilton, J. A., Lubiana, T., Patel, N., Raymor, B., Zucchi, I., Deplancke, B., Ernst, C., Osumi-Sutherland, D., Robinson-Rechavi, M., Sternberg, P. W., Bastian, F. B.

The rapid accumulation of single-cell RNA-Seq (scRNA-seq) data across multiple repositories presents major challenges for data accessibility, integration, and reproducibility. While primary reposit...

preprint
Paper โ†’
bioRxiv

Quantifying Evidence for Competing Biomedical Hypotheses using Large Language Models and Bayesian Analysis

Moore, B. M., Freeman, J., Millikin, R. J., Mohanty, C., George, K. S., Bal, A., Lock, C., Sauer, J.-D., Spurgeon, M. E., Moore, D. L., Travers, B. G., Stewart, R.

Science fundamentally depends on the generation and testing of hypotheses, many of them controversial. An explosion in scientific literature has made evaluating hypotheses even within a domain a pr...

preprint
Paper โ†’