
- Select a language for the TTS:
- UK English Female
- UK English Male
- US English Female
- US English Male
- Australian Female
- Australian Male
- Language selected: (auto detect) - EN
Play all audios:
Access through your institution Buy or subscribe Generative artificial intelligence (AI) and large language models (LLMs) in particular are changing the way we do data science. Most
prominently, scientists use the technology for interacting with scientific data1, answering data analysis questions2,3, generating data analysis code4,5,6, and (re-)writing scientific
manuscripts7. Unfortunately, the prompts sent to LLMs are commonly not conserved, and thus, at the time of publication, it might be hard to differentiate human-made and AI-generated parts of
the scientific work. A professional peer-review system, for documenting how LLM-generated code was prompted for, and which human reviewed it, is not established in contemporary scientific
culture. However, such systems do exist for collaborative code editing involving multiple humans. For example, the source code repositories GitHub and GitLab are well-established in the
open-source software community for discussing issues and potential solutions, building code together, and for peer-reviewing content. As it was shown before that LLMs can solve real-world
GitHub issues8, developing an AI-assistant that interacts with humans directly within the GitHub platform is the obvious next step. Here, I present git-bob, a GitHub/GitLab-integration of an
LLM-based AI-assistant that can respond to GitHub issues, discuss potential solutions with humans iteratively, write code for them, and submit it as a pull-request to be reviewed by humans.
It is technically similar to various online services for data analysis such as the OpenAI ChatGPT Data Analyst or GitHub Copilot workflows, with three major differences. First, multiple
humans can interact with git-bob in one communication thread. This allows bringing together domain specialists, such as life scientists, data-analysts and the AI-assistant in one discussion,
stimulating knowledge exchange on how to interact properly with the AI-assistant. Second, discussions with git-bob and resulting code modifications are conserved in an online platform that
others can read and follow, making the interaction with the AI-assistant fully transparent. Third, git-bob is completely open-source and extensible. Other developers can read its built-in
system prompts and modify them to their needs. Developers can implement custom connectors to other LLM service providers and write plugins for their custom AI agents, which may deal with
GitHub issues differently. This is a preview of subscription content, access via your institution ACCESS OPTIONS Access through your institution Access Nature and 54 other Nature Portfolio
journals Get Nature+, our best-value online-access subscription $29.99 / 30 days cancel any time Learn more Subscribe to this journal Receive 12 digital issues and online access to articles
$99.00 per year only $8.25 per issue Learn more Buy this article * Purchase on SpringerLink * Instant access to full article PDF Buy now Prices may be subject to local taxes which are
calculated during checkout ADDITIONAL ACCESS OPTIONS: * Log in * Learn about institutional subscriptions * Read our FAQs * Contact customer support CODE AVAILABILITY The complete source code
of git-bob is available online at GitHub11: https://github.com/haesleinhuepf/git-bob REFERENCES * Royer, L. A. _Nat. Methods_ 20, 951–952 (2023). Article Google Scholar * Lai, Y. et al.
Preprint at https://arxiv.org/abs/2211.11501 (2022). * Lei, W. et al. _Nat. Methods_ 21, 1368–1370 (2024). Article Google Scholar * Royer, L. A. _Nat. Methods_ 21, 1371–1373 (2024).
Article Google Scholar * Haase, R., Tischer, C., Hériché, J.-K. & Scherf, N. Preprint at _bioRxiv_ https://doi.org/10.1101/2024.04.19.590278 (2024). * Chen, M. et al. Preprint at
https://arxiv.org/abs/2107.03374 (2021). * Lu, C. et al. Preprint at https://arxiv.org/abs/2408.06292 (2024). * Jimenez, C. E. et al. Preprint at https://arxiv.org/abs/2310.06770 (2024). *
Yin, Z. et al. Preprint at https://arxiv.org/abs/2305.18153 (2023). * About GitHub-hosted runners. _GitHub_
https://docs.github.com/en/actions/using-github-hosted-runners/using-github-hosted-runners/about-github-hosted-runners (accessed 14 October 2024). * Hasse, R. git-bob. _GitHub_
https://github.com/haesleinhuepf/git-bob (2024). Download references ACKNOWLEDGEMENTS I would like to thank E. K. Nicolay (UFZ Leipzig) and M. Lampert (TU Dresden) for testing git-bob in its
early days and for providing constructive feedback on the manuscript. I also would like to thank V. Hilsenstein for pushing for GitLab interoperability. I acknowledge the financial support
by the Federal Ministry of Education and Research of Germany and by Sächsische Staatsministerium für Wissenschaft, Kultur und Tourismus in the programme Center of Excellence for AI-research
“Center for Scalable Data Analytics and Artificial Intelligence Dresden/Leipzig”, project identification number: ScaDS.AI. I also acknowledge financial support from the Deutsche
Forschungsgemeinschaft (DFG, German Research Foundation) under the National Research Data Infrastructure – NFDI 46/1 – 501864659 - NFDI4BioImage. AUTHOR INFORMATION AUTHORS AND AFFILIATIONS
* Data Science Center, Leipzig University, Leipzig, Germany Robert Haase * Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI) Dresden / Leipzig, Leipzig, Germany
Robert Haase Authors * Robert Haase View author publications You can also search for this author inPubMed Google Scholar CORRESPONDING AUTHOR Correspondence to Robert Haase. ETHICS
DECLARATIONS COMPETING INTERESTS The author declares no competing interests. PEER REVIEW PEER REVIEW INFORMATION _Nature Computational Science_ thanks Virginie Uhlmann and the other,
anonymous, reviewer(s) for their contribution to the peer review of this work. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION Supplementary Figures 1–6. RIGHTS AND PERMISSIONS Reprints
and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Haase, R. Towards transparency and knowledge exchange in AI-assisted data analysis code generation. _Nat Comput Sci_ 5, 271–272 (2025).
https://doi.org/10.1038/s43588-025-00781-1 Download citation * Published: 27 March 2025 * Issue Date: April 2025 * DOI: https://doi.org/10.1038/s43588-025-00781-1 SHARE THIS ARTICLE Anyone
you share the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the
Springer Nature SharedIt content-sharing initiative