Scientific software documentation now has a second audience: AI systems that use it to explain software behavior, generate code, and help users reach complex APIs. As users interact with software through AI, documentation becomes part of the software's interface.
In this article, we summarize our recent experience revamping the documentation for the Visualization Toolkit (VTK) software package so it can better support both human readers and AI-assisted workflows.
Opening problem
Scientific software is powerful, but it still depends on documentation to make that power usable.
Documentation has always been part of what makes scientific software usable. VTK offers a rich set of algorithms, data structures, and rendering components, but users still need help understanding how those pieces work together in practice. Documentation has traditionally supplied that missing layer by explaining concepts, walking through workflows, and connecting APIs to real applications.1
That role remains. The difference now is that documentation has a second audience.
What changed
Documentation no longer serves only human readers; it also serves AI systems that use it to explain software and generate code.
What is new is not documentation itself, but who, and what, uses it. AI systems now work from the same API pages, guides, tutorials, and examples that human users employ. AI systems use that material to answer questions, explain software behavior, and help generate code. Once that happens, documentation stops being just explanatory material. It starts becoming part of the way the software is accessed and used.
How AI uses documentation
AI systems consume documentation in several ways, including model adaptation, retrieval, structured tool access, and code generation.
Public information usually does not tell us whether a general-purpose model such as ChatGPT was trained on a specific documentation set like VTK, even though OpenAI says its models draw on a mix of public, licensed, synthetic, and human-generated data.2 But developers can still build smaller domain models from curated documentation, tutorials, examples, and API descriptions.
They can also bring documentation into the loop at runtime. Retrieval can supply relevant passages and examples when a user asks a question, while tool interfaces such as the Model Context Protocol (MCP) can expose structured software knowledge more directly.3,4,5 In code generation, those pieces often work together: a model produces the response, retrieval supplies context, and tools expose structure.6 For VTK, that only works well when the documentation makes roles, relationships, inputs, outputs, and common usage patterns explicit.
Why that matters
AI systems do not read documentation the way people do, so documentation now needs to make its structure and relationships explicit.
AI systems do not use documentation the way people do. A person can read a page from beginning to end, follow an explanation across links, and supply missing context from prior experience. AI systems typically operate within a constructed context composed of retrieved passages, examples, and structured metadata. Retrieval and ranking systems select that material, and the model then matches patterns between the prompt and the supplied context to predict a response. It does not absorb the documentation as a continuous argument. It works over selected fragments and inferred relationships. That means documentation now has to do more than read well. It also has to state concepts, relationships, and common usage patterns clearly enough for software systems to use them.
In VTK, that difference shows up quickly. If the documentation clearly says what a class does, what role it plays in the software, what it takes as input, what it produces, and where it fits in the visualization pipeline, AI-assisted tools have a much better chance of producing something useful. If that information is scattered, left implicit, or described inconsistently, the answer may still sound convincing while getting important details wrong.
What to do differently
Keep the website for people, but add a structured documentation layer that machines can use directly.
Traditional documentation websites are still the right place for people to learn. That is where users browse, follow links, compare examples, and slowly build a mental model of how a system works. For VTK, that kind of exploration still matters. People need explanations, examples, tutorials, and enough context to understand not just an API call, but how the larger visualization pipeline fits together.
AI systems need some of that same information, but they do better when it is exposed more directly. Instead of leaving everything buried in prose, projects can add a structured layer that software can parse and use more reliably. In practice, that might mean JSON or JSONL records that spell out class roles, parameters, relationships, example usage, and pipeline behavior. The website is still where people learn. The structured layer helps machines work with the same knowledge without having to infer quite so much from scattered pages.
That shift also changes how it makes sense to build documentation. Rather than writing pages first and trying to pull structure out of them later, it may be better to make structure part of the system from the beginning. Some of that structure can come straight from the software through introspection: class names, inheritance, method signatures, and docstrings. An ontology can add another layer by defining the categories and relationships that make those facts useful.7,8 Human experts then handle the smaller, higher-value part of the work: validating the model, curating the mappings, and fixing where automation falls short. The result is not two separate documentation systems. It is a documentation system that serves people through the website and AI through a structured knowledge layer.
The practical takeaway is straightforward. Most projects do not need a massive overhaul of AI documentation. However, a few habits suddenly matter even more than they used to.
- Clear structure matters.
- Consistent terminology matters.
- Self-contained pages matter.
- Good examples matter.
- Machine-readable metadata matters.
Those were already signs of good documentation. Now they also shape how well AI can work with the software.
Final takeaway
As users reach software through AI, and AI reaches software through documentation, documentation becomes part of the software's interface.
More and more people will come to software through AI rather than through source code, manuals, or search results alone. When that happens, the AI still needs some way to figure out what the software does, how its pieces fit together, and what patterns of use actually make sense. In many cases, that path runs straight through the documentation.
That is the bigger shift. Documentation is no longer just supporting material. It is becoming part of how software is actually used.
As users interact with software through AI, the AI itself relies on documentation to understand APIs, relationships, and usage patterns. In that context, documentation becomes a new computational interface to the software.
Author bios
Vicente Bolea is a senior R&D engineer at Kitware Inc. He is a core developer in VTK and its ecosystem, ParaView, Viskores, and ADIOS2, and is a regular contributor in other projects within the DAV and Tools integration initiative. He has a strong interest in open source solutions, system programming, software sustainability, and high-performance computing.
Jaswant Panchumarti is a graduate student at RPI and a Senior R&D Engineer at Kitware. He is leading the VTK WebAssembly effort at Kitware. His research focuses on detecting shocks in hypersonic flows with neural networks.
Berk Geveci leads the scientific visualization and informatics teams at Kitware Inc. He is one of the leading developers of the ParaView visualization application and the Visualization Toolkit (VTK). His research interests include large-scale parallel computing, computational dynamics, finite elements, and visualization algorithms. Dr. Geveci regularly publishes and teaches courses at conferences including IEEE Visualization and Supercomputing conferences.
Will Schroeder, Ph.D., is a co-founder of Kitware and served as its CEO for 19 years. His role is to identify technology and business opportunities and obtain the necessary support for Kitware to meet these opportunities. He also provides technical leadership in large open source projects such as the National Library of Medicine Insight Toolkit (ITK) project, the NA-MIC NIH National Center for Biomedical Computing, and the Visualization Toolkit (VTK), where he is a lead developer and the first author of the VTK textbook.
Patrick O'Leary is the Assistant Director of Scientific Computing for Kitware, Inc. Dr. O'Leary's research interests include high-performance computing (HPC), numerical analysis, finite elements, and visualization.
References
- 1Kitware and the VTK Community. VTK Documentation. Accessed April 20, 2026.
- 2OpenAI. How ChatGPT and Our Foundation Models Are Developed. Help Center. Accessed April 20, 2026.
- 3Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Kuttler, Mike Lewis, Wen-tau Yih, Tim Rocktaschel, Sebastian Riedel, and Douwe Kiela. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems 33 (NeurIPS, 2020), 9459-9474, 2020.
- 4Anthropic. Introducing the Model Context Protocol. Anthropic, November 25, 2024.
- 5Model Context Protocol. Specification. Accessed April 20, 2026.
- 6Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, and Charles Sutton. Program Synthesis with Large Language Models. arXiv, 2021.
- 7Thomas R. Gruber. A Translation Approach to Portable Ontology Specifications. Knowledge Acquisition 5, no. 2 (1993): 199-220.
- 8Aidan Hogan, Eva Blomqvist, Michael Cochez, Claudia D'Amato, Gerard de Melo, Claudio Gutierrez, Sabrina Kirrane, Jose Emilio Labra Gayo, Roberto Navigli, Sebastian Neumaier, Axel-Cyrille Ngonga Ngomo, Axel Polleres, Sabbir M. Rashid, Anisa Rula, Lukas Schmelzeisen, Juan Sequeda, Steffen Staab, and Antoine Zimmermann. Knowledge Graphs. ACM Computing Surveys 54, no. 4 (2021): 1-37.


