Documentation Is the Interface

Scientific software documentation now has a second audience: AI systems that use it to explain software behavior, generate code, and help users reach complex APIs. As users interact with software through AI, documentation becomes part of the software's interface.

In this article, we summarize our recent experience revamping the documentation for the Visualization Toolkit (VTK) software package so it can better support both human readers and AI-assisted workflows.

Opening problem

Scientific software is powerful, but it still depends on documentation to make that power usable.

Documentation has always been part of what makes scientific software usable. VTK offers a rich set of algorithms, data structures, and rendering components, but users still need help understanding how those pieces work together in practice. Documentation has traditionally supplied that missing layer by explaining concepts, walking through workflows, and connecting APIs to real applications.¹

That role remains. The difference now is that documentation has a second audience.

What changed

Documentation no longer serves only human readers; it also serves AI systems that use it to explain software and generate code.

What is new is not documentation itself, but who, and what, uses it. AI systems now work from the same API pages, guides, tutorials, and examples that human users employ. AI systems use that material to answer questions, explain software behavior, and help generate code. Once that happens, documentation stops being just explanatory material. It starts becoming part of the way the software is accessed and used.

How AI uses documentation

AI systems consume documentation in several ways, including model adaptation, retrieval, structured tool access, and code generation.

Public information usually does not tell us whether a general-purpose model such as ChatGPT was trained on a specific documentation set like VTK, even though OpenAI says its models draw on a mix of public, licensed, synthetic, and human-generated data.² But developers can still build smaller domain models from curated documentation, tutorials, examples, and API descriptions.

They can also bring documentation into the loop at runtime. Retrieval can supply relevant passages and examples when a user asks a question, while tool interfaces such as the Model Context Protocol (MCP) can expose structured software knowledge more directly.^3,4,5 In code generation, those pieces often work together: a model produces the response, retrieval supplies context, and tools expose structure.⁶ For VTK, that only works well when the documentation makes roles, relationships, inputs, outputs, and common usage patterns explicit.

Why that matters

AI systems do not read documentation the way people do, so documentation now needs to make its structure and relationships explicit.

AI systems do not use documentation the way people do. A person can read a page from beginning to end, follow an explanation across links, and supply missing context from prior experience. AI systems typically operate within a constructed context composed of retrieved passages, examples, and structured metadata. Retrieval and ranking systems select that material, and the model then matches patterns between the prompt and the supplied context to predict a response. It does not absorb the documentation as a continuous argument. It works over selected fragments and inferred relationships. That means documentation now has to do more than read well. It also has to state concepts, relationships, and common usage patterns clearly enough for software systems to use them.

In VTK, that difference shows up quickly. If the documentation clearly says what a class does, what role it plays in the software, what it takes as input, what it produces, and where it fits in the visualization pipeline, AI-assisted tools have a much better chance of producing something useful. If that information is scattered, left implicit, or described inconsistently, the answer may still sound convincing while getting important details wrong.

What to do differently

Keep the website for people, but add a structured documentation layer that machines can use directly.

Traditional documentation websites are still the right place for people to learn. That is where users browse, follow links, compare examples, and slowly build a mental model of how a system works. For VTK, that kind of exploration still matters. People need explanations, examples, tutorials, and enough context to understand not just an API call, but how the larger visualization pipeline fits together.

AI systems need some of that same information, but they do better when it is exposed more directly. Instead of leaving everything buried in prose, projects can add a structured layer that software can parse and use more reliably. In practice, that might mean JSON or JSONL records that spell out class roles, parameters, relationships, example usage, and pipeline behavior. The website is still where people learn. The structured layer helps machines work with the same knowledge without having to infer quite so much from scattered pages.

That shift also changes how it makes sense to build documentation. Rather than writing pages first and trying to pull structure out of them later, it may be better to make structure part of the system from the beginning. Some of that structure can come straight from the software through introspection: class names, inheritance, method signatures, and docstrings. An ontology can add another layer by defining the categories and relationships that make those facts useful.^7,8 Human experts then handle the smaller, higher-value part of the work: validating the model, curating the mappings, and fixing where automation falls short. The result is not two separate documentation systems. It is a documentation system that serves people through the website and AI through a structured knowledge layer.

The practical takeaway is straightforward. Most projects do not need a massive overhaul of AI documentation. However, a few habits suddenly matter even more than they used to.

Clear structure matters.
Consistent terminology matters.
Self-contained pages matter.
Good examples matter.
Machine-readable metadata matters.

Those were already signs of good documentation. Now they also shape how well AI can work with the software.

Final takeaway

As users reach software through AI, and AI reaches software through documentation, documentation becomes part of the software's interface.

More and more people will come to software through AI rather than through source code, manuals, or search results alone. When that happens, the AI still needs some way to figure out what the software does, how its pieces fit together, and what patterns of use actually make sense. In many cases, that path runs straight through the documentation.

That is the bigger shift. Documentation is no longer just supporting material. It is becoming part of how software is actually used.

As users interact with software through AI, the AI itself relies on documentation to understand APIs, relationships, and usage patterns. In that context, documentation becomes a new computational interface to the software.

Author bios

Vicente Bolea is a senior R&D engineer at Kitware Inc. He is a core developer in VTK and its ecosystem, ParaView, Viskores, and ADIOS2, and is a regular contributor in other projects within the DAV and Tools integration initiative. He has a strong interest in open source solutions, system programming, software sustainability, and high-performance computing.

Jaswant Panchumarti is a graduate student at RPI and a Senior R&D Engineer at Kitware. He is leading the VTK WebAssembly effort at Kitware. His research focuses on detecting shocks in hypersonic flows with neural networks.

Berk Geveci leads the scientific visualization and informatics teams at Kitware Inc. He is one of the leading developers of the ParaView visualization application and the Visualization Toolkit (VTK). His research interests include large-scale parallel computing, computational dynamics, finite elements, and visualization algorithms. Dr. Geveci regularly publishes and teaches courses at conferences including IEEE Visualization and Supercomputing conferences.

Will Schroeder, Ph.D., is a co-founder of Kitware and served as its CEO for 19 years. His role is to identify technology and business opportunities and obtain the necessary support for Kitware to meet these opportunities. He also provides technical leadership in large open source projects such as the National Library of Medicine Insight Toolkit (ITK) project, the NA-MIC NIH National Center for Biomedical Computing, and the Visualization Toolkit (VTK), where he is a lead developer and the first author of the VTK textbook.

Patrick O'Leary is the Assistant Director of Scientific Computing for Kitware, Inc. Dr. O'Leary's research interests include high-performance computing (HPC), numerical analysis, finite elements, and visualization.

Documentation Is the Interface

Opening problem

What changed

How AI uses documentation

Why that matters

What to do differently

Final takeaway

Author bios

References

More on Documentation and AI for Better Development

Stuck Writing Software Documentation? Focus on One Good Tutorial

BSSw Fellowship

Research Software Engineers in the Age of GenAI: Same Value, Changing Practice

Community

Design Systems To Help Amplify Development of Usable Scientific Software Interfaces

Deep Dive

AI4Dev and LLM4HPC Workshops: Leveraging AI for Productive HPC Software Development

Community

Developing Coding Standards and Practices for Sustainable Software Development

Deep Dive and How To