Working with the elephant

Arno_Bosse · April 14, 2026, 6:53pm

I was very pleased, recently, to come across Hans Brandhorst’s essay, “The elephant in the room:
Iconography, Iconclass and Artificial Intelligence”. As I read it, it makes three critical points about the use of ChatGPT for iconographic research and cataloging (and I agree with all of them). These are: i) hallucinated Iconclass notations, ii) inaccurate pictorial descriptions and unreliable transcriptions, which then lead to iii) incorrect interpretations and notations. Other (also legitimate) issues about LLMs in general are raised in the essay as further contributing factors.

I was pleased because the essay is directly relevant to two resources I’ve been building over the last few weeks: rijksmuseum-iconclass-mcp, and its companion resource, rijksmuseum-mcp+. As it happens, both were designed to try and address above all Hans’ first point of criticism about LLMs (hallucinated references) and to at least make some progress towards addressing the third (incorrect interpretations). What they do, in a nutshell, is to direct an LLM to draw on a pre-defined data source in answering a prompt instead of falling back on its pre-trained, background knowledge. For the first resource, this is an online database of Iconclass notations, and for the second, the Rijksmuseum’s catalogue of artworks. The two are designed to work together, but for the sake of this discussion, let’s set aside rijksmuseum-mcp+ and just focus on rijksmuseum-iconclass-mcp which can, in any case, also be used as a standalone resource.

Once you ‘connect’ your LLM to rijksmuseum-iconclass-mcp (how to do so is described on the site) you can query it as usual in natural language and ask it to “look up the iconclass notation for an elephant” which it will then search for in its online database. The resource offers more than that, however, because it also allows you to, for example, search the Iconclass database by concept or meaning (i.e. semantic search) and not just keyword, to list all the key-expanded variants of a concept, or to explore the different places a concept appears within the Iconclass hierarchy. In my (admittedly still limited, and also not as an Iconclass expert) experience while building and testing the resource, hallucinated notations or labels are now rare.

With respect to Hans’ second (or rather, in my summary, third) point, about incorrect interpretations, this is, of course, due in large part to LLMs not being able to interpret the details of images correctly or to accurately transcribe historical texts. But (and here is where I disagree a little with his approach, though not with his conclusion) it is in part also due to how the LLM was being prompted in the ‘Elephants’ essay. LLMs live and die by the context they are given, and if very little is given (e.g. “interpret this” or “describe that”) then they will try to guess what exactly was meant by "interpret’ or ‘describe’ and consequently often fail at guessing right. But if you provide an LLM with more guidance and context, in the form of a more detailed and well structured prompt, it is likely to do much better. This, in effect, is what lies behind the research skills feature which is nothing more than a detailed set of instructions and guidance, in natural language (i.e. more context), to help the LLM better address Iconclass queries. Now, the ‘skill’ file I have created is very much geared towards teaching it how to query rijksmuseum-iconclass-mcp. It says little or nothing at all about how to conduct iconographic research, what constitutes a ‘good’ interpretation, what matters to focus on or to avoid, and so on. And it can’t, in its present form, because I’m really not an expert in this area. But it could, with the help of collaborators, and I suspect (based on my experience in working with LLMs) that once given this added context, an LLM could then also offer far better interpretations and cataloguing advice. But this remains a hypothesis to be tested!

Both resources are still under development and are in part, only technology demonstrators, but of course I also want them to have a practical function and to provide real value to users. For this reason, I’d be very grateful for any and all feedback and criticism from members of this forum.

,

Arno_Bosse · April 17, 2026, 11:04am

To make it easier to compare ChatGPT’s attempts (based on its background, pre-trained knowledge) against Claude (using actual Iconclass data via rijksmuseum-iconclass-mcp) I’ve reproduced the tests Hans set to ChatGPT in his Elephants (PDF) essay:

Part 1: https://claude.ai/share/b2beb1c3-d26a-45f1-b520-cba3bdd36787
Part 2: https://claude.ai/share/b0dbfe2e-ba30-4011-be16-2d5bb621bd94

It’s not quite a fair fight – Hans drew on a now outdated version of ChatGPT (4o) back in July 2025 while Claude Opus 4.7 (used here) was just released yesterday. But my claim is that it’s less about the model and more about where the model gets its knowledge about Iconclass.

epoz · April 23, 2026, 7:32am

Arno, this looks very interesting, and extremely useful.
I wonder if we could integrate this more closely into the iconclass.org website, and make it available to more users.

Would you maybe be interested in presenting your work at one of the monthly online Iconcnlass meetings?
(usually on the last Friday of the month, at 17:00 Amsterdam time via Googlemeet)

HansBrandhorst · April 23, 2026, 8:09am

Of a once famous Dutch author (Simon Vestdijk) it was said that he could “write faster than God could read” which more or less describes the current AI-explosion. The Elephant article is due to be published in Emblematica - also on paper - at the end of a process that is quite fast for traditional publications. But it does mean that whatever you write about AI-technology is outdated before the ink is dry.
I totally agree with Arno’s point that the prompt is crucial. Ideally it should function as an assistant with which we enter a conversation. E.g. if you have a hunch about the subject matter of an image, or about details you think the response is erroneous, you should be able to take a next step and suggest it to the model… It is my impression that we are quickly getting to this situation, also for iconography and Iconclass

Arno_Bosse · May 4, 2026, 5:33pm

To answer Etienne’s question – yes, I deliberately set it up so that it consists of one larger, more complex database (the Iconclass notations) and a very small and simple sidecar database consisting of just the artwork counts at the Rijksmuseum for each notation. The latter is very easy to update. So you could, for example, add artwork counts to it from the RKD with just a CSV file and an import script. Then the Iconclass MCP server would be able to refer users to both the Rijksmuseum and the RKD. Hosting it online (I use railway.com for this) is also inexpensive.

For now, I’ve fixed a few bugs, made it a little faster, and added a new feature where it will provide a link to a custom search at ArtResearch for artworks with a specific Iconclass notation.