There was once a noble scholarly life in which a person learned to plot by hand.

One sat before Matplotlib, perhaps, and slowly discovered that everything had a spine, a tick, a locator, a formatter, a transform, and a private grievance. In R, one met ggplot2 and learned that beauty could be achieved through grammar, provided one remembered where the parentheses went. Seaborn arrived as a kindly middle power, doing much of the work until the exact moment one needed the legend somewhere else.

This was character-building, presumably.

Now, of course, one can ask an LLM.

Not always successfully. This should be said early, before the machine becomes too pleased with itself. Benchmarks for plotting-code generation note both the promise of LLMs in automating data-analysis and visualization tasks and the continuing difficulty of producing fully executable code for complex plotting requests [1]. LIDA, one of the better-known systems in this space, frames visualization generation as a multi-stage problem: summarize the data, propose goals, generate visualization code, refine it, and produce grammar-agnostic charts across libraries such as Matplotlib and Seaborn [2]. Newer agentic systems such as PlotGen make the same point more explicitly: one agent plans, another writes code, and others inspect numbers, labels, and visual output before sending the poor figure back for repair [3].

This is, in a narrow but important sense, brilliant.

Not because it abolishes the need to understand visualization. It does not. If anything, it reveals more quickly who understands the plot and who merely recognizes one. But it changes the unit of labour. The researcher no longer needs to remember, from first principles and late at night, how to rotate the x-axis labels without accidentally rotating the entire moral arc of the figure. One can instead say: make this a dot heatmap; encode magnitude by colour and frequency by size; split by cell class; keep the legend outside; make it legible for a paper; do not, under any circumstances, invent a rainbow.

The machine will try. It will sometimes fail in a new and decorative way. But failure has become conversational.

This is the part I find useful. The LLM is not merely a code generator. It is a plotting intern with infinite patience, middling taste, and no resentment when asked for the fifteenth time to make the labels smaller, then larger, then smaller but in a more publication-ready way. It can translate between ggplot2 and Seaborn, between a sketch and an implementation, between “this looks wrong” and “the hue mapping is inconsistent across facets.” It can remember the boilerplate, which frees the researcher to remember the point.

There is a quiet academic virtue in this.

A figure is not the code that made it. The figure is an argument in visual form. It says: here is the comparison; here is the uncertainty; here is the structure; here is the exception that matters. The best use of an LLM is not to avoid learning what a good plot is. It is to spend less of one’s finite life relearning how to summon it.

Of course, one must still inspect the result. The axes may lie. The bins may drift. The colour scale may flatter the hypothesis. A model may produce perfectly runnable nonsense, which is, in academia, not exactly a new threat. The final responsibility remains with the human author, as it always did, though now the author has fewer excuses for a bad legend.

So yes, there is something ridiculous about never learning to plot manually. There is also something ridiculous about pretending that scholarship consists in memorizing the difference between theme_minimal() and theme_classic() while one’s actual scientific question waits in the next room.

The mature position is less heroic.

Learn enough to know what the figure should say.
Use the agent to make it say that.
Then check every line, every axis, every transformation, every silence.

The rest is formatting, and formatting has already taken enough from us.

— § —