Claude's Creations: Can AI Teach AI? — Part 2

Christian Monson
Oct 4, 2024
5 min read

Updated: Nov 7, 2024

TL;DR

Claude, Anthropic's Large Language Model AI, can build tools to help teach AI,

BUT

The tools that Claude builds are prototypes -- not high-quality finished products:

They look like "programmer art" and
They plateau at a certain level of complexity

Join me on a journey to convince Claude that a functional interactive tool that actually looks nice really would be a good idea...

(If you missed the first installment, you can catch up with Part 1 here)

Intro

Last month we set off on a quest to use the Claude AI to build a tool that can help teach machine learning concepts. After several attempts we were able to convince Claude to plot 2D Gaussian point clouds and draw 90% confidence ellipses for them.

Now that we can generate data, let's have Claude implement one of the first machine learning algorithms that every student learns: decision trees.

The Decision Tree Algorithm

A decision tree learns to approximate the distribution of a dataset by repeatedly subdividing a training data set into smaller but purer subsets. To make one subdivision, a decision tree finds the most pure split that can be achieved using any single input feature.

Let's look at an example dataset.

This dataset is 2-dimensional. That is, each datapoint consists of 2 input features (i.e. an x and a y coordinate). While a diagonal line could easily separate the red and blue points, a diagonal line would need to reference both input features -- which a decision tree cannot do. Instead, a decision tree can only draw either a vertical line dividing the points into left and right groups, or a horizontal line dividing the points into top and bottom groups. (Counterintuitively, a vertical line only relies on the value of the horizontal feature, while a horizontal line only uses the value of the vertical feature.) In this particular data set, neither a vertical line or a horizontal line can perfectly separate the red points from the blue points, but a vertical split results in the purest subgroups:

Having performed one split, the decision tree algorithm then considers each of the two resulting subsets of data. If a subset does not yet consist entirely of datapoints from one single class, then the decision tree can draw a new vertical or horizontal line a la:

To finally arrive at a set of subdivisions that fully separate the training datapoints by color class. The decision tree approximates the distribution of the blue points as the blue area of the input plane and it approximates the distribution of the red points as the red area of the plane.

Claude and Decision Trees

An interactive visual tool that could incrementally build a decision tree in real time would be a powerful teaching aid. So let's have Claude augment our 2d Gaussian clouds with an implementation of a decision tree:

And the result:

Ummm, where's the tree? I see no vertical or horizontal subdivisions...

Before we go down the rabbit hole of convincing Claude to actually build a tree, I should mention that the jump from 2 color classes to 4 is intentional. Before I tasked Claude with adding a decision tree, I asked the AI to up the number of distributions from 2 to 4, and to simultaneously modify the Sample Size slider to grade on a logarithmic scale. Claude successfully implemented both modifications on the first try!

But the AI was not so adept at building the decision tree. When I reported back to Claude that "The decision boundary is not printed to the screen", Claude returned with an artifact that displayed absolutely nothing inside of the plot area. On Claude's third try the code couldn't even run without crashing:

The reference to "tf" suggested that Clause was using the Tensorflow library for some part of the calculation. Tensorflow is a library meant for building neural networks, not decision trees, so I tried the instructions: "Don't use tensorflow for the decision tree but write a simple decision tree algorithm from scratch". No dice: "Error: Error in updatePlot: ReferenceError: n_samples is not defined".

Step-by-Step Instruction

It seems Claude is stuck in a dead end code cul-de-sac: The code doesn't do what Claude claims it does, and when Claude tries to fix the existing code, the AI fails. Repeatedly.

This is not the first time on this journey that Claude ended up in dead end. In last month's post, I discussed how Claude could not get past the very first step of this process — plotting points from a 2D gaussian. The solution that time was to simplify the instructions. Instead of explaining to Claude why I wanted to build this ML teaching tool, I had to explicitly tell Claude exactly what to build. one. small. step. at. a. time.

So, apparently "Train a decision tree ... and draw the decision tree ... boundaries" is too big a jump for Claude. Let's simplify by asking Claude not to build a full decision tree but rather just the first step of the decision tree algorithm (sometimes called a decision stump 😏). And furthermore I won't use the words "decision stump", but instead explicitly state exactly how to build a stump:

And Hooray! It worked

Now we can build up the tree little by little:

Which also works:

Build the tree still taller...

Which again works (but is hard to read due to the horrible color choices)

Wow! Improve the color scheme, add a few sliders to give the user more control over how they build the decision tree, and this could be a useful tool!

Much better!

You can actually see the division lines. And as an added bonus you can also see the order the divisions were added because darker lines were drawn first.

Although, looking closer, I'm not sure why Claude's artifact draws the faintest line, it doesn't look like there are any datapoints at all in the central box! And when I tried to cherry-pick a better picture by hitting the "Generate New Distributions" button, I found that Claude's decision tree artifact pretty much always makes questionable subdivisions on the last few steps. Unfortunately, I did not notice this shortcoming at the time.

Instead, I decided that this tool met the basic functional requirements I had set for myself. I had a tool that could demonstrate how a decision tree works! (Indeed, I used output from later versions of the tool as the pictures for the section on decision trees. Proving that this tool can be used to teach machine learning :-)

Now it was time to polish the tool. Join me next time as I attempt to convince Claude to paint regions by color and add sliders to control both the number of classes and the minimum number of datapoints in any one leaf of the decision tree.

Will Claude be able to polish this rough gem of a prototype into a fully functional teaching aid

(Spoiler Alert... No)