slop-o-meter: Measure AI Slop

The Idea

Slop is all about human attention. LLM-generated code is slop when no person owns it, understands it, and has verified it works.

If we can quantify how much attention a given piece of software needs (attention cost), and measure how much attention it received (attention spent), we can calculate how "sloppy" it is.

The Scale

Buttered NoodlesSlop? Too dated
to have any

Spaghetti BologneseA little sauce
never hurt

LasagnaLots of slop, but
with structure

Sloppy JoeContainment is
just pretense

Just The SlopPure. Proud.
Let it flow

The Algorithm

We use a project's git history and GitHub activity to estimate the two attention quantities above. For cost, we look at the codebase historical size. For attention, we look at "signals of human interaction" like commits and PR comments.

A week-by-week slop score is then calculated from the estimates.

Weeks that see significant amounts of code added, with disproportionately little human activity, increase the sloppiness of the project. Weeks with high human activity and few (or negative) code additions reduce it.

The Results

Unreliable, unfortunately.

For many repos they're plausible, but for just as many they're clearly incorrect.

Accuracy depends heavily on having enough human interaction signals: a feature developed behind closed doors and then code-dropped all at once comes with very few signals attached. To the algorithm it looks indistinguishable from a ladelful of steaming LLM slop.

Other factors can also throw off the estimates. For example, vendored-in dependencies kept in non-standard folders and large code files for configurations or demo data.

The algorithm does try to account for some of these exceptions, but it seems that, ultimately, the two measures we're using are just too indirect.

Can You Make a Better Algorithm?

Hop over to GitHub, open a PR, and you'll get (pending approval) a preview environment where you can test your theories.

Can We Measure
Software Slop?

The Idea

The Scale

The Algorithm

The Results

Can You Make a Better Algorithm?

Read The Full Story

Can We MeasureSoftware Slop?

The Idea

The Scale

The Algorithm

The Results

Can You Make a Better Algorithm?

Read The Full Story

Can We Measure
Software Slop?