Fork on GitHubFork on GitHub
Buttered Noodles
Spaghetti Bolognese
Lasagna
Sloppy Joe
Just The Slop

Can We Measure
Software Slop?

An experiment
(with questionable results)

The Idea

Slop is all about human attention. LLM-generated code is slop when no person owns it, understands it, and has verified it works.

If we can quantify how much attention a given piece of software needs (attention cost), and measure how much attention it received (attention spent), we can calculate how "sloppy" it is.

The Scale

Buttered Noodles
Spaghetti Bolognese
Lasagna
Sloppy Joe
Just The Slop
0
1
2
3
4
5
Buttered NoodlesSlop? Too dated
to have any
Spaghetti BologneseA little sauce
never hurt
LasagnaLots of slop, but
with structure
Sloppy JoeContainment is
just pretense
Just The SlopPure. Proud.
Let it flow

The Algorithm

We use a project's git history and GitHub activity to estimate the two attention quantities above. For cost, we look at the codebase historical size. For attention, we look at "signals of human interaction" like commits and PR comments.

A week-by-week slop score is then calculated from the estimates.

Weeks that see significant amounts of code added, with disproportionately little human activity, increase the sloppiness of the project. Weeks with high human activity and few (or negative) code additions reduce it.

The Results

Unreliable, unfortunately.

For many repos they're plausible, but for just as many they're clearly incorrect.

Accuracy depends heavily on having enough human interaction signals: a feature developed behind closed doors and then code-dropped all at once comes with very few signals attached. To the algorithm it looks indistinguishable from a ladelful of steaming LLM slop.

Other factors can also throw off the estimates. For example, vendored-in dependencies kept in non-standard folders and large code files for configurations or demo data.

The algorithm does try to account for some of these exceptions, but it seems that, ultimately, the two measures we're using are just too indirect.

Can You Make a Better Algorithm?

Hop over to GitHub, open a PR, and you'll get (pending approval) a preview environment where you can test your theories.

Read The Full Story

Visit pscanf's blog for a more detailed explanation and analysis of the experiment.