I agree with this, but I wouldn't say LOC being an anti metric has been a consensus for 40 years. Lots of your quotes are from 20 years ago because that's the last time the lesson was forgotten.
20 years is about the amount of time it takes for the pain of an old mistake to fade so I'm not surprised the lesson is being forgotten now.
I'm confident the lesson will be learned again, but I expect the failures to be more consequential this time because AI is an amplifier, and it's especially good at amplifying failure.
I like your "Comprehension coverage" metric. This is especially important for microservices requiring high availability. Error handling and attribution, backoff, retries, fail over, all needs to be understood and manually tested. I can't imagine outsourcing the thinking of a distributed system to an AI. It's a good final lint check, but not an author.
I can understand Microsoft, Meta, and Anthropic talking about lines of code written using AI, because they sell or publish products that are related to AI-assisted code generation. For everybody else not in the business of AI copilots, perhaps focusing on the AI generated lines of code as a metric is really proxy for "we too are using the latest, greatest tools available in software engineering, like the coolest teams out there". However, as you explain, focusing on lines of code signals a misunderstanding of what is actually meaningful in software engineering, which in addition to being unfocused at the business level, also erodes trust in the engineering leadership itself.
Low trust, high amount of AI generated code, with a low understanding of what that code is really doing may perhaps be the real AI bubble we are about to witness burst.
In simple terms, LOC is an output, while meeting requirements is an outcome. The number of lines of code written manually or otherwise is irrelevant, speed and quality is.
I agree with this, but I wouldn't say LOC being an anti metric has been a consensus for 40 years. Lots of your quotes are from 20 years ago because that's the last time the lesson was forgotten.
20 years is about the amount of time it takes for the pain of an old mistake to fade so I'm not surprised the lesson is being forgotten now.
I'm confident the lesson will be learned again, but I expect the failures to be more consequential this time because AI is an amplifier, and it's especially good at amplifying failure.
I like your "Comprehension coverage" metric. This is especially important for microservices requiring high availability. Error handling and attribution, backoff, retries, fail over, all needs to be understood and manually tested. I can't imagine outsourcing the thinking of a distributed system to an AI. It's a good final lint check, but not an author.
I can understand Microsoft, Meta, and Anthropic talking about lines of code written using AI, because they sell or publish products that are related to AI-assisted code generation. For everybody else not in the business of AI copilots, perhaps focusing on the AI generated lines of code as a metric is really proxy for "we too are using the latest, greatest tools available in software engineering, like the coolest teams out there". However, as you explain, focusing on lines of code signals a misunderstanding of what is actually meaningful in software engineering, which in addition to being unfocused at the business level, also erodes trust in the engineering leadership itself.
Low trust, high amount of AI generated code, with a low understanding of what that code is really doing may perhaps be the real AI bubble we are about to witness burst.
% of passed tests per cycle is a better metric.
In simple terms, LOC is an output, while meeting requirements is an outcome. The number of lines of code written manually or otherwise is irrelevant, speed and quality is.