2842 stories
·
0 followers

I am dreading our LLM-written incident report future

1 Share

The other day, Reginald Braithwaite posted the following toot. For posterity, I’ve also included my own response to it:

Braithwaite’s post is dripping with sarcasm, but make no mistake, incident reports written entirely by LLMs is coming. And I am not looking forward to this future.

Before I dive in here, I want to note that there is a lot of toil you need to do in order to gather the data you need to write a good incident report, and LLMs can help significantly reduce that toil. I’ve got no issues there. But there’s a world of difference between using LLMs to help you assemble the ingredients involved in writing an incident report, and using an LLM to actually write the report itself.

Braithwaite’s post is horrifying to me precisely because of the seduction of the LLM as a tool for generating an incident report. After all, you can just ask it to write the report, and it’ll do it. And that’s exactly what scares me.

There’s a famous quote by the cartoonist Dick Guindon: “Writing is Nature’s way of showing you how sloppy your thinking is“. You might think you understand a concept, but it’s only when you put metaphorical pen to paper, when you actually try to explain the concept in written words to a potential reader, that you realize how fuzzy your understanding actually is. Writing in your own words forces you to confront how much you actually understand what it is that you’re writing about. Or, as Leslie Lamport put it, “If you’re thinking without writing, you only think you’re thinking.”

Having an LLM generate the text of an incident write-up bypasses this thinking step. Now there’s no human in the loop of the writing process that has to confront whether the explanation is actually consistent with the evidence that they’ve gathered. Instead, what you get is a plausible explanation of what happened to someone who is not intimately familiar with the details. They might read, nod along, and think, “yes, that makes sense.” But the LLM may have invented couplings between systems that aren’t there, and may miss critical interactions that were actually part of the incident, and because nobody did the hard work of actually synthesizing the data to do the write-up, nobody will notice. Because if you’re trying to reduce the writing effort, how much effort are you really going to put into checking the LLMs work.

In my view, LLM-generated incident write-ups are more dangerous than using LLM for coding or for AI SRE style tasks. For coding tasks, there’s always a testing step to check that the code exhibits the desired behavior, even if nobody looks at the code itself for meaningful details. For AI SRE tasks, either the LLM output helps you resolve the incident, or it doesn’t. In both cases, Nature is the ultimate arbiter of the LLM output.

But incident write-ups aren’t like that. The consequences of a poor report aren’t immediately apparent the way incorrect code or an incorrect operational diagnosis are in the moment. Instead, we get incident reports that have the superficially correct form, but are actually incorrect, with no obvious test for correctness.

And, because incident reports are time-consuming to write, the temptation to use AI tools to generate them will be overwhelming. But these LLMs will not go around talking to people that were involved in the incident. These reports will be simulacra; they will have the right form, but they will not provide readers with genuine insights into the nature of the system. The amount of learning will be significantly curtailed.

And, yes, people will probably use AI to summarize them as well.

It’s not a future I’m looking forward to.



Read the whole story
huskerboy
15 hours ago
reply
Seattle
Share this story
Delete

Seattle about to get a ‘Parabolic’ heat wave?

1 Share

(Non-math majors: “What?”)

(Math majors:….”Ummm, yeah Scott… What?”)

(CAUTION AHEAD: This post is rated “M” for… “Math”.  Yes, this forecast uses obscure math vocabulary. If you were thinking you were coming to Facebook explicitly to avoid having to think about — or learn about — math, here is the TL;DWM  (“Too Long; Don’t Wanna Math”) version: It’s going to get hot on Sunday and Monday, then it’s not. Thanks for stopping by!)

As you’ve probably heard, it’s going to get hot around Seattle. Yes, avoiding record heat is likely the preferable choice for a vast majority of people, especially in a region where air conditioning is about in similar abundance to Capitol Hill street parking.

BUT! If you were going to have to endure a heat wave, this is about the best version you can hope for, as there are a number of ways it could have been worse.

AND SOMEHOW YOU’RE GOING TO TELL ME USING ALGEBRA OR SOMETHING?

Maybe more like geometry.

First, the forecast: We’re already into the sunny phase, now it’s just the temperature climb. Highs on Friday will reach the low-mid 70s around Seattle, then reach close to 80 on Saturday.

As a ridge of high pressure intensifies in the region over the weekend, we’ll start to bring in some warm, easterly winds that are the key to our local heat waves. The air undergoes a dual process of heating and drying as it blows over and then sinks down the western slopes of the Cascades, bringing a California/desert-like heat.

That heating engine starts to rev up on Sunday, with highs reaching the mid-upper 80s, and peaks on Monday with highs now pegged to reach the low 90s around Seattle; hotter toward the foothills and a little cooler along the coast.

91 DEGREES IN SEATTLE SEEMS PRETTY YUCK. HOW WAS THIS BETTER?

There’s what I count as at least five items in our favor with this heat wave that will help ease the impacts.

1- It’s a dry heat. With the easterly wind component, it won’t be humid at all, so your sweat will be more effective at keeping you cool (especially if, say, you’re playing soccer in front of billions of fans…)

2- While there will still be some fire danger with the hot and dry winds, it certainly helps that we had all that rain just a few days ago and have had occasional rain the past 6 weeks as opposed to a weeks-long bone dry streak leading into this heat.

3- It could have been hotter. The initial long range models were suggesting perhaps mid to even upper 90s were in the realm for the peak of the heat (some Euro ensemble models had 101-103!). We’ve already seen other heat waves around the planet in the first few months go well beyond the norm with the record-obliterating heat in the Desert Southwest at the end of March and the intense heat wave in Northern Europe a few weeks ago.  We still have scars from June 2021 around here. But this will NOT be that kind of heat. (Though I do give kudos to the models for correctly sleuthing out a heat wave pattern for us 11-12 days in advance!)

4- There are places to escape the heat that aren’t far. It’ll still be warm on the coast on Sunday and near the shorelines, but more like upper 70s to low 80s.

And 5: It’s not going to last very long. Really, it’s just one day of the hottest heat. And then unlike stubborn Heat Domes that last for days, this ridge breaks down pretty quick and marine breezes start to cool us off as soon as Tuesday, increasing their cooling power through the week.

In fact, if you were to chart the high temperature forecast this week on a graph, it would look much like a parabolic curve (aaah! Math!)

What’s a parabolic curve? Essentially in this case, it’s an upside-down “U”.

The high temperatures gradually ramp up, peak, and then ramp down almost equally as fast.

The true math majors would look at the forecast and say it’s not *truly* parabolic — for that to happen the highs before the heat wave would have to equal the highs behind it. Right now we’re at 72-80-86-91-83-76-73. But since when do we have to be exact with math?

But more importantly, the forecast is mimicking a parabola, and not a flat ridge of days-long heat. The Monday Belgium-Egypt World Cup match will indeed be broiling. But by the time the USA match against Australia arrives next Friday, we should be back down in the 70s — perfect weather to turn that parabolic “frown” upside down 🙂

Read the whole story
huskerboy
7 days ago
reply
Seattle
Share this story
Delete

Grappling With the Existential AI Threat

2 Shares

Charity Majors, writing about how high-performing engineering teams are dealing with the transition from pre-AI to AI-native development: AI enthusiasts are in a race against time, AI skeptics are in a race against entropy.

This is not a situation where one side is right and the other is huffing paint. (O, that it were!) Each side is grappling with a real, alarming, escalating threat to the company’s existence, and the closer they look the more (again: real, alarming) evidence they find.

The enthusiasts are not wrong. We are starting to see real, non-imaginary, discontinuous leaps in capabilities from teams that lean in hard to working with AI. And this does not feel like a normal technology cycle where you can wait for the dust to settle; teams that sit this out while competitors are hustling could be out of business before the dust settles. That’s a real, existential threat.

The skeptics are also not wrong. When you ship code faster than engineers can read it, in domains where nobody has full context, you are making withdrawals from a trust account that took years to build. Reliability degrades, institutional knowledge evaporates. You end up with systems nobody understands, products burbling into incoherence, and on-call rotations that grind people up and spit them out. That is ALSO a real existential threat.

She goes on to say that “the wins and costs are happening to two different groups of people. There is no natural feedback loop.” Interesting read.

Tags: artificial intelligence · Charity Majors · programming

Read the whole story
huskerboy
11 days ago
reply
Seattle
Share this story
Delete

AI enthusiasts are in a race against time, AI skeptics are in a race against entropy (xpost)

1 Share

Both sides are grappling with a real existential threat, and both sides feel like they are screaming into the void. There is a way to close the gap and get everyone pulling in the same direction.. Xposted from substack.

Read the whole story
huskerboy
11 days ago
reply
Seattle
Share this story
Delete

The demon of the gaps

1 Share
Mephistopheles (a medieval demon from German folklore) flying over Wittenberg, in a lithograph by Eugène Delacroix.

Modern software systems contain within them a mind-boggling level of complexity. As software engineers, we make this complexity manageable through techniques like decomposition, information hiding, and abstraction. We endeavor to break our systems up into components that interact over well-defined interfaces. By doing this, the surface exposed to individual software engineers is dramatically reduced: no individual has to understand how the entire complex system works in order to contribute to their system. Instead, each software engineer needs to understand only the individual component that they work on, along with the interfaces of the other components that they interact with. Decomposition is synonymous with analysis, where you study a larger thing by breaking it up into smaller pieces that are more amenable to understanding.

You can see this strategy of complexity management in action in microservice architectures. An engineer needs to understand the service that their team owns, and the interfaces of the services that their team calls out to. This architecture effectively bounds the information that an engineer needs in order to work effectively. Microservice architectures aren’t there for scaling the software itself, they’re there for scaling the software organization.

Unfortunately, when the system breaks down, this complexity management strategy breaks down itself. Just as hurricanes don’t respect political boundaries, system failures don’t respect component boundaries. Yes, sometimes the problem in a software system is limited to the failure of a single component. Those are the easiest cases to diagnose and mitigate. However, the hairy incidents are the ones that arise due to unexpected interactions across components. Maybe you have several services that are throwing errors, or maybe none of the services are throwing errors but customers are still seeing incorrect behavior. There’s no obvious change that correlates with the start of impact, or maybe you don’t even know when the impact started because the customer impact isn’t reflected in your existing metrics.

When you’re in the throes of an incident that involves an unexpected interaction, this architecture that was built for managing complexity now works against you. Because you’ve built an analysis solution but you’re now faced with a synthesis problem. You need to understand how the pieces all normally fit together to function in order to determine what is going wrong with the system right now. You’ve optimized to avoid requiring anybody to understand how the whole thing works, but now the whole thing isn’t working, and no one person knows how the whole thing works.

The job of the incident responders is to collectively figure out how to do that synthesis. You’ve brought together a group of people who each understand the functions of different components of the system, and you need to work together to build enough of an understanding of how the system functions to debug what’s going wrong. As an ad hoc team, the incident responders have to move up and down the abstraction hierarchy to figure this out.

This sort of in-the-moment reconstruction of system function from component parts is an essential part of incident response for the most complex incidents, but it’s rarely treated as first-class work that’s worthy of study and support. The recent book Crisis Engineering by Marina Nitze, Matthew Weaver, and Mikey Dickerson is the exception that proves the rule: they do discuss the work of building a model of the system during a crisis to help figure out what’s gone wrong. But I struggle to recall any other guidance I’ve read about incident response that talks about how to prepare for doing this sort of work. It’s important work, and it’s difficult, and the ability to do it well can have a huge impact on the time it takes to mitigate the hardest incidents. This is stuff that even the best individual humans struggle with, because it involves a group of humans working together effectively, with each person having a partial model of the system. And if the best humans struggle with it, I don’t think AI SRE tools are going to save us here: if the best humans struggle, the AIs will too. We need to figure out how to get better at this collectively. Like so many things, it’s a coordination problem.



Read the whole story
huskerboy
12 days ago
reply
Seattle
Share this story
Delete

Cloudflare CEO says AI bots now exceed human traffic on the web

1 Share
over a year earlier than anticipated, currently 57% to 43% by their metrics #
Read the whole story
huskerboy
15 days ago
reply
Seattle
Share this story
Delete
Next Page of Stories