Making such a agent event-driven is vital, as it may be tied to companies like Azure’s Safety Graph. By utilizing present Azure safety insurance policies as a greatest observe, it’s capable of evaluate present state with what it needs to be, informing customers of points and performing primary remediations consistent with Azure suggestions. For instance, it will possibly replace net servers to a brand new model of TLS, guaranteeing that your purposes stay on-line.
Occasions will be sourced from Azure instruments like Monitor, pulling alert particulars to drive an automatic root-cause evaluation. Because the agent is designed to work with recognized Azure information sources, it’s ready to make use of these to detect exceptions after which decide the potential trigger, reporting again its conclusions to an obligation website reliability engineer. This offers the engineer not solely an alert however a spot to begin investigations and remediations.
There may be even the choice of dealing with primary fixes as soon as they’re authorised by a website reliability engineer. The checklist of authorised operations is sensibly small, together with triggering scaling, restarting, and the place acceptable, rolling again modifications.