Step-by-Step Information to AI Agent Improvement Utilizing Microsoft Agent-Lightning

September 1, 2025

38

On this tutorial, we stroll by means of organising a complicated AI Agent utilizing Microsoft’s Agent-Lightning framework. We’re operating every thing straight inside Google Colab, which implies we are able to experiment with each the server and consumer elements in a single place. By defining a small QA agent, connecting it to an area Agent-Lightning server, after which coaching it with a number of system prompts, we are able to observe how the framework helps useful resource updates, job queuing, and automatic analysis. Take a look at the FULL CODES right here.

!pip -q set up agentlightning openai nest_asyncio python-dotenv > /dev/null
import os, threading, time, asyncio, nest_asyncio, random
from getpass import getpass
from agentlightning.litagent import LitAgent
from agentlightning.coach import Coach
from agentlightning.server import AgentLightningServer
from agentlightning.varieties import PromptTemplate
import openai
if not os.getenv("OPENAI_API_KEY"):
   attempt:
       os.environ["OPENAI_API_KEY"] = getpass("🔑 Enter OPENAI_API_KEY (depart clean if utilizing an area/proxy base): ") or ""
   besides Exception:
       go
MODEL = os.getenv("MODEL", "gpt-4o-mini")

We start by putting in the required libraries & importing all of the core modules we’d like for Agent-Lightning. We additionally arrange our OpenAI API key securely and outlined the mannequin we’ll use for the tutorial. Take a look at the FULL CODES right here.

class QAAgent(LitAgent):
   def training_rollout(self, job, rollout_id, sources):
       """Given a job {'immediate':..., 'reply':...}, ask LLM utilizing the server-provided system immediate and return a reward in [0,1]."""
       sys_prompt = sources["system_prompt"].template
       person = job["prompt"]; gold = job.get("reply","").strip().decrease()
       attempt:
           r = openai.chat.completions.create(
               mannequin=MODEL,
               messages=[{"role":"system","content":sys_prompt},
                         {"role":"user","content":user}],
               temperature=0.2,
           )
           pred = r.selections[0].message.content material.strip()
       besides Exception as e:
           pred = f"[error]{e}"
       def rating(pred, gold):
           P = pred.decrease()
           base = 1.0 if gold and gold in P else 0.0
           gt = set(gold.cut up()); pr = set(P.cut up());
           inter = len(gt & pr); denom = (len(gt)+len(pr)) or 1
           overlap = 2*inter/denom
           brevity = 0.2 if base==1.0 and len(P.cut up())

We outline a easy QAAgent by extending LitAgent, the place we deal with every coaching rollout by sending the person’s immediate to the LLM, accumulating the response, and scoring it in opposition to the gold reply. We design the reward perform to confirm correctness, token overlap, and brevity, enabling the agent to study and produce concise and correct outputs. Take a look at the FULL CODES right here.

TASKS = [
   {"prompt":"Capital of France?","answer":"Paris"},
   {"prompt":"Who wrote Pride and Prejudice?","answer":"Jane Austen"},
   {"prompt":"2+2 = ?","answer":"4"},
]
PROMPTS = [
   "You are a terse expert. Answer with only the final fact, no sentences.",
   "You are a helpful, knowledgeable AI. Prefer concise, correct answers.",
   "Answer as a rigorous evaluator; return only the canonical fact.",
   "Be a friendly tutor. Give the one-word answer if obvious."
]
nest_asyncio.apply()
HOST, PORT = "127.0.0.1", 9997

We outline a tiny benchmark with three QA duties and curate a number of candidate system prompts to optimize. We then apply nest_asyncio and set our native server host and port, permitting us to run the Agent-Lightning server and purchasers inside a single Colab runtime. Take a look at the FULL CODES right here.

async def run_server_and_search():
   server = AgentLightningServer(host=HOST, port=PORT)
   await server.begin()
   print("✅ Server began")
   await asyncio.sleep(1.5)
   outcomes = []
   for sp in PROMPTS:
       await server.update_resources({"system_prompt": PromptTemplate(template=sp, engine="f-string")})
       scores = []
       for t in TASKS:
           tid = await server.queue_task(pattern=t, mode="practice")
           rollout = await server.poll_completed_rollout(tid, timeout=40)  # waits for a employee
           if rollout is None:
               print("⏳ Timeout ready for rollout; persevering with...")
               proceed
           scores.append(float(getattr(rollout, "final_reward", 0.0)))
       avg = sum(scores)/len(scores) if scores else 0.0
       print(f"🔎 Immediate avg: {avg:.3f}  |  {sp}")
       outcomes.append((sp, avg))
   finest = max(outcomes, key=lambda x: x[1]) if outcomes else ("",0)
   print("n🏁 BEST PROMPT:", finest[0], " | rating:", f"{finest[1]:.3f}")
   await server.cease()

We begin the Agent-Lightning server and iterate by means of our candidate system prompts, updating the shared system_prompt earlier than queuing every coaching job. We then ballot for accomplished rollouts, compute common rewards per immediate, report the best-performing immediate, and gracefully cease the server. Take a look at the FULL CODES right here.

def run_client_in_thread():
   agent = QAAgent()
   coach = Coach(n_workers=2)    
   coach.match(agent, backend=f"http://{HOST}:{PORT}")
client_thr = threading.Thread(goal=run_client_in_thread, daemon=True)
client_thr.begin()
asyncio.run(run_server_and_search())

We launch the consumer in a separate thread with two parallel staff, permitting it to course of duties despatched by the server. On the similar time, we run the server loop, which evaluates totally different prompts, collects rollout outcomes, and reviews the perfect system immediate primarily based on common reward.

In conclusion, we’ll see how Agent-Lightning allows us to create a versatile agent pipeline with only some traces of code. We are able to begin a server, run parallel consumer staff, consider totally different system prompts, and routinely measure efficiency, all inside a single Colab surroundings. This demonstrates how the framework streamlines the method of constructing, testing, and optimizing AI brokers in a structured method.

Take a look at the FULL CODES right here. Be at liberty to take a look at our GitHub Web page for Tutorials, Codes and Notebooks. Additionally, be at liberty to comply with us on Twitter and don’t overlook to hitch our 100k+ ML SubReddit and Subscribe to our E-newsletter.

Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is dedicated to harnessing the potential of Synthetic Intelligence for social good. His most up-to-date endeavor is the launch of an Synthetic Intelligence Media Platform, Marktechpost, which stands out for its in-depth protection of machine studying and deep studying information that’s each technically sound and simply comprehensible by a large viewers. The platform boasts of over 2 million month-to-month views, illustrating its recognition amongst audiences.

Previous articleAligning Offline and On-line Metrics for Success

Next articleDoes Stevia Break a Quick? Glycemic, Insulin & Autophagy Results Defined

Step-by-Step Information to AI Agent Improvement Utilizing Microsoft Agent-Lightning

An Implementation to Construct Dynamic AI Techniques with the Mannequin Context Protocol (MCP) for Actual-Time Useful resource and Instrument Integration

Microsoft AI Proposes BitNet Distillation (BitDistill): A Light-weight Pipeline that Delivers as much as 10x Reminiscence Financial savings and about 2.65x CPU Speedup

Weak-for-Robust (W4S): A Novel Reinforcement Studying Algorithm that Trains a weak Meta Agent to Design Agentic Workflows with Stronger LLMs

LEAVE A REPLY Cancel reply

Most Popular

Raspberry Pi Goals for Extra Versatile OS Configuration with a Transfer to Cloud-Init

The place AI meets cloud-native computing

Korea Innovation Basis selects 2 AI/IoT corporations for World Know-how Commercialisation Help Program

CRISPR Slashes ‘Dangerous Ldl cholesterol’ Ranges by 95 % in Early Outcomes

Recent Comments

ABOUT US

POPULAR POSTS

Raspberry Pi Goals for Extra Versatile OS Configuration with a Transfer to Cloud-Init

The place AI meets cloud-native computing

Korea Innovation Basis selects 2 AI/IoT corporations for World Know-how Commercialisation Help Program

POPULAR CATEGORY