LangChain agents tutorial: build a multi-step workflow in Python

Last Updated on June 12, 2026

A chatbot that answers questions from a knowledge base sounds impressive in a demo. Then someone asks it a question that requires checking two sources, comparing the results, and deciding whether to follow up. The bot returns a confident, single-source answer that misses half the problem. It could retrieve information, but it could not decide what to do with it.

This is where most “agent” tutorials fall short. They show how to wire an LLM to a tool and call it agentic. The real challenge is different: designing a system that chooses actions, evaluates results, and navigates a multi-step workflow in Python without hard-coding every path.

What needs to be understood is control flow and decision-making, not just syntax. The concepts that matter are how agents select tools, when they stop, how they recover from bad results, and what separates autonomous decision-making from a glorified function call.

This piece covers what LangChain agents actually are, when to use them, how to build a simple agentic workflow, and what breaks in practice. One clear stance up front: if your workflow has no real decisions to make, an agent is usually the wrong abstraction.

Why LangChain agents are more than just LLM extensions

A common misconception treats LangChain agents as prompt wrappers with tool access. Add a search function, give the model a system message, and the application is “agentic.” That framing misses what actually changes when an application becomes an agent.

An agentic system does something fundamentally different from a static chain. It can choose among tools at runtime based on context. It can maintain state across steps. It can decide whether to continue, retry, or stop. It can coordinate with external systems like databases, APIs, or retrieval pipelines.

These are not cosmetic differences. They change the architecture. A fixed chain executes steps in order. An agent evaluates the situation and picks the next step. That distinction is what makes agent orchestration useful for workflows where the path is not predictable in advance.

A langchain agents tutorial that only covers model invocation misses this. The real lesson is in the control flow: how the agent decides, what information it carries forward, and what happens when a tool returns something unexpected. Building an AI agent in Python means designing these decision points, not just importing a class.

The foundational role of LangChain’s unified message format

LangChain introduces messages as the unit of communication between every component in the system. This sounds abstract until you try to build a multi-step workflow in Python without it.

Think of messages like standardized shipping containers. A container ship, a train, and a truck can all move the same container because the shape is predictable. LangChain messages work the same way. Whether the message comes from the user, the system prompt, the model’s response, or a tool’s output, it follows a consistent structure. That consistency is what makes chaining, branching, and tool invocation reliable across steps.

The practical benefits are concrete:

Easier tool invocation. The model’s output can be parsed into a tool call and the tool’s result feeds back as a message the model can process.
Clearer history. Every interaction is recorded in the same format, making it easier to inspect what happened and why.
Less brittle prompt glue code. Without a standard format, developers end up writing custom string concatenation at every step. That breaks fast.
Better debugging. When something goes wrong at step four of a six-step workflow, a uniform message log makes it possible to trace the failure.

LangChain supports system, human, AI, and tool message types. Each one serves a clear role in the conversation history. Understanding this format is a prerequisite for building anything that holds state across multiple steps.

What makes LangChain agents distinct

The capabilities that separate LangChain agents from simpler LLM applications are practical, not theoretical. Each one unlocks a specific kind of workflow:

Tool selection at runtime. The agent inspects the current task and picks the right tool instead of following a fixed sequence. This is what enables flexible task handling.
Multi-step reasoning. The agent can take an action, evaluate the result, and decide what to do next. Workflows that require gathering information from multiple sources depend on this.
Integration with external systems. APIs, databases, retrieval pipelines, and code execution environments can all be registered as tools.
Short-term memory and state passing. The agent carries context from one step to the next, so it does not lose track of what it has already done.
Error recovery and retries. When a tool call fails or returns unexpected output, the agent can try again or choose an alternative path.
Human-in-the-loop checkpoints. For sensitive actions, the agent can pause and request approval before proceeding.

Each of these features maps to a real build requirement. Tool selection matters when the agent handles varied user requests. Error recovery matters when external APIs are unreliable. Human checkpoints matter when the output affects customers or financial transactions.

The strategic value of autonomous decision-making in AI workflows

Consider an internal support agent that handles employee IT requests. A ticket arrives: “My VPN stopped working after the latest update.” A useful agent triages the ticket, searches the knowledge base for known issues related to the update, checks the employee’s device metadata, drafts a response with troubleshooting steps, and escalates to a human only when its confidence is low.

Without an agent, this workflow requires either a human at every step or a rigid decision tree that breaks whenever a new edge case appears. The value of LangChain agents is not that they “think like humans.” The value is that they handle branching workflows with less manual orchestration.

The business and technical outcomes are measurable:

Reduced handoffs. Fewer steps require a person to copy information between systems or make a routing decision.
Faster completion of repeatable knowledge tasks. Summarization, lookup, drafting, and classification can happen in one agent loop instead of a multi-person queue.
More resilient workflows when requirements are incomplete. The agent can ask clarifying questions or gather missing data instead of failing silently.

These outcomes matter for teams building production AI workflows. The question is not whether agents are impressive. The question is whether they reliably reduce friction in a specific process.

The reflection pattern and why it matters

The reflection pattern for agents is a repeating cycle: the agent produces a draft, evaluates it against criteria, and revises before acting or returning a result.

This is one of the first patterns that separates simple prompting from actual agent design. In a single-pass system, the model generates an answer and returns it. With reflection, the model generates an answer, critiques its own output, and produces a better version. The cycle can repeat until the result meets a quality threshold or a maximum iteration count is reached.

Reflection helps most in tasks where first-draft quality is unreliable:

Summarization. First attempts often miss key details or include irrelevant information.
Structured extraction. Pulling specific fields from unstructured text benefits from a validation pass.
Code generation. A review step catches syntax errors and logic gaps before execution.
Planning before tool use. The agent can draft a plan, evaluate whether the plan addresses the user’s actual question, and revise before executing any tool calls.

The tradeoffs are real. Reflection increases latency because the model runs multiple passes. It increases token usage and cost. For simple lookups or deterministic tasks, reflection is unnecessary overhead.

For workflows that affect customers or downstream systems, a single-pass answer is rarely good enough. Reflection is the mechanism that makes output quality controllable rather than random.

Industry case studies that show strategic value

Healthcare operations. A patient intake agent collects preliminary details from a patient portal submission, checks insurance policy rules, identifies missing information, and drafts handoff notes for clinical staff. A fixed chain would struggle because intake forms vary widely and policy rules change by plan type. The agent handles decision points like which follow-up questions to ask and when to flag a case for manual review.

Financial services. A compliance review agent processes transaction narratives, queries internal policy rules, cross-references patterns against historical flags, and surfaces suspicious activity for analyst review. The branching logic here is significant: different transaction types trigger different policy checks. Human review is still required for final disposition, but the agent reduces analyst triage time by handling initial screening.

E-commerce support. A customer service agent checks order status, retrieves return and shipping policies, evaluates whether a refund is warranted, and drafts a response. Escalation logic routes complex cases to a human. The value is handling the 70-80% of requests that follow predictable patterns while preserving human attention for exceptions.

Software engineering. A bug triage agent receives a new report, searches logs and documentation for related issues, proposes reproduction steps, and opens a draft issue with structured metadata. The decision points include whether the bug is a duplicate, what severity to assign, and whether to request more information from the reporter.

In each case, the agent handles autonomous decision-making within bounded rules. Human review remains part of the system. The agent does not replace judgment. It reduces the manual work required before judgment is needed.

Complexity and challenges: deploying LangChain agents effectively

Deployment is where enthusiasm meets edge cases. A demo agent that works on five test inputs can fail unpredictably on the sixth.

Agentic systems are harder to control because they operate in fuzzy problem spaces with variable inputs and branching paths. “If you’re afraid of fuzziness and want to have full control,” agents will feel uncomfortable. That discomfort is the point. Agents are useful precisely when hard-coded logic becomes too brittle. But they require tolerance for probabilistic behavior and investment in guardrails that keep that behavior bounded.

Effective deployment depends on understanding the problem context, not just choosing a framework. The same LangChain agent pattern that works well for internal document Q&A can fail badly in a payment-processing workflow where every action has financial consequences.

Why problem context matters more than framework syntax

Before writing any agent code, five questions shape the architecture more than the choice of prompt template:

What is the agent actually deciding? If every request maps to the same action, a simple chain is cheaper and more reliable.
Are the tools reliable and bounded? An agent calling a flaky third-party API needs different error handling than one querying a local database.
What does a failed action look like? A wrong search result is recoverable. A wrong payment transfer is not.
Are latency and cost acceptable? Multi-step agents with reflection can take 10-30 seconds per request and consume significant tokens.
Does auditability matter? Regulated industries need a clear trace of why the agent took each action.

A ticket-routing agent has different failure tolerances than a workflow that modifies customer records. Retrieval-heavy workflows need evaluation of source quality, not just agent logic. These distinctions determine whether an agent architecture is appropriate and how much infrastructure it needs around it.

Common pitfalls and how to avoid them

The agent loops or takes too many steps.
Cause: Unclear stopping criteria or weak tool descriptions that leave the agent unsure when it has finished.
Fix: Set explicit max iterations. Add clear success conditions to the system prompt. Tighten tool schemas so the agent knows what “done” looks like.

The wrong tool gets selected.
Cause: Overlapping tool purposes. If two tools have similar descriptions, the model picks semi-randomly.
Fix: Write tool definitions with non-overlapping language. Add “when to use” and “when not to use” notes. Include examples in the tool description.

Outputs look plausible but are unusable.
Cause: No validation layer between the agent’s output and the consumer of that output.
Fix: Use structured outputs with schema validation. Add post-processing checks before returning results.

Latency is too high.
Cause: Too many tool calls or reflection passes per request.
Fix: Reduce the number of steps by combining related operations. Cache retrieval results. Reserve reflection for high-value tasks only.

Costs spike unexpectedly.
Cause: Verbose context windows and repeated retries that inflate token usage.
Fix: Trim conversation history to relevant messages. Summarize state instead of passing full logs. Monitor token usage per request.

The agent fails silently.
Cause: Missing logging and observability. The agent returns a generic fallback without recording what went wrong.
Fix: Record step traces, tool inputs and outputs, errors, and final decision paths. Build observability before scaling usage.

Identifying the right use cases for LangChain agents

How do you know whether a workflow needs an agent or a simpler chain? The answer is not about complexity for its own sake. It is about whether the workflow requires decisions that cannot be hard-coded in advance.

The best use cases for LangChain agents share several characteristics:

Multiple possible actions depending on the input
Incomplete information that requires gathering before acting
External tool use (search, APIs, databases, code execution)
Variable task order that changes based on intermediate results
Useful fallback or escalation paths when the agent is uncertain

Agents are the wrong fit for:

Deterministic pipelines where every input follows the same steps
One-shot transformations like format conversion or simple summarization
Highly regulated actions with zero tolerance for ambiguity where every decision must be pre-approved and auditable to a fixed rule set

A simple decision framework

Use this as a quick filter. If the answer is “yes” to three or more of these criteria, an agent is worth considering:

Does the workflow require choosing among several tools?
Can the task branch based on intermediate results?
Does the system need to ask clarifying questions?
Is the input messy or incomplete?
Does the final answer depend on external systems or retrieved knowledge?
Would a static chain require too many hard-coded paths?

Two or fewer? A fixed chain or a prompt-only approach is likely simpler, cheaper, and more reliable.

Compare agents vs simpler alternatives

Approach	Best for	Strengths	Tradeoffs	Example
Prompt-only LLM app	Single-turn tasks with clear inputs	Simple, fast, low cost	No tool use, no state, no branching	Rewrite this paragraph in a formal tone
Fixed chain/workflow	Multi-step tasks with predictable order	Reliable, easy to test and debug	Brittle when inputs vary or steps need to change	Extract entities → classify → format output
LangChain agent	Flexible decision-making with tools	Adapts to variable inputs, selects tools dynamically	Higher latency, harder to debug, requires guardrails	Answer a question by searching docs, checking a database, and drafting a response
LangGraph or graph-based orchestration	Stateful workflows with explicit control	Inspectable state transitions, retry logic, human checkpoints	More setup, steeper learning curve	Multi-agent system with approval gates and conditional branching

The key distinction: LangChain agents offer flexibility for workflows where the path is not fully predictable. LangGraph becomes the better choice when you need explicit stateful orchestration, production-grade retry logic, and branches you can inspect and control.

From reactive to proactive: transforming workflows with LangChain agents

Reactive systems wait for a direct request. A user asks a question, the system responds. Proactive systems monitor context and prepare or trigger the next useful action without being explicitly asked.

Proactive behavior does not mean uncontrolled autonomy. It means the system recognizes patterns and initiates approved next steps. A proactive customer support agent notices that a user has asked about return shipping three times in five minutes and offers the return form before being asked. A proactive research agent recognizes that a search result contradicts a previous finding and flags the discrepancy.

The tradeoff is clear. Proactive behavior improves throughput and user experience when the triggers are well-defined. It creates confusion and distrust when the system acts on ambiguous signals. Boundaries matter more than ambition here.

Using few-shot prompting to guide proactive behavior

Few-shot prompting means showing the model examples of the behavior you want before the live task. Instead of describing the ideal response in abstract instructions, you provide two or three concrete input-output pairs.

For an agent, few-shot prompting shapes how it decides what to do next. A customer support agent shown examples of when to offer a refund path, when to check policy first, and when to escalate learns to apply similar logic to new requests. The examples function as behavioral templates.

Few-shot prompting helps an agent:

Infer when to ask follow-up questions instead of guessing at incomplete information
Suggest next actions that match the demonstrated pattern
Format tool requests consistently so downstream systems receive predictable inputs
Avoid under-acting or over-acting by calibrating the level of initiative to the examples provided

This technique guides behavior but does not replace evaluation or guardrails. An agent that learns from examples to be proactive still needs iteration limits, confidence thresholds, and validation checks. Few-shot prompting is a steering mechanism, not a safety mechanism.

Before-and-after workflow comparison

Before: manual quarterly account summary.
A user requests a quarterly summary. An analyst manually pulls metrics from three dashboards, checks for anomalies against the previous quarter, drafts a narrative summary, formats it for the stakeholder, and sends it for review. Elapsed time: 2-4 hours. Requires analyst availability.

After: agent-assisted quarterly account summary.
The agent receives the request, retrieves metrics from connected data sources, flags anomalies that exceed defined thresholds, drafts a narrative summary using a standard template, and asks the analyst to confirm before sending. Elapsed time: 10-15 minutes of agent processing plus 5-10 minutes of analyst review.

What changes: the analyst shifts from data gathering and drafting to reviewing and approving. The total effort drops significantly. Human oversight remains. The agent handles the repeatable parts. The analyst handles the judgment.

Common pitfalls and misconceptions about LangChain agents

The word “agent” gets used loosely across tools, demos, and vendor marketing. That loose usage creates persistent misconceptions that lead to poor implementation choices. A useful langchain agents tutorial should leave readers with sharper judgment, not just code to copy.

Myth: all agents are equally agentic

Systems that get called “agents” vary enormously in how much they actually decide. The spectrum runs from prompt-driven assistants that follow a fixed instruction, to tool-using agents that select from a set of functions, to multi-step planners that decompose tasks, to stateful orchestrated systems with review loops and human checkpoints.

More agentic is not automatically better. A multi-step planning agent is powerful, but also slower, harder to debug, and riskier in sensitive workflows. A tool-using agent with a single well-scoped tool might be the right level of agency for the task.

The practical implication: match the level of agency to the problem. Overbuilding creates unnecessary complexity. Underbuilding creates an application that cannot handle the variation it will encounter.

Myth: if an agent can call tools, it understands the task

Tool access is not the same as good decision-making. A model with access to a search tool, a calculator, and a database query function can call any of them. Whether it calls the right one depends on:

How clearly the tool descriptions explain what each tool does
How much context the agent has about the user’s actual intent
Whether the agent has seen examples of correct tool selection
Whether there are evaluation criteria for what counts as a good result

A common failure mode: the agent calls a search tool when it should ask a clarifying question first. It has the capability to search, so it searches. But searching without understanding the question produces irrelevant results. Capability without judgment is just expensive noise.

Myth: agents replace workflow design

Agents reduce some hard-coded logic. They do not remove the need for engineering discipline. A production agent still requires:

Constraints on what actions are allowed
Validation of outputs before they reach users or downstream systems
Logging of every step for debugging and audit
Fallback behavior when the agent cannot complete the task
Governance over what data the agent can access and what decisions it can make

Teams that skip these because “the agent will figure it out” end up with systems that are impossible to debug, expensive to run, and unreliable under real traffic. Agent design is software engineering, not prompt engineering alone.

Practical steps to integrate LangChain agents into your projects

LangChain agents become useful when they solve a real workflow problem, not when they simply demonstrate tool calling. The following steps outline how to build a simple multi-step agent in Python, from defining the task through evaluating performance.

Step 1: define the workflow and success criteria

Start with one concrete workflow. A practical example: a research assistant that answers a question by deciding whether to search documents, summarize findings, and produce a final response.

Define these before writing code:

Task input. A natural language question from the user.
Acceptable output. A clear, sourced answer that addresses the question. If the answer cannot be found, the agent says so.
Tools needed. A document search tool, a summarization function, and a response formatter.
Stopping condition. The agent has either produced a validated answer or exhausted its search options.
Escalation condition. The agent’s confidence is below a threshold, or the question falls outside the scope of available documents.

Many projects go wrong here. They start coding before they define “done.” An agent without a clear stopping condition will loop. An agent without an escalation path will fabricate answers when it should ask for help.

Step 2: select tools and write clear tool descriptions

Tool descriptions are the steering wheel of an agent. The model reads these descriptions to decide which tool to use. Vague descriptions produce unpredictable tool selection.

For the research assistant example:

search_docs: “Search the internal knowledge base for documents relevant to a specific question. Input: a search query string. Output: a list of document excerpts with source metadata. Use when the user asks a factual question. Do not use for math calculations or opinion questions.”
summarize_text: “Condense a long text passage into a concise summary. Input: a text string. Output: a shorter summary preserving key facts. Use after retrieving documents when the retrieved text is too long to return directly.”
format_response: “Structure the final answer for the user. Input: answer text and source references. Output: a formatted response with citations. Use only when the agent is ready to deliver a final answer.”

Each definition includes purpose, expected input, expected output, when to use it, and when not to use it. This level of specificity directly improves tool calling accuracy.

Step 3: create the agent loop in Python

The core flow of a LangChain agent follows a predictable pattern:

Receive user input and initialize the message history
Pass the messages to the model along with available tool definitions
The model either returns a final answer or requests a tool call
If a tool call is requested, execute the tool and append the result as a tool message
Pass the updated message history back to the model
Repeat until the model returns a final answer or a max iteration limit is reached

In code, this is a loop with a conditional check. The structure matters more than the specific API calls:

messages = [system_message, user_message]

for step in range(max_iterations):
    response = model.invoke(messages)
    
    if response.tool_calls:
        for tool_call in response.tool_calls:
            result = execute_tool(tool_call)
            messages.append(tool_message(result))
        messages.append(response)
    else:
        final_answer = response.content
        break

This is a simplified skeleton. The key idea is that the agent loop is a decision loop, not a sequential pipeline. The model decides what happens next at each step.

Step 4: add memory, validation, and guardrails

An agent that works on a single turn needs additional infrastructure to work reliably:

Memory management. Preserve only the context that matters. Long conversation histories inflate token usage and confuse the model. Summarize earlier steps when the history grows beyond a useful window.
Structured outputs. Where possible, define output schemas so the agent’s final response can be validated programmatically. A JSON schema for the response format catches malformed outputs before they reach users.
Iteration limits. Set a hard ceiling on the number of steps. Five to ten iterations is a reasonable starting point for most workflows.
Confidence thresholds. If the model expresses low confidence or the retrieved documents have low relevance scores, trigger escalation instead of returning a weak answer.
Tool allowlists. Restrict which tools are available based on the task context. Not every tool needs to be accessible for every request.
Human approval for sensitive actions. Any action that modifies data, sends a message to a customer, or triggers a financial transaction should require explicit approval.

Step 5: evaluate and optimize performance

An agent that works once is a demo. An agent that works reliably across varied inputs is a production system. The gap between the two is evaluation.

Test across these dimensions:

Task completion rate. Does the agent produce a correct, complete answer for a representative set of inputs?
Tool selection accuracy. Does the agent choose the right tool at each step? Log tool calls and review them.
Latency. How long does the full loop take? Identify which steps are slowest.
Cost. How many tokens does each request consume? Track this per request, not just in aggregate.
Failure recovery. When a tool call fails, does the agent recover gracefully or return an error?

Optimization follows directly from these measurements:

Shorten prompts to reduce token usage without losing essential instructions
Reduce unnecessary reflection passes for tasks where first-pass quality is sufficient
Refine tool descriptions based on observed misselection patterns
Cache retrieval results for repeated queries
Use simpler chains for deterministic substeps instead of routing everything through the agent

Moving from experiment to production requires instrumentation. Logging, monitoring, and evaluation pipelines are not optional. They are the difference between a prototype and a system a team can rely on.

Optional extension: when to move from LangChain agents to LangGraph

LangChain agents are effective for getting started with agentic workflows quickly. The agent loop pattern handles many use cases with minimal setup.

LangGraph becomes the better choice when the workflow requires:

Explicit state transitions that are visible and inspectable
Retry logic with configurable backoff and fallback
Branches you can trace through a graph structure rather than an implicit loop
Human checkpoints built into the execution graph
Production-grade orchestration with persistence and recovery

The practical recommendation: start with a LangChain agent to validate the workflow. Move to LangGraph when you need the control, observability, and stateful orchestration that production systems demand.

Interactive checkpoint: can this workflow be an agent?

Test your understanding with three scenarios. For each one, decide whether it should use a prompt-only app, a fixed chain, or a LangChain agent.

Scenario 1: Summarize a single PDF into bullet points.
Best approach: Prompt-only app. The input is clear, the task is single-step, and there are no decisions to make. A direct prompt with the document text is the simplest and most cost-effective option.

Scenario 2: Answer account questions by checking documentation and policy tools, with different answers depending on account type.
Best approach: LangChain agent. The workflow branches based on account type, requires tool use (doc search and policy lookup), and the agent needs to decide which source to check first and whether the information is sufficient.

Scenario 3: Route a support issue across several systems based on the content of the message, the customer’s history, and current system status.
Best approach: LangChain agent or LangGraph. Multiple tools are involved, the routing logic depends on intermediate results, and the system may need to check multiple sources before making a decision. If the routing requires explicit state management and human approval gates, LangGraph is the stronger fit.

Conclusion

LangChain agents are valuable not because they are a fancier LLM wrapper. They are valuable because they enable autonomous, multi-step workflows when tasks involve uncertainty, tools, and decisions that cannot be hard-coded in advance.

The skill is not in calling the API. The skill is in designing the workflow, writing tool descriptions that guide correct behavior, building guardrails that keep the system bounded, and evaluating performance beyond “it worked once.” Knowing when not to use an agent is just as important as knowing how to build one.

If you are ready to go deeper, building multi-agent systems, mastering orchestration patterns with LangGraph, and shipping production-ready agentic workflows, the Agentic AI Engineer with LangChain and LangGraph program is designed for exactly that progression. You can also explore the full catalog to find programs that match where you are now and where you want to go next.

Schools

Popular

Featured

LangChain agents tutorial: build a multi-step workflow in Python

Why LangChain agents are more than just LLM extensions

The foundational role of LangChain’s unified message format

What makes LangChain agents distinct

The strategic value of autonomous decision-making in AI workflows

The reflection pattern and why it matters

Industry case studies that show strategic value

Complexity and challenges: deploying LangChain agents effectively

Why problem context matters more than framework syntax

Common pitfalls and how to avoid them

Identifying the right use cases for LangChain agents

A simple decision framework

Compare agents vs simpler alternatives

From reactive to proactive: transforming workflows with LangChain agents

Using few-shot prompting to guide proactive behavior

Before-and-after workflow comparison

Common pitfalls and misconceptions about LangChain agents

Myth: all agents are equally agentic

Myth: if an agent can call tools, it understands the task

Myth: agents replace workflow design

Practical steps to integrate LangChain agents into your projects

Step 1: define the workflow and success criteria

Step 2: select tools and write clear tool descriptions

Step 3: create the agent loop in Python

Step 4: add memory, validation, and guardrails

Step 5: evaluate and optimize performance

Optional extension: when to move from LangChain agents to LangGraph

Interactive checkpoint: can this workflow be an agent?

Conclusion

Popular Nanodegrees

Programming for Data Science with Python

Data Scientist Nanodegree

Self-Driving Car Engineer

Data Analyst Nanodegree

Android Basics Nanodegree

Intro to Programming Nanodegree

AI for Trading

Predictive Analytics for Business Nanodegree

AI For Business Leaders

Data Structures & Algorithms

School of Artificial Intelligence

School of Cyber Security

School of Data Science

School of Business

School of Autonomous Systems

School of Executive Leadership

School of Programming and Development

Related Articles

What is the Claude Agent SDK, and why are engineers building their own harnesses?

Claude Code Best Practices: How I actually use Claude Code as a senior cloud architect

The Claude Certified Architect Exam, Explained by Someone Who Passed It

Prompt chaining explained: how to build reasoning pipelines in Python