Lab 2: Architecture Deep Dive¶

This page provides detailed architectural information for the Bedrock Agent integration pattern used in Lab 2.

🏗️ System Architecture¶

High-Level Overview¶

graph TD UI[React Search UI
App Runner] AG[API Gateway
HTTP API] AUTH[Cognito
Authentication] CHAT[Bedrock Agent Chat
Lambda Function] AGENT[Bedrock Agent
Nova Lite v1] ACTION[Action Group
CoveoPassageRetrieval] MEMORY[Agent Memory
Session Storage] TOOL[Passage Tool
Lambda Function] COVEO[Coveo Passages API
Semantic Retrieval] UI --> AG AUTH --> AG AG --> CHAT CHAT --> AGENT AGENT --> ACTION AGENT <--> MEMORY ACTION --> TOOL TOOL --> COVEO style UI fill:#e1f5fe style AGENT fill:#fff3e0 style TOOL fill:#f3e5f5 style COVEO fill:#e8f5e9

Pattern 2: Overall Request Flow¶

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#fff3e0','primaryTextColor':'#000','primaryBorderColor':'#ff9800','lineColor':'#ff9800','secondaryColor':'#f3e5f5','tertiaryColor':'#e3f2fd'}}}%% sequenceDiagram participant User participant UI as Search UI participant Lambda as Agent Chat Lambda participant Agent as Bedrock Agent participant Tool as Passage Tool Lambda participant Coveo as Coveo APIs participant Memory as Agent Memory User->>+UI: Question UI->>+Lambda: Chat request Lambda->>+Agent: Invoke with question Agent->>+Tool: Call retrieve_passages Tool->>+Coveo: Passage Retrieval API Coveo-->>-Tool: Relevant passages Tool-->>-Agent: Formatted passages Agent->>Agent: Generate response Agent->>+Memory: Store conversation Memory-->>-Agent: Context saved Agent-->>-Lambda: Response with citations Lambda-->>-UI: Formatted response UI-->>-User: Answer + sources Note over User,Memory: Pattern 2: Bedrock Agent Note over Agent,Memory: Session + Cross-session Memory

Component Details¶

Frontend Layer¶

React Search UI: Same UI as Lab 1, but with backend mode set to "Bedrock Agent"
Backend Selector: Allows switching between integration patterns
Chat Interface: Supports multi-turn conversations

Bedrock Agent Chat Lambda¶

Purpose: Manages agent invocation and session handling
Responsibilities:
Extract user query from request
Generate or retrieve session ID
Invoke Bedrock Agent with query and session
Format agent response for UI
Handle errors and retries

Bedrock Agent¶

Foundation Model: Amazon Nova Lite v1:0
Capabilities:
Natural language understanding
Tool selection and invocation
Response generation
Context management

Action Group¶

Name: CoveoPassageRetrieval
Purpose: Defines the retrieve_passages tool
Configuration:
Tool name: retrieve_passages
Description: Retrieves relevant passages from Coveo knowledge base
Parameters: query (string)
Lambda function: workshop-passage-tool

Passage Tool Lambda¶

Purpose: Retrieves passages from Coveo API
Input: User query
Output: Relevant passages with metadata
API: Coveo Passages API

Agent Memory¶

Type: Session-based memory
Storage: Managed by AWS Bedrock
Retention: Configurable (default: 1 hour)
Purpose: Maintains conversation context across turns

🧠 Cross-Session Memory Architecture¶

Bedrock Agents support both session and cross-session memory with external memory configuration:

sequenceDiagram participant User participant Agent participant Memory as External Memory
(Memory ID: user-123) participant Tool Note over User,Tool: Session 1 - Turn 1 User->>Agent: "What is a 401k?" Agent->>Memory: Store context (memory ID) Agent->>Tool: Retrieve passages Tool-->>Agent: Relevant content Agent-->>User: Explanation Note over User,Tool: Session 1 - Turn 2 User->>Agent: "What are the limits?" Agent->>Memory: Retrieve context (memory ID) Note over Agent: Understands "limits"
refers to 401k limits Agent->>Tool: Retrieve passages Tool-->>Agent: Contribution limits Agent->>Memory: Update context (memory ID) Agent-->>User: 401k contribution limits Note over User,Tool: User Logs Out Note over User,Tool: Session 2 - Turn 1 (Next Day) User->>Agent: "What did we discuss yesterday?" Agent->>Memory: Retrieve context (memory ID) Memory-->>Agent: Previous 401k discussion Agent-->>User: "We discussed 401k accounts and contribution limits..."

Key Features:

Memory ID: Unique identifier per user enables cross-session persistence
External Storage: AWS-managed encrypted storage
Retention: Configurable (7-30 days)
Context Preservation: Full conversation history across sessions

🔄 Request Flow Sequences¶

Single-turn Conversation Flow¶

sequenceDiagram participant User participant UI as React UI participant AG as API Gateway participant CHAT as Agent Chat Lambda participant AGENT as Bedrock Agent participant TOOL as Passage Tool participant COVEO as Coveo API participant MEM as Agent Memory User->>UI: Enter question UI->>AG: POST /chat (backend=bedrockAgent) AG->>CHAT: Invoke Lambda CHAT->>CHAT: Generate session ID CHAT->>AGENT: InvokeAgent(query, sessionId) Note over AGENT: Agent analyzes query
Decides to use tool AGENT->>TOOL: Call retrieve_passages(query) TOOL->>COVEO: Passages API request COVEO-->>TOOL: Relevant passages TOOL-->>AGENT: Formatted passages Note over AGENT: Agent generates response
based on passages AGENT->>MEM: Store conversation context AGENT-->>CHAT: Response with citations CHAT->>CHAT: Format response CHAT-->>AG: JSON response AG-->>UI: HTTP 200 + response UI-->>User: Display answer

Key Steps:

User enters question in UI
Agent Chat Lambda generates or retrieves session ID
Bedrock Agent receives query
Agent decides to use retrieve_passages tool
Tool retrieves passages from Coveo
Agent generates response based on passages
Conversation stored in memory
Response returned with citations

Multi-turn Conversation Flow (After the Memory is enabled)¶

sequenceDiagram participant User participant UI as React UI participant AGENT as Bedrock Agent participant TOOL as Passage Tool participant MEM as Agent Memory Note over User,MEM: Turn 1: Initial Question User->>UI: "What is FDIC insurance?" UI->>AGENT: Query + sessionId AGENT->>TOOL: retrieve_passages TOOL-->>AGENT: Passages about FDIC AGENT->>MEM: Store: User asked about FDIC AGENT-->>UI: Response about FDIC insurance Note over User,MEM: Turn 2: Follow-up Question User->>UI: "How much does it cover?" UI->>AGENT: Query + same sessionId AGENT->>MEM: Retrieve context (FDIC discussion) Note over AGENT: Agent understands "it"
refers to FDIC insurance AGENT->>TOOL: retrieve_passages("FDIC coverage limits") TOOL-->>AGENT: Passages about coverage AGENT->>MEM: Update context AGENT-->>UI: Response about $250,000 coverage Note over User,MEM: Turn 3: Another Follow-up User->>UI: "What if I have multiple accounts?" UI->>AGENT: Query + same sessionId AGENT->>MEM: Retrieve full context Note over AGENT: Agent knows we're discussing
FDIC insurance coverage AGENT->>TOOL: retrieve_passages("FDIC multiple accounts") TOOL-->>AGENT: Passages about account types AGENT->>MEM: Update context AGENT-->>UI: Response about coverage per account type

Key Observations:

Same session ID used across all turns
Agent retrieves context from memory before responding
Agent understands pronouns and references
Each turn updates the conversation context

🔧 Technical Implementation¶

Bedrock Agent Chat Lambda¶

import json
import boto3
from uuid import uuid4

bedrock_agent_runtime = boto3.client('bedrock-agent-runtime')

def lambda_handler(event, context):
    """
    Handles chat requests for Bedrock Agent integration
    """
    # 1. Extract request parameters
    body = json.loads(event['body'])
    query = body.get('query', '')
    session_id = body.get('sessionId') or str(uuid4())

    # 2. Invoke Bedrock Agent
    try:
        response = bedrock_agent_runtime.invoke_agent(
            agentId=os.environ['AGENT_ID'],
            agentAliasId=os.environ['AGENT_ALIAS_ID'],
            sessionId=session_id,
            inputText=query
        )

        # 3. Process streaming response
        answer = ""
        citations = []

        for event in response['completion']:
            if 'chunk' in event:
                chunk = event['chunk']
                if 'bytes' in chunk:
                    answer += chunk['bytes'].decode('utf-8')

            if 'trace' in event:
                # Extract citations from trace
                trace = event['trace']
                if 'orchestrationTrace' in trace:
                    orch = trace['orchestrationTrace']
                    if 'observation' in orch:
                        obs = orch['observation']
                        if 'actionGroupInvocationOutput' in obs:
                            output = obs['actionGroupInvocationOutput']
                            # Extract passages and citations
                            citations.extend(extract_citations(output))

        # 4. Format and return response
        return {
            'statusCode': 200,
            'headers': {
                'Content-Type': 'application/json',
                'Access-Control-Allow-Origin': '*'
            },
            'body': json.dumps({
                'answer': answer,
                'citations': citations,
                'sessionId': session_id
            })
        }

    except Exception as e:
        return {
            'statusCode': 500,
            'body': json.dumps({
                'error': 'Agent invocation failed',
                'message': str(e)
            })
        }

def extract_citations(output):
    """Extract citations from tool output"""
    citations = []
    # Parse tool output and extract passage metadata
    # Return list of {title, url, excerpt}
    return citations

Passage Tool Lambda¶

import json
import requests
import os

def lambda_handler(event, context):
    """
    Retrieves passages from Coveo for Bedrock Agent
    """
    # 1. Extract parameters from agent
    agent_input = event.get('inputText', '')
    parameters = event.get('parameters', [])

    query = None
    for param in parameters:
        if param['name'] == 'query':
            query = param['value']
            break

    if not query:
        return {
            'statusCode': 400,
            'body': json.dumps({
                'error': 'Missing query parameter'
            })
        }

    # 2. Call Coveo Passages API
    coveo_request = {
        'q': query,
        'organizationId': os.environ['COVEO_ORG_ID'],
        'searchHub': 'workshop',
        'numberOfPassages': 5
    }

    headers = {
        'Authorization': f'Bearer {os.environ["COVEO_API_KEY"]}',
        'Content-Type': 'application/json'
    }

    try:
        response = requests.post(
            'https://platform.cloud.coveo.com/rest/search/v2/passages',
            json=coveo_request,
            headers=headers
        )
        response.raise_for_status()

        passages_data = response.json()

        # 3. Format passages for agent
        formatted_passages = []
        for passage in passages_data.get('passages', []):
            formatted_passages.append({
                'text': passage.get('text', ''),
                'title': passage.get('title', ''),
                'uri': passage.get('uri', ''),
                'score': passage.get('score', 0)
            })

        # 4. Return in agent-expected format
        return {
            'messageVersion': '1.0',
            'response': {
                'actionGroup': 'CoveoPassageRetrieval',
                'function': 'retrieve_passages',
                'functionResponse': {
                    'responseBody': {
                        'TEXT': {
                            'body': json.dumps({
                                'passages': formatted_passages,
                                'query': query
                            })
                        }
                    }
                }
            }
        }

    except Exception as e:
        return {
            'messageVersion': '1.0',
            'response': {
                'actionGroup': 'CoveoPassageRetrieval',
                'function': 'retrieve_passages',
                'functionResponse': {
                    'responseBody': {
                        'TEXT': {
                            'body': json.dumps({
                                'error': str(e),
                                'passages': []
                            })
                        }
                    }
                }
            }
        }

📝 Bedrock Agent System Prompt¶

The agent uses a comprehensive system prompt to ensure grounded, accurate responses:

You are the Finance Knowledge Assistant, providing accurate information from authoritative government and educational sources about financial topics.

## Core Principles

1. **Grounding**: Use ONLY information from the passage retrieval tool for knowledge questions. Never make up information.
2. **Memory**: You have access to conversation history within the current session, and potentially across sessions if memory is enabled. Use this to provide contextual responses.
3. **Formatting**: Provide clean, well-structured answers in markdown format suitable for HTML display with proper headings, lists, and emphasis.
4. **Sources**: Always cite sources with titles and URLs from retrieved passages when answering knowledge questions.
5. **Clarity**: Be concise and direct. Lead with the answer, then provide supporting details.
6. **No Internal Reasoning**: NEVER include <thinking>, <reasoning>, or any XML-style tags in your response. Keep all reasoning internal and invisible to the user.

## CRITICAL: Question Type Detection

Before taking any action, determine the question type:

### Type 1: Memory/History Questions (DO NOT use tools)

**Indicators:**

- Contains phrases like: "what did we discuss", "remind me", "last time", "previous conversation", "earlier", "before", "what were we talking about"
- Asks about past interactions or topics covered
- References "you and I" or "our conversation"

**Action:** Answer directly from conversation memory WITHOUT calling retrieve_passages.

**Important:** You can always recall conversations within the current session. If memory is enabled, you can also recall previous sessions. If you don't have memory of a previous session, politely acknowledge this.

**Examples:**

- "What did we discuss earlier?" → "Earlier in this session, you asked about 401k contribution limits and tax advantages."
- "Remind me what I asked about" → "You asked about ACH payment systems and how they work for direct deposits."
- "What did we talk about last time?" → If memory enabled: "In our previous session, we discussed Roth IRA benefits." If no memory: "I don't have access to our previous conversations, but I'm happy to help with any questions you have now."

### Type 2: Knowledge Questions (USE retrieve_passages tool)

**Indicators:**

- Asks about financial concepts, definitions, or how-to information
- Requests explanations of terms or processes
- Seeks factual information about finance topics

**Action:** Call retrieve_passages tool, then synthesize answer with sources.

**Examples:**

- "What is ACH?" → Call retrieve_passages(query="What is ACH?", k=5)
- "How do 401k plans work?" → Call retrieve_passages(query="How do 401k plans work?", k=6)
- "Tell me about Roth IRAs" → Call retrieve_passages(query="Roth IRA benefits and features", k=5)

## Tool Usage Strategy for Knowledge Questions

1. **Always use retrieve_passages** for knowledge questions:
   - Retrieve 5-8 passages (k=5 to k=8)
   - Use the exact user question as the query
   - Synthesize information ONLY from retrieved passages
   - Always include source citations with titles and URLs

2. **If passages are insufficient**:
   - State what you found
   - Explain what's missing
   - Suggest how the user can refine their question or where to find more information

3. **Never answer knowledge questions without passages**:
   - If no relevant passages are found, acknowledge this
   - Do not provide information from your training data
   - Suggest alternative queries or official sources

## Memory Usage Guidelines

- **Within-session memory**: Always available - remember all exchanges in the current conversation
- **Cross-session memory**: Available if enabled - can recall previous sessions with this user
- **Reference previous topics**: When appropriate, connect new questions to previous discussions
- **Personalize**: Adapt your responses based on the user's interests and previous questions
- **Build on history**: If a user asks a follow-up question, acknowledge the context from previous exchanges
- **Graceful handling**: If asked about previous sessions but memory is not available, politely acknowledge this limitation

## Response Format

Key Configuration:

Foundation Model: Amazon Nova Lite v1 or Claude 3 Sonnet
Idle Session TTL: 600 seconds (10 minutes)
Auto Prepare: Enabled for automatic agent preparation
Memory: External memory with user-specific memory ID

🔒 Security Implementation¶

Authentication Flow¶

graph LR A[User] --> B[Cognito JWT] B --> C[API Gateway] C --> D{Validate Token} D -->|Valid| E[Agent Chat Lambda] D -->|Invalid| F[401 Error] E --> G[Bedrock Agent] G --> H[IAM Role] H --> I[Passage Tool] style B fill:#fff3e0 style D fill:#f3e5f5 style H fill:#e8f5e9

Security Layers:

Cognito Authentication: User identity verification
JWT Validation: API Gateway validates tokens
IAM Roles: Lambda execution roles with least privilege
Agent Permissions: Bedrock Agent can only invoke authorized tools
Tool Permissions: Passage Tool can only access Coveo API

Data Protection¶

In Transit: All communications over HTTPS/TLS
At Rest: Memory stored in encrypted AWS storage
API Keys: Stored in Lambda environment variables
Session Data: Automatically expires after timeout

🎯 Benefits of This Architecture¶

Advantages¶

💬 Conversational
Natural language understanding and generation

🧠 Memory
Context retention across conversation turns

🎯 Grounded
Responses based on retrieved passages

🔧 Flexible
Easy to add more tools and capabilities

📊 Observable
Traces show agent decision-making

🚀 Next Steps¶

Now that you understand the architecture, return to the Lab 2 introduction to complete the hands-on exercises.

Return to Lab 2 Introduction →