Document AI: From OCR to Agentic Doc Extraction

Other DeepLearning Resources

🎓 Keep learning → Explore all DeepLearning.AI courses — taught by the people building the future of AI. Find your next one.

💻 Explore more course artifacts → Browse the DeepLearning.AI course artifacts repo to find notebooks, projects, and notes from other courses across the DeepLearning.AI library.

About this repository

This repository covers supplemental material for Lab 6 of the course. Document processing can surface information from across files and images by parsing, extracting and splitting data. In Lab 5 of the course, you built a local pipeline for document processing with LandingAI. Here you will learn to build the same pipeline in the cloud with AWS. In particular, you will build a chatbot for conducting deep research with several AWS services. Those services include Lambda, S3, IAM, and Bedrock. To learn more about cloud computing on AWS, please check out the following resources:

Documentation
- S3
- Lambda
- IAM
- Bedrock
Libraries

Overview

The pipeline consists of three components:

1. Document Processing

S3 Bucket: Stores uploaded PDF documents
Lambda Function: Automatically triggered on file upload to S3
LandingAI ADE:
- Processes documents and extracts chunks with bounding boxes.
- Creates individual JSON files for each document chunk
Storage:
- output/medical/: Markdown files
- output/medical_grounding/: Grounding data with bounding boxes
- output/medical_chunks/: Individual chunk JSON files for Knowledge Base
- output/medical_chunk_images/: Dynamically generated cropped chunk images

2. Knowledge Base

AWS Bedrock Knowledge Base: Indexes individual chunk JSON files
Metadata: Maintains chunk type, page number, and bounding box coordinates

3. Chatbot

Strands Agent Framework: Orchestrates conversation flow
Bedrock Memory Service: Maintains conversation context
Visual Grounding:
- Extracts and crops specific chunk regions from PDFs
- Adds red border highlighting around chunks

Dependencies

To replicate the lab, you must configure your own AWS account.

Python
- Use version 3.10
OS
- Recommended to use x86_64
AWS
- Please get AWS account with permissions for the following service
  - Lambda
  - S3
  - IAM
  - Bedrock
  - CloudWatch Logs
  - In your account you must set up the following resources
    - S3 Bucket
    - Bedrock Knowledge Base
LandingAI
- Vision Agent API Key
- Remember that you can make a free account at LandingAI:

Folder Structure

sc-landingai/
├── L6.ipynb                          # Main lab notebook
├── ade_s3_handler.py                 # Lambda function for document processing
├── lambda_helpers.py                 # Helper functions for Lambda deployment
├── visual_grounding_helper.py        # Functions for creating cropped chunk images
├── medical/                          # Sample medical PDF documents
│   ├── Common_cold_clinincal_evidence.pdf
│   ├── CT_Study_of_the_Common_Cold.pdf
│   ├── Evaluation_of_echinacea_for_the_prevention_and_treatment_of_the_common_cold.pdf
│   ├── Prevention_and_treatment_of_the_common_cold.pdf
│   ├── The_common_cold_a_review_of_the_literature.pdf
│   ├── Understanding_the_symptoms_of_the_common_cold_and_influenza.pdf
│   ├── Viruses_and_Bacteria_in_the_Etiology_of_the_Common_Cold.pdf
│   └── Vitamin_C_for_Preventing_and_Treating_the_Common_Cold.pdf
└── README.md                         # This file

Getting Started

Step 0: S3 and Bedrock

Make two folders in your S3 bucket called input/ and output/
Connect the Bedrock Knowledge Base to the folder

Step 1: Environment Setup

Create a .env file with your credentials:

AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AWS_REGION=us-west-2
S3_BUCKET=your-bucket-name
VISION_AGENT_API_KEY=your_landingai_api_key
BEDROCK_MODEL_ID=us.anthropic.claude-sonnet-4-5-20250929-v1:0
BEDROCK_KB_ID=your_knowledge_base_id

Step 2: Install Dependencies

pip install boto3 python-dotenv Pillow PyMuPDF landingai-ade typing-extensions
pip install bedrock-agentcore strands-agents pandas

Step 3: Run the Notebook

Open Lab-6.ipynb in Jupyter and follow the step-by-step instructions to:

Deploy the Lambda function
Set up S3 triggers
Process medical documents (creates chunks automatically)
Configure Bedrock Knowledge Base to index output/medical_chunks/
Test chunk-based search with search_medical_chunks()
Launch the interactive chatbot

Monitoring & Debugging

CloudWatch Logs

Monitor Lambda execution in AWS CloudWatch:

Processing status for each document
Error messages and stack traces
Performance metrics and duration

S3 Output Verification

Check processed outputs:

# List all processed files
stats = monitor_lambda_processing(logs_client, s3_client, bucket_name)

Knowledge Base Sync

Verify document ingestion:

response = bedrock_agent.start_ingestion_job(
    knowledgeBaseId=BEDROCK_KB_ID,
    dataSourceId=DATA_SOURCE_ID
)

Troubleshooting

Common Issues

Lambda Timeout: Increase timeout in deployment (default: 900s)
Memory Errors: Increase Lambda memory (default: 1024MB)
IAM Permissions: Ensure role has S3 and CloudWatch access
Python Version Mismatch: Use Python 3.10 for compatibility
Knowledge Base Not Found: Verify KB ID and region settings

Debug Commands

# Check Lambda logs
monitor_lambda_processing(logs_client, s3_client, bucket)

# Verify S3 outputs
s3_client.list_objects_v2(Bucket=bucket, Prefix='output/')

# Test chunk-based search
results = search_medical_chunks("test query", s3_client, bucket)

# Test knowledge base search
test_result = search_knowledge_base("test query")

⚠️ Note: This lab requires active AWS services which may incur costs. Remember to clean up resources after completing the exercises.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Document AI: From OCR to Agentic Doc Extraction

Other DeepLearning Resources

About this repository

Overview

1. Document Processing

2. Knowledge Base

3. Chatbot

Dependencies

Folder Structure

Getting Started

Step 0: S3 and Bedrock

Step 1: Environment Setup

Step 2: Install Dependencies

Step 3: Run the Notebook

Monitoring & Debugging

CloudWatch Logs

S3 Output Verification

Knowledge Base Sync

Troubleshooting

Common Issues

Debug Commands

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
images		images
medical		medical
.gitattributes		.gitattributes
.gitignore		.gitignore
Lab-6.ipynb		Lab-6.ipynb
README.md		README.md
ade_s3_handler.py		ade_s3_handler.py
lambda_helpers.py		lambda_helpers.py
visual_grounding_helper.py		visual_grounding_helper.py

Folders and files

Latest commit

History

Repository files navigation

Document AI: From OCR to Agentic Doc Extraction

Other DeepLearning Resources

About this repository

Overview

1. Document Processing

2. Knowledge Base

3. Chatbot

Dependencies

Folder Structure

Getting Started

Step 0: S3 and Bedrock

Step 1: Environment Setup

Step 2: Install Dependencies

Step 3: Run the Notebook

Monitoring & Debugging

CloudWatch Logs

S3 Output Verification

Knowledge Base Sync

Troubleshooting

Common Issues

Debug Commands

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages