Building a GenAI PDF Query App with a RAG Architecture

September 19, 2024

This article will cover building a Generative AI PDF query web application that uses a retrieval-augmented generation (RAG) architecture. The PDF query web application will go through your PDF’s and provide you responses to your questions based on what is in your PDF’s. The importance of using RAG is the ability to scope the results of the generated response from the LLM in our case Claude 3.5 Sonnet with up-to-date, accurate, reliable responses. RAG allows for domain-specific contextually relevant responses tailored to your data rather than static training data.

The PDF query web application will leverage Facebook AI Similarity Search (FAISS) and Amazon Titan Embeddings to create vector representations of unstructured text and the storage/search of those embeddings. LangChain is utilized for the prompt template guiding the models response, RetrievalQA for pertinent data, and various PDF processing tools. We will use Amazon Bedrock to access Claude 3.5 Sonnet and Amazon Titan Embeddings.

PDF Query RAG-Based LLM GenAI Web Application Process Flow

Code:

GitHub repo with files referenced in this blog pdf-query-rag-llm-app

Prerequisites:

  • Amazon Web Services Account
  • Enable Amazon Bedrock Access (Specifically Amazon Titan Embeddings and Claude 3.5 Sonnet) see: Manage access to Amazon Bedrock foundation models
  • EC2 Instance Role with AmazonBedrockFullAccess Policy Attached (note you can make this more secure by making a custom policy)
  • Verified on EC2 Instance Ubuntu 22.04 and Ubuntu 24.04
  • Verified with Python 3.10, 3.11, 3.12
  • Virtualenv
  • AWS Default Region is set to us-east-1 you can change the region in the pdf_query_rag_llm_app.py file under region_name='us-east-1'

AWS Resource Cost:

As with most AWS services you will incur costs for usage.

EC2 Ubuntu Instance Setup:
(This article assumes you have a ubuntu user with /home/ubuntu)

Step 0
Install some dependencies

sudo apt -y update
 
sudo apt -y install build-essential openssl
 
sudo apt -y install libpq-dev libssl-dev libffi-dev zlib1g-dev
 
sudo apt -y install python3-pip python3-dev
 
sudo apt -y install nginx
 
sudo apt -y install virtualenvwrapper

Step 1
Clone the GIT Repository

cd /home/ubuntu
 
git clone https://github.com/nethacker/pdf-query-rag-llm-app.git

Step 2
Setup the Python Environment

virtualenv pdf-query-rag-llm-app_env
 
source pdf-query-rag-llm-app_env/bin/activate

Step 3
Install the PDF Query RAG LLM application package dependencies

cd /home/ubuntu/pdf-query-rag-llm-app
 
pip install -r requirements.txt

Step 4
Setup systemd to daemonize and bootstrap the PDF Query RAG-Based LLM APP (Port 8080)

sudo cp systemd/pdf-query-rag-llm-app.service /etc/systemd/system/
 
sudo systemctl start pdf-query-rag-llm-app
 
sudo systemctl enable pdf-query-rag-llm-app.service

Step 5
Install NGINX to help scale and handle connections (Port 80)

sudo cp nginx/nginx_pdf-query-rag-llm-app.conf /etc/nginx/sites-available/nginx_pdf-query-rag-llm-app.conf
 
sudo rm /etc/nginx/sites-enabled/default
 
sudo ln -s /etc/nginx/sites-available/nginx_pdf-query-rag-llm-app.conf /etc/nginx/sites-enabled
 
sudo systemctl restart nginx

Step 6
Test your application, remember to put your PDF’s that you want to query in the data directory on the instance and click New Data Update before querying.

http://{yourhost}

Notes:

  • Make sure to open up port 80 in your EC2 Security Group associated to the instance.
  • For HTTPS (TLS) you can use AWS ALB or AWS CloudFront.
  • Depending on how many PDF’s you have, how big the PDF’s are, and your CPU specifications using the New Data Update button can take awhile as it builds your vector embeddings.
  • Any time you add PDF’s or change them make sure to click “New Data Update” to update/build your vector embeddings.
  • This application does not take into consideration security controls, that is your responsibility.
  • Please read Amazon Bedrock FAQ’s for general questions about AWS LLM resources used.

Comments are closed.