
Agentic AI – LLM with Web Scraping – Beginner Bootcamp
Welcome to our beginner-friendly AI bootcamp series! In this post, we’ll explore how to build intelligent web agents using OpenAI’s powerful language models combined with web scraping techniques. You’ll also learn how to enhance your agent using RAG (Retrieval-Augmented Generation) and a vector database for scalability and performance.
Let’s dive into building your own Web Insights Agent from scratch.
What is a Web Insights Agent?
A Web Insights Agent is an AI-powered assistant that can extract information from a website (and its linked child pages) and answer user questions based on the content. It’s like a mini search engine customized for a single website!
By combining LLMs (like GPT-4) with web scraping, chunking, embeddings, and vector stores, you can build an agent that reads the internet and responds intelligently.
Web Insights Agent 1 – Using OpenAI + Web Scraping (No RAG)
GitHub Code
🛠️ GitHub Repository: 👉 View the Code on GitHub
( https://github.com/debabratapruseth/AI-agent-with-LLM-and-Web-Scrapping )
What You Will Learn
- How to scrape web content and internal links using
requestsandBeautifulSoup - How to chunk and manage large text using token limits
- How to call OpenAI’s GPT-4 with contextual prompts
- How to build a basic AI agent using Python functions and LLM reasoning
How It Works
- User asks a question and provides a website URL.
- The agent scrapes the main page and 5–10 child pages.
- Text is chunked and passed to GPT-4 with the question.
- GPT-4 analyzes the chunks and generates a relevant response.
Educational Goals
- Learn to combine traditional web scraping with AI models
- Understand OpenAI function calling and prompt engineering
- Practice writing modular, error-handling Python code
- Build your first LLM-powered application from scratch
Future Improvements
- Add a user-friendly Streamlit or Gradio interface
- Improve failure handling for broken or missing links
- Add summarization support for long child pages
Architecture Improvement – Why RAG is Better
While this version works well for small pages, it runs into issues when the scraped content is large. GPT models have token limits, and sending the full text of every page doesn’t scale.
This is where RAG (Retrieval-Augmented Generation) comes in:
Instead of sending all content to the LLM, we store it in a vector database, and retrieve only the top relevant chunks during query time.
Try It Yourself
Open Google Colab (or any Python environment of your choice). Upload the notebook from the GitHub repo. Run the code and see the agent in action. Use any AI assistant (e.g., Gemini, Copilot, ChatGPT) for real-time debugging or customizations.
Web Insights Agent 2 – Using OpenAI + Web Scraping + RAG + FAISS
GitHub Code
🛠️ GitHub Repository: 👉 View the Code on GitHub
( https://github.com/debabratapruseth/AI-agent-with-LLM-Web-Scrapping-and-RAG )
This is the upgraded version of the Web Insights Agent that uses RAG (Retrieval-Augmented Generation) with a FAISS vector store. It’s faster, more scalable, and production-ready.
What You Will Learn
- Build a vector-based retrieval system with FAISS
- Store and query large sets of web content using embeddings
- Use OpenAI’s embedding models (
text-embedding-ada-002) - Create a RetrievalQA chain with LangChain and GPT-4
- Understand how modern enterprise search works under the hood
How It Works
- User provides a URL and a question.
- The site and child links are scraped and text is chunked.
- Embeddings are generated for each chunk and stored in FAISS.
- At query time, only top-k most relevant chunks are retrieved.
- GPT-4 uses those chunks to generate a precise answer.
Educational Goals
- Understand how RAG solves the token limit problem
- Learn how vector similarity search works
- Practice real-world information retrieval concepts
- Move from toy models to scalable, professional AI agents
Future Improvements
- Persist FAISS index for faster re-use
- Add support for large-scale scraping and background sync
- Switch to managed vector DBs like Pinecone or ChromaDB
- Add summarization layers for more efficient embedding
Try It Yourself
Open Google Colab (or any Python environment of your choice). Upload the notebook from the GitHub repo. Run the code and see the agent in action. Use any AI assistant (e.g., Gemini, Copilot, ChatGPT) for real-time debugging or customizations.
Discover more from Debabrata Pruseth
Subscribe to get the latest posts sent to your email.


