RAG Explained for Non-Engineers: How I Built an AI-Powered Economic and Financial Research Tool

wpnews.pro

I am not a software engineer. My background is in applied economics — econometric modelling, regression analysis, policy research, and financial model validation. Python, for most of my career, meant running regression or traditional ML models and writing statistical scripts, not building applications.

So when I decided to build an AI-powered research tool from scratch, I was not coming at it as a developer. I was coming at it as an economist who had spent years working through dense policy and financial documents — WEF Global Competitiveness Reports, IMF World Economic Outlooks, Regulatory Filings, bank underwriting and pricing policies, model documentation packages, and ministry internal briefs — and thinking: there has to be a better way to do this.

This article is about what I built, how it works, and what it taught me. It is written for economists, analysts, and finance professionals who are curious about generative AI but do not have a computer science background. No jargon, no assumptions.

Anyone who works in economic research or financial analysis knows the routine. A new IMF report drops. It is 180 pages. You need to understand the key risks, the inflation outlook, and how it compares to the previous year’s report — ideally before your next meeting.

You skim. You use Ctrl+F. You read the executive summary and hope it covers what you need. Sometimes it does. Often it doesn’t.

Now multiply that by five reports, three institutions, and a quarterly cycle. The information is all there. The bottleneck is human reading time.

This is the problem I wanted to solve. Not replace the analyst — but give the analyst a better way to interact with the documents they already have.

RAG stands for Retrieval-Augmented Generation. The name sounds technical, but the concept is intuitive once you strip away the acronym.

Think of a standard AI chatbot like ChatGPT. It knows a lot — it was trained on enormous amounts of text from the internet. But it does not know your specific documents. If you ask it “what does the IMF’s April 2025 report say about Gulf region inflation?”, it will either make something up or tell you it doesn’t have that information.

RAG solves this by giving the AI a way to search your documents before it answers.

Here is the analogy I find most useful. Imagine you are a brilliant research analyst. I give you a stack of 500-page reports and ask you a question. You don’t read all 500 pages before answering. You flip to the relevant sections, read those carefully, and then formulate your answer based on what you actually found — not what you vaguely remember.

RAG works the same way. It retrieves the most relevant passages from your documents, then uses an AI model to generate an answer grounded in those passages. The result is an AI that can answer questions about documents it has never seen before — and tell you exactly which page it found the answer on.

Let me walk through what happens under the hood when you upload a report and ask a question. I’ll use plain language throughout. Step 1: You upload a PDF

The application reads your PDF and breaks it into smaller chunks of text — roughly 1,200 characters each, with a 250-character overlap between chunks. The overlap is important as it ensures that a sentence that falls at the boundary between two chunks doesn’t get lost.

Think of it like cutting a long research paper into index cards, with each card sharing a few lines with the one before it, so nothing falls through the cracks.

Step 2: The chunks are converted into numbers

This is the part that sounds the most abstract but is actually the most important. Each chunk of text is converted into a list of numbers — called an embedding — using OpenAI’s embedding model.

Here’s the intuition. Words and sentences that mean similar things produce similar sets of numbers. “Inflation risk” and “price pressure” will be close together in this numerical space. “Central bank policy” and “monetary tightening” will be close together. Completely unrelated concepts will be far apart.

This means the system can find relevant content even when your question doesn’t use the exact same words as the document. It is matching by meaning, not by keyword.

Step 3: The numbers are stored in a vector database

These numerical representations are stored in a vector database called Chroma. Think of it as a very sophisticated filing system that serves as the long-term memory for AI models by mapping unstructured data (like text or images) into high-dimensional vectors so they can be searched based on semantic meaning rather than exact keywords.

Step 4: You ask a question

Your question is also converted into numbers using the same embedding model. The system then searches the vector database for the chunks of text whose numbers are closest to your question’s numbers — i.e., the passages most relevant to what you’re asking.

FinSight uses a retrieval method called MMR (Max Marginal Relevance), which is designed to return results that are both relevant and diverse. This prevents the system from returning eight slightly different versions of the same paragraph — it actively seeks out varied perspectives from across the document.

Step 5: The AI generates a grounded answer

The retrieved passages are sent to GPT-4o-mini along with your question. The model is instructed to answer using only what it found in those passages — not its general training knowledge. It also tells you which document and which page each point came from.

This source-grounding is what makes RAG genuinely useful for professional research. You are not just getting an answer — you are getting a cited answer you can verify.

The application has five core research functions, all of which I use for economic and financial document analysis.

Individual report summaries: Upload a single report and the application generates a structured executive summary covering the main themes, growth and inflation outlook, key risks, policy implications, and a five-bullet summary for senior management. Useful when you need to brief someone quickly without reading 150 pages yourself.

Combined multi-report summaries: Upload several reports at once — say, the IMF World Economic Outlook and a central bank financial stability report — and the system synthesizes them into a single summary, noting where they agree and where they diverge.

Report comparison: Ask a specific comparison question — “How does the 2024 report’s inflation outlook differ from the 2025 report?” — and the system searches across both documents to produce a structured comparison with source references.

Risk identification: The system retrieves all passages related to risks, vulnerabilities, and downside scenarios across the uploaded documents and produces a ranked risk register with severity levels. Exactly the kind of output a risk analyst or credit officer needs.

Conversational question answering: Ask follow-up questions in plain English. The system remembers your conversation history and is smart enough to handle vague follow-ups. If you ask “Is it worrisome?” after asking about inflation, it knows to rewrite your question as “Is the inflation outlook described in the report worrisome?” before searching — so it retrieves the right content even when your question is conversational rather than precise.

The technical tools — LangChain, Chroma, Streamlit, the OpenAI API — are well-documented and more accessible than I expected. What took more thought was the design.

The hardest problem was not the code. It was deciding what the application should do and how to prompt the model to do it well. Every major function in FinSight has a carefully written instruction set — a prompt — that tells the model exactly how to structure its output, what to prioritize, and what not to do (most importantly: do not answer from general knowledge, only from the retrieved documents).

This is where an economics or research background is genuinely useful. Writing a good prompt for a risk identification function is not a software engineering problem. It is a research design problem. What does a well-structured risk analysis look like? What categories matter? What does a senior banker or policy professional need to see? These are questions I could answer from experience.

The insight I keep coming back to is this: the gap between “someone who can code” and “someone who can build something useful” is not purely technical. It is about understanding the problem well enough to design a solution. Domain knowledge is not a consolation prize for not being a software engineer. It is the thing that makes the tool worth building.

For those curious about what is actually running under the hood: OpenAI GPT-4o-mini is the language model that reads the retrieved text and generates answers. It is fast, cost-effective, and well-suited for structured analytical tasks.

OpenAI Embeddings (text-embedding-3-small) converts text into numerical vectors. This is what makes meaning-based search possible.

LangChain is the framework that connects everything together — the document , the text splitter, the retrieval system, and the language model. Think of it as the plumbing.

Chroma is the vector database that stores the embeddings and handles the similarity search.

Streamlit is the tool that turns the Python code into a web application with buttons, text inputs, and a chat interface — no web development experience required.

PyPDF handles reading and extracting text from uploaded PDF files.

The financial services industry generates enormous volumes of structured, text-heavy analytical content: credit reports, regulatory filings, central bank publications, risk assessments, earnings transcripts. Most of this content is read by humans under time pressure, with significant information being missed or underweighted simply because there is too much of it.

RAG-based tools are not replacing that analytical work. They are removing the bottleneck. An analyst who can query five central bank reports in two minutes — getting cited, structured answers — can spend their time on the part that actually requires human judgment: interpreting the implications, forming a view, making a recommendation.

This is the same logic that made econometric modelling valuable. You are not replacing the economist with the regression. You are giving the economist a more powerful instrument.

The current version handles PDF documents and conversational question answering. The roadmap includes adding support for real-time data feeds — central bank websites, financial news sources — so the system can answer questions about current conditions, not just uploaded documents. I am also exploring ways to incorporate structured data outputs, so the system can produce tables and charts alongside its narrative summaries.

The GitHub repository is public. The README includes the full architecture, technology stack, and setup instructions.

GitHub: github.com/eramtafsir/finsight-rag

If you are an economist, banker, policy professional, or consultant thinking about what AI can realistically do for your research workflows — or simply trying to understand what RAG actually means in practice — I hope this was useful. The tools are more accessible than the terminology suggests. The harder question is always the one that requires domain expertise: what problem is actually worth solving? Eram Tafsir is an Applied Economist and Quantitative Analyst with experience in econometric modeling, ML model validation, financial research, and economic policy analysis. She writes about the intersection of economics, data science, and AI.

Connect on LinkedIn: linkedin.com/in/eramtafsirView projects on GitHub: github.com/eramtafsir

RAG Explained for Non-Engineers: How I Built an AI-Powered Economic and Financial Research Tool was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

source & further reading

pub.towardsai.net — original article I Gave Five AI Coding Agents a way to Fact-Check the Docs They Were handed. They Refused to Use it. I Tested the Viral “Caveman” AI Trick. Here’s What It Actually Saves (And What It Doesn’t) You Can’t Monitor an AI Agent Like a Web Service. Here’s What I Track Instead.

RAG Explained for Non-Engineers: How I Built an AI-Powered Economic and Financial Research Tool

Run your AI side-project on zahid.host