Back
Key Takeaway
Building a secure RAG-based in-house LLM utilization environment
Through an AIR Studio and AWS OpenSearch-based RAG architecture, we established a chatbot environment that safely utilizes in-house documents, and verified a security-focused LLM utilization system where RAG or LLM Only responses automatically operate depending on the availability of materials.
Automotive (D Company)
Client :Automotive (D Company)
Industry :Automotive / Manufacturing
Service Area :Data & AI
1. Overview (Project Background)
This project was initiated to establish a
secure LLM usage environment that minimizes the risk of technical information leakage and data learning issues that could arise as generative AI usage expands within the company.
As internal employees utilized public LLMs such as ChatGPT,
concerns were raised that corporate internal data could be leaked externally or used for model training,
and a security-focused approach to generative AI utilization was needed.
Additionally, beyond simple question-and-answer interactions,
through RAG (Retrieval-Augmented Generation) chatbot implementation based on in-house documents and embedding data,
we aimed to create a structure that automatically switches response methods depending on the availability of materials.
When internal documents exist → RAG-based response
When internal documents do not exist → LLM Only response
2. Solution (Resolution Approach)
Objective Definition
Verification of data leakage prevention structure based on security solutions
Performance and quality comparison and benchmarking of AWS-based LLM compared to GPT-4o
Key Verification Tasks
Verification of architecture to prevent internal data from being used for external training
Verification of response quality and accuracy using AWS LLM models
3. Result (Achievements)
Construction of RAG-based Data Processing Pipeline
Establishment of preprocessing process that converts various types of documents into structures suitable for RAG
Ensuring search accuracy by vector indexing preprocessed data in AWS OpenSearch
Document Parsing and Indexing Enhancement
Document content parsing using LLM-based OCR
Composition of parsed documents into a RAG-usable structure by loading them into VectorDB (OpenSearch)
Chat API Business Logic Implementation
Intent classification performed upon user query input
(In-house regulations / ESG / Others)Automatic selection of RAG pipeline or LLM Only response path based on classification results
Document Correction Function Verification
Implementation of correction pipeline for typos and expression errors using LLM
Verification of document quality improvement possibilities completed
Expected Benefits
RAG-based Chatbot Utilization
Provision of in-house document RAG chatbot and Web RAG chatbot through AIR Studio
Support for document management and configuration management functions by repository
Establishment of chatbot verification system based on expected question-answer sets
Document Correction Automation
Provision of Streamlit-based UI
Automatic inspection and correction output of entire document content upon document upload







