Back

Key Takeaway

Building a secure RAG-based in-house LLM utilization environment

Through an AIR Studio and AWS OpenSearch-based RAG architecture, we established a chatbot environment that safely utilizes in-house documents, and verified a security-focused LLM utilization system where RAG or LLM Only responses automatically operate depending on the availability of materials.

Automotive (D Company)

Client :Automotive (D Company)

Industry :Automotive / Manufacturing

Service Area :Data & AI

1. Overview (Project Background)

This project was initiated to establish a
secure LLM usage environment that minimizes the risk of technical information leakage and data learning issues that could arise as generative AI usage expands within the company.

As internal employees utilized public LLMs such as ChatGPT,
concerns were raised that corporate internal data could be leaked externally or used for model training,
and a security-focused approach to generative AI utilization was needed.

Additionally, beyond simple question-and-answer interactions,
through RAG (Retrieval-Augmented Generation) chatbot implementation based on in-house documents and embedding data,
we aimed to create a structure that automatically switches response methods depending on the availability of materials.

When internal documents exist → RAG-based response
When internal documents do not exist → LLM Only response

2. Solution (Resolution Approach)

Objective Definition

Verification of data leakage prevention structure based on security solutions
Performance and quality comparison and benchmarking of AWS-based LLM compared to GPT-4o

Key Verification Tasks

Verification of architecture to prevent internal data from being used for external training
Verification of response quality and accuracy using AWS LLM models

3. Result (Achievements)

Construction of RAG-based Data Processing Pipeline

Establishment of preprocessing process that converts various types of documents into structures suitable for RAG
Ensuring search accuracy by vector indexing preprocessed data in AWS OpenSearch

Document Parsing and Indexing Enhancement

Document content parsing using LLM-based OCR
Composition of parsed documents into a RAG-usable structure by loading them into VectorDB (OpenSearch)

Chat API Business Logic Implementation

Intent classification performed upon user query input
(In-house regulations / ESG / Others)
Automatic selection of RAG pipeline or LLM Only response path based on classification results

Document Correction Function Verification

Implementation of correction pipeline for typos and expression errors using LLM
Verification of document quality improvement possibilities completed

Expected Benefits

RAG-based Chatbot Utilization

Provision of in-house document RAG chatbot and Web RAG chatbot through AIR Studio
Support for document management and configuration management functions by repository
Establishment of chatbot verification system based on expected question-answer sets

Document Correction Automation

Provision of Streamlit-based UI
Automatic inspection and correction output of entire document content upon document upload

Automotive (D Company)

Key Takeaway

Building a secure RAG-based in-house LLM utilization environment

Automotive (D Company)

1. Overview (Project Background)

2. Solution (Resolution Approach)

Objective Definition

Key Verification Tasks

3. Result (Achievements)

Construction of RAG-based Data Processing Pipeline

Document Parsing and Indexing Enhancement

Chat API Business Logic Implementation

Document Correction Function Verification

Expected Benefits

RAG-based Chatbot Utilization

Document Correction Automation

Related

Case Stories

HANATOUR (HANATOUR)

hy(Korea Yakult)

Hansol Paper

MORAI

Jeju Beer(Jeju Beer)

HAPPY CAMPUS

Let's build intelligent data solutions that drive real business value through advanced analytics and AI.