BLOG

Implementing GenAI for Effective Log Analysis and Query Automation in Amazon Security Lake(Amazon Bedrock, RAG)
작성일: 2024-06-25

The only Korean MSP with the Amazon Security Lake Service Partner Certification!

 

 

Have you seen Amazon Security Lake posted on the MegazoneCloud Tech blog? Amazon Security Lake is AWS Native SIEM (Security Information and Event Management) service that allows for the easy collection of various AWS logs and their conversion into a unified format (OCSF). In October 2023, MegazoneCloud became the first company in Korea to achieve Security Lake Service Partner status, and we have been introducing this service to various clients based on our technical expertise.

 

While Amazon Security Lake is excellent for collecting AWS native logs, many of our clients find it challenging to analyze these logs. To address this, we have implemented and tested an architecture that leverages the increasingly popular GenAI services to assist log analysis. We showcased this implementation at the MegazoneCloud booth for AWS Summit Seoul 2024 and we are excited to share this.

 

 

AWS Services and Technologies for High-Performance Gen-AI Architecture

 

Earlier this year, a post was uploaded to the AWS Tech Blog detailing how to utilize data from Amazon Security Lake using Amazon SageMaker and Amazon Bedrock. This AWS Tech Blog demonstrates how to use LLM models to analyze logs in the OCSF format, collected centrally through Security Lake, providing a solution that enables more efficient threat detection and quicker incident response tracking. The content was fascinating, and if the desired performance could be achieved through actual implementation, it would undoubtedly become a valuable tool for enterprises.

 

At our MegazoneCloud Cloud Technology Center (CTC), we used this blog as a reference to implement the solution, conduct performance testing, and optimize. Before we dive into the demo configuration details, let’s first explore the relevant services and technologies.

 

Amazon Bedrock and Amazon SageMaker

 

Amazon Bedrock and Amazon SageMaker are AWS platforms for AI/ML services, each serving distinct purposes and functionalities.

 

Amazon Bedrock is a fully managed service that integrates high-performance foundation models (FMs) from leading AI companies. It allows users to easily test and evaluate various large-scale AI models, select models based on specific requirements, and personalize them through techniques such as fine-tuning and retrieval-augmented generation (RAG). (RAG will be explained later.) Amazon Bedrock offers a variety of large language models (LLMs) and places a strong emphasis on the secure and safe management of user data.

 

Amazon SageMaker, on the other hand, supports the entire ML workflow, streamlining the development of ML models. It offers an integrated platform that facilitates visual management of all ML workloads, from data preparation to model training, tuning, and deployment.

 

Both services are pivotal to AWS’s ML and AI ecosystem, enabling users to implement AI and ML functionalities effortlessly without the complexities of managing infrastructure. However, Amazon Bedrock focuses primarily on LLM-based large-scale generative AI applications, while Amazon SageMaker supports a broader ML workflow, allowing developers to finely control and optimize ML projects for various workloads.

 

In this demo, the integration of Amazon Bedrock and Amazon SageMaker enables in-depth analysis and rapid response for security data.

 

 

Foundation Models (Anthropic Claude – Haiku, Sonnet, Opus)

 

For the demo, we tested three models of Anthropic Claude 3.0: Haiku, Sonnet, and Opus. These models were chosen from among the various LLM models supported by Amazon Bedrock, specifically for Korean language.

 

Here are the features of each model:

 

1. Haiku

    1. Description: The smallest model in the Claude series, optimized for efficiency and rapid response times.
    2. Capabilities: Well-suited for basic language understanding, simple content generation, and quick response tasks.
    3. Ideal For: Applications with limited resources and budget constraints, such as customer support chatbots, simple content generation tools, and lightweight language processing applications.
    4. Advantages: Cost-competitive with lower computational requirements and faster processing speeds.
    5. Target Users: Small businesses, startups, or individual developers.

 

2. Sonnet

    1. Description: A medium-sized model in the Claude series, balancing performance and resource consumption.
    2. Capabilities: Enhanced language understanding and generation, handling complex content generation, contextual understanding, and detailed language tasks.
    3. Ideal For: Advanced customer support systems, content generation platforms, marketing automation tools, and interactive AI applications.
    4. Advantages: Optimized cost-performance for enhanced features.
    5. Target Users: Small and medium-sized enterprises, tech companies, and services requiring sophisticated language processing.

 

3. Opus

    1. Description: The largest and most powerful model in the Claude series, offering state-of-the-art capabilities.
    2. Capabilities: Advanced content generation, deep context understanding, complex task handling, and large-scale application support.
    3. Ideal For: Detailed content creation, extensive marketing and sales automation, and research and development.
    4. Advantages: Top-tier AI capabilities from extensive and diverse training datasets.
    5. Target Users: Large enterprises and organizations requiring cutting-edge AI solutions.
    6. Cost: The most expensive Claude model due to its extensive training and advanced features.

 

In summary, Haiku is Ideal for businesses needing a low-cost, high-efficiency AI solution for basic tasks. Sonnet balances performance and cost, catering to businesses requiring advanced functionalities within a reasonable budget. Opus targets organizations ready to invest in premium performance for advanced applications, offering top-tier AI capabilities. Each model in the Claude series is designed to meet specific requirements and budget constraints, making it possible to cater to various business needs.

 

 

Text2SQL

 

Text2SQ (Text-To-SQL) is a technology that automatically converts user questions in natural language into SQL (Structured Query Language) queries. This allows users to ask questions in everyday language, which the system translates into SQL queries tailored to retrieve specific data from databases.

 

This capability is particularly valuable for users managing complex databases or lacking technical expertise, as it enables them to access necessary information without needing to write complex query syntax. By simply asking questions in natural language, users can easily analyze and extract the required data. This approach is especially beneficial in the security field, allowing non-SQL security professionals to effortlessly analyze security log data and extract the important information they need.

 

Text2SQL facilitates faster and more efficient data-driven decision-making by reducing barriers to database usage. Its applications span diverse fields such as business analysis, customer service management, and security monitoring, enabling a wide range of use cases.

 

Architecture

 

Next, let’s delve into the actual implementation and testing carried out by MegazoneCloud using these AWS services and technologies.

 

Demo Implementation

 

Amazon Security Lake with GenAI

 

For the demo, we configured only the necessary services based on the architecture guidelines provided by the AWS blog. First, we separated the account where Amazon Security Lake is activated from the account where GenAI will be configured because Amazon Security Lake is the organization level service that supervises other accounts. We then enabled Amazon SageMaker and Amazon Bedrock. In the AWS GenAI for Security Lake account, we configured the GenAI model using Amazon SageMaker and Amazon Bedrock.

 

Within this account, we set up the SageMaker VPC and SageMaker domain and configured the Amazon SageMaker instance. Additionally, we enabled data analysis via Amazon Athena, and the analysis results were stored in Amazon S3.

 

Since Amazon Bedrock is not currently available in the Seoul region, we deployed all services in the Northern Virginia region. Additionally, because Amazon Security Lake integrates and collects logs from various sources such as cloud, servers, and databases, testing all logs would require significant resources and face numerous constraints. Therefore, we focused on a more realistic environment by configuring and testing only a subset of logs, specifically VPC Flow Logs and CloudTrail Logs.

Amazon SageMaker Architecture

 

Amazon SageMaker Workflow

 

After setting up the AWS environment, we configured a Notebook Instance in Amazon SageMaker. The fundamental logic of the structure in Amazon SageMaker involved creating a template based on the table and data information, invoking LLM based on the designed template, outputting the results, validating the output query, and explaining the analysis results.

 

The main modules in SageMaker are as follows, and we’ll take a closer look at the code for each module in the Amazon SageMaker Workflow architecture above:

 

table_info_call_athena: Extracts the used table and data information.
● create_query_prompt: Defines the basic template structure.
task_define_template: A detailed task definition template, combined with the basic structure of `create_query_template`
query_validation_with_record: Module for query validation with records.

 

(Note: The code provided is the final configuration, and there may be some differences from the code tested during the process.)

 

1. Creating a Template Based on Table and Data Information

 

Firstly, we start by receiving the user’s natural language input. Then, we extract the relevant table and data information from the input and create a prompt template based on this information. The prompt template not only contains the table and data information but also incorporates business logic commonly used in practical scenarios, which enhances the accuracy of the Amazon Bedrock model.

 

#Extracting Table and Data Information, Declaring Prompt Templates
table_name = 'cloudtrail_table'
table_info = slm.table_info_call_athena(table_name=table_name, info_type='structure')
prompt = slm.create_query_prompt(table_info=table_info, query_k=query_k, user_input=user_input, retrieve_few_shots=global)

 

#Extracting Database table information
def table_info_call_athena(info_type=None):
    table_name = 'cloudtrail_table'
    if info_type == 'structure':
        return spt.cloudtrail_table_info

 

#Create queries and define prompt templates based on table information and user inputs
def create_query_prompt(table_info=None, sample_log=None, query_k=None, user_input=None):
    template = spt.task_define_template(table_info=table_info, sample_log=sample_log, query_k=query_k)
    prompt = f"""
        System: {template}
        Human: {user_input}
        Assistant:
    """
    return prompt

...


def task_define_template(table_info=None, sample_log=None, query_k=None):
        task_define_template = f""" 
            You are the chief security officer in charge of in-house security control. 
            You are in charge of analyzing AWS SecurityLake data.
            Your role is to convert natural language requests into valid SQL queries.
            <table_info>
            {table_info}
            </table_info>
            <table_info></table_info> contains database schema and 3 sample rows from the table. 
            Generate a query using only the columns that exist in the database schema provided for the table information.
            The DB table name is “{table_name}”.
            Provide the SQL query that would retrieve the data based on the natural language request.
            Always limit a query to a maximum of {query_k} results.
            If you think the question is not related to database, Answer "The requested action cannot be performed."
        """

2. Invoking LLM and Outputting Results Based on the Template

 

Call the Claude model based on the prompt template and output the results. Users can choose an Anthropic Claude model according to their preference or business environment, showcasing the openness of Amazon Bedrock.

 

 

#Calling Claude model based on prompt templates and Outputting results.
return_text, boto3_bedrock = slm.query_invoke_model(model_name=model_name, prompt=st.session_state.prompt_session, max_token=None)

#Extracting only queries from ouput results
model_query = slm.query_extract(return_text=return_text)

 

3. Output Query Validation

 

The generated query is executed on the actual database. If the query runs successfully, the results are validated; if it fails, the query is regenerated. The `query_validation_with_record` function utilizes the execution and validation loop to improve query accuracy. While Amazon Bedrock supports a highly capable LLM model, it may occasionally struggle to provide a perfect answer for various requirements. Therefore, using validation logic enhances the overall system’s completeness.

 

 

#Query Validation
query_result, st.session_state.prompt_session, valid_log, query_id = slm.query_validation_with_record(
    valid_roop=valid_roop, model_query=model_query, retrieve_few_shots=globals()['retrieve_few_shots'], 
    table_info=table_info, query_k=query_k, session_log=st.session_state.prompt_session, model_name=model_name,
    user_input=user_input, table_name=table_name, boto3_bedrock=boto3_bedrock, max_tokens=max_tokens, query_mode=query_mode)

 

 

def query_validation_with_record(valid_roop=None, model_query=None, db_conn=None, model_name=None,
                                 table_info=None, sample_log=None, query_k=None, session_log=None,
                                 user_input=None, table_name=None, boto3_bedrock=None, max_tokens=None, query_mode=None,
                                 retrieve_few_shots=None):
    query_result = None
    valid_model_query = ''
    valid_log = []

    for roop in range(valid_roop):
        if query_result is None:
            try:
                query_result = query_execute_with_athena(query=model_query)
                if query_result.shape[0] < 1:
                    print('NO DATA')
                    raise Exception('No data')
                else:
                    print('EXECUTE SUCCESS')
                    break
            except Exception as e:
                query_result = None
                valid_model_query += f'\n\nAssistant : {model_query}'
                valid_model_query += '\n\nHuman: An error occurs in the query you created. Please re-create the query with appropriate syntax.'
                prompt = query_valid_prompt(table_info=table_info, sample_log=sample_log, query_k=query_k, user_input=user_input, 
                                            valid_model_query=valid_model_query, table_name=table_name, query_mode=query_mode)
                return_text = query_invoke_model(prompt=prompt, boto3_bedrock=boto3_bedrock, max_tokens=max_tokens)
                model_query = query_extract(return_text=return_text)
                print(return_text)
    else:
        query_result = None

    return query_result

4. Output of Analysis Results

 

The analysis is detailed enough to understand the rationale behind the query generated by the LLM. This approach was designed to enhance user satisfaction, query comprehension and explainable AI through a systematic perspective.

 

analysis_prompt = st.session_state.prompt_session
PaS_analysis = spt.result_analysis_template()
analysis_prompt += f'\n\nHuman: {PaS_analysis} \nAssistant: '
message, msg = analysis_model(prompt=analysis_prompt, boto3_bedrock=boto3_bedrock, max_tokens=4096)

 

def result_analysis_template(table_info=None, query=None, result=None):
    result_analysis_template = """
    Explain the last query you created and what logic you used to create the last query based on the results of the last query you created.
Let's make a plan and solve the problem step by step.
"""

    return result_analysis_template

 

#Calling the resulting analytics model
def analysis_model(prompt=None, boto3_bedrock=None, max_tokens=None, log_f=None):
    prompt_config = {
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": max_tokens,
        "messages": [
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": prompt},
                ],
            }
        ],
    }

    body = json.dumps(prompt_config)
    modelId = "anthropic.claude-3-sonnet-20240229-v1:0"
    accept = "application/json"
    contentType = "application/json"

    response = boto3_bedrock.invoke_model_with_response_stream(body=body, modelId=modelId, accept=accept, contentType=contentType)
    msg = response.get("body")

    return msg

 

5. Return of Results

 

If the query is validated and executed successfully, the query results are shown to the user.

 

Return of Result based on a user Question

 

After completing the Amazon SageMaker configuration, we conducted tests to evaluate accuracy in real questions and answers. During testing, we posed typical questions such as “Tell me the source IP with the highest number of API calls in the past week,” and “Provide logs for using the root account.” Additionally, to assess whether it could analyze real security threats, we generated logs in advance related to scenarios like port scans and credential theft, and asked questions about detection. We queried various security logs and prioritized latency and query accuracy during testing.

Test Results

 

Initially, we configured Amazon SageMaker using a Zero-Shot Learning approach. We conducted queries for both general questions and threat detection related to port scans and credential theft scenarios. The following results were obtained: 

 

Return of Failed Result based on a user Question(VPC Flow Logs)

 

Return of Failed Result based on a user Question(CloudTrail Logs)

 

In the test results, when using Zero-Shot Learning as depicted in the screenshot above, the LLM model was unable to generate accurate Text2SQL queries. It appears that the dataset Amazon Security Lake uses, which consists of OCSF(Open Cybersecurity Schema Framework) logs containing JSON data, contributed to the complexity of query generation. This complexity may have limited the ability to create accurate queries, especially when the dataset is in JSON format. Additionally, a lack of information about the OCSF schema may have contributed to the inaccuracies in query generation.

 

 

First Enhancement; Few-Shot Learning

 

 

To address the inaccuracy, we separated the JSON portion from the logs containing JSON data and created a separate dataset. We then tested this in a new environment. By applying the Few-Shot Learning technique, which provides accurate responses based on internal data sources, we configured a Query Assistant to support SQL queries alongside the queries. We conducted the test again using this setup.

 

Return of Successed Result based on a user Question(VPC Flow Logs)

 

Return of Successed Result based on a user Question(CloudTrail Logs)

 

As observed in the screenshot, the improved configuration provided suitable answers for the query intent. To ensure more accurate results, we tested various queries alongside the foundational models (Haiku, Sonnet, Opus) using a list of diverse queries. The results showed that Sonnet provided the most accurate answers among the foundational models, achieving approximately 40% accuracy. This represents a significant improvement over the initial configuration, but there are still some

areas to enhance.

 

 

Second Enhancement – RAG(Retrieval Augmented Generation)

 

 

After conducting testing and the first enhancement, we moved forward with the second enhancement. The test results showed significant improvement in accuracy. Consequently, we decided to showcase the demo at the AWS Summit Seoul 2024 MegazoneCloud booth.

 

The second enhancement consisted of two main components: adding Few-Shot Learning data and setting up RAG (Retrieval Augmented Generation) along with modifying some prompts. Firstly, data for Few-Shot Learning was added, and during testing, the bucket name in the S3 data used for Few-Shot Learning was referenced in the query, resulting in hallucinations. Hallucinations occur when the LLM provides responses that do not match the existing information or are logically inconsistent, reflecting lower model performance. To address hallucination issues, we implemented and enhanced RAG to generate more accurate and reliable responses. Let’s take a closer look at the RAG implementation and the demo conducted at the AWS Summit Seoul 2024.

 

RAG is a technique that combines information retrieval and information generation in the field of AI. It allows language models to generate more accurate and useful responses by retrieving related information from external data sources and incorporating it into the text creation process. The workflow of RAG mainly consists of handling user questions → retrieving information → generating a response. More specifically, when a user inputs a question or request, the input is transformed into an embedding vector to search for similar documents. Then, the LLM generates the final response based on the contents of the searched document. In this step, the retrieved information is combined with the language model’s knowledge (training data) to generate more accurate responses.

 

The advantage of RAG is that it significantly enhances the accuracy of language model responses by utilizing external data. It allows real-time searching and reflection of the latest information, going beyond the model’s limitations and providing richer information.

 

RAG combines the strengths of information retrieval and information generation to produce more accurate and useful text. This approach significantly improves GenAI model performance and can be applied in various applications.

 

RAG is made up of two main components: indexing (embedding) and search and generation. Indexing involves collecting and indexing data from the source into a pipeline, while search and generation consists of retrieving relevant data from the index based on user queries and sending it to the model to generate the response.

 

Here’s a diagram of the RAG indexing process: 

 

 

RAG Indexing Process

 

 

After loading multiple data sources, the process of dividing large texts into smaller segments, known as Data Chunking, is carried out. Following this, the chunked data undergoes encoding (embedding) and is stored in a Vector Database to facilitate retrieval of the segmented data.

 

The embedded question input by the user is searched against the Vector Database to find similar data and is provided to the LLM. This allows the LLM to deliver more efficient responses. The RAG diagram below offers a clearer understanding.

 

 

RAG Diagram

 

During the demo, we created a Vector Database by setting up frequently used questions and their corresponding answer queries from the logs stored in Amazon Security Lake. This setup allowed us to implement RAG, enabling the system to generate output by referring to the most semantically similar question queries when the user inputs a question.

#RAG Implemantaion
chunking_dataset = pickle.load(open(os_path+'/pickle_dir/chunking_dataset.pkl', 'rb'))
corpus_embeddings = pickle.load(open(os_path+'/pickle_dir/corpus_embeddings.pkl', 'rb'))
query_list = pickle.load(open(os_path+'/pickle_dir/query_list.pkl', 'rb'))

#Loading the HuggingFace Korean Embedding Model
embeddings = SentenceTransformer("jhgan/ko-sroberta-multitask") 
#Similarity Search with FAISS Engine(IndexFlatIP : Cosine Similarity)
index = faiss.IndexIDMap(faiss.IndexFlatIP(768)) 
#question embedding and index connection
index.add_with_ids(corpus_embeddings, np.array(range(0, corpus_embeddings.shape[0]))) 

#Encoding task on user_input using Korean embedding model
query_vector = embeddings.encode([user_input])
#Ouputs a list of query indexes similar to a user question
top_k = index.search(query_vector, few_shot_k) 
docs = [chunking_dataset[_id] for _id in top_k[1].tolist()[0]]

#As a result of the similarity search, configure the correct answer query for the most similar question in the prompt template
for doc, idx in zip(docs, top_k[1][0].tolist()):
temp_str = f"Human: {doc}\nAssistant: {query_list[idx]}\n\n"

 

 

print(chunking_dataset) #Split into sentences(Korean Question

['I would like to know the date when the CMK was rotated.',
'Can I track snapshot restore events for Amazon RDS instances?',
'Can I check for tag changes on AWS resources as well?',
'List the IPs that made the most API calls in the last week, and show the API calls made by the top 5 IPs',
'How many login attempts were there yesterday?',
'Are there logs for AWS Management Console authentication failures?',
'Are there logs for Security Group changes?',
'Was there any login to the management console with MFA disabled?',
'Are there logs of root account usage?',
'Can we check the start, stop, and delete actions for EC2 and RDS instances?',
'Can we check events where rules were added or removed from a specific security group?',
'Let me know the IPs that logged into the AWS console that are not 54216116106',
'List API call failures in the last 24 hours and show the errorCode and errorMessage fields',
'Provide the AWS KMS event records for the last 24 hours.',
'What was the most frequently occurring API Call in the past week?',
'What was the IP with the most API calls two days ago?',
'What was the API Call that encountered the most errors in the past month?',
'Did arn:aws:iam::551508107696/DemoUser use MFA when logging in?',
'What was the time period with the most API calls yesterday?',
'Was there anyone accessing during the early morning?',
'Who was the IAM User that logged in the most in the past week?',
'Which service/user had the most decryption requests with KMS?',
]

 

 

print(corpus_embeddings) #Embedding

tensor([[-5.3411e-03, -7.4023e-02, 3.7926e-01, ..., -7.2996e-04,
-1.6781e-01, -1.9504e-01],
[ 3.5060e-02, -2.2566e-01, 4.5334e-01, ..., -2.9150e-01,
-1.2969e-01, 1.0309e-01],
[-4.2866e-01, -6.3441e-01, 5.5720e-01, ..., -9.3364e-02,
6.0054e-02, -6.7089e-01],
...,
[-9.8832e-01, 6.1827e-01, 3.7764e-02, ..., -2.4416e-01,
5.9593e-02, 1.0958e-01],
[-2.9902e-01, -1.5139e-01, 1.4176e-01, ..., 1.6358e-02,
8.6800e-01, -7.1446e-01],
[-2.9451e-01, -4.9310e-01, 9.2341e-01, ..., 6.7432e-02,
-3.1949e-01, -1.7224e-01]])

 

 

print(q_list)

['SELECT * \nFROM \n ""amazon_security_lake_glue_db_ap_northeast_2"".""security_lake_genai""\nWHERE\n eventday >=
'SELECT\n *\nFROM \n ""amazon_security_lake_glue_db_ap_northeast_2"".""security_lake_genai""\nWHERE\n eventday
'SELECT COUNT(activity_name) activity_name_count\nFROM ""amazon_security_lake_glue_db_ap_northeast_2"".""security_lake_
'SELECT user_uuid, COUNT(actor_user_uuid) user_uuid_count, COUNT(activity_name) activity_name_count\nFROM ""amazon_secu
'SELECT src_endpoint_ip, COUNT(src_endpoint_ip) src_endpoint_ip_count\nFROM ""amazon_security_lake_glue_db_ap_northeast
'SELECT api_operation, api_service_name, eventday, COUNT(api_operation) as api_count\nFROM ""amazon_security_lake_glue_
'SELECT api_operation, api_service_name, eventday, COUNT(api_operation) as api_count\nFROM ""amazon_security_lake_glue_
'SELECT api_operation, COUNT(activity_name) activity_name_count\nFROM ""amazon_security_lake_glue_db_ap_northeast_2""."
'SELECT src_endpoint_ip, COUNT(src_endpoint_ip) src_endpoint_ip_count\nFROM ""amazon_security_lake_glue_db_ap_northeast
'SELECT \n CAST(CONCAT(CAST(DATE(""time"") AS VARCHAR), \' \', CAST(HOUR(""time"") AS VARCHAR), \':00\') AS TIMESTAM
'WITH \n role_tbl AS(\n -- assumerole_table\n SELECT actor_invoked_by as actor, COUNT(actor_invoked_by
'SELECT\n api_operation\n , count(api_operation) AS operation_count\nFROM \n ""amazon_security_lake_glue_db_ap
'SELECT\n api_operation\n , count(api_operation) AS operation_count\nFROM \n ""amazon_security_lake_glue_db_ap
'SELECT\n api_operation\n , count(api_operation) AS operation_count\nFROM \n ""amazon_security_lake_glue_db_ap
'WITH ip_list_table AS (\n SELECT\n src_endpoint_ip\n , COUNT(src_endpoint_ip) AS ip_count\n FROM \
'SELECT\n *\nFROM \n ""amazon_security_lake_glue_db_ap_northeast_2"".""security_lake_genai""\nWHERE \n eventda
'SELECT\n *\nFROM \n ""amazon_security_lake_glue_db_ap_northeast_2"".""security_lake_genai""\nWHERE\n eventday
"SELECT * \nFROM security_lake_genai\nWHERE actor_user_uuid ='arn:aws:iam::551508107696:user/DemoUser' AND mfa= true\nO
'SELECT\n *\nFROM \n ""amazon_security_lake_glue_db_ap_northeast_2"".""security_lake_genai""\nWHERE \n eventda
'SELECT\n api_operation\n , api_response_error\n , api_response_message\n , api_service_name\nFROM \n ""
'SELECT\n api_operation\n , eventday\n , resources_uid\nFROM \n ""amazon_security_lake_glue_db_ap_northeast
]

 

In this way, we explored the entire process of analyzing log data and generating SQL queries by combining Amazon Security Lake with GenAI and RAG. Finally, we improved the stability of the LLM and increased the accuracy from 40% to 52%, meaning that the entire AWS-native system outputs more accurate responses.

Final Result

 

We configured the demo environment into a web application and demonstrated it at the AWS Summit Seoul 2024 MegazoneCloud booth. You can see the demonstration details below.

 

 

 

 

 

Additional Recommendations for Customer Environment and Requirements.

 

 

We have reviewed how to facilitate access and analysis of security logs using Amazon Bedrock and Amazon SageMaker for logs collected from Amazon Security Lake. Through the actual demo environment setup and testing, we observed that analyzing security logs became significantly easier. However, this demo was conducted in a limited environment, and there are some areas to improve. Therefore, further enhancement work is essential moving forward. Below is a summary of the considerations after conducting the demo testing.

 

Firstly, in this test, we performed simple queries and aggregate query generation on a single table in the database. However, in actual operational environments, it is often necessary to derive results using multiple tables. Particularly, since Security Lake stores various security logs, additional learning and tuning, including defining rule sets, are required when analyzing the relationships between logs. This process increases the complexity of the queries, and performance validation for multi-join queries is necessary. Therefore, we must consider query configurations using multiple tables and optimizing performance accordingly.

 

Additionally, when generating queries based on natural language, it is important to compose prompt templates with specific guidelines. For example, when asking a query about “traffic,” the user’s intent may relate to “network log count,” but the LLM might look for data related to “log size.” To prevent such issues, we should provide clear guidelines for synonyms that have the same meaning but different expressions to ensure consistent query generation.

 

Finally, when generating output results using LLM, the format may not always be consistent. Sometimes explanations are included, and sometimes they aren’t. To resolve this issue, we should perform multiple tests to ensure consistency of the output results and improve the reliability of analysis results.

 

These considerations are essential elements for security and AI/ML experts when applying this demo to real operational environments. By carefully reviewing and incorporating these elements, we can establish a more stable and efficient environment for analyzing security data.

 

 

MegazoneCloud and Amazon Security Lake with GenAI

 

 

By leveraging Amazon Security Lake and GenAI to consolidate and store data in the Open Cybersecurity Schema Framework (OCSF) format, security log searches and correlation analysis of each log can be performed from a single prompt. This approach offers security professionals and data scientists an opportunity to save time and resources.

 

We hope this blog provides valuable insights to those interested in security data analysis, aiding them in extracting meaningful insights from complex security datasets. MegazoneCloud remains committed to ongoing research and innovation, ensuring that users can establish secure and efficient cloud environments.

 

MegazoneCloud plans to continue researching and developing this demo environment, enhancing it to detect and respond to security threats effectively. Additionally, we will strive to offer tailored, customized solutions according to specific customer environments and requirements. As an Amazon Security Lake Service Partner, we are dedicated to providing top-notch service and technical expertise, and we encourage those interested in Amazon Security Lake to reach out to us anytime.

 

 

📧Contact: megazonecloud-asl@mz.co.kr

 

 

 

[References]

[https://aws.amazon.com/ko/blogs/security/generate-ai-powered-insights-for-amazon-security-lake-using-amazon-sagemaker-studio-and-amazon-bedrock/] AWS Blog
[https://www.megazone.com/amazon_security_lake_231012/] AWS Service Partner 
[https://www.megazone.com/techblog_awssecuritylake_ocsfv1-1_240514/] Introducing OCSF 1.1 and 1.2
[https://python.langchain.com/v0.2/docs/introduction/ ] LangChain

Written by MegazoneCloud Cloud Technology Center