Leveraging GPT-4 to Streamline ESG Compliance Audits

by Dima Knivets on Tue, 24 Oct 2023

In today's rapidly evolving corporate landscape, Environmental, Social, and Governance (ESG) compliance is more crucial than ever. Auditors specializing in ESG standards are often burdened with the cumbersome task of manually sifting through a mountain of company documents to extract relevant information, making it a significant bottleneck in the auditing workflow.

Introduction

To enhance TSP's auditing capabilities, we developed a new AI-based feature, leveraging the GPT-4 API. This feature automates the extraction of relevant ESG compliance data from company documents, directly aiding ESG auditors in their tasks.

Streamlining Auditor Workflows

One of the key aspects of the new AI feature lies in its capability to facilitate the auditor's job. When the workflow is initiated for a specific company within the TSP system, the AI feature generates two key components in response to each question from the ESG questionnaire:

Answer
A direct response to the question based on the analysis of the company's documentation.
Text Excerpt
A snippet from the company document (along with the URL to the document) that the model used to formulate its answer.

These results create a foundation that ESG auditors can use for further verification, cutting down the time and effort required for manual document analysis. It's important to note, however, that GPT-4's output should not be taken at face value. The model can sometimes produce incorrect or "hallucinated" responses. Therefore, it remains the auditor's responsibility to validate the model's claims.

If the model is unable to find an answer, auditors would still need to look through the documents manually. However, when GPT-4 does provide an answer accompanied by a document excerpt, auditors can save time by:

  1. Review the model's conclusion based on the provided excerpt.
  2. Performing a quick document search to validate the excerpt's authenticity.

By utilizing this feature, auditors can drastically reduce their workload, focusing on validation rather than exhaustive document analysis.

High-Level Workflow Overview

Data Collection
The company completes an ESG questionnaire on the TSP platform and uploads supporting documents. Our AI feature gathers these inputs for processing.
Data cleaning and AI analysis
Prompts are crafted, incorporating questions from the ESG questionnaire and company documents. We also address GPT-4 token constraints by partitioning extensive documents before prompting the model.
Parsing results
Responses from GPT-4 are mapped to corresponding questions and documents. Answers, supportive text, and document references are extracted.
Persisting responses
The parsed information is stored and then displayed in an organized manner on the auditor's dashboard within the TSP platform.

Technical Considerations and Challenges

Designing the GPT-4 Prompt

We chose to approach the prompt design from the perspective of an ESG (Environmental, Social, and Governance) auditor tasked with auditing a company. Based on our research, adopting a user persona in the prompt seems to be a straightforward method for enhancing the model's performance.

It was essential to include all relevant questions and company documents in the prompt. This approach ensures that GPT-4 has access to the information it needs to generate accurate and comprehensive responses.

The final prompt is composed of three main components:

Task Description
A brief outline of the assignment, explaining what the ESG auditor is expected to accomplish.
List of Questions
These are the queries the auditor must answer, drawn from a comprehensive list of ESG metrics and considerations.
Document Contents
The actual textual content of the company documents, which serves as the basis for answering the auditor's questions.

Token Limitations

Given the fixed token limit for each prompt, and considering that our prompts would be very long due to the inclusion of document content, we had to find a creative solution. We chose not to summarize the prompts, a frequently used workaround, as this could lead to the loss of crucial information.

Instead, we divided the documents into segments to perform multiple API calls, where each call would incorporate a segment of the document into the final prompt. To achieve this, we first calculated the token length of the prompt and subtracted it from the total token limit. This gave us the number of tokens available for each document segment.

For instance, let's say the base prompt, which includes the task description and a list of questions, amounts to 2,000 tokens. Given a token limit of 8,000 tokens, this would leave us with 6,000 tokens available for document content. If a document is 18,000 tokens long, it would have to be split into three segments of 6,000 tokens each. Consequently, we would need to make three separate API calls. Each call would use the same 2,000-token base prompt, along with a 6,000-token segment of the document.

Finally, unlike characters in standard text, a "token" in a GPT model can be as short as one character or as long as one word. We used the Tiktoken package to calculate the token length of our prompt.

Output Truncation

Initially, we attempted to use prompts that consumed the entire token limit. However, we soon discovered that this approach led to truncated outputs from GPT-4, resulting in incomplete or missing information. As it turns out, the token limit is shared between the prompt and GPT's response, which means that some amount of tokens have to be left for the response.

Given that the length of the model's output is unpredictable, we experimented with adjusting the maximum prompt length. We found that reserving approximately two thousand tokens for the output ensured complete responses in all subsequent API calls.

Non-Deterministic Behavior and Inconsistent Formatting

For our use case, we ideally need consistent responses across multiple API calls when given the same prompt, as this would simplify the parsin process. In our workflow, the generated output must first be parsed into a list of questions. The answers are then mapped to corresponding questions in our data model. In some cases, inconsistent outputs completely broke our parsing logic.

We initially thought tweaking the temperature parameter to zero might produce more consistent responses. However, despite this adjustment, we found that GPT-4 continued to generate variable outputs.

What proved most effective was including a highly detailed response structure as an example within the prompt itself. This approach produced more consistent responses, allowing us to parse the response more easily. We also applied some simple post-processing steps to clean and perform sanity checks on the outputs.

First Impressions and What's Next

We have conducted local tests to measure the model's ability to accurately extract and categorize ESG compliance data. The preliminary results are very promising. The GPT-4 model was able to accurately answer questions by providing relevant text excerpts or numerical data, even specifying the year when the data was recorded. We're in the process of demoing the tool to various companies to gather initial user feedback.

One immediate improvement on our roadmap concerns the tool's scalability and fault tolerance. Currently, the tool processes GPT-4 API calls synchronously. This approach can be time-consuming and prone to failure, especially for companies with a large volume of documents. To address this, we plan to implement a task queue using tools like Celery. This change will allow us to parallelize API calls and ensure that failed tasks are retried, thereby improving the tool's efficiency and reliability.

Conclusion

In the ever-evolving landscape of corporate responsibility, the need for efficient and accurate ESG auditing is becoming increasingly critical. Our newly developed AI feature, powered by GPT-4, aims to revolutionize the auditing process by automating the extraction of important compliance data from a plethora of company documents.

Through high-level workflow processes and overcoming several technical challenges, we have created a solution that not only streamlines the ESG auditing workflow but also ensures high-quality, detailed analysis. While there's more work ahead, we believe this innovation is a significant step toward making ESG auditing more efficient and accurate.

More to read

Following DjaoDjin's Values, the source code underlying this blog post is available a Open Source.

If you are curious about building SaaS products, you might like How to build an Environmental, Social, and Governance (ESG) SaaS Product?

If you are looking for more posts about DjaoDjin's experiments with Artificial Intelligence (AI), you might enjoy Generating picture-in-picture of a speaker with Synthesia, or Producing Voice Over for Video Tutorials with Open Source

More business lessons we learned running a SaaS application hosting platform are also available on the DjaoDjin blog. For our fellow engineers, there are in-depth technical posts available.

by Dima Knivets on Tue, 24 Oct 2023


Receive news about DjaoDjin in your inbox.

Bring fully-featured SaaS products to production faster.

Follow us on