SQL Agent LangChain: Query Database using Natural Language

Table of Contents Hide

Introduction
Recap of Agents and Chains
1. LangChain Agent Abstraction
SQL Agent
1. Why SQL Agent Important?
Practical Implementation of SQL Agents
1. Querying Database with Natural Language
Conclusion

Introduction

Are you tired of traditional, clunky database querying methods? Say goodbye to confusing syntax and hello to a whole new era of intuitive interaction with SQL Agent LangChain.

Brace yourself as we dive into the extraordinary world of LangChain and SQL Agents, powered by OpenAI’s LLMs, including the renowned ChatGPT. Prepare to witness the incredible potential of these technologies as we guide you through the implementation process of querying databases using the power of natural language.

Recap of Agents and Chains

Let’s take a quick trip down memory lane to recap the fascinating world of agents and chains in LangChain.

Agents in LangChain are entities that have access to a suite of tools and can decide which tools to call based on user input. There are two main types of agents in LangChain:

Action Agents: These agents decide on an action to take and execute that action one step at a time.
Plan-and-Execute Agents: These agents first decide on a plan of action to take and then execute those actions one at a time.

Action Agents are more conventional and suitable for small tasks, while Plan-and-Execute Agents are better for complex or long-running tasks that require planning and maintaining long-term objectives. It is often best to have an Action Agent in charge of the execution of the Plan-and-Execute Agent.

LangChain Agent Abstraction

To create an agent in LangChain, you need to define the following abstractions:

Agent: This is where the logic of the application lives. Agents expose an interface that takes in user input and a list of previous steps taken by the agent. It returns either an AgentAction or AgentFinish.
- AgentAction corresponds to the tool to use and the input to that tool.
- AgentFinish means the agent is done and has information on what to return to the user.
Tools: These are the actions an agent can take. The tools given to an agent depend on the desired functionality.
Toolkits: These are groups of tools designed for specific use cases. For example, an agent interacting with a SQL database may need tools to execute queries and inspect tables.
Agent Executor: This wraps an agent and a list of tools. It is responsible for running the agent iteratively until the stopping criteria is met.

The most important abstraction to understand is that of the agent. An agent can be constructed using a PromptTemplate, a Language Model, and an Output Parser. The PromptTemplate constructs a prompt to send to the language model based on user input and previous steps.

The Language Model generates output based on the prompt, and the Output Parser parses the output into an AgentAction or AgentFinish object.

SQL Agent

The SQL Database Agent is a component within LangChain that acts as a bridge between users and SQL databases. It enables users to ask questions in natural language, eliminating the need for writing complex SQL queries.

By translating these natural language queries into SQL queries and executing them against the database, the SQL Database Agent makes it easier for users to access and retrieve information from SQL databases.

Why SQL Agent Important?

The SQL Database Agent is essential because it enhances the accessibility and usability of SQL databases. Writing SQL queries requires knowledge of the database schema and query syntax, which can be challenging for users who are not familiar with SQL.

The SQL Database Agent eliminates this barrier by providing a user-friendly interface that allows users to interact with the database using natural language queries. This makes SQL databases more accessible to a wider range of users, enabling them to retrieve information without the need for extensive SQL knowledge.

Understanding the SQL Database Agent is crucial for developers and users who work with SQL databases. By knowing about the SQL Database Agent, developers can leverage its capabilities to create applications that provide a more user-friendly and intuitive experience for interacting with SQL databases.

On the other hand, users can benefit from knowing about the SQL Database Agent as it allows them to access and retrieve information from SQL databases without the need for extensive SQL knowledge.

Overall, knowledge of the SQL Database Agent empowers both developers and users to make the most out of SQL databases and streamline their data retrieval processes.

Practical Implementation of SQL Agents

Developers and users alike can now leverage the power of the SQL Database Agent to effortlessly retrieve information without the need for complex SQL syntax. This groundbreaking approach unlocks the true potential of databases and simplifies the data retrieval process.

Here we will learn how to implement SQL Agents. Let’s start by installing basic libraries:

pip install  langchain openai pymysql --upgrade -q

import basic libraries and set API key:

from langchain.agents import create_sql_agent
from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.sql_database import SQLDatabase
from langchain.llms.openai import OpenAI
from langchain.agents import AgentExecutor
from langchain.agents.agent_types import AgentType
from langchain.chat_models import ChatOpenAI
import os

os.environ["OPENAI_API_KEY"] = "your open ai key"

now to use the SQL database we will need a SQL database for the sake of this tutorial we will be using an SQL database created using (https://www.w3resource.com/sql/sql-table.php)

now if you are using this same SQL database great! you can create it using sqlite3 on your local system and then save it. Let’s continue with the assumption you have created the database and it is located at path “/home/user/sample_data.db”

db = SQLDatabase.from_uri("sqlite:////home/user/sample_data.db")

Now let us set up our agents, and chains for this:

toolkit = SQLDatabaseToolkit(db=db, llm=OpenAI(temperature=0))
agent_executor = create_sql_agent(
    llm=ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613"),
    toolkit=toolkit,
    verbose=True
)

Querying Database with Natural Language

Now that we have our agent and everything up we can execute it to query our database:

agent_executor.run("what are columns in agents table?")

> Entering new  chain...
Action: sql_db_list_tables
Action Input: ""
Observation: AGENTS, CUSTOMER, ORDERS
Thought:The tables in the database are AGENTS, CUSTOMER, and ORDERS. I should query the schema of the AGENTS table to see its columns.
Action: sql_db_schema
Action Input: "AGENTS"
Observation: 
CREATE TABLE "AGENTS" (
	"AGENT_CODE" CHAR(6) NOT NULL, 
	"AGENT_NAME" CHAR(40), 
	"WORKING_AREA" CHAR(35), 
	"COMMISSION" NUMERIC(10, 2), 
	"PHONE_NO" CHAR(15), 
	"COUNTRY" TEXT(25), 
	PRIMARY KEY ("AGENT_CODE")
)

/*
3 rows from AGENTS table:
AGENT_CODE	AGENT_NAME	WORKING_AREA	COMMISSION	PHONE_NO	COUNTRY
A007	Ramasundar	Bangalore	0.15	077-25814763	
A003	Alex 	London	0.13	075-12458969	
A008	Alford	New York	0.12	044-25874365	
*/
Thought:The columns in the AGENTS table are AGENT_CODE, AGENT_NAME, WORKING_AREA, COMMISSION, PHONE_NO, and COUNTRY.
Final Answer: AGENT_CODE, AGENT_NAME, WORKING_AREA, COMMISSION, PHONE_NO, COUNTRY

> Finished chain.

Output:

'AGENT_CODE, AGENT_NAME, WORKING_AREA, COMMISSION, PHONE_NO, COUNTRY'

This works fine now let us try executing some more queries:

agent_executor.run("what is primary key in agents table?")

> Entering new  chain...
Action: sql_db_list_tables
Action Input: ""
Observation: AGENTS, CUSTOMER, ORDERS
Thought:I can query the schema of the AGENTS table to find the primary key.
Action: sql_db_schema
Action Input: "AGENTS"
Observation: 
CREATE TABLE "AGENTS" (
	"AGENT_CODE" CHAR(6) NOT NULL, 
	"AGENT_NAME" CHAR(40), 
	"WORKING_AREA" CHAR(35), 
	"COMMISSION" NUMERIC(10, 2), 
	"PHONE_NO" CHAR(15), 
	"COUNTRY" TEXT(25), 
	PRIMARY KEY ("AGENT_CODE")
)

/*
3 rows from AGENTS table:
AGENT_CODE	AGENT_NAME	WORKING_AREA	COMMISSION	PHONE_NO	COUNTRY
A007	Ramasundar	Bangalore	0.15	077-25814763	
A003	Alex 	London	0.13	075-12458969	
A008	Alford	New York	0.12	044-25874365	
*/
Thought:The primary key in the AGENTS table is "AGENT_CODE".
Final Answer: The primary key in the AGENTS table is "AGENT_CODE".

> Finished chain.

Output:

'The primary key in the AGENTS table is "AGENT_CODE".'

It works fine with simple queries now what if we ask it a bit complex query of how tables relate to each other.

agent_executor.run("how are agents and customer tables related?")


> Entering new  chain...
Action: sql_db_list_tables
Action Input: ""
Observation: AGENTS, CUSTOMER, ORDERS
Thought:There are three tables in the database: AGENTS, CUSTOMER, and ORDERS. I should query the schema of the AGENTS and CUSTOMER tables to see how they are related.
Action: sql_db_schema
Action Input: "AGENTS, CUSTOMER"
Observation: 
CREATE TABLE "AGENTS" (
	"AGENT_CODE" CHAR(6) NOT NULL, 
	"AGENT_NAME" CHAR(40), 
	"WORKING_AREA" CHAR(35), 
	"COMMISSION" NUMERIC(10, 2), 
	"PHONE_NO" CHAR(15), 
	"COUNTRY" TEXT(25), 
	PRIMARY KEY ("AGENT_CODE")
)

/*
3 rows from AGENTS table:
AGENT_CODE	AGENT_NAME	WORKING_AREA	COMMISSION	PHONE_NO	COUNTRY
A007	Ramasundar	Bangalore	0.15	077-25814763	
A003	Alex 	London	0.13	075-12458969	
A008	Alford	New York	0.12	044-25874365	
*/


CREATE TABLE "CUSTOMER" (
	"CUST_CODE" TEXT(6) NOT NULL, 
	"CUST_NAME" TEXT(40) NOT NULL, 
	"CUST_CITY" CHAR(35), 
	"WORKING_AREA" TEXT(35) NOT NULL, 
	"CUST_COUNTRY" TEXT(20) NOT NULL, 
	"GRADE" NUMERIC, 
	"OPENING_AMT" NUMERIC(12, 2) NOT NULL, 
	"RECEIVE_AMT" NUMERIC(12, 2) NOT NULL, 
	"PAYMENT_AMT" NUMERIC(12, 2) NOT NULL, 
	"OUTSTANDING_AMT" NUMERIC(12, 2) NOT NULL, 
	"PHONE_NO" TEXT(17) NOT NULL, 
	"AGENT_CODE" CHAR(6) NOT NULL, 
	PRIMARY KEY ("CUST_CODE"), 
	FOREIGN KEY("AGENT_CODE") REFERENCES "AGENTS" ("AGENT_CODE")
)

/*
3 rows from CUSTOMER table:
CUST_CODE	CUST_NAME	CUST_CITY	WORKING_AREA	CUST_COUNTRY	GRADE	OPENING_AMT	RECEIVE_AMT	PAYMENT_AMT	OUTSTANDING_AMT	PHONE_NO	AGENT_CODE
C00013	Holmes	London	London	UK	2.0000000000	6000.00	5000.00	7000.00	4000.00	BBBBBBB	A003
C00001	Micheal	New York	New York	USA	2.0000000000	3000.00	5000.00	2000.00	6000.00	CCCCCCC	A008
C00020	Albert	New York	New York	USA	3.0000000000	5000.00	7000.00	6000.00	6000.00	BBBBSBB	A008
*/
Thought:The AGENTS and CUSTOMER tables are related through the AGENT_CODE column in the CUSTOMER table, which is a foreign key referencing the AGENT_CODE column in the AGENTS table.
Final Answer: The AGENTS and CUSTOMER tables are related through the AGENT_CODE column in the CUSTOMER table.

> Finished chain.

Output:

'The AGENTS and CUSTOMER tables are related through the AGENT_CODE column in the CUSTOMER table.'

Conclusion

In conclusion, the combination of LangChain, SQL Agents, and OpenAI’s LLMs has transformed the way we query databases. By enabling natural language interaction, these technologies have made database querying more accessible, user-friendly, and efficient.

SQL Agent LangChain have truly revolutionized the way we interact with databases, making querying a seamless and intuitive experience.