Hello :) Today is Day 254!
A quick summary of today:
- learning about agents with LangGraph
- file reviewer LLM evaluation metrics
Intro to LangGraph by LangChain
Setup
- Cloning the academy repo
- running jupyter
- setting up langsmith/openai env variables
- setting up Tavily API (a search engine optimized for LLMs and RAG, aimed at efficient, quick, and persistent search results)
- download langgraph studio for mac
LangGraph is a framework developed by LangChain for building agent and multi-agent applications. Unlike LangChain’s main package, LangGraph is specifically designed to give developers more control and precision in agent workflows, especially for complex real-world systems. Its goal is to help ensure that agents execute tasks more reliably by allowing developers to dictate the order in which tools are used or modify prompts based on the agent’s current state. LangGraph is part of an effort to overcome the challenges of building agents that can perform tasks autonomously with high reliability.
Module 0: Using LangChain’s chat models
from langchain_openai import ChatOpenAI
gpt4o_chat = ChatOpenAI(model="gpt-4o", temperature=0)
gpt35_chat = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)
Here is an example with OpenAI models. We have model and temperature.
Chat models in LangChain have a number of default methods. For the most part, we’ll be using:
stream
: stream back chunks of the responseinvoke
: call the chain on an input
Chat models take messages as input. Messages have a role (that describes who is saying the message) and a content property. We’ll be talking a lot more about this later, but here let’s just show the basics.
from langchain_core.messages import HumanMessage
# Create a message
msg = HumanMessage(content="Hello world", name="Lance")
# Message list
messages = [msg]
# Invoke the model with a list of messages
gpt4o_chat.invoke(messages)
Output:
AIMessage(content='Hello! How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 11, 'total_tokens': 20}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_25624ae3a5', 'finish_reason': 'stop', 'logprobs': None}, id='run-9abd403f-f4de-4581-ba13-1c834a36c63d-0', usage_metadata={'input_tokens': 11, 'output_tokens': 9, 'total_tokens': 20})
Module 1: Introduction
Here is a FAQ that was provided and gave me a better idea of what’s langgraph
LangGraph FAQ Summary
-
Do I need to use LangChain to use LangGraph? No. LangGraph is a more low-level and controllable orchestration framework for complex agentic systems. LangChain, by contrast, provides a standard interface for interacting with models and components, suited for simpler tasks.
-
How is LangGraph different from other agent frameworks?
Unlike other frameworks for simple tasks, LangGraph is designed to handle bespoke, complex tasks, offering more control without restricting users to a specific cognitive architecture. -
Does LangGraph impact the performance of my app?
No. LangGraph is designed for streaming workflows and will not add overhead to your code. -
Is LangGraph open source? Is it free?
Yes, LangGraph is an open-source library under the MIT license, and it’s free to use. -
Is LangGraph Cloud open source? No. LangGraph Cloud is proprietary software that will eventually become a paid service with certain tiers.
-
How do I enable LangGraph Cloud?
LangGraph Cloud is currently in beta and accessible to LangSmith Plus and Enterprise plan users. -
How are LangGraph and LangGraph Cloud different?
LangGraph is a framework for building agent workflows, while LangGraph Cloud is a service for deploying, scaling, and debugging these applications. It also includes a Studio for prototyping. -
How does LangGraph fit into the LangChain ecosystem? LangGraph is used to build stateful agents with streaming and human-in-the-loop support, while LangChain’s components aid development, and LangSmith assists in monitoring and optimizing production deployments. LangGraph Cloud helps turn LangGraph applications into production-ready systems.
~
A solitary llm is faily limited as it does not have access to tools, external context, or multi-step workflows. There are apps that have chains (a series of step, like retrieval, augmentation, generation, etc.). Chains are nice as they are deterministic because they will always follow the steps that we outline for them. What we want to do next, is give multiple chains to an LLM system and depending on the need, the system can decide on its own which chain to invoke.
Module 1 structure:
Simple graph
3 nodes and 1 conditional edge
State
First we need to define the State of the graph. The State schema serves as the input schema for all Nodes and Edges in the graph.
We can use TypedDict, and we give it 1 key: graph_state
from typing_extensions import TypedDict
class State(TypedDict):
graph_state: str
Then, we define Nodes
Nodes
Nodes are just python functions. The first positional argument is the state, as defined above. Because the state is a TypedDict
with schema as defined above, each node can access the key, graph_state
, with state['graph_state']
. Each node returns a new value of the state key graph_state
. By default, the new value returned by each node will override the prior state value.
def node_1(state):
print("---Node 1---")
return {"graph_state": state['graph_state'] +" I am"}
def node_2(state):
print("---Node 2---")
return {"graph_state": state['graph_state'] +" happy!"}
def node_3(state):
print("---Node 3---")
return {"graph_state": state['graph_state'] +" sad!"}
Edges
Edges connect the nodes. Normal Edges are used if you want to always go from, for example, node_1
to node_2
. Conditional Edges are used want to optionally route between nodes. Conditional edges are implemented as functions that return the next node to visit based upon some logic.
import random
from typing import Literal
def decide_mood(state) -> Literal["node_2", "node_3"]:
# Often, we will use state to decide on the next node to visit
user_input = state['graph_state']
# Here, let's just do a 50 / 50 split between nodes 2, 3
if random.random() < 0.5:
# 50% of the time, we return Node 2
return "node_2"
# 50% of the time, we return Node 3
return "node_3"
Graph construction
Now, we build the graph from our components defined above.
The StateGraph class is the graph class that we can use.
First, we initialize a StateGraph with the State
class we defined above.
Then, we add our nodes and edges.
We use the START
Node, a special node that sends user input to the graph, to indicate where to start our graph.
The END
Node is a special node that represents a terminal node.
Finally, we compile our graph to perform a few basic checks on the graph structure.
We can visualize the graph as a Mermaid diagram.
from IPython.display import Image, display
from langgraph.graph import StateGraph, START, END
# Build graph
builder = StateGraph(State)
builder.add_node("node_1", node_1)
builder.add_node("node_2", node_2)
builder.add_node("node_3", node_3)
# Logic
builder.add_edge(START, "node_1")
builder.add_conditional_edges("node_1", decide_mood)
builder.add_edge("node_2", END)
builder.add_edge("node_3", END)
# Add
graph = builder.compile()
# View
display(Image(graph.get_graph().draw_mermaid_png()))
Graph Invocation
The compiled graph implements the runnable protocol.
This provides a standard way to execute LangChain components.
invoke
is one of the standard methods in this interface.
The input is a dictionary {"graph_state": "Hi, this is lance."}
, which sets the initial value for our graph state dict.
When invoke
is called, the graph starts execution from the START
node.
It progresses through the defined nodes (node_1
, node_2
, node_3
) in order.
The conditional edge will traverse from node 1
to node 2
or 3
using a 50/50 decision rule.
Each node function receives the current state and returns a new value, which overrides the graph state.
The execution continues until it reaches the END
node.
graph.invoke({"graph_state" : "Hi, this is Lance."})
Output: {'graph_state': 'Hi, this is Lance. I am sad!'}
invoke
runs the entire graph synchronously.
This waits for each step to complete before moving to the next.
It returns the final state of the graph after all nodes have executed.
In this case, it returns the state after node_3
has completed:
{'graph_state': 'Hi, this is Lance. I am sad!'}
Node 1 appended I am
, and then with a 50% chance we go to node 2 or 3, and append happy
or sad
I can open this graph in LangGraph studio as well
In the provided academy repo, there is a studio folder with the following files that define the graph’s logic:
and all I have to do is drag the folder in LangGraph Studio.
I can interact with the graph, like in the notebook, and also see what each node did:
Chain
Next, let’s build up to a simple chain that combines 4 concepts:
- Using chat messages as our graph state
- Using chat models in graph nodes
- Binding tools to our chat model
- Executing tool calls in graph nodes
Messages
Chat models can use messages
, which capture different roles within a conversation.
LangChain supports various message types, including HumanMessage
, AIMessage
, SystemMessage
, and ToolMessage
.
These represent a message from the user, from chat model, for the chat model to instruct behavior, and from a tool call.
Let’s create a list of messages.
Each message can be supplied with a few things:
content
- content of the messagename
- optionally, a message authorresponse_metadata
- optionally, a dict of metadata (e.g., often populated by model provider forAIMessages
)
from pprint import pprint
from langchain_core.messages import AIMessage, HumanMessage
messages = [AIMessage(content=f"So you said you were researching ocean mammals?", name="Model")]
messages.append(HumanMessage(content=f"Yes, that's right.",name="Lance"))
messages.append(AIMessage(content=f"Great, what would you like to learn about.", name="Model"))
messages.append(HumanMessage(content=f"I want to learn about the best place to see Orcas in the US.", name="Lance"))
for m in messages:
m.pretty_print()
Output:
================================== Ai Message ==================================
Name: Model
So you said you were researching ocean mammals?
================================ Human Message =================================
Name: Lance
Yes, that's right.
================================== Ai Message ==================================
Name: Model
Great, what would you like to learn about.
================================ Human Message =================================
Name: Lance
I want to learn about the best place to see Orcas in the US.
Chat models
They can use a sequence of messages as input and support message types (like above). Below, OpenAI is used.
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o")
result = llm.invoke(messages)
type(result)
Out:
AIMessage(content="Orcas, also known as killer whales, can be observed in several locations in the United States, but one of the best places to see them is the Pacific Northwest, particularly around the San Juan Islands in Washington State. \n\n### Key Locations:\n\n1. **San Juan Islands, Washington:**\n - **Best Time:** The peak season is from May to September, but they can be seen year-round.\n - **Why It's Great:** The nutrient-rich waters attract a variety of marine life, making it an ideal habitat for orcas. There are both resident and transient orca pods in the area.\n\n2. **Puget Sound, Washington:**\n - **Best Time:** Summer months, primarily from May to September.\n - **Why It's Great:** Puget Sound is home to the Southern Resident orcas, which are a unique and endangered population.\n\n3. **Monterey Bay, California:**\n - **Best Time:** Spring and fall are the best times to see transient orcas.\n - **Why It's Great:** The nutrient-rich waters of Monterey Bay attract a variety of marine life, including orcas, which come to prey on seals and sea lions.\n\n4. **Southeast Alaska:**\n - **Best Time:** Summer months, from May to September.\n - **Why It's Great:** The Inside Passage is a prime location for spotting orcas, along with humpback whales and other marine animals.\n\n### Tips for Whale Watching:\n- **Tours:** Consider booking a whale-watching tour with a reputable company that follows responsible wildlife viewing guidelines.\n- **Binoculars and Camera:** Bring binoculars for a closer look and a camera with a good zoom lens to capture the experience.\n- **Weather:** Dress in layers and prepare for variable weather conditions, especially if you're going out on a boat.\n\n### Conservation Note:\nOrcas face various threats, including habitat destruction, pollution, and reduced prey availability. Supporting conservation efforts and choosing eco-friendly tour operators can help protect these magnificent creatures for future generations.\n\nIs there anything more specific you’d like to know about orcas or whale watching?", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 427, 'prompt_tokens': 67, 'total_tokens': 494}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_157b3831f5', 'finish_reason': 'stop', 'logprobs': None}, id='run-98914d63-e629-401b-a74e-dea0e68eff03-0', usage_metadata={'input_tokens': 67, 'output_tokens': 427, 'total_tokens': 494})
Metadata result.response_metadata
:
{'token_usage': {'completion_tokens': 427,
'prompt_tokens': 67,
'total_tokens': 494},
'model_name': 'gpt-4o-2024-05-13',
'system_fingerprint': 'fp_157b3831f5',
'finish_reason': 'stop',
'logprobs': None}
Tools
Tools are useful whenever you want a model to interact with external systems.
External systems (e.g., APIs) often require a particular input schema or payload, rather than natural language.
When we bind an API, for example, as a tool we given the model awareness of the required input schema.
The model will choose to call a tool based upon the natural language input from the user.
And, it will return an output that adheres to the tool’s schema.
Many LLM providers support tool calling and tool calling interface in LangChain is simple.
You can simply pass any Python function
into ChatModel.bind_tools(function)
.
Here is an example function as a tool:
def multiply(a: int, b: int) -> int:
"""Multiply a and b.
Args:
a: first int
b: second int
"""
return a * b
llm_with_tools = llm.bind_tools([multiply])
If we pass an input - e.g., "What is 2 multiplied by 3"
- we see a tool call returned.
The tool call has specific arguments that match the input schema of our function along with the name of the function to call.
{'arguments': '{"a":2,"b":3}', 'name': 'multiply'}
Using messages as state
With these foundations in place, we can now use messages
in our graph state.
We can define our state MessagesState
as a TypedDict
with a single key: messages
.
messages
is simply a list of messages, as we defined above (e.g., HumanMessage
, etc).
from typing_extensions import TypedDict
from langchain_core.messages import AnyMessage
class MessagesState(TypedDict):
messages: list[AnyMessage]
Reducers
Now, we have a minor problem!
As discussed, each node will return a new value for our state key messages
.
But, this new value will override the prior messages
value.
As our graph runs, we want to append messages to to our messages
state key.
We can use reducer functions address this.
Reducers allow us to specify how state updates are performed.
If no reducer function is specified, then it is assumed that updates to the key should override it as we saw before.
But, to append messages, we can use the pre-built add_messages
reducer.
This ensures that any messages are appended to the existing list of messages.
We annotate simply need to annotate our messages
key with the add_messages
reducer function as metadata.
from typing import Annotated
from langgraph.graph.message import add_messages
class MessagesState(TypedDict):
messages: Annotated[list[AnyMessage], add_messages]
Since having a list of messages in graph state is so common, LangGraph has a pre-built MessagesState
!
MessagesState
is defined:
- With a pre-build single
messages
key - This is a list of
AnyMessage
objects - It uses the
add_messages
reducer
We’ll usually use MessagesState
because it is less verbose than defining a custom TypedDict
, as shown above.
from langgraph.graph import MessagesState
class MessagesState(MessagesState):
# Add any keys needed beyond messages, which is pre-built
pass
The graph
from IPython.display import Image, display
from langgraph.graph import StateGraph, START, END
# Node
def tool_calling_llm(state: MessagesState):
return {"messages": [llm_with_tools.invoke(state["messages"])]}
# Build graph
builder = StateGraph(MessagesState)
builder.add_node("tool_calling_llm", tool_calling_llm)
builder.add_edge(START, "tool_calling_llm")
builder.add_edge("tool_calling_llm", END)
graph = builder.compile()
# View
display(Image(graph.get_graph().draw_mermaid_png()))
If we pass in Hello!
, the LLM responds without any tool calls.
messages = graph.invoke({"messages": HumanMessage(content="Hello!")})
for m in messages['messages']:
m.pretty_print()
================================ Human Message =================================
Hello!
================================== Ai Message ==================================
Hi there! How can I assist you today?
The LLM chooses to use a tool when it determines that the input or task requires the functionality provided by that tool.
messages = graph.invoke({"messages": HumanMessage(content="Multiply 2 and 3!")})
for m in messages['messages']:
m.pretty_print()
================================ Human Message =================================
Multiply 2 and 3!
================================== Ai Message ==================================
Tool Calls:
multiply (call_Er4gChFoSGzU7lsuaGzfSGTQ)
Call ID: call_Er4gChFoSGzU7lsuaGzfSGTQ
Args:
a: 2
b: 3
Router
Above, we saw that the graph can:
- Return a tool call
- Return a natural language response
We can think of a router as a tool where the chat model routes between a direct response or a tool call based upon the user input.
This is an simple example of an agent, where the LLM is directing the control flow either by calling a tool or just responding directly.
Below, we will extend the graph to work with either output
First, we create the tool again:
from langchain_openai import ChatOpenAI
def multiply(a: int, b: int) -> int:
"""Multiply a and b.
Args:
a: first int
b: second int
"""
return a * b
llm = ChatOpenAI(model="gpt-4o")
llm_with_tools = llm.bind_tools([multiply])
We use the built-in ToolNode
and simply pass a list of our tools to initialize it.
We use the built-in tools_condition
as our conditional edge.
from IPython.display import Image, display
from langgraph.graph import StateGraph, START, END
from langgraph.graph import MessagesState
from langgraph.prebuilt import ToolNode
from langgraph.prebuilt import tools_condition
# Node
def tool_calling_llm(state: MessagesState):
return {"messages": [llm_with_tools.invoke(state["messages"])]}
# Build graph
builder = StateGraph(MessagesState)
builder.add_node("tool_calling_llm", tool_calling_llm)
builder.add_node("tools", ToolNode([multiply]))
builder.add_edge(START, "tool_calling_llm")
builder.add_conditional_edges(
"tool_calling_llm",
# If the latest message (result) from assistant is a tool call -> tools_condition routes to tools
# If the latest message (result) from assistant is a not a tool call -> tools_condition routes to END
tools_condition,
)
builder.add_edge("tools", END)
graph = builder.compile()
# View
display(Image(graph.get_graph().draw_mermaid_png()))
messages = graph.invoke({"messages": ("user", "Multiply 3 and 2")})
for m in messages['messages']:
m.pretty_print()
================================ Human Message =================================
Multiply 3 and 2
================================== Ai Message ==================================
Tool Calls:
multiply (call_om27lpEIRsGdjQ9K78JUZtps)
Call ID: call_om27lpEIRsGdjQ9K78JUZtps
Args:
a: 3
b: 2
================================= Tool Message =================================
Name: multiply
6
Here is doing the same thing but using LangGraph Studio:
Agent
we can extend the above into a generic agent architecture.
In the above router, we invoked the model and, if it chose to call a tool, we returned a ToolMessage
to the user.
But, what if we simply pass that ToolMessage
back to the model?
We can let it either (1) call another tool or (2) respond directly.
This is the intuition behind ReAct, a general agent architecture.
act
- let the model call specific toolsobserve
- pass the tool output back to the modelreason
- let the model reason about the tool output to decide what to do next (e.g., call another tool or just respond directly)
This general purpose architecture can be applied to many types of tools.
Lets add some tools
from langchain_openai import ChatOpenAI
def multiply(a: int, b: int) -> int:
"""Multiply a and b.
Args:
a: first int
b: second int
"""
return a * b
# This will be a tool
def add(a: int, b: int) -> int:
"""Adds a and b.
Args:
a: first int
b: second int
"""
return a + b
def divide(a: int, b: int) -> float:
"""Adds a and b.
Args:
a: first int
b: second int
"""
return a / b
tools = [add, multiply, divide]
llm = ChatOpenAI(model="gpt-4o")
llm_with_tools = llm.bind_tools(tools)
Create an LLM
from langgraph.graph import MessagesState
from langchain_core.messages import HumanMessage, SystemMessage
# System message
sys_msg = SystemMessage(content="You are a helpful assistant tasked with performing arithmetic on a set of inputs.")
# Node
def assistant(state: MessagesState):
return {"messages": [llm_with_tools.invoke([sys_msg] + state["messages"])]}
As before, we use MessagesState
and define a Tools
node with our list of tools.
The Assistant
node is just our model with bound tools.
We create a graph with Assistant
and Tools
nodes.
We add tools_condition
edge, which routes to End
or to Tools
based on whether the Assistant
calls a tool.
Now, we add one new step:
We connect the Tools
node back to the Assistant
, forming a loop.
- After the
assistant
node executes,tools_condition
checks if the model’s output is a tool call. - If it is a tool call, the flow is directed to the
tools
node. - The
tools
node connects back toassistant
. - This loop continues as long as the model decides to call tools.
- If the model response is not a tool call, the flow is directed to END, terminating the process.
from langgraph.graph import START, StateGraph
from langgraph.prebuilt import tools_condition
from langgraph.prebuilt import ToolNode
from IPython.display import Image, display
# Graph
builder = StateGraph(MessagesState)
# Define nodes: these do the work
builder.add_node("assistant", assistant)
builder.add_node("tools", ToolNode(tools))
# Define edges: these determine how the control flow moves
builder.add_edge(START, "assistant")
builder.add_conditional_edges(
"assistant",
# If the latest message (result) from assistant is a tool call -> tools_condition routes to tools
# If the latest message (result) from assistant is a not a tool call -> tools_condition routes to END
tools_condition,
)
builder.add_edge("tools", "assistant")
react_graph = builder.compile()
# Show
display(Image(react_graph.get_graph(xray=True).draw_mermaid_png()))
messages = [HumanMessage(content="Add 3 and 4, then multiply by 2, and finally divide by 5")]
messages = react_graph.invoke({"messages": messages})
for m in messages['messages']:
m.pretty_print()
================================ Human Message =================================
Add 3 and 4, then multiply by 2, and finally divide by 5
================================== Ai Message ==================================
Tool Calls:
add (call_s056B7Zg4pWzrBeb5t0ZqTM7)
Call ID: call_s056B7Zg4pWzrBeb5t0ZqTM7
Args:
a: 3
b: 4
================================= Tool Message =================================
Name: add
7
================================== Ai Message ==================================
Tool Calls:
multiply (call_Wh2R0c53q9Kin7AeCtFcD5ng)
Call ID: call_Wh2R0c53q9Kin7AeCtFcD5ng
Args:
a: 7
b: 2
================================= Tool Message =================================
Name: multiply
14
================================== Ai Message ==================================
Tool Calls:
divide (call_U9eDdbIfW9N2DbCPCKi0i6TA)
Call ID: call_U9eDdbIfW9N2DbCPCKi0i6TA
Args:
a: 14
b: 5
================================= Tool Message =================================
Name: divide
2.8
================================== Ai Message ==================================
The result of the calculation is \(2.8\).
We can check it out in LangGraph Studio as well:
Now, the system responds with a natural language, rather than a tool_call
In LangSmith, we can even see extra tracing:
Agent with memory
Lets introduce memory to our agent
Setup for the agent is like the above section, but for memory, here is how to set it up:
from langgraph.checkpoint.memory import MemorySaver
memory = MemorySaver()
react_graph_memory = builder.compile(checkpointer=memory)
When we use memory, we need to specify a thread_id
.
This thread_id
will store our collection of graph states.
Here is a cartoon:
- The checkpointer write the state at every step of the graph
- These checkpoints are saved in a thread
- We can access that thread in the future using the
thread_id
# Specify a thread
config = {"configurable": {"thread_id": "1"}}
# Specify an input
messages = [HumanMessage(content="Add 3 and 4.")]
# Run
messages = react_graph_memory.invoke({"messages": messages},config)
for m in messages['messages']:
m.pretty_print()
================================ Human Message =================================
Add 3 and 4.
================================== Ai Message ==================================
Tool Calls:
add (call_23tmdvhv1VAfMxIZnFBFHJD7)
Call ID: call_23tmdvhv1VAfMxIZnFBFHJD7
Args:
a: 3
b: 4
================================= Tool Message =================================
Name: add
7
================================== Ai Message ==================================
The sum of 3 and 4 is 7.
If we pass the same thread_id
, then we can proceed from from the previously logged state checkpoint!
In this case, the above conversation is captured in the thread.
The HumanMessage
we pass ("Multiply that by 2."
) is appended to the above conversation.
So, the model now know that that
refers to the The sum of 3 and 4 is 7.
.
messages = [HumanMessage(content="Multiply that by 2.")]
messages = react_graph_memory.invoke({"messages": messages}, config)
for m in messages['messages']:
m.pretty_print()
================================ Human Message =================================
Add 3 and 4.
================================== Ai Message ==================================
Tool Calls:
add (call_23tmdvhv1VAfMxIZnFBFHJD7)
Call ID: call_23tmdvhv1VAfMxIZnFBFHJD7
Args:
a: 3
b: 4
================================= Tool Message =================================
Name: add
7
================================== Ai Message ==================================
The sum of 3 and 4 is 7.
================================ Human Message =================================
Multiply that by 2.
================================== Ai Message ==================================
Tool Calls:
multiply (call_aFKM1qZxhnsW9YICm8xSeLpG)
Call ID: call_aFKM1qZxhnsW9YICm8xSeLpG
Args:
a: 7
b: 2
================================= Tool Message =================================
Name: multiply
14
================================== Ai Message ==================================
The result of multiplying 7 by 2 is 14.
Deployment
There are a few central concepts to understand -
LangGraph
—
- Python and JavaScript library
- Allows creation of agent workflows
LangGraph API
—
- Bundles the graph code
- Provides a task queue for managing asynchronous operations
- Offers persistence for maintaining state across interactions
LangGraph Cloud
–
- Hosted service for the LangGraph API
- Allows deployment of graphs from GitHub repositories
- Also provides monitoring and tracing for deployed graphs
- Accessible via a unique URL for each deployment
LangGraph Studio
–
- Integrated Development Environment (IDE) for LangGraph applications
- Uses the API as its back-end, allowing real-time testing and exploration of graphs
- Can be run locally or with cloud-deployment
LangGraph SDK
–
- Python library for programmatically interacting with LangGraph graphs
- Provides a consistent interface for working with graphs, whether served locally or in the cloud
- Allows creation of clients, access to assistants, thread management, and execution of runs
The tutorial showed how to deploy the above agent with tools using GitHub and LangSmith.
But I cannot actually deploy a link at the moment. Nevertheless local is fine for now too.
That is all for module 1. Tbh, extremely informative, and definitely learned a lot. Especially related to agents, and how those functions (like the multiply one) get put in the system and the LLM can use them.
Movie reviewer LLM
Last week, I mentioned about a project with one of my professors. I still cannot share anything related to it. But today I looked into evaluation metrics like ROUGE and BLEU. BLEU seems to be more for machine translation rather than our case. And seems like ROUGE fits us better as it has multiple metrics inside of it.
The evaluate
module on Hugging Face provides multiple ROUGE (Recall-Oriented Understudy for Gisting Evaluation) metrics to assess the overlap between the model-generated text and the reference text. Here’s the breakdown of the four different ROUGE metrics you’re seeing:
-
ROUGE-1: Measures the overlap of unigrams (individual words) between the generated and reference summaries. It gives an indication of how many single words from the reference are captured in the generated summary.
-
ROUGE-2: Measures the overlap of bigrams (two consecutive words) between the generated and reference summaries. This metric takes into account the preservation of word sequences, reflecting how well the generated text captures small sequences of words from the reference.
-
ROUGE-L: Measures the longest common subsequence (LCS) between the generated and reference summaries. It evaluates the longest sequence of words that appear in both texts in the same order but not necessarily consecutively. This metric captures a more flexible notion of similarity based on the overall structure of the text.
-
ROUGE-Lsum: Specifically designed for evaluating summarization tasks, it computes the LCS at the sentence level, focusing on the structure of summaries. ROUGE-Lsum looks at how well the sentences in the generated summary align with those in the reference summary.
And the value for each is between 0 and 1, 0 meaning completel different summaries, and 1 exactly the same, so it is a nice and easy metric to understand. I ran a test on my generated data, using huggingface’s evaluate library (which contains all 4 ROUGE metrics from above), and I got: {'rouge1': 0.7613263913669215, 'rouge2': 0.4409441673251364, 'rougeL': 0.6578334536498124, 'rougeLsum': 0.7542867701046572}
a
However, I am concerned as in addition to text, the model outputs a score out of 10 for each movie category (storyline, characters, etc). And using metrics like ROUGE we are not evaluating the model based on these numbers. So as part of my update email to my professor I mentioned this concern and whether we should look into a specific model that outputs scores.
That is all for today!
See you tomorrow :)