(Day 254) Started Intro to LangGraph by LangChain

Ivan Ivanov · September 11, 2024

nlp theory applying-knowledge

Hello :) Today is Day 254!

A quick summary of today:

learning about agents with LangGraph
file reviewer LLM evaluation metrics

Intro to LangGraph by LangChain

Setup

Cloning the academy repo
running jupyter
setting up langsmith/openai env variables
setting up Tavily API (a search engine optimized for LLMs and RAG, aimed at efficient, quick, and persistent search results)
download langgraph studio for mac

LangGraph is a framework developed by LangChain for building agent and multi-agent applications. Unlike LangChain’s main package, LangGraph is specifically designed to give developers more control and precision in agent workflows, especially for complex real-world systems. Its goal is to help ensure that agents execute tasks more reliably by allowing developers to dictate the order in which tools are used or modify prompts based on the agent’s current state. LangGraph is part of an effort to overcome the challenges of building agents that can perform tasks autonomously with high reliability.

Module 0: Using LangChain’s chat models

from langchain_openai import ChatOpenAI
gpt4o_chat = ChatOpenAI(model="gpt-4o", temperature=0)
gpt35_chat = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)

Here is an example with OpenAI models. We have model and temperature.

Chat models in LangChain have a number of default methods. For the most part, we’ll be using:

stream: stream back chunks of the response
invoke: call the chain on an input

Chat models take messages as input. Messages have a role (that describes who is saying the message) and a content property. We’ll be talking a lot more about this later, but here let’s just show the basics.

from langchain_core.messages import HumanMessage

# Create a message
msg = HumanMessage(content="Hello world", name="Lance")

# Message list
messages = [msg]

# Invoke the model with a list of messages 
gpt4o_chat.invoke(messages)

Output:

AIMessage(content='Hello! How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 11, 'total_tokens': 20}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_25624ae3a5', 'finish_reason': 'stop', 'logprobs': None}, id='run-9abd403f-f4de-4581-ba13-1c834a36c63d-0', usage_metadata={'input_tokens': 11, 'output_tokens': 9, 'total_tokens': 20})

Module 1: Introduction

Here is a FAQ that was provided and gave me a better idea of what’s langgraph

LangGraph FAQ Summary

Do I need to use LangChain to use LangGraph? No. LangGraph is a more low-level and controllable orchestration framework for complex agentic systems. LangChain, by contrast, provides a standard interface for interacting with models and components, suited for simpler tasks.
How is LangGraph different from other agent frameworks?
Unlike other frameworks for simple tasks, LangGraph is designed to handle bespoke, complex tasks, offering more control without restricting users to a specific cognitive architecture.
Does LangGraph impact the performance of my app?
No. LangGraph is designed for streaming workflows and will not add overhead to your code.
Is LangGraph open source? Is it free?
Yes, LangGraph is an open-source library under the MIT license, and it’s free to use.
Is LangGraph Cloud open source? No. LangGraph Cloud is proprietary software that will eventually become a paid service with certain tiers.
How do I enable LangGraph Cloud?
LangGraph Cloud is currently in beta and accessible to LangSmith Plus and Enterprise plan users.
How are LangGraph and LangGraph Cloud different?
LangGraph is a framework for building agent workflows, while LangGraph Cloud is a service for deploying, scaling, and debugging these applications. It also includes a Studio for prototyping.
How does LangGraph fit into the LangChain ecosystem? LangGraph is used to build stateful agents with streaming and human-in-the-loop support, while LangChain’s components aid development, and LangSmith assists in monitoring and optimizing production deployments. LangGraph Cloud helps turn LangGraph applications into production-ready systems.

A solitary llm is faily limited as it does not have access to tools, external context, or multi-step workflows. There are apps that have chains (a series of step, like retrieval, augmentation, generation, etc.). Chains are nice as they are deterministic because they will always follow the steps that we outline for them. What we want to do next, is give multiple chains to an LLM system and depending on the need, the system can decide on its own which chain to invoke.

Module 1 structure:

Simple graph

3 nodes and 1 conditional edge

State

First we need to define the State of the graph. The State schema serves as the input schema for all Nodes and Edges in the graph.

We can use TypedDict, and we give it 1 key: graph_state

from typing_extensions import TypedDict

class State(TypedDict):
    graph_state: str

Then, we define Nodes

Nodes

Nodes are just python functions. The first positional argument is the state, as defined above. Because the state is a TypedDict with schema as defined above, each node can access the key, graph_state, with state['graph_state']. Each node returns a new value of the state key graph_state. By default, the new value returned by each node will override the prior state value.

def node_1(state):
    print("---Node 1---")
    return {"graph_state": state['graph_state'] +" I am"}

def node_2(state):
    print("---Node 2---")
    return {"graph_state": state['graph_state'] +" happy!"}

def node_3(state):
    print("---Node 3---")
    return {"graph_state": state['graph_state'] +" sad!"}

Edges

Edges connect the nodes. Normal Edges are used if you want to always go from, for example, node_1 to node_2. Conditional Edges are used want to optionally route between nodes. Conditional edges are implemented as functions that return the next node to visit based upon some logic.

import random
from typing import Literal

def decide_mood(state) -> Literal["node_2", "node_3"]:
    
    # Often, we will use state to decide on the next node to visit
    user_input = state['graph_state'] 
    
    # Here, let's just do a 50 / 50 split between nodes 2, 3
    if random.random() < 0.5:

        # 50% of the time, we return Node 2
        return "node_2"
    
    # 50% of the time, we return Node 3
    return "node_3"

Graph construction

Now, we build the graph from our components defined above.

The StateGraph class is the graph class that we can use.

First, we initialize a StateGraph with the State class we defined above.

Then, we add our nodes and edges.

We use the START Node, a special node that sends user input to the graph, to indicate where to start our graph.

The END Node is a special node that represents a terminal node.

Finally, we compile our graph to perform a few basic checks on the graph structure.

We can visualize the graph as a Mermaid diagram.

from IPython.display import Image, display
from langgraph.graph import StateGraph, START, END

# Build graph
builder = StateGraph(State)
builder.add_node("node_1", node_1)
builder.add_node("node_2", node_2)
builder.add_node("node_3", node_3)

# Logic
builder.add_edge(START, "node_1")
builder.add_conditional_edges("node_1", decide_mood)
builder.add_edge("node_2", END)
builder.add_edge("node_3", END)

# Add
graph = builder.compile()

# View
display(Image(graph.get_graph().draw_mermaid_png()))

Graph Invocation

The compiled graph implements the runnable protocol.

This provides a standard way to execute LangChain components.

invoke is one of the standard methods in this interface.

The input is a dictionary {"graph_state": "Hi, this is lance."}, which sets the initial value for our graph state dict.

When invoke is called, the graph starts execution from the START node.

It progresses through the defined nodes (node_1, node_2, node_3) in order.

The conditional edge will traverse from node 1 to node 2 or 3 using a 50/50 decision rule.

Each node function receives the current state and returns a new value, which overrides the graph state.

The execution continues until it reaches the END node.

graph.invoke({"graph_state" : "Hi, this is Lance."})

Output: {'graph_state': 'Hi, this is Lance. I am sad!'}

invoke runs the entire graph synchronously.

This waits for each step to complete before moving to the next.

It returns the final state of the graph after all nodes have executed.

In this case, it returns the state after node_3 has completed:

{'graph_state': 'Hi, this is Lance. I am sad!'}

Node 1 appended I am, and then with a 50% chance we go to node 2 or 3, and append happy or sad

I can open this graph in LangGraph studio as well

In the provided academy repo, there is a studio folder with the following files that define the graph’s logic:

and all I have to do is drag the folder in LangGraph Studio.

I can interact with the graph, like in the notebook, and also see what each node did:

Chain

Next, let’s build up to a simple chain that combines 4 concepts:

Using chat messages as our graph state
Using chat models in graph nodes
Binding tools to our chat model
Executing tool calls in graph nodes

Messages

Chat models can use messages, which capture different roles within a conversation.

LangChain supports various message types, including HumanMessage, AIMessage, SystemMessage, and ToolMessage.

These represent a message from the user, from chat model, for the chat model to instruct behavior, and from a tool call.

Let’s create a list of messages.

Each message can be supplied with a few things:

content - content of the message
name - optionally, a message author
response_metadata - optionally, a dict of metadata (e.g., often populated by model provider for AIMessages)

from pprint import pprint
from langchain_core.messages import AIMessage, HumanMessage

messages = [AIMessage(content=f"So you said you were researching ocean mammals?", name="Model")]
messages.append(HumanMessage(content=f"Yes, that's right.",name="Lance"))
messages.append(AIMessage(content=f"Great, what would you like to learn about.", name="Model"))
messages.append(HumanMessage(content=f"I want to learn about the best place to see Orcas in the US.", name="Lance"))

for m in messages:
    m.pretty_print()

Output:

================================== Ai Message ==================================
Name: Model

So you said you were researching ocean mammals?
================================ Human Message =================================
Name: Lance

Yes, that's right.
================================== Ai Message ==================================
Name: Model

Great, what would you like to learn about.
================================ Human Message =================================
Name: Lance

I want to learn about the best place to see Orcas in the US.

Chat models

They can use a sequence of messages as input and support message types (like above). Below, OpenAI is used.

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o")
result = llm.invoke(messages)
type(result)

Out:

AIMessage(content="Orcas, also known as killer whales, can be observed in several locations in the United States, but one of the best places to see them is the Pacific Northwest, particularly around the San Juan Islands in Washington State. \n\n### Key Locations:\n\n1. **San Juan Islands, Washington:**\n   - **Best Time:** The peak season is from May to September, but they can be seen year-round.\n   - **Why It's Great:** The nutrient-rich waters attract a variety of marine life, making it an ideal habitat for orcas. There are both resident and transient orca pods in the area.\n\n2. **Puget Sound, Washington:**\n   - **Best Time:** Summer months, primarily from May to September.\n   - **Why It's Great:** Puget Sound is home to the Southern Resident orcas, which are a unique and endangered population.\n\n3. **Monterey Bay, California:**\n   - **Best Time:** Spring and fall are the best times to see transient orcas.\n   - **Why It's Great:** The nutrient-rich waters of Monterey Bay attract a variety of marine life, including orcas, which come to prey on seals and sea lions.\n\n4. **Southeast Alaska:**\n   - **Best Time:** Summer months, from May to September.\n   - **Why It's Great:** The Inside Passage is a prime location for spotting orcas, along with humpback whales and other marine animals.\n\n### Tips for Whale Watching:\n- **Tours:** Consider booking a whale-watching tour with a reputable company that follows responsible wildlife viewing guidelines.\n- **Binoculars and Camera:** Bring binoculars for a closer look and a camera with a good zoom lens to capture the experience.\n- **Weather:** Dress in layers and prepare for variable weather conditions, especially if you're going out on a boat.\n\n### Conservation Note:\nOrcas face various threats, including habitat destruction, pollution, and reduced prey availability. Supporting conservation efforts and choosing eco-friendly tour operators can help protect these magnificent creatures for future generations.\n\nIs there anything more specific you’d like to know about orcas or whale watching?", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 427, 'prompt_tokens': 67, 'total_tokens': 494}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_157b3831f5', 'finish_reason': 'stop', 'logprobs': None}, id='run-98914d63-e629-401b-a74e-dea0e68eff03-0', usage_metadata={'input_tokens': 67, 'output_tokens': 427, 'total_tokens': 494})

Metadata result.response_metadata:

{'token_usage': {'completion_tokens': 427,
  'prompt_tokens': 67,
  'total_tokens': 494},
 'model_name': 'gpt-4o-2024-05-13',
 'system_fingerprint': 'fp_157b3831f5',
 'finish_reason': 'stop',
 'logprobs': None}

Tools

Tools are useful whenever you want a model to interact with external systems.

External systems (e.g., APIs) often require a particular input schema or payload, rather than natural language.

When we bind an API, for example, as a tool we given the model awareness of the required input schema.

The model will choose to call a tool based upon the natural language input from the user.

And, it will return an output that adheres to the tool’s schema.

Many LLM providers support tool calling and tool calling interface in LangChain is simple.

You can simply pass any Python function into ChatModel.bind_tools(function).

Here is an example function as a tool:

def multiply(a: int, b: int) -> int:
    """Multiply a and b.

    Args:
        a: first int
        b: second int
    """
    return a * b

llm_with_tools = llm.bind_tools([multiply])

If we pass an input - e.g., "What is 2 multiplied by 3" - we see a tool call returned.

The tool call has specific arguments that match the input schema of our function along with the name of the function to call.

{'arguments': '{"a":2,"b":3}', 'name': 'multiply'}

Using messages as state

With these foundations in place, we can now use messages in our graph state.

We can define our state MessagesState as a TypedDict with a single key: messages.

messages is simply a list of messages, as we defined above (e.g., HumanMessage, etc).

from typing_extensions import TypedDict
from langchain_core.messages import AnyMessage

class MessagesState(TypedDict):
    messages: list[AnyMessage]

Reducers

Now, we have a minor problem!

As discussed, each node will return a new value for our state key messages.

But, this new value will override the prior messages value.

As our graph runs, we want to append messages to to our messages state key.

We can use reducer functions address this.

Reducers allow us to specify how state updates are performed.

If no reducer function is specified, then it is assumed that updates to the key should override it as we saw before.

But, to append messages, we can use the pre-built add_messages reducer.

This ensures that any messages are appended to the existing list of messages.

We annotate simply need to annotate our messages key with the add_messages reducer function as metadata.

from typing import Annotated
from langgraph.graph.message import add_messages

class MessagesState(TypedDict):
    messages: Annotated[list[AnyMessage], add_messages]

Since having a list of messages in graph state is so common, LangGraph has a pre-built MessagesState!

MessagesState is defined:

With a pre-build single messages key
This is a list of AnyMessage objects
It uses the add_messages reducer

We’ll usually use MessagesState because it is less verbose than defining a custom TypedDict, as shown above.

from langgraph.graph import MessagesState

class MessagesState(MessagesState):
    # Add any keys needed beyond messages, which is pre-built 
    pass

The graph

from IPython.display import Image, display
from langgraph.graph import StateGraph, START, END
    
# Node
def tool_calling_llm(state: MessagesState):
    return {"messages": [llm_with_tools.invoke(state["messages"])]}

# Build graph
builder = StateGraph(MessagesState)
builder.add_node("tool_calling_llm", tool_calling_llm)
builder.add_edge(START, "tool_calling_llm")
builder.add_edge("tool_calling_llm", END)
graph = builder.compile()

# View
display(Image(graph.get_graph().draw_mermaid_png()))

If we pass in Hello!, the LLM responds without any tool calls.

messages = graph.invoke({"messages": HumanMessage(content="Hello!")})
for m in messages['messages']:
    m.pretty_print()

================================ Human Message =================================

Hello!
================================== Ai Message ==================================

Hi there! How can I assist you today?

The LLM chooses to use a tool when it determines that the input or task requires the functionality provided by that tool.

messages = graph.invoke({"messages": HumanMessage(content="Multiply 2 and 3!")})
for m in messages['messages']:
    m.pretty_print()

================================ Human Message =================================

Multiply 2 and 3!
================================== Ai Message ==================================
Tool Calls:
  multiply (call_Er4gChFoSGzU7lsuaGzfSGTQ)
 Call ID: call_Er4gChFoSGzU7lsuaGzfSGTQ
  Args:
    a: 2
    b: 3

Router

Above, we saw that the graph can:

Return a tool call
Return a natural language response

We can think of a router as a tool where the chat model routes between a direct response or a tool call based upon the user input.

This is an simple example of an agent, where the LLM is directing the control flow either by calling a tool or just responding directly.

Below, we will extend the graph to work with either output

First, we create the tool again:

from langchain_openai import ChatOpenAI

def multiply(a: int, b: int) -> int:
    """Multiply a and b.

    Args:
        a: first int
        b: second int
    """
    return a * b

llm = ChatOpenAI(model="gpt-4o")
llm_with_tools = llm.bind_tools([multiply])

We use the built-in ToolNode and simply pass a list of our tools to initialize it.

We use the built-in tools_condition as our conditional edge.

from IPython.display import Image, display
from langgraph.graph import StateGraph, START, END
from langgraph.graph import MessagesState
from langgraph.prebuilt import ToolNode
from langgraph.prebuilt import tools_condition

# Node
def tool_calling_llm(state: MessagesState):
    return {"messages": [llm_with_tools.invoke(state["messages"])]}

# Build graph
builder = StateGraph(MessagesState)
builder.add_node("tool_calling_llm", tool_calling_llm)
builder.add_node("tools", ToolNode([multiply]))
builder.add_edge(START, "tool_calling_llm")
builder.add_conditional_edges(
    "tool_calling_llm",
    # If the latest message (result) from assistant is a tool call -> tools_condition routes to tools
    # If the latest message (result) from assistant is a not a tool call -> tools_condition routes to END
    tools_condition,
)
builder.add_edge("tools", END)
graph = builder.compile()

# View
display(Image(graph.get_graph().draw_mermaid_png()))

messages = graph.invoke({"messages": ("user", "Multiply 3 and 2")})
for m in messages['messages']:
    m.pretty_print()

================================ Human Message =================================

Multiply 3 and 2
================================== Ai Message ==================================
Tool Calls:
  multiply (call_om27lpEIRsGdjQ9K78JUZtps)
 Call ID: call_om27lpEIRsGdjQ9K78JUZtps
  Args:
    a: 3
    b: 2
================================= Tool Message =================================
Name: multiply

6

Here is doing the same thing but using LangGraph Studio:

Agent

we can extend the above into a generic agent architecture.

In the above router, we invoked the model and, if it chose to call a tool, we returned a ToolMessage to the user.

But, what if we simply pass that ToolMessage back to the model?

We can let it either (1) call another tool or (2) respond directly.

This is the intuition behind ReAct, a general agent architecture.

act - let the model call specific tools
observe - pass the tool output back to the model
reason - let the model reason about the tool output to decide what to do next (e.g., call another tool or just respond directly)

This general purpose architecture can be applied to many types of tools.

Lets add some tools

from langchain_openai import ChatOpenAI

def multiply(a: int, b: int) -> int:
    """Multiply a and b.

    Args:
        a: first int
        b: second int
    """
    return a * b

# This will be a tool
def add(a: int, b: int) -> int:
    """Adds a and b.

    Args:
        a: first int
        b: second int
    """
    return a + b

def divide(a: int, b: int) -> float:
    """Adds a and b.

    Args:
        a: first int
        b: second int
    """
    return a / b

tools = [add, multiply, divide]
llm = ChatOpenAI(model="gpt-4o")
llm_with_tools = llm.bind_tools(tools)

Create an LLM

from langgraph.graph import MessagesState
from langchain_core.messages import HumanMessage, SystemMessage

# System message
sys_msg = SystemMessage(content="You are a helpful assistant tasked with performing arithmetic on a set of inputs.")

# Node
def assistant(state: MessagesState):
   return {"messages": [llm_with_tools.invoke([sys_msg] + state["messages"])]}

As before, we use MessagesState and define a Tools node with our list of tools.

The Assistant node is just our model with bound tools.

We create a graph with Assistant and Tools nodes.

We add tools_condition edge, which routes to End or to Tools based on whether the Assistant calls a tool.

Now, we add one new step:

We connect the Tools node back to the Assistant, forming a loop.

After the assistant node executes, tools_condition checks if the model’s output is a tool call.
If it is a tool call, the flow is directed to the tools node.
The tools node connects back to assistant.
This loop continues as long as the model decides to call tools.
If the model response is not a tool call, the flow is directed to END, terminating the process.

from langgraph.graph import START, StateGraph
from langgraph.prebuilt import tools_condition
from langgraph.prebuilt import ToolNode
from IPython.display import Image, display

# Graph
builder = StateGraph(MessagesState)

# Define nodes: these do the work
builder.add_node("assistant", assistant)
builder.add_node("tools", ToolNode(tools))

# Define edges: these determine how the control flow moves
builder.add_edge(START, "assistant")
builder.add_conditional_edges(
    "assistant",
    # If the latest message (result) from assistant is a tool call -> tools_condition routes to tools
    # If the latest message (result) from assistant is a not a tool call -> tools_condition routes to END
    tools_condition,
)
builder.add_edge("tools", "assistant")
react_graph = builder.compile()

# Show
display(Image(react_graph.get_graph(xray=True).draw_mermaid_png()))

messages = [HumanMessage(content="Add 3 and 4, then multiply by 2, and finally divide by 5")]
messages = react_graph.invoke({"messages": messages})
for m in messages['messages']:
    m.pretty_print()

================================ Human Message =================================

Add 3 and 4, then multiply by 2, and finally divide by 5
================================== Ai Message ==================================
Tool Calls:
  add (call_s056B7Zg4pWzrBeb5t0ZqTM7)
 Call ID: call_s056B7Zg4pWzrBeb5t0ZqTM7
  Args:
    a: 3
    b: 4
================================= Tool Message =================================
Name: add

7
================================== Ai Message ==================================
Tool Calls:
  multiply (call_Wh2R0c53q9Kin7AeCtFcD5ng)
 Call ID: call_Wh2R0c53q9Kin7AeCtFcD5ng
  Args:
    a: 7
    b: 2
================================= Tool Message =================================
Name: multiply

14
================================== Ai Message ==================================
Tool Calls:
  divide (call_U9eDdbIfW9N2DbCPCKi0i6TA)
 Call ID: call_U9eDdbIfW9N2DbCPCKi0i6TA
  Args:
    a: 14
    b: 5
================================= Tool Message =================================
Name: divide

2.8
================================== Ai Message ==================================

The result of the calculation is \(2.8\).

We can check it out in LangGraph Studio as well:

Now, the system responds with a natural language, rather than a tool_call

In LangSmith, we can even see extra tracing:

Agent with memory

Lets introduce memory to our agent

Setup for the agent is like the above section, but for memory, here is how to set it up:

from langgraph.checkpoint.memory import MemorySaver
memory = MemorySaver()
react_graph_memory = builder.compile(checkpointer=memory)

When we use memory, we need to specify a thread_id.

This thread_id will store our collection of graph states.

Here is a cartoon:

The checkpointer write the state at every step of the graph
These checkpoints are saved in a thread
We can access that thread in the future using the thread_id

# Specify a thread
config = {"configurable": {"thread_id": "1"}}

# Specify an input
messages = [HumanMessage(content="Add 3 and 4.")]

# Run
messages = react_graph_memory.invoke({"messages": messages},config)
for m in messages['messages']:
    m.pretty_print()

================================ Human Message =================================

Add 3 and 4.
================================== Ai Message ==================================
Tool Calls:
  add (call_23tmdvhv1VAfMxIZnFBFHJD7)
 Call ID: call_23tmdvhv1VAfMxIZnFBFHJD7
  Args:
    a: 3
    b: 4
================================= Tool Message =================================
Name: add

7
================================== Ai Message ==================================

The sum of 3 and 4 is 7.

If we pass the same thread_id, then we can proceed from from the previously logged state checkpoint!

In this case, the above conversation is captured in the thread.

The HumanMessage we pass ("Multiply that by 2.") is appended to the above conversation.

So, the model now know that that refers to the The sum of 3 and 4 is 7..

messages = [HumanMessage(content="Multiply that by 2.")]
messages = react_graph_memory.invoke({"messages": messages}, config)
for m in messages['messages']:
    m.pretty_print()

================================ Human Message =================================

Add 3 and 4.
================================== Ai Message ==================================
Tool Calls:
  add (call_23tmdvhv1VAfMxIZnFBFHJD7)
 Call ID: call_23tmdvhv1VAfMxIZnFBFHJD7
  Args:
    a: 3
    b: 4
================================= Tool Message =================================
Name: add

7
================================== Ai Message ==================================

The sum of 3 and 4 is 7.
================================ Human Message =================================

Multiply that by 2.
================================== Ai Message ==================================
Tool Calls:
  multiply (call_aFKM1qZxhnsW9YICm8xSeLpG)
 Call ID: call_aFKM1qZxhnsW9YICm8xSeLpG
  Args:
    a: 7
    b: 2
================================= Tool Message =================================
Name: multiply

14
================================== Ai Message ==================================

The result of multiplying 7 by 2 is 14.

Deployment

There are a few central concepts to understand -

LangGraph —

Python and JavaScript library
Allows creation of agent workflows

LangGraph API —

Bundles the graph code
Provides a task queue for managing asynchronous operations
Offers persistence for maintaining state across interactions

LangGraph Cloud –

Hosted service for the LangGraph API
Allows deployment of graphs from GitHub repositories
Also provides monitoring and tracing for deployed graphs
Accessible via a unique URL for each deployment

LangGraph Studio –

Integrated Development Environment (IDE) for LangGraph applications
Uses the API as its back-end, allowing real-time testing and exploration of graphs
Can be run locally or with cloud-deployment

LangGraph SDK –

Python library for programmatically interacting with LangGraph graphs
Provides a consistent interface for working with graphs, whether served locally or in the cloud
Allows creation of clients, access to assistants, thread management, and execution of runs

The tutorial showed how to deploy the above agent with tools using GitHub and LangSmith.

But I cannot actually deploy a link at the moment. Nevertheless local is fine for now too.

That is all for module 1. Tbh, extremely informative, and definitely learned a lot. Especially related to agents, and how those functions (like the multiply one) get put in the system and the LLM can use them.

Movie reviewer LLM

Last week, I mentioned about a project with one of my professors. I still cannot share anything related to it. But today I looked into evaluation metrics like ROUGE and BLEU. BLEU seems to be more for machine translation rather than our case. And seems like ROUGE fits us better as it has multiple metrics inside of it.

The evaluate module on Hugging Face provides multiple ROUGE (Recall-Oriented Understudy for Gisting Evaluation) metrics to assess the overlap between the model-generated text and the reference text. Here’s the breakdown of the four different ROUGE metrics you’re seeing:

ROUGE-1: Measures the overlap of unigrams (individual words) between the generated and reference summaries. It gives an indication of how many single words from the reference are captured in the generated summary.
ROUGE-2: Measures the overlap of bigrams (two consecutive words) between the generated and reference summaries. This metric takes into account the preservation of word sequences, reflecting how well the generated text captures small sequences of words from the reference.
ROUGE-L: Measures the longest common subsequence (LCS) between the generated and reference summaries. It evaluates the longest sequence of words that appear in both texts in the same order but not necessarily consecutively. This metric captures a more flexible notion of similarity based on the overall structure of the text.
ROUGE-Lsum: Specifically designed for evaluating summarization tasks, it computes the LCS at the sentence level, focusing on the structure of summaries. ROUGE-Lsum looks at how well the sentences in the generated summary align with those in the reference summary.

And the value for each is between 0 and 1, 0 meaning completel different summaries, and 1 exactly the same, so it is a nice and easy metric to understand. I ran a test on my generated data, using huggingface’s evaluate library (which contains all 4 ROUGE metrics from above), and I got: {'rouge1': 0.7613263913669215, 'rouge2': 0.4409441673251364, 'rougeL': 0.6578334536498124, 'rougeLsum': 0.7542867701046572}a

However, I am concerned as in addition to text, the model outputs a score out of 10 for each movie category (storyline, characters, etc). And using metrics like ROUGE we are not evaluating the model based on these numbers. So as part of my update email to my professor I mentioned this concern and whether we should look into a specific model that outputs scores.

That is all for today!

See you tomorrow :)