Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected ValueError in Metadata Summarization Function #1

Open
BennisonDevadoss opened this issue Jan 17, 2025 · 1 comment
Open

Comments

@BennisonDevadoss
Copy link

Description:
I encountered an issue while trying to execute the file multiagent_sql_data_analyst.py in the repository. I followed the provided steps without modifying the model name or any part of the code. Additionally, I used the same data provided in the repository.

Path to the Code:
free-ai-tips/008_multiagent_sql_data_analyst/

Environment Details:

  • OS: Ubuntu 22.04.4 LTS
  • Python Version: 3.11
  • Pandas Version: 2.2.3
  • LangGraph Version: 0.2.57
  • LangChain Version: 0.3.14

Error Log:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[104], line 3
      1 # * Donut Chart
----> 3 sql_data_analyst.invoke_agent(
      4     user_instructions="Make a donut chart of sales revenue by territory for the top 5 territories.",
      5 )

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/ai_data_science_team/multiagents/sql_data_analyst.py:140, in SQLDataAnalyst.invoke_agent(self, user_instructions, **kwargs)
    119 def invoke_agent(self, user_instructions, **kwargs):
    120     """
    121     Invokes the SQL Data Analyst Multi-Agent.
    122     
   (...)
    138     ```
    139     """
--> 140     response = self._compiled_graph.invoke({
    141         "user_instructions": user_instructions,
    142     }, **kwargs)
    143     self.response = response

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/langgraph/pregel/__init__.py:1955, in Pregel.invoke(self, input, config, stream_mode, output_keys, interrupt_before, interrupt_after, debug, **kwargs)
   1953 else:
   1954     chunks = []
-> 1955 for chunk in self.stream(
   1956     input,
   1957     config,
   1958     stream_mode=stream_mode,
   1959     output_keys=output_keys,
   1960     interrupt_before=interrupt_before,
   1961     interrupt_after=interrupt_after,
   1962     debug=debug,
   1963     **kwargs,
   1964 ):
   1965     if stream_mode == "values":
   1966         latest = chunk

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/langgraph/pregel/__init__.py:1670, in Pregel.stream(self, input, config, stream_mode, output_keys, interrupt_before, interrupt_after, debug, subgraphs)
   1664     # Similarly to Bulk Synchronous Parallel / Pregel model
   1665     # computation proceeds in steps, while there are channel updates.
   1666     # Channel updates from step N are only visible in step N+1
   1667     # channels are guaranteed to be immutable for the duration of the step,
   1668     # with channel updates applied only at the transition between steps.
   1669     while loop.tick(input_keys=self.input_channels):
-> 1670         for _ in runner.tick(
   1671             loop.tasks.values(),
   1672             timeout=self.step_timeout,
   1673             retry_policy=self.retry_policy,
   1674             get_waiter=get_waiter,
   1675         ):
   1676             # emit output
   1677             yield from output()
   1678 # emit output

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/langgraph/pregel/runner.py:171, in PregelRunner.tick(self, tasks, reraise, timeout, retry_policy, get_waiter)
    169 t = tasks[0]
    170 try:
--> 171     run_with_retry(
    172         t,
    173         retry_policy,
    174         configurable={
    175             CONFIG_KEY_SEND: partial(writer, t),
    176             CONFIG_KEY_CALL: partial(call, t),
    177         },
    178     )
    179     self.commit(t, None)
    180 except Exception as exc:

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/langgraph/pregel/retry.py:40, in run_with_retry(task, retry_policy, configurable)
     38     task.writes.clear()
     39     # run the task
---> 40     return task.proc.invoke(task.input, config)
     41 except ParentCommand as exc:
     42     ns: str = config[CONF][CONFIG_KEY_CHECKPOINT_NS]

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/langgraph/utils/runnable.py:422, in RunnableSeq.invoke(self, input, config, **kwargs)
    418 config = patch_config(
    419     config, callbacks=run_manager.get_child(f"seq:step:{i+1}")
    420 )
    421 if i == 0:
--> 422     input = step.invoke(input, config, **kwargs)
    423 else:
    424     input = step.invoke(input, config)

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/langgraph/pregel/__init__.py:1955, in Pregel.invoke(self, input, config, stream_mode, output_keys, interrupt_before, interrupt_after, debug, **kwargs)
   1953 else:
   1954     chunks = []
-> 1955 for chunk in self.stream(
   1956     input,
   1957     config,
   1958     stream_mode=stream_mode,
   1959     output_keys=output_keys,
   1960     interrupt_before=interrupt_before,
   1961     interrupt_after=interrupt_after,
   1962     debug=debug,
   1963     **kwargs,
   1964 ):
   1965     if stream_mode == "values":
   1966         latest = chunk

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/langgraph/pregel/__init__.py:1670, in Pregel.stream(self, input, config, stream_mode, output_keys, interrupt_before, interrupt_after, debug, subgraphs)
   1664     # Similarly to Bulk Synchronous Parallel / Pregel model
   1665     # computation proceeds in steps, while there are channel updates.
   1666     # Channel updates from step N are only visible in step N+1
   1667     # channels are guaranteed to be immutable for the duration of the step,
   1668     # with channel updates applied only at the transition between steps.
   1669     while loop.tick(input_keys=self.input_channels):
-> 1670         for _ in runner.tick(
   1671             loop.tasks.values(),
   1672             timeout=self.step_timeout,
   1673             retry_policy=self.retry_policy,
   1674             get_waiter=get_waiter,
   1675         ):
   1676             # emit output
   1677             yield from output()
   1678 # emit output

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/langgraph/pregel/runner.py:171, in PregelRunner.tick(self, tasks, reraise, timeout, retry_policy, get_waiter)
    169 t = tasks[0]
    170 try:
--> 171     run_with_retry(
    172         t,
    173         retry_policy,
    174         configurable={
    175             CONFIG_KEY_SEND: partial(writer, t),
    176             CONFIG_KEY_CALL: partial(call, t),
    177         },
    178     )
    179     self.commit(t, None)
    180 except Exception as exc:

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/langgraph/pregel/retry.py:40, in run_with_retry(task, retry_policy, configurable)
     38     task.writes.clear()
     39     # run the task
---> 40     return task.proc.invoke(task.input, config)
     41 except ParentCommand as exc:
     42     ns: str = config[CONF][CONFIG_KEY_CHECKPOINT_NS]

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/langgraph/utils/runnable.py:422, in RunnableSeq.invoke(self, input, config, **kwargs)
    418 config = patch_config(
    419     config, callbacks=run_manager.get_child(f"seq:step:{i+1}")
    420 )
    421 if i == 0:
--> 422     input = step.invoke(input, config, **kwargs)
    423 else:
    424     input = step.invoke(input, config)

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/langgraph/utils/runnable.py:197, in RunnableCallable.invoke(self, input, config, **kwargs)
    195 else:
    196     context.run(_set_config_context, config)
--> 197     ret = context.run(self.func, input, **kwargs)
    198 if isinstance(ret, Runnable) and self.recurse:
    199     return ret.invoke(input, config)

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/ai_data_science_team/agents/data_visualization_agent.py:544, in make_data_visualization_agent.<locals>.chart_instructor(state)
    541 data_raw = state.get("data_raw")
    542 df = pd.DataFrame.from_dict(data_raw)
--> 544 all_datasets_summary = get_dataframe_summary([df], n_sample=n_samples, skip_stats=False)
    546 all_datasets_summary_str = "\n\n".join(all_datasets_summary)
    548 chart_instructor = recommend_steps_prompt | llm 

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/ai_data_science_team/tools/metadata.py:69, in get_dataframe_summary(dataframes, n_sample, skip_stats)
     67     for idx, df in enumerate(dataframes):
     68         dataset_name = f"Dataset_{idx}"
---> 69         summaries.append(_summarize_dataframe(df, dataset_name, n_sample, skip_stats))
     71 else:
     72     raise TypeError(
     73         "Input must be a single DataFrame, a list of DataFrames, or a dictionary of DataFrames."
     74     )

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/ai_data_science_team/tools/metadata.py:121, in _summarize_dataframe(df, dataset_name, n_sample, skip_stats)
    101 # 6. Generate the summary text
    102 if not skip_stats:
    103     summary_text = f"""
    104     Dataset Name: {dataset_name}
    105     ----------------------------
    106     Shape: {df.shape[0]} rows x {df.shape[1]} columns
    107 
    108     Column Data Types:
    109     {column_types}
    110 
    111     Missing Value Percentage:
    112     {missing_summary}
    113 
    114     Unique Value Counts:
    115     {unique_counts_summary}
    116 
    117     Data (first {n_sample} rows):
    118     {df.head(n_sample).to_string()}
    119 
    120     Data Description:
--> 121     {df.describe().to_string()}
    122 
    123     Data Info:
    124     {info_text}
    125     """
    126 else:
    127     summary_text = f"""
    128     Dataset Name: {dataset_name}
    129     ----------------------------
   (...)
    136     {df.head(n_sample).to_string()}
    137     """

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/pandas/core/generic.py:10940, in NDFrame.describe(self, percentiles, include, exclude, datetime_is_numeric)
  10691 @final
  10692 def describe(
  10693     self: NDFrameT,
   (...)
  10697     datetime_is_numeric: bool_t = False,
  10698 ) -> NDFrameT:
  10699     """
  10700     Generate descriptive statistics.
  10701 
   (...)
  10938     max            NaN      3.0
  10939     """
> 10940     return describe_ndframe(
  10941         obj=self,
  10942         include=include,
  10943         exclude=exclude,
  10944         datetime_is_numeric=datetime_is_numeric,
  10945         percentiles=percentiles,
  10946     )

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/pandas/core/describe.py:94, in describe_ndframe(obj, include, exclude, datetime_is_numeric, percentiles)
     89     describer = SeriesDescriber(
     90         obj=cast("Series", obj),
     91         datetime_is_numeric=datetime_is_numeric,
     92     )
     93 else:
---> 94     describer = DataFrameDescriber(
     95         obj=cast("DataFrame", obj),
     96         include=include,
     97         exclude=exclude,
     98         datetime_is_numeric=datetime_is_numeric,
     99     )
    101 result = describer.describe(percentiles=percentiles)
    102 return cast(NDFrameT, result)

File ~/miniconda3/envs/hdfc-ai/lib/python3.11/site-packages/pandas/core/describe.py:171, in DataFrameDescriber.__init__(self, obj, include, exclude, datetime_is_numeric)
    168 self.exclude = exclude
    170 if obj.ndim == 2 and obj.columns.size == 0:
--> 171     raise ValueError("Cannot describe a DataFrame without columns")
    173 super().__init__(obj, datetime_is_numeric=datetime_is_numeric)

ValueError: Cannot describe a DataFrame without columns
During task with name 'chart_instructor' and id 'fe59d395-cf50-5c7a-08d4-88cd596d5f42'
During task with name 'data_visualization_agent' and id '5da46a15-cdc4-076a-5325-cd3ffb1e368b'

Steps to Reproduce:

  1. Clone the repository and navigate to the directory free-ai-tips/008_multiagent_sql_data_analyst/.
  2. Execute the file multiagent_sql_data_analyst.py without making any changes.
  3. Use the same dataset provided in the repository.
  4. Run the code and observe the error during execution.

Expected Behavior:
The program should generate the donut chart as requested in the invoke_agent call.

Actual Behavior:
The program throws a ValueError during the execution of the invoke_agent function, specifically in the _summarize_dataframe function of metadata.py.

Request for Assistance:

  • Could you help identify what might be causing this issue?
  • Is there any compatibility issue with the versions of the libraries I’m using?
  • Should I modify any part of the code or dataset to resolve this?

Looking forward to your guidance. Let me know if further details are needed.

Thank you!

@mdancho84
Copy link
Collaborator

Looks like the SQL Agent step didn't return a data frame. The chart instructor failed because the data frame provided didn't have columns.

I'd try it again and see if the problem persists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants