Sign In

Integrate GPT-4o into comfyui to achieve LLM visual functions!

14

245

7

Updated: May 20, 2024

workflowagentgpt4v

Type

Workflows

Stats

171

0

Reviews

Published

May 20, 2024

Base Model

SD 1.5

Hash

AutoV2
842D15CB09
default creator card background decoration
hst97858's Avatar

hst97858

GPT-4o has been released, and I’m joining the excitement by enabling my comfyui agent open-source project to support GPT-4o integration into comfyui, achieving visual functions.

The project address is:heshengtao/comfyui_LLM_party: A set of block-based LLM agent node libraries designed for ComfyUI development.(一组面向comfyui开发的积木化LLM智能体节点库)

In my open-source project, you can use these features:

  1. You can right-click in the comfyui interface, select llm from the context menu, and you will find the nodes for this project. [how to use nodes](how_to_use_nodes.md)

  2. Supports API integration or local large model integration. Modular implementation for tool invocation.When entering the base_url, please use a URL that ends with /v1/.You can use [ollama](https://github.com/ollama/ollama) to manage your model. Then, enter http://localhost:11434/v1/ for the base_url, ollama for the api_key, and your model name for the model_name, such as: llama3. If the call fails with a 503 error, you can try turning off the proxy server.

  3. Local knowledge base integration with RAG support.

  4. Ability to invoke code interpreters.

  5. Enables online queries, including Google search support.

  6. Implement conditional statements within ComfyUI to categorize user queries and provide targeted responses.

  7. Supports looping links for large models, allowing two large models to engage in debates.

  8. Attach any persona mask, customize prompt templates.

  9. Supports various tool invocations, including weather lookup, time lookup, knowledge base, code execution, web search, and single-page search.

  10. Use LLM as a tool node.

  11. Rapidly develop your own web applications using API + Streamlit.The picture below is an example of a drawing application.

  12. Added a dangerous omnipotent interpreter node that allows the large model to perform any task.

  13. It is recommended to use the show_text node under the function submenu of the right-click menu as the display output for the LLM node.