Desktop AI Assistant powered by o1, GPT-4, GPT-4V, DALL-E 3, Llama 3, Gemini, Claude. Chatbot, assistant, vision and more.
PyGPT is an all-in-one Desktop AI Assistant that provides direct interaction with OpenAI language models, including o1, GPT-4o, GPT-4 Vision, and GPT-3.5, through the OpenAI API. The application also integrates with other LLMs, like Llama 3, Gemini, Mistral, Claude, Bielik, and more, by utilizing Langchain, Llama-index and Ollama.
Features
- Desktop AI Assistant for
Linux
, Windows
and Mac
, written in Python.
- Works similarly to
ChatGPT
, but locally (on a desktop computer).
- 11 modes of operation: Chat, Vision, Completion, Assistant, Image generation, LangChain, Chat with Files, Chat with Audio, Experts, Autonomous Mode and Agents.
- Supports multiple models:
o1
, GPT-4o
, GPT-4
, GPT-3.5
, and any model accessible through LangChain
, LlamaIndex
and Ollama
such as Llama 3
, Mistral
, Google Gemini
, Anthropic Claude
, Bielik
, etc.
- Chat with your own Files: integrated
LlamaIndex
support: chat with data such as: txt
, pdf
, csv
, html
, md
, docx
, json
, epub
, xlsx
, xml
, webpages, Google
, GitHub
, video/audio, images and other data types, or use conversation history as additional context provided to the model.
- Built-in vector databases support and automated files and data embedding.
- Included support features for individuals with disabilities: customizable keyboard shortcuts, voice control, and translation of on-screen actions into audio via speech synthesis.
- Handles and stores the full context of conversations (short and long-term memory).
- Internet access via
Google
and Microsoft Bing
.
- Speech synthesis via
Microsoft Azure
, Google
, Eleven Labs
and OpenAI
Text-To-Speech services.
- Speech recognition via
OpenAI Whisper
, Google
and Microsoft Speech Recognition
.
- Real-time video camera capture in Vision mode.
- Image analysis via
GPT-4 Vision
and GPT-4o
.
- Integrated
LangChain
support (you can connect to any LLM, e.g., on HuggingFace
).
- Integrated calendar, day notes and search in contexts by selected date.
- Tools and commands execution (via plugins: access to the local filesystem, Python Code Interpreter, system commands execution, and more).
- Custom commands creation and execution.
- Crontab / Task scheduler included.
- Manages files and attachments with options to upload, download, and organize.
- Context history with the capability to revert to previous contexts (long-term memory).
- Allows you to easily manage prompts with handy editable presets.
- Provides an intuitive operation and interface.
- Includes a notepad.
- Includes simple painter / drawing tool.
- Supports multiple languages.
- Requires no previous knowledge of using AI models.
- Simplifies image generation using
DALL-E
.
- Fully configurable.
- Themes support.
- Real-time code syntax highlighting.
- Plugins support.
- Built-in token usage calculation.
- Possesses the potential to support future OpenAI models.
- Open source; source code is available on
GitHub
.
- Utilizes the user's own API key.
- and many more.
The application is free, open-source, and runs on PCs with Linux, Windows and Mac. The full Python source code is available on GitHub.
Project Website: https://pygpt.net
GitHub: https://github.com/szczyglis-dev/py-gpt
PyPi: https://pypi.org/project/pygpt-net
Documentation: https://pygpt.readthedocs.io/en/latest
Changelog:
2.4.46 (2024-12-16)
- Added a new tab in Settings: "API Keys", where the API keys configuration for Google and Anthropic models has been relocated.
- Introduced a new mode in "Chat with Files": "Retrieve Only", which allows for retrieving raw documents from the index.
- Fixed a bug related to tool calls in the Gemini provider when using Chat with Files mode.