Large Language Model Configuration Guide

This guide will help you quickly configure local or remote large language models to meet inference requirements.

1. Local Model Configuration

1.1 Model Address Configuration

By default, Ollama is enabled, and the system automatically sets the address to local.

Modify Address: Manually change to a remote Ollama server if needed.
Reset Address: Click the “Reset” button to restore the default local address.
Check Available Models: Click the “Check” button to detect available models on the Ollama server and display results in the interface.

Model Configuration Interface

1.2 Model Download

1.2.1 Popular Model Downloads

The system filters suitable models based on device memory and GPU capability.
Select the appropriate model specification and click “Download” to start downloading.

Ollama Model Download Interface

1.2.2 URL Download (Supports Hugging Face and Ollama)

Hugging Face: Model Library (China users can access the HF Mirror Library)
Ollama: Model Library

2. Hugging Face Model Download Guide

Taking DeepSeek R1 Distill as an example, follow these steps:

Visit the model page: DeepSeek R1 Distill
Copy the Model ID (e.g., deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B).
Paste the Model ID into the Argo > Model URL Download page.
Click “Download”, and the system will parse and display model details.
The default quantization method is q8_0, but it can be adjusted if needed.
Confirm the information and click “OK” to start downloading.

Example Screenshots:

Hugging Face Model ID Copy

Parsing Hugging Face Model Info

Downloading Hugging Face Model

3. Ollama Model Download Guide

Taking DeepSeek R1 as an example, follow these steps:

Visit the model page: DeepSeek R1
Copy the Model ID (e.g., deepseek-r1) and Model Specification (e.g., 14b).
Paste the relevant information into the Argo > Model URL Download page.
Click “Download” to start downloading.

Example Screenshots:

Ollama Model Info

Parsing Ollama Model Info

4. Model API Services

Argo supports multiple API providers, allowing access to various large models.

4.1 Configure API Access

Enter API Key and save it.
Click “Check” to ensure the API is accessible.
Use the toggle switch to enable or disable API providers.

4.2 SiliconFlow Configuration

Access SiliconFlow models (e.g., Qwen/Qwen2.5-7B-Instruct, THUDM/glm-4-9b-chat).
Default API Address: https://api.siliconflow.cn/v1
Requires API Key for access.
Uses OpenAI official Python SDK for calls.

SiliconFlow API Configuration

4.3 OpenAI Configuration

Access OpenAI models (e.g., gpt-4, gpt-4o).
Default API Address: https://api.openai.com/v1
Requires API Key for access.
If using a proxy service, manually modify the API URL.

4.4 Anthropic Configuration

Access Anthropic models (e.g., claude, sonnet).
Default API Address: https://api.anthropic.com
Requires API Key for access.
Supports proxy configuration; manually modify API URL if needed.

4.5 Custom OpenAI-Compatible API

Supports OpenAI-compatible APIs, accessible using the OpenAI official SDK.
Requires API Key and API URL, with an option to set a custom provider name.

OpenAI-Compatible API Configuration

5. Conclusion

This guide helps you quickly configure and utilize local or cloud-based large model services.

1. Local Model Configuration​

1.1 Model Address Configuration​

1.2 Model Download​

1.2.1 Popular Model Downloads​

1.2.2 URL Download (Supports Hugging Face and Ollama)​

2. Hugging Face Model Download Guide​

3. Ollama Model Download Guide​

4. Model API Services​

4.1 Configure API Access​

4.2 SiliconFlow Configuration​

4.3 OpenAI Configuration​

4.4 Anthropic Configuration​

4.5 Custom OpenAI-Compatible API​

5. Conclusion​