TokenAI
Horus 1.0 4B

Horus 1.0 4B

Text Generation Model

Published 8 April 2026

A compact yet powerful language model designed for efficient text generation and chat. Horus 1.0 4B delivers strong performance in reasoning and chat tasks.

Architecture

LLaMA-based architecture with proven attention mechanisms

Context

8K tokens context length for efficient text generation

Chat

Optimized for conversational AI and interactive chat

Reasoning

Strong reasoning and logical inference capabilities

Model Versions

Full Weights

F16Original

9.03 GB

Full precision weights

RAM: 12 GB

VRAM: 10 GB

View on Hugging Face

Compressed (GGUF)

Q8_0

4.8 GB

Near-lossless

RAM: 6 GB

VRAM: 5 GB

View on Hugging Face
Q6_K

3.71 GB

Excellent

RAM: 5 GB

VRAM: 4 GB

View on Hugging Face
Q5_K_M

3.23 GB

Very Good

RAM: 4 GB

VRAM: 3.5 GB

View on Hugging Face
Q4_K_M

2.78 GB

Good

RAM: 3.5 GB

VRAM: 3 GB

View on Hugging Face

Detailed Specifications

FormatFile SizeMin RAMMin VRAMQualityBest For
F169.03 GB12 GB10 GBMaximum qualityHigh-end GPUs (RTX 3090, A100)
Q8_04.8 GB6 GB5 GBNear-losslessRTX 3060 12GB, RTX 4060
Q6_K3.71 GB5 GB4 GBExcellentRTX 3060, RTX 4060 Laptop
Q5_K_M3.23 GB4 GB3.5 GBVery GoodGTX 1650, RTX 3050
Q4_K_M2.78 GB3.5 GB3 GBGoodEntry-level GPUs, CPU-only

Model Configurator

Customize your setup and get a ready-to-run code snippet

A

Select Quantized Version

C

Your Generated Code

Q4_K_M ·
import neuralnode as nn

# Load the Horus 1.0 4B model
horus = nn.load("tokenai/Horus_1.0_4B_GGUF/Horus_1.0_4B_Q4_K_M.gguf")

# Generate text
response = horus.generate("Explain quantum computing in simple terms")

print(response.content)

4 Billion parameters compact transformer optimized for efficient local deployment.

LLaMA-based Architecture with proven attention mechanisms for reliable performance.

8K tokens context length suitable for most text generation and coding tasks.

Excels in reasoning and chat tasks while maintaining minimal resource requirements.

Open license enabling unrestricted use and fine-tuning for custom applications.

Performance Benchmarks

BenchmarkMetricHorus 1.0 4BComparison
MMLU5-shot71.5Phi-3 mini: 68.5
HumanEvalPass@165.2Phi-3 mini: 62.8
GSM8KMath72.8Phi-3 mini: 70.5
BBHReasoning68.4Phi-3 mini: 65.2

Standardized Verification

Benchmark Result 1
Benchmark Result 2
Benchmark Result 3

Horus 1.0 4B delivers exceptional performance for its size, making it ideal for developers seeking powerful language capabilities without requiring massive computational resources.