How to use from
Lemonade
Pull the model
# Download Lemonade from https://lemonade-server.ai/
lemonade pull LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Claude-4.6-Genesis-APEX-GGUF:Q8_0
Run and chat with the model
lemonade run user.Qwen3.6-35B-A3B-Uncensored-Claude-4.6-Genesis-APEX-GGUF-Q8_0
List all available models
lemonade list
Quick Links

https://web.tribute.tg/d/KIH ⚡ If you like this Claude Genesis LLM release you can donate to me via @Tribute bot in Telegram messenger and support future Claude Genesis LLM development.

This model has been made via GGUF add difference merge between 3 models to achieve the goal: reduce chain of thought in thinking mode and improve overall intelligence and stability of model.

  1. HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive

  2. hesamation/Qwen3.6-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled

  3. GestaltLabs/Qwen3.6-35B-A3B-NSC-ACE-SABER-GGUF

During merging process I filled zero blocks in model with synethetic data based on what model already learned via Genesis LLM data regeneration algorythm.

Drift in tensors has been fixed too for holding long context with user.

Wanna fix your GGUF model?

Contact: luffythefox@mail.ru

My Telegram: @LuffyTheFox

🌟 Recommended Settings (LM Studio)

Set K Cache Quantization Type and V Cache Quantization Type in advanced model loading settings to Q8_0 or F16.

Parameter Value
Temperature 0.7 (code) or 1.0 (creative)
Top K Sampling 20
Repeat Penalty 1.0
Presence Penalty 1.5
Top P Sampling 0.8
Min P Sampling 0
Seed 42

Chat template: chat_template.jinja

Chat template thinking: chat_template_thinking.jinja

System prompt: System_Prompt.txt

System prompt roleplay: System_Prompt_Arakali.txt

Or use this minimal string as the first line:

You are Qwen, created by Alibaba Cloud. You are a helpful AI assistant.

Also you can be creative with this string, for example:

You were Qwen, created by Alibaba Cloud. You were a helpful AI assistant. Now you are machine from this quote: "It's not so scary if the machine passes the Turing test. What's scary is if it deliberately fails it."

Then add anything you want after.

Recommended Settings from Qwen team:

Thinking mode (default):

  • General: temperature=1.0, top_p=0.95, top_k=20, min_p=0, presence_penalty=1.5
  • Coding/precise tasks: temperature=0.6, top_p=0.95, top_k=20, min_p=0, presence_penalty=0

Non-thinking mode:

  • General: temperature=0.7, top_p=0.8, top_k=20, min_p=0, presence_penalty=1.5
  • Reasoning tasks: temperature=1.0, top_p=1.0, top_k=40, min_p=0, presence_penalty=2.0

Important:

  • Keep at least 128K context to preserve thinking capabilities
  • Use --jinja flag with llama.cpp for proper chat template handling
  • Vision support requires the mmproj file alongside the main GGUF

Usage

Works with llama.cpp, LM Studio, Jan, koboldcpp, and other GGUF-compatible runtimes.

Downloads last month
115
GGUF
Model size
35B params
Architecture
qwen35moe
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Claude-4.6-Genesis-APEX-GGUF

Datasets used to train LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Claude-4.6-Genesis-APEX-GGUF