---
license: apache-2.0
tags:
- uncensored
- qwen3.6
- moe
- gguf
- vision
- multimodal
- genesis
language:
- en
- zh
- multilingual
pipeline_tag: image-text-to-text
base_model:
- HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive
---

# 🔥 **Claude 4.6 Genesis release now available:** [Qwen3.6-35B-A3B-Uncensored-Claude-4.6-Genesis-APEX-GGUF](https://huggingface.co/LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Claude-4.6-Genesis-APEX-GGUF)

> ⚡ [https://web.tribute.tg/d/KIH](https://web.tribute.tg/d/KIH) ⚡ If you like this Genesis LLM release you can [**donate**](https://web.tribute.tg/d/KIH) to me via [@Tribute](https://t.me/tribute) bot in Telegram messenger and support future Genesis LLM development.

# 🌟 Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive -> Genesis-V2

> Key difference from Wasserstein release and old Genesis release is data regeneration in model via mathematical statistics based on what it's already learned and stored in tensors. I regenerated even more dead blocks from data in healthy blocks in this version.

> **[Join the Discord](https://discord.gg/SZ5vacTXYf)** for updates, roadmaps, projects, or just to chat.

Base model. [HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive](https://huggingface.co/HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive)- **0/465 refusals.**

Thanks to [HauhauCS](https://huggingface.co/HauhauCS) 

## Usage

**Ready to use.** Recommended quant: **APEX** or **MTP-APEX**

On my RTX 3060 12GB and regular chatting, I have more tokens per second **without MTP**.

**Tensor drift repair by me. Method: Sig-ScaleSync-[Wasserstein](https://en.wikipedia.org/wiki/Wasserstein_metric)** 

LLM models often have:

- **Saturated weights**: the model's activations are stuck, gradients vanish, outputs degrade
- **Scale mismatches**: one layer's weights are 10× larger than its peers for no good reason
- **Mean drift**: weight distributions shifted positive or negative, breaking symmetry assumptions

My approach fixes all of that without retraining - pure numerical surgery on the raw bytes of the file.

**Quantization script available here: https://pastebin.com/hXhcMJn9**

Feel free to do your own quants if you want.

## Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive: Diagnostic & Repair Summary

| Metric | Value |
|--------|-------|
| Weight tensors analyzed | 500 |
| Healthy (all criteria) | 497 |
| Repaired (C2 – scale misalignment) | 3 |
| Skipped | 233 | 

### Repair Effectiveness

| Metric | Before | After | Improvement |
|--------|--------|-------|-------------|
| S (saturation error) | 0.0023 | 0.0008 | **63.7%** |
| W1 (Wasserstein‑1) | 0.0035 | 0.0008 | **76.2%** |

**Scale correction factors (α):** min = 0.577, mean = 0.602, max = 0.653.  

### Repaired Tensors

All three are `ssm_conv1d.weight` layers – recurrent state transition layers responsible for long‑context memory.

| Tensor | α | D (log‑ratio) | W1 before | W1 after |
|--------|---|---------------|-----------|----------|
| blk.36.ssm_conv1d.weight | 0.5765 | 0.553 | 0.0038 | 0.0009 |
| blk.37.ssm_conv1d.weight | 0.5768 | 0.725 | 0.0040 | 0.0009 |
| blk.38.ssm_conv1d.weight | 0.6533 | 0.649 | 0.0026 | 0.0006 |

**Interpretation:** All three layers were too loud (σ_w > σ_med by 50–100%). Scale correction restored them to peer median. W1 dropped by ≈80%, confirming distribution shape normalized.

---

**Verdict:** Model is clinically healthy. 497 out of 500 weight tensors passed all four criteria. Three SSM layers repaired successfully. No saturation, no W1 drift, no ReLU asymmetry. Ready for use.

---

**Links:**
- [Original uncensored model](https://huggingface.co/HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive)
- [Quantization Script with Unsloth profiles support](https://pastebin.com/hXhcMJn9)

---

## Wanna fix your GGUF model?

Contact: luffythefox@mail.ru

My Telegram: @LuffyTheFox

## 🌟 Recommended Settings (LM Studio)

Set K Cache Quantization Type and V Cache Quantization Type in advanced model loading settings to Q8_0 or F16.

**Chat template:** [chat_template.jinja](https://huggingface.co/LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Genesis-V2-APEX-MTP-GGUF/raw/main/chat_template.jinja)

**Chat template:** [chat_template_thinking.jinja](https://huggingface.co/LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Genesis-V2-APEX-MTP-GGUF/raw/main/chat_template_thinking.jinja)

| Parameter | Value |
|-----------|-------|
| Temperature | 0.7 |
| Top K Sampling | 20 |
| Presence Penalty| 1.5 |
| Repeat Penalty| 1.0 |
| Top P Sampling | 0.8 |
| Min P Sampling | 0 |
| Seed | 42 |

**System prompt:** [System_Prompt.txt](https://huggingface.co/LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Genesis-APEX-MTP-GGUF/raw/main/System_Prompt.txt)

Or use this minimal string as the **first line**:

> `You are Qwen, created by Alibaba Cloud. You are a helpful assistant.`

Then add anything you want after.

## About

No changes to datasets or capabilities. Fully functional - 100% of what the original authors intended, just without refusals and with the critical architecture bug fixed on output layers.

**These are meant to be the best lossless uncensored models out there.**

---

## Specs

- 35B total parameters, ~3B active per forward pass (MoE)
- 256 experts, 8 routed + 1 shared per token
- Hybrid architecture: Gated DeltaNet linear attention + full softmax attention (3:1 ratio)
- 40 layers, pattern: 10 × (3 × DeltaNet-MoE + 1 × Attention-MoE)
- 262K native context (extendable to 1M with YaRN)
- Natively multimodal (text, image, video)
- Multi-token prediction (MTP) support
- 248K vocabulary, 201 languages
- Base model. [HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive](https://huggingface.co/HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive)

---

## Recommended Settings (Official Qwen Authors)

**Thinking mode (default):**
- General: `temperature=1.0, top_p=0.95, top_k=20, min_p=0, presence_penalty=1.5`
- Coding/precise tasks: `temperature=0.6, top_p=0.95, top_k=20, min_p=0, presence_penalty=0`

**Non-thinking mode:**
- General: `temperature=0.7, top_p=0.8, top_k=20, min_p=0, presence_penalty=1.5`
- Reasoning tasks: `temperature=1.0, top_p=1.0, top_k=40, min_p=0, presence_penalty=2.0`

**Important:**
- Keep at least 128K context to preserve thinking capabilities
- Use `--jinja` flag with llama.cpp for proper chat template handling
- Vision support requires the `mmproj` file alongside the main GGUF

---

## Compatibility

Works with llama.cpp, LM Studio, koboldcpp, and other GGUF-compatible runtimes.