--- license: apache-2.0 tags: - uncensored - qwen3.6 - moe - gguf - vision - multimodal - genesis language: - en - zh - multilingual pipeline_tag: image-text-to-text base_model: - HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive --- # 🔥 **Claude 4.6 Genesis release now available:** [Qwen3.6-35B-A3B-Uncensored-Claude-4.6-Genesis-APEX-GGUF](https://huggingface.co/LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Claude-4.6-Genesis-APEX-GGUF) > ⚡ [https://web.tribute.tg/d/KIH](https://web.tribute.tg/d/KIH) ⚡ If you like this Genesis LLM release you can [**donate**](https://web.tribute.tg/d/KIH) to me via [@Tribute](https://t.me/tribute) bot in Telegram messenger and support future Genesis LLM development. # 🌟 Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive -> Genesis-V2 > Key difference from Wasserstein release and old Genesis release is data regeneration in model via mathematical statistics based on what it's already learned and stored in tensors. I regenerated even more dead blocks from data in healthy blocks in this version. > **[Join the Discord](https://discord.gg/SZ5vacTXYf)** for updates, roadmaps, projects, or just to chat. Base model. [HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive](https://huggingface.co/HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive)- **0/465 refusals.** Thanks to [HauhauCS](https://huggingface.co/HauhauCS) ## Usage **Ready to use.** Recommended quant: **APEX** or **MTP-APEX** On my RTX 3060 12GB and regular chatting, I have more tokens per second **without MTP**. **Tensor drift repair by me. Method: Sig-ScaleSync-[Wasserstein](https://en.wikipedia.org/wiki/Wasserstein_metric)** LLM models often have: - **Saturated weights**: the model's activations are stuck, gradients vanish, outputs degrade - **Scale mismatches**: one layer's weights are 10× larger than its peers for no good reason - **Mean drift**: weight distributions shifted positive or negative, breaking symmetry assumptions My approach fixes all of that without retraining - pure numerical surgery on the raw bytes of the file. **Quantization script available here: https://pastebin.com/hXhcMJn9** Feel free to do your own quants if you want. ## Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive: Diagnostic & Repair Summary | Metric | Value | |--------|-------| | Weight tensors analyzed | 500 | | Healthy (all criteria) | 497 | | Repaired (C2 – scale misalignment) | 3 | | Skipped | 233 | ### Repair Effectiveness | Metric | Before | After | Improvement | |--------|--------|-------|-------------| | S (saturation error) | 0.0023 | 0.0008 | **63.7%** | | W1 (Wasserstein‑1) | 0.0035 | 0.0008 | **76.2%** | **Scale correction factors (α):** min = 0.577, mean = 0.602, max = 0.653. ### Repaired Tensors All three are `ssm_conv1d.weight` layers – recurrent state transition layers responsible for long‑context memory. | Tensor | α | D (log‑ratio) | W1 before | W1 after | |--------|---|---------------|-----------|----------| | blk.36.ssm_conv1d.weight | 0.5765 | 0.553 | 0.0038 | 0.0009 | | blk.37.ssm_conv1d.weight | 0.5768 | 0.725 | 0.0040 | 0.0009 | | blk.38.ssm_conv1d.weight | 0.6533 | 0.649 | 0.0026 | 0.0006 | **Interpretation:** All three layers were too loud (σ_w > σ_med by 50–100%). Scale correction restored them to peer median. W1 dropped by ≈80%, confirming distribution shape normalized. --- **Verdict:** Model is clinically healthy. 497 out of 500 weight tensors passed all four criteria. Three SSM layers repaired successfully. No saturation, no W1 drift, no ReLU asymmetry. Ready for use. --- **Links:** - [Original uncensored model](https://huggingface.co/HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive) - [Quantization Script with Unsloth profiles support](https://pastebin.com/hXhcMJn9) --- ## Wanna fix your GGUF model? Contact: luffythefox@mail.ru My Telegram: @LuffyTheFox ## 🌟 Recommended Settings (LM Studio) Set K Cache Quantization Type and V Cache Quantization Type in advanced model loading settings to Q8_0 or F16. **Chat template:** [chat_template.jinja](https://huggingface.co/LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Genesis-V2-APEX-MTP-GGUF/raw/main/chat_template.jinja) **Chat template:** [chat_template_thinking.jinja](https://huggingface.co/LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Genesis-V2-APEX-MTP-GGUF/raw/main/chat_template_thinking.jinja) | Parameter | Value | |-----------|-------| | Temperature | 0.7 | | Top K Sampling | 20 | | Presence Penalty| 1.5 | | Repeat Penalty| 1.0 | | Top P Sampling | 0.8 | | Min P Sampling | 0 | | Seed | 42 | **System prompt:** [System_Prompt.txt](https://huggingface.co/LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Genesis-APEX-MTP-GGUF/raw/main/System_Prompt.txt) Or use this minimal string as the **first line**: > `You are Qwen, created by Alibaba Cloud. You are a helpful assistant.` Then add anything you want after. ## About No changes to datasets or capabilities. Fully functional - 100% of what the original authors intended, just without refusals and with the critical architecture bug fixed on output layers. **These are meant to be the best lossless uncensored models out there.** --- ## Specs - 35B total parameters, ~3B active per forward pass (MoE) - 256 experts, 8 routed + 1 shared per token - Hybrid architecture: Gated DeltaNet linear attention + full softmax attention (3:1 ratio) - 40 layers, pattern: 10 × (3 × DeltaNet-MoE + 1 × Attention-MoE) - 262K native context (extendable to 1M with YaRN) - Natively multimodal (text, image, video) - Multi-token prediction (MTP) support - 248K vocabulary, 201 languages - Base model. [HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive](https://huggingface.co/HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive) --- ## Recommended Settings (Official Qwen Authors) **Thinking mode (default):** - General: `temperature=1.0, top_p=0.95, top_k=20, min_p=0, presence_penalty=1.5` - Coding/precise tasks: `temperature=0.6, top_p=0.95, top_k=20, min_p=0, presence_penalty=0` **Non-thinking mode:** - General: `temperature=0.7, top_p=0.8, top_k=20, min_p=0, presence_penalty=1.5` - Reasoning tasks: `temperature=1.0, top_p=1.0, top_k=40, min_p=0, presence_penalty=2.0` **Important:** - Keep at least 128K context to preserve thinking capabilities - Use `--jinja` flag with llama.cpp for proper chat template handling - Vision support requires the `mmproj` file alongside the main GGUF --- ## Compatibility Works with llama.cpp, LM Studio, koboldcpp, and other GGUF-compatible runtimes.