IN
Fix: Add Offline DeepSeek Model #82
intelligentnode/Intelli#94

/claim #82

Hi, this is Enity300, I am sending this pull request from other ID due some issues I’m currently facing with Algora-pbc. Please be assured any modifications to the approach or any sort of conversation will be promptly discussed through this ID and through @Enity300.

Offline DeepSeek Model Loader Implementation

Changes Made:

  • Added model/deepseek implementation directory with:
    • Core wrapper (wrapper.py lines 1-98)
    • Quantization helpers (helpers/quantize.py)
    • Memory mapping system (helpers/memory_map.py)
  • Implemented optimized loading patterns from llama.cpp (reference: llama_cpp_wrapper.py)
  • Added integration tests (test_deepseek_wrapper.py)

Key Features:

  1. HuggingFace Hub Integration
    • Direct model downloads with config fallback (see wrapper.py)
    • Supports both full and quantized GGUF models
  2. Memory Optimizations
    • Layer-wise loading (reference: model_loader.py)
    • mmap-based tensor loading (reference: memory_map.py)
  3. Quantization Support
    • 4/8-bit via bitsandbytes (quantize.py lines)

Testing:

  • Verified with DeepSeek-R1-Distill-Qwen-7B models:
    • Quantized: bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF
    • Full: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
  • Added arithmetic and QA test cases (test_deepseek.py)

Other necessary details are mentioned in the read-me under model/deepseek

Claim

Total prize pool $800
Total paid $0
Status Pending
Submitted February 21, 2025
Last updated February 21, 2025

Contributors

RA

Raghav Arora

@RaghavArora14

100%

Sponsors

IN

IntelliNode

@intelligentnode

$800