lastmass/Qwen3.5-Medical-GSPO

https://huggingface.co/lastmass/Qwen3.5-Medical-GSPO
Activeby lastmass3.7K8updated 1 week ago

A Chinese medical reasoning model fine-tuned from Qwen3.5-4B using a two-stage training pipeline: Supervised Fine-Tuning (SFT) for format alignment, followed by Group Sequence Policy Optimization (GSPO) with an LLM-as-Judge reward function.

Sourced from

  • HuggingFacelastmass/Qwen3.5-Medical-GSPO

Related resources

Hulu-Med: A Transparent Generalist Model towards Holistic Medical Vision-Language Understanding

Active8791 week ago
Python

First architecture deeply integrating a DNA foundation model with an LLM for multimodal biological reasoning, achieving 98% accuracy on KEGG disease pathway prediction and 15%+ average gains on variant effect prediction with interpretable step-by-step reasoning traces (bowang-lab, 390+ stars)

Active3902 months ago
Jupyter Notebook
Apache-2.0

The MediPhi Model Collection comprises 7 small language models of 3.8B parameters from the base model Phi-3.5-mini-instruct specialized in the medical and clinical domains. The collection is designed in a modular fashion. Five MediPhi experts are fine-tuned on various medical corpora (i.e.

Active2K5 months ago
Python

In search enginers, rerankers are crucial for improving the accuracy of your retrieval system.

Active22.9K3 months ago
Python

This model had been created as part of joint research of HUMADEX research group (https://www.linkedin.com/company/101563689/) and has received funding by the European Union Horizon Europe Research and Innovation Program project SMILE (grant number 101080923) and Marie Skłodowska-Curie Actions…

Idle3361 year ago