I’ve spent the last 2 months building something that might change how students prepare USMLE/UKMLE/NEET-PG forever. Meet Neeto-1.0-8B - a specialized, 8-billion-parameter biomedical LLM fine-tuned on a curated dataset of over 500K items. Our goal was clear: create a model that could not only assist with medical exam prep (NEET-PG, USMLE, UKMLE) but also strengthen factual recall and clinical reasoning for practitioners and the model itself outperforming general models by 25% on medical datasets.
Docs + model on Hugging Face 👉 https://huggingface.co/S4nfs/Neeto-1.0-8b
🤯 The Problem
While my company was preparing a research paper on USMLE/UKMLE/NEET-PG and medical science, I realized existing AI assistants couldn’t handle medical reasoning. They’d hallucinate drug interactions, miss diagnostic nuances, and provide dangerous oversimplifications. So I decided to build something better at my organization.
🚀 The Breakthrough
After 1 month of training on more than 410,000+ medical samples (MedMCQA, USMLE questions, clinical cases) and private datasets from our my organization’s platform medicoplasma[dot]com, we achieved:
Metric Score outperforms
MedQA Accuracy 85.8% +87% vs general AI
PubMedQA 79.0% +23% vs other medical AIs
Response Time <2 seconds Real-time clinical use
🔧 Technical Deep Dive
Architecture: Llama-3.1-8B with full-parameter fine-tuning Training: 8×H200 GPUs using FSDP (Fully Sharded Data Parallel) Quantization: 4-bit GGUF for consumer hardware compatibility
Here’s how we compare to other models:
Model MedQA Score Medical Reasoning
Neeto-1.0-8B 85.8% Expert-level
Llama-3-8B-Instruct 62.3% Intermediate
OpenBioLM-8B 59.1% Basic
Yesterday, I watched a friend use Neeto to diagnose a complex case of ureteral calculus with aberrant renal artery anatomy - something that would take hours in textbooks. Neeto provided the differential diagnosis in 1.7 seconds with 92% confidence.
💻 How to Use It Right Now
1. Install vLLM
pip install vllm
2. Run the medical AI server
vllm serve S4nfs/Neeto-1.0-8b
3. Ask medical questions
curl http://localhost:8000/v1/completions -H “Content-Type: application/json” -d ’{ “model”: “S4nfs/Neeto-1.0-8b”, “prompt”: “A 55-year-old male with flank pain and hematuria…”, “max_tokens”: 4096, “temperature”: 0.7 }’
🌟 What Makes This Different
Cultural Context: Optimized for advanced healthcare system and terminology Real Clinical Validation: Tested by 50+ doctors across global universities Accessibility: Runs on single GPU Transparency: Full training data and methodology disclosed (2 datasets are private as i am seeking permission from my org to release)
📈 Benchmark Dominance
We’re outperforming every similar-sized model across 7 medical benchmarks, (see docs, for full results):
MedMCQA: 66.2% (+18% over competitors) MMLU Medical Genetics: 87.1% (Best in class) Clinical Knowledge: 79.4% (Near-specialist level)
Upvote & like the model for medical research. Feedback, criticism & collaborations welcome! 🤗
💬 Discussion r/LocalLLaMA (107 points, 17 commentaires) 🔗 Source