Featured image of post Param Δ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost

Param Δ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost

Training-free weight mixing for large language model

Info

Comments

A new post-training method for large language model. It directly adds the weight difference (ΔΘ = Θpost - Θbase) to the new base model (ΘParamΔ = Θ′base + ΔΘ), achieving comparable performance without additional training. Method

Last updated: 2025-05-03
Built with Hugo, theme modified on Stack