Tag: fine

  • A Theoretical Framework for LLM Fine-tuning Using Early Stopping for Non-random Initialization

    A Theoretical Framework for LLM Fine-tuning Using Early Stopping for Non-random Initialization arXiv:2602.13942v1 Announce Type: new Abstract: In the era of large language models (LLMs), fine-tuning pretrained models has become ubiquitous. Yet the theoretical underpinning remains an open question. A central question is why only a few epochs of fine-tuning are typically sufficient to achieve…

  • Fine Tuning a Simulation-Driven Estimator

    Fine Tuning a Simulation-Driven Estimator arXiv:2504.04480v2 Announce Type: cross Abstract: Many industries now deploy high-fidelity simulators (digital twins) to represent physical systems, yet their parameters must be calibrated to match the true system. This motivated the construction of simulation-driven parameter estimators, built by generating synthetic observations for sampled parameter values and learning a supervised mapping…

  • Tilt Matching for Scalable Sampling and Fine-Tuning

    Tilt Matching for Scalable Sampling and Fine-Tuning arXiv:2512.21829v1 Announce Type: new Abstract: We propose a simple, scalable algorithm for using stochastic interpolants to sample from unnormalized densities and for fine-tuning generative models. The approach, Tilt Matching, arises from a dynamical equation relating the flow matching velocity to one targeting the same distribution tilted by a…

  • Q-Learning with Fine-Grained Gap-Dependent Regret

    Q-Learning with Fine-Grained Gap-Dependent Regret arXiv:2510.06647v1 Announce Type: new Abstract: We study fine-grained gap-dependent regret bounds for model-free reinforcement learning in episodic tabular Markov Decision Processes. Existing model-free algorithms achieve minimax worst-case regret, but their gap-dependent bounds remain coarse and fail to fully capture the structure of suboptimality gaps. We address this limitation by establishing…

  • Fine-Tune Your Topic Modeling Workflow with BERTopic

    Fine-Tune Your Topic Modeling Workflow with BERTopic Learn how to fine-tune BERTopic settings for more focused, reproducible, and interpretable results The post Fine-Tune Your Topic Modeling Workflow with BERTopic appeared first on Towards Data Science. Tiffany Chen Go to original source

  • How I Fine-Tuned Granite-Vision 2B to Beat a 90B Model — Insights and Lessons Learned

    How I Fine-Tuned Granite-Vision 2B to Beat a 90B Model — Insights and Lessons Learned A hands-on journey exploring fine-tuning techniques that unlock the power of small vision models. The post How I Fine-Tuned Granite-Vision 2B to Beat a 90B Model — Insights and Lessons Learned appeared first on Towards Data Science. Julio Sanchez Go…

  • How to Fine-Tune Small Language Models to Think with Reinforcement Learning

    How to Fine-Tune Small Language Models to Think with Reinforcement Learning A visual tour and from-scratch guide to train GRPO reasoning models in PyTorch The post How to Fine-Tune Small Language Models to Think with Reinforcement Learning appeared first on Towards Data Science. Avishek Biswas Go to original source

  • Learning to Choose or Choosing to Learn: Best-of-N vs. Supervised Fine-Tuning for Bit String Generation

    Learning to Choose or Choosing to Learn: Best-of-N vs. Supervised Fine-Tuning for Bit String Generation arXiv:2505.17288v1 Announce Type: new Abstract: Using the bit string generation problem as a case study, we theoretically compare two standard methods for adapting large language models to new tasks. The first, referred to as supervised fine-tuning, involves training a new…

  • Fine-Tuning vLLMs for Document Understanding

    Fine-Tuning vLLMs for Document Understanding In this article, I discuss how you can fine-tune VLMs (visual large language models, often called vLLMs) like Qwen 2.5 VL 7B. I will introduce you to a dataset of handwritten digits, which the base version of Qwen 2.5 VL struggles with. We will then inspect the dataset, annotate it,…

  • Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models

    Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models arXiv:2503.20807v1 Announce Type: new Abstract: Fine-tuning Large Language Models (LLMs) on some task-specific datasets has been a primary use of LLMs. However, it has been empirically observed that this approach to enhancing capability inevitably compromises safety, a phenomenon also known as the safety-capability trade-off in LLM fine-tuning.…

  • Are You Still Using LoRA to Fine-Tune Your LLM?

    Are You Still Using LoRA to Fine-Tune Your LLM? LoRA (Low Rank Adaptation – arxiv.org/abs/2106.09685) is a popular technique for fine-tuning Large Language Models (LLMs) on the cheap. But 2024 has seen an explosion of new parameter-efficient fine-tuning techniques, an alphabet soup of LoRA alternatives: SVF, SVFT, MiLoRA, PiSSA, LoRA-XS … And most are based…

  • How to Fine-Tune DistilBERT for Emotion Classification

    How to Fine-Tune DistilBERT for Emotion Classification The customer support teams were drowning with the overwhelming volume of customer inquiries at every company I’ve worked at. Have you had similar experiences? What if I told you that you could use AI to automatically identify, categorize, and even resolve the most common issues? By fine-tuning a…

  • Fine-tuning Multimodal Embedding Models

    Fine-tuning Multimodal Embedding Models Adapting CLIP to YouTube Data (with Python Code) This is the 4th article in a larger series on multimodal AI. In the previous post, we discussed multimodal RAG systems, which can retrieve and synthesize information from different data modalities (e.g. text, images, audio). There, we saw how we could implement such a…

  • The Next Frontier in LLM Accuracy

    The Next Frontier in LLM Accuracy Exploring the Power of Lamini Memory Tuning Image generated by DALL-E 3 Accuracy is often critical for LLM applications, especially in cases such as API calling or summarisation of financial reports. Fortunately, there are ways to enhance precision. The best practices to improve accuracy include the following steps: You can start…

  • Building an LLM fine-tuning Dataset

    Building an LLM fine-tuning Dataset sentdex Go to original source