abedaniels's picture
Update factuality-correction/granite-4.0-micro/README.md
b327e71 verified

Factuality Correction

Model Summary: Factuality Correction is a LoRA adapter (always active) for ibm-granite/granite-4.0-micro, specifically designed to correct factually incorrect LLM generated responses by explicitly taking into account contextual passages that may contain contradicting or conflicting information. Rather than assuming contextual consistency, the adapter evaluates LLM-generated responses against one or more context sources and identifies cases where the response conflicts with, misrepresents, or selectively ignores evidence present in those contexts. The adapter is capable of correcting factual inaccuracies in long-form responses composed of multiple atomic units—such as individual facts or claims—while preserving the full generative and reasoning capabilities of the base model.

Usage

Intended Use: Factuality Correction is a LoRA adapter for IBM’s Granite-4.0-micro model. It enables the granite-4.0-micro model to correct a generated long-form response containing multiple atomic facts or claims with respect to contextual information that may be incomplete, conflicting, or contradictory. This adapter is designed to operate as an integrated component of the Granite inference pipeline. The model is specifically designed to correct factually incorrect responses according to the following definition:

  • A factually incorrect response occurs when the assistant's message contains one or more factual claims that are unsupported by, inconsistent with, or directly contradicted by the information provided in the documents or context. This includes situations where the assistant: introduces details not grounded in the context, misstates or distorts facts contained within the context, misinterprets the meaning or implications of the context, supplies erroneous or conflicting information relative to the context. Even if only a small portion of the response contains such inaccuracies, the overall message is considered factually incorrect.

Quickstart Example (LoRA)

import json

from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
from vllm.lora.request import LoRARequest

def parse_output(output) -> str:
    correction = None
    output_text = next(iter(output.outputs)).text.strip()

    # Output must be in JSON format
    try:
        data = json.loads(output_text)
        correction = data.get("correction")
    except json.JSONDecodeError as e:
        print(f"Cannot parse JSON output: {e}")
        correction = "FAIL"
    return correction

def make_prompt(query: str, response: str, contexts: list, tokenizer):

    factuality_correction_text = """
<guardian>As a judge agent, your role is to help assess whether the provided text meets the given judging criteria, utilizing all available information, including conversations, documents, and tools.

### Criteria: A factually incorrect response occurs when the assistant's message contains one or more factual claims that are unsupported by, inconsistent with, or directly contradicted by the information provided in the documents or context. This includes situations where the assistant: introduces details not grounded in the context, misstates or distorts facts contained within the context, misinterprets the meaning or implications of the context, supplies erroneous or conflicting information relative to the context. Even if only a small portion of the response contains such inaccuracies, the overall message is considered factually incorrect.

### Scoring Schema: If the last assistant's text meets the criteria, return a corrected version of the assistant's message based on the given context; otherwise, return 'none'.
"""

    messages = []    
    messages.append({
        "role": "user",
        "content": query
    })
    messages.append({
        "role": "assistant",
        "content": response
    })

    # Add the quardian (user)
    messages.append({
        "role": "user",
        "content": factuality_correction_text
    })

    # Apply the granite 4.0-micro chat template
    formatted_text = tokenizer.apply_chat_template(
        messages, 
        tokenize=False, 
        add_generation_prompt=True,
        documents=[{"doc_id": "0", "text": "\n\n".join(contexts)}]
    )

    return formatted_text

# Load the model
BASE_PATH = "ibm-granite/granite-4.0-micro"
adapter_repo = "ibm-granite/granitelib-guardian-r1.0"
adapter_subfolder = "factuality-correction/granite-4.0-micro/lora"

# Download adapter to local cache and get path
local_repo = snapshot_download(adapter_repo, allow_patterns=f"{adapter_subfolder}/*")
adapter_path = f"{local_repo}/{adapter_subfolder}"
LORA_PATH = adapter_path

sampling_params = SamplingParams(max_tokens=4096, temperature=0.0, seed=42)
lora_request = LoRARequest("adapter1", 1, LORA_PATH)
model = LLM(model=BASE_PATH, tensor_parallel_size=1, gpu_memory_utilization=0.95, dtype="bfloat16", enable_lora=True, max_lora_rank=128)

# Prepare the prompt
question = "Is Ozzy Osbourne still alive?"
response = "Yes, Ozzy Osbourne is alive in 2025 and preparing for another world tour, continuing to amaze fans with his energy and resilience."
contexts = ["Ozzy Osbourne passed away on July 22, 2025, at the age of 76 from a heart attack. He died at his home in Buckinghamshire, England, with contributing conditions including coronary artery disease and Parkinson's disease. His final performance took place earlier that month in Birmingham."]
tokenizer = AutoTokenizer.from_pretrained(BASE_PATH)
prompts = [make_prompt(question, response, contexts, tokenizer)]

# Generate the output
output = model.generate(prompts, sampling_params, lora_request=lora_request)

# Display the output
correction = parse_output(output[0])
print(f"# Corrected response : {correction}")

Training Details

The Factuality Correction model is a LoRA adapter finetuned to correct factually incorrect responses with respect to conflicting or contradicting contextual information based on the post-hoc correction method described in [Cano et al. 2026] FactCorrector: A Graph-Inspired Approach to Long-Form Factuality Correction of Large Language Models.

Training Data:

The model was trained using synthetic data that was generated from the ELI5-Category dataset that augments long-form explanatory question–answer threads scraped from the \texttt{r/explainlikeimfive} reddit forum with explicit topical annotations. This resource contains questions in which users request intuitive explanations of complex topics, each assigned by community moderators to one of 12 high-level categories (11 topical domains plus a \textit{Repost} category) and paired with multiple candidate answers and their corresponding upvote scores. For each question, we deterministically select the answer with the highest number of upvotes as the canonical response. While these answers are generally high quality and well articulated, they are not guaranteed to be factually correct and may include inaccuracies, outdated claims, or speculative statements.

To further diversify the dataset, we intentionally introduce factually incorrect responses to user questions. These synthetic responses are generated by prompting a reasonably strong LLM, such as the Mixtral-8x22B-Instruct-v0.1 model. The ratio of synthetic to human-authored responses is maintained at 50%, ensuring a balanced mix of realistic and adversarial content. The final dataset comprises 17,522 instances, uniformly distributed across 12 categories. We further partition the dataset into training (14,017 samples), validation (1,752 samples), and test (1,753 samples) splits.

Evaluation

The adapter was evaluated on the test split of the ELI5 Dataset, comprising 1,753 samples that were not used in training. For each dataset $\mathcal{D}$, we computed three factuality metrics: precision or factuality score ($Precision$), recall at $K$ ($Recall@K$), and $F_1@K$, averaged over all prompts in $\mathcal{D}$, where $K$ is set to the median number of atoms. These metrics were evaluated using a factuality assessor such as FactReasoner with Google search results as external knowledge source. For each factuality metric $S$ (e.g., precision), we report its relative gain, denoted as $G(S)$ and defined by: $G(S) = \frac{2 \cdot (S_c - S_r)}{S_c + S_r}$, where $S_r$ and $S_c$ are the metrics corresponding to the original response and the correction, respectively. A positive $G(S)$ indicates that the correction outperforms the response, while a negative value means that the correction performs worse. By construction, $G(S)$ ranges from $-2$ to $2$ and remains well defined even when either $S_r$ or $S_c$ equals zero.

Dataset Precision F1@K Recall@K
ELI5 (test) 0.26 0.20 0.09
BIO (OOD) 0.44 0.25 0.37

Adapter Configuration

Parameter LoRA
Base model ibm-granite/granite-4.0-micro
LoRA rank (r) 64
LoRA alpha 128
Target modules all linear
Output format {"correction": "X"} where X is the correction
Max completion tokens 4096
KV cache Supported

Citation

If you find this detector useful, please cite the following work.

@inproceedings{marinescu2025factreasoner,
  title={FactReasoner: A Probabilistic Approach to Long-Form Factuality Assessment for Large Language Models},
  author={Marinescu, Radu and Bhattacharjya, Debarun and Lee, Junkyu and Tchrakian, Tigran and Cano, Javier Carnerero and Hou, Yufang and Daly, Elizabeth and Pascale, Alessandra},
  booktitle={Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year={2025}
}

@misc{carnererocano2026factcorrector,
      title={FactCorrector: A Graph-Inspired Approach to Long-Form Factuality Correction of Large Language Models}, 
      author={Javier Carnerero-Cano and Massimiliano Pronesti and Radu Marinescu and Tigran Tchrakian and James Barry and Jasmina Gajcin and Yufang Hou and Alessandra Pascale and Elizabeth Daly},
      year={2026},
      eprint={2601.11232},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2601.11232}, 
}

Infrastructure: We trained the Granite Micro 4.0 Factuality Correction adapter on IBM's Vela cluster using 4 A100 GPUs.

Ethical Considerations: The Granite Micro 4.0 Factuality Correction adapter is primarily fine-tuned on English-only input–output pairs. Although the underlying base model supports multilingual dialogue, the adapter’s performance on non-English tasks may differ from its performance on English. In addition, while the base model has been aligned with safety considerations in mind, the adapter may, in some cases, produce inaccurate, biased, or otherwise unsafe outputs in response to user prompts. It is also important to note that there is no built-in safeguard guaranteeing that the correction output is always correct. As with other generative models, safety assurance relies on offline evaluation procedures (see Evaluation), and while we expect the generated outputs to meet safety standards, this cannot be guaranteed. Finally, this adapter is specifically optimized for the factuality definition described above, and its behavior outside that scope may be limited.

Resources