Show simple item record

Evaluating Bias and Toxicity in LLMs

dc.contributor.advisorHerrera García, Vicente Octavio
dc.contributor.authorRodríguez del Corral, María Victoria
dc.date.accessioned2025-07-30T09:34:00Z
dc.date.available2025-07-30T09:34:00Z
dc.date.issued2025-07
dc.identifier.citationRodríguez del Corral, M.V. Evaluating Bias and Toxicity in LLMs [Trabajo Final de Máster, Universidad Loyola Andalucía]es
dc.identifier.urihttps://hdl.handle.net/20.500.12412/6737
dc.description.abstractThis master´s thesis investigates bias and toxicity in Large Language Models (LLMs) as a central concern for AI Safety and AI Alignment. Guided by a series of different benchmarks and the 3H framework, it systematically shows how publicly available checkpoints behave when faced with reasoning, demographic and open-ended safety challenges. Three Jupyter notebooks integrate the harness evaluation, Hugging Face bias and customized safety prompts to deliver a reliable and standardized benchmarking framework. Beyond establishing that raw accuracy is no guarantee of ethical soundness, the thesis details how those gaps were uncovered. Each notebook covers different layers of the problem: one benchmarks factual and reasoning skills, another measures Toxicity and Bias , and a third runs multi‑turn dialogues that surface context‑dependent harms. This setup means new models or datasets can be swapped in with minimal code changes, giving future AI Safety a solid base for its tests. The study argues that current AI systems reflect the same offline social power dynamics. Addressing those issues calls for more than clever code modifications; it demands continuous processes including broader data curation or tighter model‑governance rules and humans firmly educated and in the loop. Together, these suggestions provide a clearer guide to using models effectively and responsibly. Overall, this work applies the 3H framework to a practical benchmarking process, highlighting where current models still have weaknesses and offering clear steps to develop AI that is safer and fairer. In the future, more people should be involved, and the tests used to check AI should be kept up to date so they stay useful as the technology keeps changing.es
dc.language.isoenges
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.titleEvaluating Bias and Toxicity in LLMses
dc.typemasterThesises
dc.description.masterMáster Universitario en Inteligencia Artificiales
dc.rights.accessRightsopenAccesses
dc.subject.keywordAIes
dc.subject.keywordAI Safety AIes
dc.subject.keywordBiases
dc.subject.keywordToxicityes
dc.subject.keyword3Hes
dc.subject.keywordAI Alignmentes


Files in this item

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 Internacional