OUSD (R&E) critical technology area(s): Trusted AI and Autonomy
Objective: To develop new methods that can unlearn the bias in artificial intelligence (AI) models.
Description: The mitigation of bias in AI is often relegated to expensive and uncertain re-training of models. This topic looks at the alternative direction of detecting where bias exists in a model and unlearning said bias, which may come from model editing techniques, concept replacing, etc. In this STTR, performers will be expected to develop methods of removing bias from a generative model without accessing the original training data of the model for retraining. Methods could include, but are not limited to, analysis of the internal conceptual space of models to prevent biased content generation as in [1], extend methods as in [2] for the unlearning of factual data as a type of alignment, or consider modular add-ons that could be used to de-bias generated content via iterative prompt refinement [3].
In summary, performers will be expected to:
- Define a set of biases to target in a generative model.
- Develop a method of removing the biased behavior from the model.
- Develop an evaluation schema to demonstrate confidently that the targeted bias has been removed, and that performance has been retained.
Phase I
The goal of Phase I proposals is to present a new technology to address AI bias as described previously. The technology need not be mature by the end of the phase, but a convincing proof-of-concept for its utility must be demonstrated. This proof-of-concept may come in the form of a live demo, publications in peer-reviewed venues, and open-source software, among others.
Phase I deliverables and milestones for this STTR should include:
- Month 3: report detailing technical progress made to date and tasks accomplished.
- Month 6: finalize the technical report, including remaining challenges directions to be addressed, a tentative plan for future work, and lessons learned.
Phase II
Develop, install, integrate and demonstrate a prototype system determined to be the most feasible solution during the Phase I feasibility study. This demonstration should focus specifically on:
- Validating the product-market fit between the proposed solution and the proposed topic and define a clear and immediately actionable plan for running a trial with the proposed solution and the proposed customer.
- Evaluating the proposed solution against the objectives and measurable key results as defined in the Phase I feasibility study.
- Describing in detail how the solution can be scaled to be adopted widely (e.g., how can it be modified for scale).
- A clear transition path for the proposed solution that takes into account input from all affected stakeholders including, but not limited to: end users, engineering, sustainment, contracting, finance, legal, and cyber security.
- Specific details about how the solution can integrate with other current and potential future solutions.
- How the solution can be sustainable (i.e. supportability).
- Clearly identifying other specific DoD or governmental customers who want to use the solution.
Phase III dual use applications
The contractor will pursue commercialization of the various technologies developed in Phase II for transitioning expanded mission capability to a broad range of potential government and civilian users and alternate mission applications. Interested government end users may include the Air Force, the DoD Chief Digital and AI Office (CDAO), DARPA, White House Office of Science and Tech Policy (OSTP), Dept of Education, Dept of Commerce, and NIST, all of whom have been looking at the problem of detecting and mitigating bias in AI as part of an inter-agency working group. For example, mitigating bias in one of the DoD’s responsible AI principles and it is widely recognized that bias remains a hurdle for responsible AI adoption. Bias also remains a hurdle for operational AI adoption, to ensure robustness of AI to rare and unlikely events. Of course, these problems also affect and are pervasive in industry, thereby motivating the dual use of the proposed technologies. Example industrial applications include the de-biasing of generative models, which have been shown to both reflect inherent racial biases, but also to create new biases as a result of current de-biasing techniques.
Direct access with end users and government customers will be provided with opportunities to receive Phase III awards for providing the government additional research and development, or direct procurement of products and services developed in coordination with the program.
References
[1] Zhang, Gong, et al. "Forget-me-not: Learning to forget in text-to-image diffusion models." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024.
[2] D'Incà, Moreno, et al. "OpenBias: Open-set Bias Detection in Text-to-Image Generative Models." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024.
[3] Jha, Sumit Kumar, et al. "Responsible reasoning with large language models and the impact of proper nouns." Workshop on Trustworthy and Socially Responsible Machine Learning, NeurIPS 2022.
Keywords
AI Bias, Trustworthy AI, Trusted AI, Fair AI, Bias Mitigation
TPOC-1
DARPA BAA Help Desk
Opportunity
HR0011ST2025D-03
Publication: Jan. 8, 2025
Deadline: Feb. 26, 2025