Publications by the same author
plus in the repository
plus in Google Scholar

Bibliografische Daten exportieren
 

Explaining AI through mechanistic interpretability

DOI zum Zitieren der Version auf EPub Bayreuth: https://doi.org/10.15495/EPub_UBT_00008273
URN to cite this document: urn:nbn:de:bvb:703-epub-8273-8

Title data

Kästner, Lena ; Crook, Barnaby:
Explaining AI through mechanistic interpretability.
In: European Journal for Philosophy of Science. Vol. 14 (2024) Issue 4 . - 52.
ISSN 1879-4920
DOI der Verlagsversion: https://doi.org/10.1007/s13194-024-00614-4

[thumbnail of s13194-024-00614-4.pdf]
Format: PDF
Name: s13194-024-00614-4.pdf
Version: Published Version
Available under License Creative Commons BY 4.0: Attribution
Download (846kB)

Project information

Project financing: VolkswagenStiftung

Abstract

Recent work in explainable artificial intelligence (XAI) attempts to render opaque AI systems understandable through a divide-and-conquer strategy. However, this fails to illuminate how trained AI systems work as a whole. Precisely this kind of functional understanding is needed, though, to satisfy important societal desiderata such as safety. To remedy this situation, we argue, AI researchers should seek mechanistic interpretability, viz. apply coordinated discovery strategies familiar from the life sciences to uncover the functional organisation of complex AI systems. Additionally, theorists should accommodate for the unique costs and benefits of such strategies in their portrayals of XAI research.

Further data

Item Type: Article in a journal
Keywords: AI; ANN; Deep learning; Discovery; Explanation; Mechanistic; interpretability; XAI
DDC Subjects: 100 Philosophy and psychology > 100 Philosophy
Institutions of the University: Faculties > Faculty of Cultural Studies > Department of Philosophy > Chair Philosophy, Computer Science and Artificial Intelligence > Chair Philosophy, Computer Science and Artificial Intelligence - Univ.-Prof. Dr. Lena Kästner
Faculties
Faculties > Faculty of Cultural Studies
Faculties > Faculty of Cultural Studies > Department of Philosophy
Faculties > Faculty of Cultural Studies > Department of Philosophy > Chair Philosophy, Computer Science and Artificial Intelligence
Language: German
Originates at UBT: Yes
URN: urn:nbn:de:bvb:703-epub-8273-8
Date Deposited: 05 Mar 2025 06:12
Last Modified: 05 Mar 2025 06:12
URI: https://epub.uni-bayreuth.de/id/eprint/8273

Downloads

Downloads per month over past year