ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

机器学习可解释性历史最全方法、论文及资源整理分享

2021-06-05 23:01:34  阅读:186  来源: 互联网

标签:Learning 最全 解释性 paper 2020 learning 2019 分享 explanations


图片

    机器学习业务应用以输出决策判断为目标。可解释性是指人类能够理解决策原因的程度。机器学习模型的可解释性越高,人们就越容易理解为什么做出某些决定或预测。模型可解释性指对模型内部机制的理解以及对模型结果的理解。其重要性体现在:建模阶段,辅助开发人员理解模型,进行模型的对比选择,必要时优化调整模型;在投入运行阶段,向业务方解释模型的内部机制,对模型结果进行解释。比如基金推荐模型,需要解释:为何为这个用户推荐某支基金。

     

    本资源整理了可解释机器学习领域的各种方法,以及每个方法最新的论文、代码数据等资源,分享给需要的朋友。

     

    资源整理自网络,源地址:

https://github.com/stefanoteso/awesome-explanatory-supervision

 

目录

图片

 

Passive Learning

     Approaches that supervise the model's explanations.

    

    Tangent Prop - A formalism for specifying selected invariances in an adaptive network Patrice Simard, Bernard Victorri, Yann Le Cun, John Denker; NeurIPS 1992 paper Notes: injects invariances into a neural net by regularizing its gradient; precursor to learning from gradient-based explanations.

    

    Right for the right reasons: training differentiable models by constraining their explanations Andrew Slavin Ross, Michael C. Hughes, and Finale Doshi-Velez; IJCAI 2017 paper

    

    Interpretable Machine Teaching via Feature Feedback Shihan Su, Yuxin Chen, Oisin Mac Aodha, Pietro Perona, Yisong Yue; Workshop on Teaching Machines, Robots, and Humans 2017 paper

    

    Deriving Machine Attention from Human Rationales Yujia Bao, Shiyu Chang, Mo Yu, and Regina Barzilay; ACL 2019 paper

    

    Do Human Rationales Improve Machine Explanations? Strout, Julia, Ye Zhang, Raymond Mooney; ACL Workshop BlackboxNLP 2019 paper

    

    Concept bottleneck models Pang Wei Koh, Thao Nguyen, Yew Siang Tang, Stephen Mussmann, Emma Pierson, Been Kim, and Percy Liang; ICML 2020 paper

    

    Debiasing Concept Bottleneck Models with Instrumental Variables Mohammad Taha Bahadori, and David E. Heckerman; arXiv 2020 paper

    

    Learning Global Transparent Models Consistent with Local Contrastive Explanations Tejaswini Pedapati, Avinash Balakrishnan, Karthikeyan Shanmugam, Amit Dhurandhar; NeurIPS 2020 paper

    

    Reflective-Net: Learning from Explanations Johannes Schneider, Michalis Vlachos; arXiv 2020 paper

    

    Teaching with Commentaries Aniruddh Raghu, Maithra Raghu, Simon Kornblith, David Duvenaud, and Geoffrey Hinton; arXiv 2020 paper

    

    Improving performance of deep learning models with axiomatic attribution priors and expected gradients Gabriel Erion, Joseph D. Janizek, Pascal Sturmfels, Scott Lundberg, Su-In Lee; arXiv 2020 paper

    

    Interpretations are useful: penalizing explanations to align neural networks with prior knowledge Laura Rieger, Chandan Singh, William Murdoch, Bin Yu; ICML 2020 paper

    

    When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data Peter Hase, Mohit Bansal; arXiv 2020 paper

    

Interactive Learning

    Approaches that combine supervision on the explanations with interactive machine learning:

    

    Principles of Explanatory Debugging to Personalize Interactive Machine Learning Todd Kulesza, Margaret Burnett, Weng-Keen Wong, Simone Stumpf; IUI 2015 paper

    

    Explanatory Interactive Machine Learning Stefano Teso, Kristian Kersting; AIES 2019 paper Note: introduces explanatory interactive learning, focuses on active learning setup.

    

    Taking a hint: Leveraging explanations to make vision and language models more grounded Ramprasaath R. Selvaraju, Stefan Lee, Yilin Shen, Hongxia Jin, Shalini Ghosh, Larry Heck, Dhruv Batra, and Devi Parikh; ICCV 2019 pdf

    

    Toward Faithful Explanatory Active Learning with Self-explainable Neural Nets Stefano Teso; IAL Workshop 2019. paper Note: explanatory active learning with self-explainable neural networks.

    

    Making deep neural networks right for the right scientific reasons by interacting with their explanations Patrick Schramowski, Wolfgang Stammer, Stefano Teso, Anna Brugger, Franziska Herbert, Xiaoting Shao, Hans-Georg Luigs, Anne-Katrin Mahlein, Kristian Kersting; Nature Machine Intelligence 2020 paper Note: introduces end-to-end explanatory interactive learning, fixes clever Hans deep neural nets.

    

    One explanation does not fit all Kacper Sokol, Peter Flach; 2020 Künstliche Intelligenz paper

    

    Human-in-the-loop Debugging Deep Text Classifiers Piyawat Lertvittayakumjorn, Lucia Specia, Francesca Toni; EMNLP 2020 paper

    

    Human-driven FOL explanations of deep learning Gabriele Ciravegna, Francesco Giannini, Marco Gori, Marco Maggini, Stefano Melacci; IJCAI 2020 paper Notes: first-order logic.

    

    Machine Guides, Human Supervises: Interactive Learning with Global Explanations Teodora Popordanoska, Mohit Kumar, Stefano Teso; arXiv 2020 paper Note: introduces narrative bias and explanatory guided learning, focuses on human-initiated interaction and global explanations.

    

    Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations Wolfgang Stammer, Patrick Schramowski, and Kristian Kersting; arXiv 2020 paper Notes: first-order logic, attention.

    

    Right for Better Reasons: Training Differentiable Models by Constraining their Influence Function Xiaoting Shao, Arseny Skryagin, Patrick Schramowski, Wolfgang Stammer, Kristian Kersting; AAAI 2021 preliminary paper

    

    Bandits for Learning to Explain from Explanations Freya Behrens, Stefano Teso, Davide Mottin; XAI Workshop 2021 paper Notes: preliminary.

    

Distillation

    Model reconstruction from model explanations Smitha Milli, Ludwig Schmidt, Anca D. Dragan, Moritz Hardt; FAcct 2019 paper

    

    Evaluating Explanations: How much do explanations from the teacher aid students? Danish Pruthi, Bhuwan Dhingra, Livio Baldini Soares, Michael Collins, Zachary C. Lipton, Graham Neubig, and William W. Cohen; arXiv 2020 paper Notes: defines importance of different kinds of explanations by measuring their impact when used as supervision.

    

Regularization without Supervision

    Approaches that regularize the model's explanations in an unsupervised manner, often for improved interpretability.

    

    Beyond sparsity: Tree regularization of deep models for interpretability Mike Wu, Michael Hughes, Sonali Parbhoo, Maurizio Zazzi, Volker Roth, Finale Doshi-Velez; AAAI 2018 paper

    

    Regional tree regularization for interpretability in deep neural networks Mike Wu, Sonali Parbhoo, Michael Hughes, Ryan Kindle, Leo Celi, Maurizio Zazzi, Volker Roth, Finale Doshi-Velez; AAAI 2020 paper

    

    Regularizing black-box models for improved interpretability Gregory Plumb, Maruan Al-Shedivat, Ángel Alexander Cabrera, Adam Perer, Eric Xing, Ameet Talwalkar; NeurIPS 2020 paper

    

Related Works

    Explanation-based learning, focuses on logic-based formalisms and learning strategies:

    

    Explanation-based generalization: A unifying view Tom Mitchell, Richard Keller, Smadar Kedar-Cabelli; MLJ 1986 paper

    

    Explanation-based learning: A survey of programs and perspectives Thomas Ellman; ACM Computing Surveys 1989 paper

    

    Probabilistic explanation based learning Angelika Kimmig, Luc De Raedt, Hannu Toivonen; ECML 2007 paper

    

    Injecting invariances / feature constraints into models:

    

    Training invariant support vector machines Dennis DeCoste, Bernhard Schölkopf; MLJ 2002 paper

    

    The constrained weight space svm: learning with ranked features Kevin Small, Byron Wallace, Carla Brodley, Thomas Trikalinos; ICML 2011 paper

    

    Dual label-feature feedback:

    

    Active learning with feedback on features and instances Hema Raghavan, Omid Madani, Rosie Jones; JMLR 2006 paper

    

    An interactive algorithm for asking and incorporating feature feedback into support vector machines Hema Raghavan, James Allan; ACM SIGIR 2007 paper

    

    Learning from labeled features using generalized expectation criteria Gregory Druck, Gideon Mann, Andrew McCallum; ACM SIGIR 2008 paper

    

    Active learning by labeling features Gregory Druck, Burr Settles, Andrew McCallum; EMNLP 2009 paper

    

    A unified approach to active dual supervision for labeling features and examples Josh Attenberg, Prem Melville, Foster Provost; ECML-PKDD 2010 paper

    

    Closing the loop: Fast, interactive semi-supervised annotation with queries on features and instances Burr Settles; EMNLP 2011 paper

    

    Learning from rationales:

    

    Using “annotator rationales” to improve machine learning for text categorization Omar Zaidan, Jason Eisner, Christine Piatko; NAACL 2007 paper

    

    Modeling annotators: A generative approach to learning from annotator rationales Omar Zaidan, Jason Eisner; EMNLP 2008 paper

    

    Active learning with rationales for text classification Manali Sharma, Di Zhuang, Mustafa Bilgic; NAACL 2015 paper

    

    Critiquing in recommenders:

    

    Critiquing-based recommenders: survey and emerging trends Li Chen, Pearl Pu; User Modeling and User-Adapted Interaction 2012 paper

    

    Coactive critiquing: Elicitation of preferences and features Stefano Teso, Paolo Dragone, Andrea Passerini; AAAI 2017 paper

    

Resources

    A selection of general resources on Explainable AI focusing on overviews, surveys, societal implications, and critiques:

    

    Survey and critique of techniques for extracting rules from trained artificial neural networks Robert Andrews, Joachim Diederich, Alan B. Tickle; Knowledge-based systems 1995 page

    

    The Mythos of Model Interpretability Zachary Lipton; CACM 2016 paper

    

    A survey of methods for explaining black box models Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi; ACM Computing Surveys 2018 paper

    

    Explanation in Artificial Intelligence: Insights from the Social Sciences Tim Miller; Artificial Intelligence, 2019 paper

    

    Unmasking clever hans predictors and assessing what machines really learn Sebastian Lapuschkin, Stephan Wäldchen, Alexander Binder, Grégoire Montavon, Wojciech Samek, Klaus-Robert Müller; Nature Communications 2019 paper

    

    Interpretation of neural networks is fragile Amirata Ghorbani, Abubakar Abid, James Zou; AAAI 2019 paper

    

    Is Attention Interpretable? Sofia Serrano, Noah A. Smith; ACL 2019 paper

    

    Attention is not Explanation Sarthak Jain, and Byron C. Wallace; ACL 2019 paper

    

    Attention is not not Explanation Sarah Wiegreffe, and Yuval Pinter; EMNLP-IJCNLP 2019 paper

    

    The (un)reliability of saliency methods Pieter-Jan Kindermans, Sara Hooker, Julius Adebayo, Maximilian Alber, Kristof T. Schütt, Sven Dähne, Dumitru Erhan, and Been Kim; Explainable AI: Interpreting, Explaining and Visualizing Deep Learning 2019 paper

    

    Explanations can be manipulated and geometry is to blame Ann-Kathrin Dombrowski, Maximillian Alber, Christopher Anders, Marcel Ackermann, Klaus-Robert Müller, and Pan Kessel; NeurIPS 2019 paper

    

    Fooling Neural Network Interpretations via Adversarial Model Manipulation Juyeon Heo, Sunghwan Joo, and Taesup Moon; NeurIPS 2019 paper

    

    Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead Cynthia Rudin; Nature Machine Intelligence 2019 page

    

    Related Resources

    Awesome explainable AI

    

    Awesome machine learning interpretability

    

    Not Yet Sorted

    e-SNLI: natural language inference with natural language explanations Oana-Maria Camburu, Tim Rocktäschel, Thomas Lukasiewicz, and Phil Blunsom; NeurIPS 2018 paper

    

    Multimodal explanations: Justifying decisions and pointing to the evidence Dong Huk Park, Lisa Anne Hendricks, Zeynep Akata, Anna Rohrbach, Bernt Schiele, Trevor Darrell, Marcus Rohrbach; CVPR 2018 paper

    

    "Learning Deep Attribution Priors Based On Prior Knowledge Ethan Weinberger, Joseph Janizek, Su-In Lee; NeurIPS 2020 paper

    

    Explain and Predict, and then Predict Again Zijian Zhang, Koustav Rudra, Avishek Anand; arXiv 2021 paper

    

标签:Learning,最全,解释性,paper,2020,learning,2019,分享,explanations
来源: https://blog.csdn.net/lqfarmer/article/details/117607131

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有