机器学习可解释性历史最全方法、论文及资源整理分享

2021-06-05 23:01:34 阅读：186 来源： 互联网

标签：Learning 最全解释性 paper 2020 learning 2019 分享 explanations

机器学习业务应用以输出决策判断为目标。可解释性是指人类能够理解决策原因的程度。机器学习模型的可解释性越高，人们就越容易理解为什么做出某些决定或预测。模型可解释性指对模型内部机制的理解以及对模型结果的理解。其重要性体现在：建模阶段，辅助开发人员理解模型，进行模型的对比选择，必要时优化调整模型；在投入运行阶段，向业务方解释模型的内部机制，对模型结果进行解释。比如基金推荐模型，需要解释：为何为这个用户推荐某支基金。

本资源整理了可解释机器学习领域的各种方法，以及每个方法最新的论文、代码数据等资源，分享给需要的朋友。

资源整理自网络，源地址：

https://github.com/stefanoteso/awesome-explanatory-supervision

Passive Learning

Approaches that supervise the model's explanations.

Tangent Prop - A formalism for specifying selected invariances in an adaptive network Patrice Simard, Bernard Victorri, Yann Le Cun, John Denker; NeurIPS 1992 paper Notes: injects invariances into a neural net by regularizing its gradient; precursor to learning from gradient-based explanations.

Right for the right reasons: training differentiable models by constraining their explanations Andrew Slavin Ross, Michael C. Hughes, and Finale Doshi-Velez; IJCAI 2017 paper

Interpretable Machine Teaching via Feature Feedback Shihan Su, Yuxin Chen, Oisin Mac Aodha, Pietro Perona, Yisong Yue; Workshop on Teaching Machines, Robots, and Humans 2017 paper

Deriving Machine Attention from Human Rationales Yujia Bao, Shiyu Chang, Mo Yu, and Regina Barzilay; ACL 2019 paper

Do Human Rationales Improve Machine Explanations? Strout, Julia, Ye Zhang, Raymond Mooney; ACL Workshop BlackboxNLP 2019 paper

Concept bottleneck models Pang Wei Koh, Thao Nguyen, Yew Siang Tang, Stephen Mussmann, Emma Pierson, Been Kim, and Percy Liang; ICML 2020 paper

Debiasing Concept Bottleneck Models with Instrumental Variables Mohammad Taha Bahadori, and David E. Heckerman; arXiv 2020 paper

Learning Global Transparent Models Consistent with Local Contrastive Explanations Tejaswini Pedapati, Avinash Balakrishnan, Karthikeyan Shanmugam, Amit Dhurandhar; NeurIPS 2020 paper

Reflective-Net: Learning from Explanations Johannes Schneider, Michalis Vlachos; arXiv 2020 paper

Teaching with Commentaries Aniruddh Raghu, Maithra Raghu, Simon Kornblith, David Duvenaud, and Geoffrey Hinton; arXiv 2020 paper

Improving performance of deep learning models with axiomatic attribution priors and expected gradients Gabriel Erion, Joseph D. Janizek, Pascal Sturmfels, Scott Lundberg, Su-In Lee; arXiv 2020 paper

Interpretations are useful: penalizing explanations to align neural networks with prior knowledge Laura Rieger, Chandan Singh, William Murdoch, Bin Yu; ICML 2020 paper

When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data Peter Hase, Mohit Bansal; arXiv 2020 paper

Interactive Learning

Approaches that combine supervision on the explanations with interactive machine learning:

Principles of Explanatory Debugging to Personalize Interactive Machine Learning Todd Kulesza, Margaret Burnett, Weng-Keen Wong, Simone Stumpf; IUI 2015 paper

Explanatory Interactive Machine Learning Stefano Teso, Kristian Kersting; AIES 2019 paper Note: introduces explanatory interactive learning, focuses on active learning setup.

Taking a hint: Leveraging explanations to make vision and language models more grounded Ramprasaath R. Selvaraju, Stefan Lee, Yilin Shen, Hongxia Jin, Shalini Ghosh, Larry Heck, Dhruv Batra, and Devi Parikh; ICCV 2019 pdf

Toward Faithful Explanatory Active Learning with Self-explainable Neural Nets Stefano Teso; IAL Workshop 2019. paper Note: explanatory active learning with self-explainable neural networks.

Making deep neural networks right for the right scientific reasons by interacting with their explanations Patrick Schramowski, Wolfgang Stammer, Stefano Teso, Anna Brugger, Franziska Herbert, Xiaoting Shao, Hans-Georg Luigs, Anne-Katrin Mahlein, Kristian Kersting; Nature Machine Intelligence 2020 paper Note: introduces end-to-end explanatory interactive learning, fixes clever Hans deep neural nets.

One explanation does not fit all Kacper Sokol, Peter Flach; 2020 Künstliche Intelligenz paper

Human-in-the-loop Debugging Deep Text Classifiers Piyawat Lertvittayakumjorn, Lucia Specia, Francesca Toni; EMNLP 2020 paper

Human-driven FOL explanations of deep learning Gabriele Ciravegna, Francesco Giannini, Marco Gori, Marco Maggini, Stefano Melacci; IJCAI 2020 paper Notes: first-order logic.

Machine Guides, Human Supervises: Interactive Learning with Global Explanations Teodora Popordanoska, Mohit Kumar, Stefano Teso; arXiv 2020 paper Note: introduces narrative bias and explanatory guided learning, focuses on human-initiated interaction and global explanations.

Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations Wolfgang Stammer, Patrick Schramowski, and Kristian Kersting; arXiv 2020 paper Notes: first-order logic, attention.

Right for Better Reasons: Training Differentiable Models by Constraining their Influence Function Xiaoting Shao, Arseny Skryagin, Patrick Schramowski, Wolfgang Stammer, Kristian Kersting; AAAI 2021 preliminary paper

Bandits for Learning to Explain from Explanations Freya Behrens, Stefano Teso, Davide Mottin; XAI Workshop 2021 paper Notes: preliminary.

Distillation

Model reconstruction from model explanations Smitha Milli, Ludwig Schmidt, Anca D. Dragan, Moritz Hardt; FAcct 2019 paper

Evaluating Explanations: How much do explanations from the teacher aid students? Danish Pruthi, Bhuwan Dhingra, Livio Baldini Soares, Michael Collins, Zachary C. Lipton, Graham Neubig, and William W. Cohen; arXiv 2020 paper Notes: defines importance of different kinds of explanations by measuring their impact when used as supervision.

Regularization without Supervision

Approaches that regularize the model's explanations in an unsupervised manner, often for improved interpretability.

Beyond sparsity: Tree regularization of deep models for interpretability Mike Wu, Michael Hughes, Sonali Parbhoo, Maurizio Zazzi, Volker Roth, Finale Doshi-Velez; AAAI 2018 paper

Regional tree regularization for interpretability in deep neural networks Mike Wu, Sonali Parbhoo, Michael Hughes, Ryan Kindle, Leo Celi, Maurizio Zazzi, Volker Roth, Finale Doshi-Velez; AAAI 2020 paper

Regularizing black-box models for improved interpretability Gregory Plumb, Maruan Al-Shedivat, Ángel Alexander Cabrera, Adam Perer, Eric Xing, Ameet Talwalkar; NeurIPS 2020 paper

Related Works

Explanation-based learning, focuses on logic-based formalisms and learning strategies:

Explanation-based generalization: A unifying view Tom Mitchell, Richard Keller, Smadar Kedar-Cabelli; MLJ 1986 paper

Explanation-based learning: A survey of programs and perspectives Thomas Ellman; ACM Computing Surveys 1989 paper

Probabilistic explanation based learning Angelika Kimmig, Luc De Raedt, Hannu Toivonen; ECML 2007 paper

Injecting invariances / feature constraints into models:

Training invariant support vector machines Dennis DeCoste, Bernhard Schölkopf; MLJ 2002 paper

The constrained weight space svm: learning with ranked features Kevin Small, Byron Wallace, Carla Brodley, Thomas Trikalinos; ICML 2011 paper

Dual label-feature feedback:

Active learning with feedback on features and instances Hema Raghavan, Omid Madani, Rosie Jones; JMLR 2006 paper

An interactive algorithm for asking and incorporating feature feedback into support vector machines Hema Raghavan, James Allan; ACM SIGIR 2007 paper

Learning from labeled features using generalized expectation criteria Gregory Druck, Gideon Mann, Andrew McCallum; ACM SIGIR 2008 paper

Active learning by labeling features Gregory Druck, Burr Settles, Andrew McCallum; EMNLP 2009 paper

A unified approach to active dual supervision for labeling features and examples Josh Attenberg, Prem Melville, Foster Provost; ECML-PKDD 2010 paper

Closing the loop: Fast, interactive semi-supervised annotation with queries on features and instances Burr Settles; EMNLP 2011 paper

Learning from rationales:

Using “annotator rationales” to improve machine learning for text categorization Omar Zaidan, Jason Eisner, Christine Piatko; NAACL 2007 paper

Modeling annotators: A generative approach to learning from annotator rationales Omar Zaidan, Jason Eisner; EMNLP 2008 paper

Active learning with rationales for text classification Manali Sharma, Di Zhuang, Mustafa Bilgic; NAACL 2015 paper

Critiquing in recommenders:

Critiquing-based recommenders: survey and emerging trends Li Chen, Pearl Pu; User Modeling and User-Adapted Interaction 2012 paper

Coactive critiquing: Elicitation of preferences and features Stefano Teso, Paolo Dragone, Andrea Passerini; AAAI 2017 paper

Resources

A selection of general resources on Explainable AI focusing on overviews, surveys, societal implications, and critiques:

Survey and critique of techniques for extracting rules from trained artificial neural networks Robert Andrews, Joachim Diederich, Alan B. Tickle; Knowledge-based systems 1995 page

The Mythos of Model Interpretability Zachary Lipton; CACM 2016 paper

A survey of methods for explaining black box models Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi; ACM Computing Surveys 2018 paper

Explanation in Artificial Intelligence: Insights from the Social Sciences Tim Miller; Artificial Intelligence, 2019 paper

Unmasking clever hans predictors and assessing what machines really learn Sebastian Lapuschkin, Stephan Wäldchen, Alexander Binder, Grégoire Montavon, Wojciech Samek, Klaus-Robert Müller; Nature Communications 2019 paper

Interpretation of neural networks is fragile Amirata Ghorbani, Abubakar Abid, James Zou; AAAI 2019 paper

Is Attention Interpretable? Sofia Serrano, Noah A. Smith; ACL 2019 paper

Attention is not Explanation Sarthak Jain, and Byron C. Wallace; ACL 2019 paper

Attention is not not Explanation Sarah Wiegreffe, and Yuval Pinter; EMNLP-IJCNLP 2019 paper

The (un)reliability of saliency methods Pieter-Jan Kindermans, Sara Hooker, Julius Adebayo, Maximilian Alber, Kristof T. Schütt, Sven Dähne, Dumitru Erhan, and Been Kim; Explainable AI: Interpreting, Explaining and Visualizing Deep Learning 2019 paper

Explanations can be manipulated and geometry is to blame Ann-Kathrin Dombrowski, Maximillian Alber, Christopher Anders, Marcel Ackermann, Klaus-Robert Müller, and Pan Kessel; NeurIPS 2019 paper

Fooling Neural Network Interpretations via Adversarial Model Manipulation Juyeon Heo, Sunghwan Joo, and Taesup Moon; NeurIPS 2019 paper

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead Cynthia Rudin; Nature Machine Intelligence 2019 page

Related Resources

Awesome explainable AI

Awesome machine learning interpretability

Not Yet Sorted

e-SNLI: natural language inference with natural language explanations Oana-Maria Camburu, Tim Rocktäschel, Thomas Lukasiewicz, and Phil Blunsom; NeurIPS 2018 paper

Multimodal explanations: Justifying decisions and pointing to the evidence Dong Huk Park, Lisa Anne Hendricks, Zeynep Akata, Anna Rohrbach, Bernt Schiele, Trevor Darrell, Marcus Rohrbach; CVPR 2018 paper

"Learning Deep Attribution Priors Based On Prior Knowledge Ethan Weinberger, Joseph Janizek, Su-In Lee; NeurIPS 2020 paper

Explain and Predict, and then Predict Again Zijian Zhang, Koustav Rudra, Avishek Anand; arXiv 2021 paper

标签：Learning,最全,解释性,paper,2020,learning,2019,分享,explanations
来源： https://blog.csdn.net/lqfarmer/article/details/117607131

本站声明： 1. iCode9 技术分享网（下文简称本站）提供的所有内容，仅供技术学习、探讨和分享；
2. 关于本站的所有留言、评论、转载及引用，纯属内容发起人的个人观点，与本站观点和立场无关；
3. 关于本站的所有言论和文字，纯属内容发起人的个人观点，与本站观点和立场无关；
4. 本站文章均是网友提供，不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属；如您发现该文章侵犯了您的权益，可联系我们第一时间进行删除；
5. 本站为非盈利性的个人网站，所有内容不会用来进行牟利，也不会利用任何形式的广告来间接获益，纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

ICode9

机器学习可解释性历史最全方法、论文及资源整理分享