Haku

Adversarial image detection based on error computing with Taylor series

QR-koodi

Adversarial image detection based on error computing with Taylor series

Abstract. This thesis explores the concept of adversarial attacks on learning systems and describes some of the most common strategies. The main aim of those techniques is performing the minimum changes to the input data (in order to remain undetected) that directs the systems into making intentional mistakes. Multiple defensive methods were proposed but with limited results and applicability. This has to do with the fact that this domain is a constant race between attackers and defenders, new methods appearing constantly that surpass their predecessors. The focus of this work was divided into two stages, a reflection and a proposal. In the first part, different methods, both from an attacker and a defender’s side are compared and analysed, highlighting their strong points and weaknesses. Moreover, the constant pairing attack-defence continues in a chronological order, methods evolving in a manner to counter previous work especially. In the end, we propose a solution for increasing the robustness and security level of the model. It is a defence mechanism based on approximating the whole network with the help of the Taylor series expansion and deciding if a sample was tampered with. The principal focus is patching the current vulnerabilities related to small and highly targeted changes in the data in a robust manner. The result was a mechanism which has a low impact on the normal classification whilst providing good and robust detection capabilities for adversarial input.

Tallennettuna: