20 Jul 2016

On July 20th 2016, Zhelun Wu gave his master thesis defence on the topic of “Decipherment of Evasive or Encrypted Offensive Text”.

Here is the Abstract: A very common computational task in monitoring online chat sessions is stopping users from sending malicious chat messages. Examples of malicious messages include age-inappropriate language, cyber-bullying, and sending out personal information. Rule based filtering systems are commonly used to deal with this problem, but not all of the malicious messages can be filtered out as people invent increasingly subtle ways to disguise their malicious messages to bypass such filtering systems. Machine learning classifiers can also be used to identify and filter malicious messages. However, such classifiers still rely on training data that becomes out of date and new forms of malicious text cannot be detected by the classifier. In this thesis, to solve this problem we model the messages corrupted by a malicious user to bypass a chat filter as a cipher text. We apply automatic decipherment techniques using Expectation-Maximization with Hidden Markov Models and a beam search algorithm in order to decrypt corrupted malicious text back into plain text which can be then filtered using rules or a classifier.