Booker Conference Room #2512, Jacobs School of Engineering, 9500 Gilman Dr, La Jolla, San Diego, California 92093
Generative Adversarial Network (GAN) is a powerful idea to train generative models and has recently shown amazing results in computer vision. Broadly, the framework estimates generative models via an adversarial process in which the generator and discriminator play a two-player game. Although there are only a few successful cases, the GAN framework is useful for NLP tasks and overcomes the inherent limitations in conventional methods. In this talk, we will first discuss a game-theoretic formulation of the GAN architecture and evaluate the framework in an image generation task. Next, we will address the limitations of our vanilla framework and study modifications that circumvent common training problems including mode collapse and non-convergence. Equipped with this knowledge, we extend the GAN framework to natural language and review techniques that enable training with discrete outputs. Subsequently, we introduce adversarial training and study its commonly used configurations including domain adaptation. Finally, we demonstrate applications of GAN in building conversational models, neural translation models, text style transformations and related NLP tasks.
Abhishek Sethi is currently working as Research Scientist (Machine Learning) at Amazon, Inc. in Sunnyvale, California. In his current research, he is developing deep learning based natural language processing frameworks and techniques for Amazon Echo. Prior to Amazon, Abhishek completed his graduate degree at Massachusetts Institute of Technology (MIT) focusing on Game Theory and Machine Learning under Prof Saurabh Amin in Operations Research Center. In his graduate research, he proposed a Bayesian game-theoretic formulation to model the interactions between a profit-maximizing firm and a population of strategic customers. Precisely, he characterized the uniqueness and existence of Nash Equilibrium for both sequential and simultaneous games and showed that for certain technical (yet realistic) conditions on the ROC curve, the value of information about certain combination of theft levels can attain negligibly small values. In addition to academic research, Abhishek has developed machine learning-based automated long-short trading strategies for equities and corporate bonds at a proprietary hedge fund and Deutsche Bank respectively.