SOP Key points, and Drafts

Final Version

The study of intelligence focuses on three fundamental approaches — social science, biological, and technological, with the end goal being creating intelligence.

My decision to pursue research in artificial intelligence is inspired by the following question: Is there anything we cannot do if we successfully combine human intellect with a computer’s efficacy?

I have gained a breadth of knowledge in A.I. from working on ETA prediction of buses running in Delhi, Infrared to RGB image translation, prediction root-cause mutation behind a tumor. Even though each problem has different inputs, parameters, and outputs, the same class of algorithms achieve the state of the art results. Thus, creating artificial intelligence is not limited to technologizing intelligence models seen in humans but involves venturing into uncharted territories, opening up infinite possibilities.

I wish to delve into A.I. at a much deeper level focusing on natural language understanding, computer vision, reinforcement learning, and being closely involved in advancing these areas. Hence, I want to focus on a career in research, where I know I will be continuously challenged and will have the chance to work on revolutionary technologies that help shape A.I.’s future.

In my junior year of undergraduate studies, I worked on my first project in Artificial Intelligence. The goal was to detect emotion from speech using a Deep Neural Network and Extreme Learning Machine. I used the experience gained in this project to work on the Cocktail party problem, which aims to extract the desired speech from a mixture of sounds. We developed an algorithm based on Speech Synthesis architecture, which models the source speaker’s mel spectrogram directly rather than learning the mask on the input mixture.

In my freshman year of graduate studies, I worked on the Image Colorization problem, intending to model the color distribution conditioned on grayscale input. Initially, we solved it in a discriminative manner, i.e., if we give the same image n times, the output remains the same. But as we know, Image Colorization is an ill-defined problem; for a given grayscale input, there can be more than one possible color image. To achieve that, we expanded it to work in a generative fashion using an autoregressive algorithm.

I decided to continue working on image colorization during my next semester, focusing on reducing the artifacts and improving the coloring of larger objects. To achieve this, we fed the output to a fully convolutional network, acting as a denoiser, inspired by Tacotron, a source synthesis architecture. After that, my professor suggested applying image colorization on I.R. images. In applications under low lighting, I.R. cameras come in handy, but interpreting I.R. images is not straightforward for a human, and hence translating to RGB improves its understandability. I.R. images introduced two challenges, i) it is no longer a self-supervised task and requires a parallel dataset, ii) it is computationally expensive since with grayscale images, we can learn the color information at less spatial resolution and upscale it, with minimal impact on visual quality but with I.R. images, we need to learn Luminicance too. I first tried the discriminative model, but the output was blurry even after extensive hyperparameter tuning. I observed that this is happening because of pair-wise loss, as the RGB images are more detailed than their I.R. counterparts, which introduces blurriness around different objects. To reduce the blurring, I trained a GAN based model, as it is not optimized using pair-wise loss.

I am quite excited to work in the domain of Contextual Matching or Project Melange because 1) I have explored both of the topics on my own, and 2) I had few ideas which I wanted to implement in both, but I could not because of time limitations. Eg. In contextual matching, I worked on a project where we first model explicit context between two sentences, followed by generating a sentence conditioned on an explicit context and an input sentence. I wanted to extend the generation process where the context is implicit, just like answering questions based on an unseen passage.

Key points

  • Class 9th
    • 9.0 pointer, science project
  • Class 10th
    • 10.0 pointer, merit
  • Class 11th
    • Downfall
  • Class 12th
    • 95 in PCM, apart nothing
  • Bachelors into CS without much of knowledge except basics of HTML
    • first yr
      • build an image compression algorithm in Matlab,
      • developed a string search algorithm based on tries, serialization, and hashing
      • 3d modeling of space capsule demonstrating the importance of dome at the bottom.
    • second semester
  • Comprehensive Projects
    • IBM Quantum Challenge, participation in
    • Grayscale and Infrared Image Colorization
    • Object Detection
    • Speech Generation, and Source separation
    • ETA prediction
    • Phylogeny estimation
    • Conditional Generation of sentences
  • Undergraduation
    • My Majors include mathematics, computer science and electronics and have worked majorly in optimization of algorithms and developing new tools to aid other users in one way or another.
    • Then during my second year internship, I worked with DRDO over machine consciousness and here is where I was introduced with Human Computer interaction somewhat. My major work lied in the understanding what machine consciousness is, it was majorily theortical based.
    • I also in the mean time worked in scraping and building automatic testing tools by using phantomJS.
    • During my 3rd year internship, I went to National Aeronautics Laborities Bengaluru, there I was selected as APJ Abdul Kalam Scholar to work with CSIR under project invoving keytrokes dynamics and its currently at brink of completion (I was able to complete it using RNNs and was working on research paper but then I found out about the CNNS and how much efficient they are, then I asked my professor about it and he was fine, so I am completing CNN implementation of it)
    • It was to develop an algorithm which reduce access time of large corpus using tries and hashing, at the end we also implemented serialization to store bytestream directly and that’s why my major interest shifted to Java. After that for another 6 months, I completely focused on developing algorithms and applications over Java.
    • It was complemented by internship at DRDO, which helped me learning the concepts of neural networks, consciousness and whether consciousness can be derived in machines or not. This was majorly a theoretical internship with prototyping over a tank bot, but it helped me in laying my foundations of machine learning.
    • During my next summer I won a scholarship under APJ Abdul Kalam Fellowship and was given opportunity to work at National Aeronautics laboratories in CSIR-4Pi under Senior Principal Scientist Dr. GK Patra. I worked on a project involving time series analysis of keystrokes dynamics and natural language analysis of keystrokes to increase the accuracy of prediction.