Statement of purpose

Essay topics:

Statement of purpose

Around 200,000 people were victims of road accidents in India in 2016 alone. That is more than the number of people killed in all our wars put together. Even with the influx of autonomous vehicles hitting the roads, it won’t mean the end of road traffic accidents. As indicated by the above facets, in my research internship, I explored the provocative yet hardly studied possibilities for detecting as well as predicting road crashes from dash-cam videos which, in the future, might prove to be a key in curtailing vehicle conflicts. Though the problem has existed for quite some time, an automated approach to detect and predict crashes has been scarcely touched upon mainly due to the lack of freely available data and computing resources.

During my research internship at National Chung Cheng University in Taiwan, an opportunity arose for me to be part of the research team tasked with developing a robust self-driving model. As mentioned in the preceding paragraph, my role in the team was to develop a model to detect and predict road crashes from dash-cam videos. For a head start, my mentor instructed me to design an algorithm using another object detection algorithm as a basis, called “You only look once” (YOLO). Detecting crashes required two things: Depth of the objects and their position. My algorithm worked on the following concept: if the depth and position of multiple objects coincide, there will probably be a crash. To approach this, I used a tracking algorithm (Median Flow) along with a depth prediction CNN followed by a curve fitting algorithm called ARIMA. Taking 10 consecutive frames as a historical window, I fed it to ARIMA model to predict position and depth for the next 20 frames and repeated this pattern for the rest of the frames. The resulting model had a precision of ~40% and recall of ~85% which, unfortunately, made it inapplicable in practice. Further, due to too many computationally expensive routines such as Yolo, ARIMA, and depth prediction network, the model was also lackadaisical.

The improbable nature of the above algorithm presented an obvious limitation of CNN models: its lack of identifying contextual information. To counter this issue, I had to from scratch again and delved into the sea of RNN papers searching for any hint of the word ‘detect’. I eventually came across an excellent publication by three Taiwanese researchers who had suggested an alternative approach to the same problem using LSTMs. They had access to a hand-labeled dataset comprising 2000 videos (each of 4 seconds and 100 frames). I implemented the approach given to compare it with my model described previously. To simplify the input, I extracted motion feature using improved dense trajectory (IDT) for 5 consecutive frames and feeding it to Gaussian-Mixture-Model (GMM). The resulting output acted as an input into the LSTM and while using an exponential loss function for penalizing the output probability, I trained the model for 24 hours. It achieved a precision of 56% and a recall of 80% which is a lot better than the former model. The main issue this algorithm faced was the lack of sufficient dataset since accidents could occur in a variety of scenarios and in a variety of ways. The limited number of videos available does not fully capture every context and presence of innumerable outliers didn’t ease the situation. There is another perspective examined by Dr. Rachel for detecting accidents which utilized a hierarchical RNN rather than LSTMs feeding it downscale raw pixel values as input. This model faired almost the same though there is a room for improvements by hand tuning the input features.

On a more basic level, I got better acquainted with the nuances of using one model over another and got a better grasp of the essentials of the research. It also gave me an insight into what are the major issues with these kinds of problems such as identifying vital input features, tuning topology of the network, and choice of the loss function. To exemplify, the network where we specially fed only the essential features or extra spatial information besides the input, performed relatively better. The main difference here is that the model can directly map all input features to output and there’s no redundant information being supplied. Likewise, The difference between the use of different models also became apparent. Where RNN excels at identifying contextual information, CNN is able to precisely analyze the subtle features present in the map. Their complementary nature proved to be useful in deciphering how the networks that are summarized above works. In addition, the lack of an adequate dataset is compensated by finely hand tuning the model though it still does not seem to be enough to alleviate the issue.

Besides these, I have implemented few other research papers such as NEAT, Neuroevolution for evolving topologies published in 2002. It is based on the concept of genetic algorithms where the basis is that we start with a minimum topology and keep on increasing the number of nodes until it reaches a satisfactory fitness level. To avoid common pitfalls of the genetic algorithm, I used historical markings to track the origin of the node and speciation function to prevent dying of the new networks. I used the model mainly as a substitute for reinforcement learning to play arcade games such as ping pong and Mario. Though it worked well, training it took considerably longer and was notably more computationally expensive to train than Deep Reinforcement Learning. Although better versions of NEAT is now available such as HyperNEAT the extent of applicability of evolutionary algorithms is yet to be seen.

Though I am inclined to talk about a few more research projects I had implemented such as Deep RL for implementing the end-to-end self-driving model and Artistic Style Transfer, I should also stress that at the heart of my research concentration lies a more general interest with Data science, Algorithms and to be more specific, AI. To date, my experience and knowledge are sustained by an array of courses I have taken such as Advanced algorithms, Machine Learning and AI where I have maintained an average grade of B. My interests definitely stems from reading about current developments in the field of science as it heads increasingly closer to automation and I yearn to be a part of this progression which had instilled in me an appetite for research. With these personal interests, I wish to pursue an INTERN/MS/ Ph.D. under your guidance which I believe will further nurture my passion for the subject and prepare me for new challenges.

Beside my academics, I sometimes indulge in few hobbies such as competitive programming, where I had throughout my undergraduate years won 3 gold, 1 silver and 3 bronze in national contests held on online platforms. Last year I qualified for ACM ICPC national competition where my team of three people got placed at 25th rank out of 2000 teams. In addition to programming, I have a strong disposition towards sketching which I indulge in my leisure time. I won a silver medal at intra-school annual sketching competition at IIT Ropar in 2017.

Apart from these self-indulgent activities, I have been in multiple positions of responsibilities throughout my undergraduate years such as being a core team member of Project Prabhas and Project E-Swasthya at Enactus. Enactus, where I got recruited in my freshman year, is a non-profit organization dedicated to creating a generation of entrepreneurial leaders and social innovators. Noticing a lot of slums around my college where the shortage of electricity brought forth a myriad of problems for those unfortunate people, our team sought a solution by providing solar energy products at cheaper rates with the help of a volunteer who belonged to the respective society. Though the project initially flourished, it ultimately didn’t work as expected and had to be retracted, majorly due to our own lack of understanding and experience with dealing people. However, it provided us with invaluable judgment and experience on how to approach similar projects in the future. Armed with this knowledge, our team launched an enterprise, E-Swasthya where our aim was to subdue the problems related to the shortage of doctors by providing remote diagnosis and treatment of patients by means of telecommunication technology. The project was quite successful and was recognized by our district administration who adopted our model soon after its success. This project along with efforts of other teams earned us the 4th place in Enactus nationals 2016 out of approximately 100 other universities.

Since I was genuinely inquisitive about competitive programming,  the absence of coding culture during my initial years at IIT Ropar became apparent. A general lack of guidance and understanding about software development plagued the students and many of us wished a change. To bring upon a new era and introduce a culture of software development, our team of 6 people kick-started our university’s first ever software community in our 6th semester. The aim was to bring a culture of working in a collaborative environment on open source projects and prepare us for real-world problems in today’s fiercely competitive market. We have successfully launched several full-fledged software such as our Facetime Attendance application and our college’s own web portal in a span of 4 months.

To well utilize my time, I took part in few volunteer works during my time in college such as organizing blood donation camp last year where the aim was to mitigate the shortage of blood supply in our local hospital. Additionally, I belonged to the core creative team in our cultural festival Zeitgeist from 2015 to 2016 where I participated and coordinated poster editing and its publicity on social media.

Votes
Average: 8 (2 votes)
Essay Categories

Comments