Student to Student: Artificial intelligence/machine learning
By Tyler Hayes
Artificial Intelligence (AI) has recently gained widespread popularity, with several news articles being released daily about its various uses within industry including facial recognition, personal assistants like Alexa or Siri, and self-driving cars. More specifically, machine learning is a branch of AI that focuses on developing machines that learn patterns from data, without being explicitly programmed to do so. Currently, the most popular models used for machine learning are deep neural networks, which are loosely inspired by how human brains process data. For example, given an image of a cat, a neural network passes the image through several layers of “neurons” before making a prediction about what animal it has seen. These models work especially well when they are trained on large static datasets to perform a pre-defined task and even outperform humans in some cases.
However, consider the following example. Let’s suppose we have a neural network that has been trained to recognize images of cats and dogs. Now, we would simply like to update our network to also classify birds, without retraining it on cats and dogs. With conventional methods, if we update our network on only images of birds, it catastrophically forgets everything it had learned about cats and dogs. This phenomenon poses serious risks in safety critical applications such as self-driving cars and is an important problem to solve for making more robust models that are able to learn new information over time. The field concerned with developing models that are able to continuously learn and adapt to new information, without forgetting previous information, is known as lifelong machine learning, which is the focus of my Ph.D. studies.
Specifically, my research focuses on developing brain-inspired models that learn from real-time data streams. Humans acquire new knowledge all the time without catastrophically forgetting previous information. To do this, the hippocampal complex is used to quickly learn new information, which is thought to be facilitated through neurogenesis, i.e., the creation of new neurons. When we go to sleep, representations of information in the hippocampus are re-activated and replayed to the neocortex, which is used for long-term storage and generalization. In neural network models, the hippocampus is modeled as a memory buffer that replays previous data to the long-term neural network, e.g., replaying previous examples of cats and dogs to a network when it is learning the new bird category. So far, I have developed two models that make the replay buffer more memory efficient by using data clustering and quantization techniques and I look forward to developing more models during my studies at RIT.
How did you come to study Artificial Intelligence/Machine Learning at RIT?
I previously earned my BS and MS degrees in Applied and Computational Mathematics at RIT. The summer after my first year as an MS student, I took an internship at UTC Aerospace Systems where I was an image science intern. During the internship, I was using computer vision and machine learning techniques to estimate the quality of images taken from airborne image sensors. When I came back to RIT that Fall, I was interested in working on machine learning research for my MS thesis and started working with Dr. Nathan Cahill in the math department. When I wanted to learn more, he recommended that I take the Image Processing and Computer Vision course (IMGS-682) required for Imaging Science Ph.D. students taught by Dr. Christopher Kanan. While performing my MS research and taking the course, I decided I wanted to pursue research in machine learning as a career and decided to apply to RIT’s Imaging Science Ph.D. program.
How did you become interested in this topic?
I first learned about machine learning during an internship at Liberty Mutual. My manager was interested in researching how we could integrate predictive analytics into their workflow for using past data to predict future trends. I thought this concept seemed pretty neat, and after using different regression techniques in my courses and at my internship at UTC Aerospace Systems, I knew I wanted to learn more. This led me to pursue research in machine learning for my MS thesis and to take the Image Processing and Computer Vision course, which both led me to further my research studies during a Ph.D.
What advice would you like to share with other students?
Most researchers in machine learning and computer vision come from a computer science background. While computational mathematics has a lot of overlap with computer science, there were still a lot of foundational computer science skills that I was lacking. To overcome these challenges, I signed up to take two graduate level computer science courses as electives during my first year in the Ph.D. program, which helped solidify my skills. Overall, diving into a new research field always comes with lots of new challenges, but if you remain dedicated to learning new material and reaching out for help when needed, research can be a lot of fun.