Rui Li earns NSF CAREER Award to create machine intelligence that can grow and adapt
RIT assistant professor aims to develop novel framework that prevents catastrophic forgetting in AI
A Rochester Institute of Technology professor has earned a prestigious National Science Foundation award to develop machine intelligence that can actually grow when given new information. His work will help to solve the problem of catastrophic forgetting in artificial intelligence networks.
Rui Li, an assistant professor in RIT’s Ph.D. program in computing and information sciences, received a NSF Faculty Early Career Development (CAREER) award and grant for his five-year project, titled “Co-evolution of Machine Intelligence and Continuous Information.”
Li aims to create an adaptive deep learning framework that allows machine learning models to continuously learn from new bits of data and grow beyond their existing structures. The framework would be used to create more efficient and effective machine learning systems for experts in many domains—from computational biology to astrophysics.
“In the real world, new information about a subject is incrementally available over time,” said Li. “We want to create machine intelligence that can automatically enhance itself—and go through qualitative transformation—in response to new observations and data coming in.”
To do this, Li needs to solve a fundamental problem that plagues today’s artificial neural networks, called catastrophic forgetting.
Unlike with humans, artificial neural networks can’t simply apply new bits of information to what it has already learned. This would require exponentially expanding and rebuilding the network structure. Li described this as a scalability problem, where the structure becomes larger and larger as the model gets new data.
“If your model is too complex, it will capture non-essential noise that decreases efficiency and effectiveness,” said Li. “We want quality growth, instead of quantity growth. We want machine learning models that are not constrained by their initial structures.”
Li is employing the principle of Occam’s razor to simplify the model. The idea states that the network should not be multiplied unnecessarily, in order to reduce redundancy. This can be achieved automatically via Bayesian inference, a statistics technique that is used to update the probability for a hypothesis as more evidence becomes available.
“For Bayesian, top-down soft constraint comes from a prior probability—which is a distribution defined over a hypothesis space—which includes all models accessible to us as rational thinkers,” said Li. “Bottom-up hard constraint comes from the likelihood function of the model from the data.”
By defining a prior probability distribution with a stochastic process, Li plans to create a non-parametric Bayesian framework.
“If we use a family of random variables to index models, we can construct an interesting prior with various stochastic processes,” Li said. “It’s important because it enables us to make use of infinite-dimensional math structures to grow machine learning models as the size of a data set grows.”
Collaborating with Li on the research is a team of three RIT computing and information sciences Ph.D. students from different backgrounds in computing, engineering, and biomedicine. Li noted the importance of having a multidisciplinary team. As experts apply the new deep learning framework to their respective fields, it can help people make decisions and extract a better understanding from data.
“Rui has a deep understanding of statistical machine learning theory and an unusual ability to look at a range of complex problems and see how his framework can be applied,” said Anne Haake, dean of RIT’s Golisano College of Computing and Information Sciences, who has collaborated with Li on research. “Computational biology and image understanding are just some examples.”
RIT has more than a dozen NSF CAREER award winners working at the university.
The CAREER program is an NSF-wide activity that offers awards in support of junior faculty who exemplify the role of teacher-scholars through outstanding research, excellent education, and the integration of education and research within the context of the mission of their organizations.