September 11, 2023
As students of IIT Kharagpur, when we step out of the campus and chat with other students, parents, people from industry and stakeholders in every facet of India, or even the world arguably, we have the luxury of getting a very strong pre-assigned credibility and trust from them. Trust that we have the nerve to accept the “tough problems” with excitement; the term work hours will not apply, and we will be willing to self-train on new stuff to get the job done. And this is true in a village in the North-East, roundtables in Raisina Hills, workstations in Bangalore or glassdoor cabins in Silicon Valley.
But how did we get here? It can’t possibly be just because we proved our mettle in solving PCM problems, right? It is because students like us who once sat on the benches of Nalanda and our departments went on to take projects in every sector around the globe and achieved monumental results throughout the decades. The annual accolade of Young Alumni Achievers Awards is in recognition of those torchbearers who, in their very early stages of careers, have added to this trust we inherit here onwards. In this series of articles, we cover their journey as a student, the lustrous results they achieved after years of heat and what feedback and vision they have for us who want to follow their tall shadows.
In this series, the first alumnus we talked to was Anshumali Shrivastava, an RPian who graduated from the Int. MSc in Mathematics and Computing Course with an Institute Silver Medal in 2008. After working in corporate for a while, he went on to take a PhD position at Cornell to work on the interfaces of Big Data, ML and probability theory. Currently, he is an Associate Professor in the CS Department at Rice University and also runs his company ThirdAI(to be pronounced “Third-Eye”) as the CEO. So, he is literally Batman. But what is more interesting is the feat that ThirdAI’s product achieves, how it is a hallmark of Indian innovation towards cost-effectiveness, and how his academic trajectory fused into this market. So, let’s get straight to know about his journey starting from the good old days.
Wideangle and telephoto: Keeping both the snaps
So when we asked Anshumali to share his key insights about his undergrad days and what kept him propelled forward, the Institute Silver Medalist was humble and/or prodigious enough to say, “I was not a really hard-working one”. But his ideals were not shrunk only towards academics, and he advises the same to us. Finding the real charm of KGP in holistic development, he cites the importance of leadership, team dynamics and entrepreneurial spirit as the scale of a profession grows in impact and capital involved with time. Meanwhile, he also emphasises why it is important to focus on a smaller set of things to master upon, keeping in mind that this set will keep changing as we shift sectors. Anshumali, after acing it as an undergrad, as a software engineer, as a PhD student and now simultaneously as a Prof and an entrepreneur, can certainly vouch for having both the focused excellence and an arsenal of softskills.
Whiteboard vs Chalkboard
During his time at Fico, a company that gives credit scores, he was pulled towards data science and ML. However, given his strong background in mathematics, he was interested in working on the more fundamental and difficult problems that existed in this space, mostly solved with deeper deployment of applied mathematics, the ones that were usually not touched in the offices at Bangalore. At his position, just like any other Silver Medalist would do, he goes to one of the best academic spaces in the world to carry out his research. During his Ph.D. at Cornell, he worked on speeding up the “hashing algorithms” by orders of magnitude, which would significantly improve large-scale ML and data mining systems. Getting the best paper award in NeurIPS 2014(the most prestigious conference in the field), among other accolades, was rightfully earned.
But here comes the catch and curiosity...
It was 2015 when he defended his thesis, which was a very exciting time for the field, just like it is today. Around this time, Generative-AI was invented, and Deepmind’s RL-driven AlphaGo shocked the world. The glassdoors at Silicon Valley were open with a red carpet for IIT-Ivy League grads like Anshumali, who had achieved breakthroughs in AI. But this is when Anshumali chose the chalkboard over whiteboard, accepting a faculty position at Rice University. Usually, PhD grads spend a few years as a post-doc or in industry, but he went straight to the lecture halls.
Obviously, we wanted to know why. To this, he expressed the best setting to solve fundamentally tough problems is where bright and young minds work with experts in a free-flow manner, something that academia provides the best. This is also evident today when industry usually addresses the large-scale problems while academia solves the deeper ones, both requiring different resources and talent. So, as a research group leader, when he gives a problem to a student, he/she will brainstorm and work on it 24/7, as if life depended on it, and not in a 9-5 routine. So, he was elated to accept the faculty position at Rice University.
Reinventing the wheel and disrupting businesses
We went into the details of his work as a PhD student and later as a group leader at Rice University. What he had to explain was inspiring not only for an ML student but for any researcher, engineer or entrepreneur in general. During his PhD, he was investigating if he can solve something called “inner-product search” using randomised algorithms. And the widely accepted and mathematically proven answer was “no”. So today, when most student researchers just deploy the working solutions to pick the low-hanging fruits, and a small fraction eyes the “unsolved problems”, there was a KGPian, head to head with an “unsolvable problem”. But as Anshumali beautifully stated, “take impossibility with a grain of salt”. In his mathematical setting, this quote means that the problem is unsolvable only under certain assumptions. And what if he could tweak and perturb the assumptions themselves? This was his moonshot effort, and he cracked it, getting him the NeurIPS Best Paper award.
But how he further used this fundamental dent he made was even more impactful. At Rice, he created the first algorithms for CPUs to train Deep Neural Networks faster than GPUs. At the time when people thought that such deep networks would take forever to be trained on CPUs, his algorithm allowed it to be trained faster than the Nvidia V100 GPU, the then flagship. It must be clear to any reader that there is enormous business potential in this tech. Many consider it an existential threat to the likes of Nvidia, which recently crossed the trillion-dollar mark in valuation. So he raised capital to start his company ThirdAI as a CEO with his then PhD student Tharun Medini(IITB 2015, AIR 21 in JEE ‘11) as CTO. So this is how, nearly a decade of research was fruitful in building this highly disruptive company.
Jugaad Tech and Overriding the Infrastructure Gaps
Recently, in a public address Sam Altman, CEO of OpenAI, the creators of GPT, was asked by Rajan Anandan, former head of Google India, that how can India develop its own ChatGPT with an investment of around 10 million dollars, a fraction of what OpenAI had. To this, Altman said, “We will explicitly tell you that it’s completely hopeless to challenge us in training foundational models, and you shouldn’t even attempt it”. He later cites the large scale of resources and years of built infrastructure OpenAI utilised to achieve this feat, somethings that are inaccessible in India.
We bet that as an Indian, and even more as an Indian engineering student, we all felt a sense of shame and anger or even incompetence when someone of the stature of Sam Altman said this. But he clearly doesn’t know whom he is talking to. Inaccessibility to resources is a truth we are born and raised with in every town of India. If we were supposed to do things exactly the way Americans or Europeans have done, we could have never achieved all the things we have done so far. So inaccessibility to resources, is not a bug; it is a feature because we are trained to get the job done most cost-effectively.
Anshumali cites the success of Mangalyaan and Chandrayaan on how frontier research and innovation, which we often call Jugaad tech, helped us achieve the best possible results in a fraction of resources compared to our western counterparts. He says that ChatGPT and other deep tech products can be and will be built in India only if we follow this story. If we tried to replicate the path OpenAI took, of course, we would fail because it is true there is an infra gap, like the unavailability of large GPU servers, etc. But after a decade of research, he has built systems to utilise CPUs much more productively than GPUs. He says that many such similar “tough problems” in math, CS and logistics need to be solved in the Indian context to build something like ChatGPT, and it surely can be done. On this, he encourages the students to identify such bottlenecks, inscribe them into a plan, build teams and solve them. We believe that after the example Anshumali has set, many more KGPians and other students will roll up their sleeves and fight head to head with the “unsolved” and the “unsolvables”.