Wednesday, December 26, 2018

Sanjeev Arora: Interview with John Urschel at Heidelberg Laureate Forum


The Heidelberg Laureate Forum (HLF) is an organization devoted to encouraging interaction between extremely distinguished senior scholars and researchers in Computer Science and Mathematics (the Laureates) and early career academics (graduate students and postdocs). The  Forum works with a number of other organizations in the events it hosts - including the Association for Computational Machinery (ACM), the International Mathematical Union, the Norwegian Academy of Science and Letters (which also awards the Abel Prize) and the Lindau Nobel Laureate Meetings.

This video shows Prof Sanjeev Arora, interacting with MIT graduate student John Urschel. Among others who attended the Heidelberg Laureate Forum in 2018 along with him, to name just a few, were: Michael Atiyah, Vinton Cerf,  Jeff Dean, Richard Karp, Les Valiant and SRS Varadhan.
I blog the video for the insight it provides into Prof Arora's working style and working philosophy  (such as how to pick problems to work on). I hope this blog post will make the video  more accessible to search engines, encouraging more people to watch the full video. In the blog I will quote near-verbatim some of the answers, again hoping that will whet people's appetite for the full video. There is some inevitable paraphrasing, so please do not rely on this post as a transcript! (Occasionally I have also added emphasis in italics, but this is only my own editorial judgment. The video also contains some  interjecting remarks from the video director behind the camera, to clarify a point or to seek additional explanation. These are very useful indeed, but I have not explicitly referred to them).

In choosing to pair Urschel and Arora in conversation, the video director appears to have deliberately chosen two people who have reasonably similar academic interests, but also fairly contrasting personal styles. There is some initial awkwardness in the conversation, but overall it is an excellent choice of conversationalists.

The video begins with Urschel asking Arora what were some of the most interesting recent topics he had come across in the field of neural nets and deep learning.

Arora: (I have learned about) New architectures and algorithms that enable neural nets to learn and generalize better, such as batch normalization, residual nets, ReLU (rectified linear unit) networks, dense nets and similar topics...

The conversation moves along, and then Urschel asks Arora how he came to work in the field of neural nets (since Arora had earlier worked on computational complexity and optimization problems, which are more mainstream in Computer Science). What I would note also is that Prof Arora, through  his work on machine learning and deep networks, has actually managed to mainstream such work within Computer Science (and to some extent within Mathematics). 

Arora: I did not start with a preconceived idea of what to work on… but took up interesting issues as they came up – as opposed to colleagues who made a certain area their own for one or two decades…sometimes a newcomer can look at things in a new way… as happened with me and the Euclidean Traveling Salesman Problem (TSP)… if you do that often enough, you can get lucky… the main ingredient is not to be afraid of problems…in Computer Science most of our problems are very close to the source…and rather new… sometimes even the definition of the problem is not fixed… …and since there’s no end to learning, one can learn on the job (rather than have to study everything that was done before)… and most of our stuff is not very deep mathematics…
The conversation again rolls along, and then Urschel asks Arora what he had learned specifically at the Heidelberg International Forum...
Arora:  I’ve learnt about some new problems here at the HLF… e.g.,  did you know that just by observing the power consumption in a chip, in some cases you can break its cryptosystem?  (When it becomes difficult) and for more similar but more complicated tasks, you can throw a Deep Learning Network at it… you track the input-output behavior (of the chip), measure the power consumption, and the deep learning net develops correlations that work well empirically (in the decrypting task...). Also, (at the HLF) I’ve had conversations with people on a personal level that I normally do not have outside my own sub-discipline. 

In concluding this blog post, I will add only that Prof Arora's own career fully exemplifies the career philosophy he outlines here...and as a final quote (although from a different interview he did with the HLF) I leave you with this:
Arora: In Computer Science, things change so fast that if you don’t change your field (every few years) you can quickly get obsolete...