"The pinnacle of TECHNOLOGY is INNOVATION, that of SCIENCE is DISCOVERY, and more importantly that of EDUCATION is HUMANITY."


Bio

I am a research scientist at Facebook Research. I graduated with a Ph.D. in computer science from Dartmouth College and an M.S. in computer science from University of Illinois at Urbana-Champaign. Before coming to Dartmouth, I was a research staff at Nanyang Technological University.

My research interests are computer vision, machine learning and computer graphics, with specific interests in video understanding, representation learning, and vision and language.

Internship opportunity: I am looking for Ph.D. student(s) to work with me on excited video understanding problems for summer 2020. To apply, send me an email with your resume and a brief description of your research interests.


Selected Publications [full]

2019

Video Classification with Channel-Separated Convolutional Networks.
Du Tran, Heng Wang, Lorenzo Torresani, and Matt Feiszli.
International Conference on Computer Vision (ICCV), 2019.
[pre-print] [code]

SCSampler: Sampling Salient Clips from Video for Efficient Action Recognition.
Bruno Korbar, Du Tran, and Lorenzo Torresani.
International Conference on Computer Vision (ICCV), 2019.
[pre-print] [code]

DistInit: Learning Video Representations without a Single Labeled Video.
Rohit Girdhar, Du Tran, Lorenzo Torresani, and Deva Ramanan.
International Conference on Computer Vision (ICCV), 2019.
[pre-print] [code]

Learning Temporal Pose Estimation from Sparsely-Labeled Videos.
Gedas Bertasius, Christoph Feichtenhofer, Du Tran, Jianbo Shi, and Lorenzo Torresani.
Neural Information Processing Systems (NeurIPS), 2019.
[pre-print] [code]

Leveraging the Present to Anticipate the Future in Videos.
Antoine Miech, Ivan Laptev, Josef Sivic, Heng Wang, Lorenzo Torresani, and Du Tran.
IEEE Computer Vision and Pattern Recognition (CVPR) Precognition Workshop, 2019.
[paper] [code]

Large-scale Weakly-Supervised Pre-training for Video Action Recognition.
Deepti Ghadiyaram, Matt Feiszli, Du Tran, Xueting Yan, Heng Wang, and Dhruv Mahajan.
IEEE Computer Vision and Pattern Recognition (CVPR), 2019.
[paper] [code]

2018

Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization.
Bruno Korbar, Du Tran, and Lorenzo Torresani.
Neural Information Processing Systems (NeurIPS), 2018.
[paper] [code]

Scenes-Objects-Actions: A Multi-Task, Multi-Label Video Dataset.
Jamie Ray, Heng Wang, Du Tran, Yufei Wang, Matt Feiszli, Lorenzo Torresani, and Manohar Paluri.
European Conference on Computer Vision (ECCV), 2018.
[paper] [code]

A Closer Look at Spatiotemporal Convolutions for Action Recognition.
Du Tran, Heng Wang, Lorenzo Torresani, Jamie Ray, Yann LeCun, and Manohar Paluri.
IEEE Computer Vision and Pattern Recognition (CVPR), 2018.
[paper] [code]

Detect-and-Track: Efficient Pose Estimation in Videos.
Rohit Girdhar, Georgia Gkioxari, Lorenzo Torresani, Manohar Paluri, and Du Tran.
IEEE Computer Vision and Pattern Recognition (CVPR), 2018.
[paper] [code]

2017

Simple, Efficient and Effective Keypoint Tracking.
Rohit Girdhar, Georgia Gkioxari, Lorenzo Torresani, Deva Ramanan, Manohar Paluri, and Du Tran.
International Conference on Computer Vision (ICCV) PoseTrack Workshop, 2017.
[paper] [code]

2016

Deep End2End Voxel2Voxel Prediction.
Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri.
IEEE Computer Vision and Pattern Recognition (CVPR) DeepVision Workshop, 2016.
[paper] [code]

EXMOVES: Mid-level Features for Efficient Action Recognition and Video Analysis.
Du Tran and Lorenzo Torresani.
International Journal on Computer Vision (IJCV), 2016.
[paper] [code]

2015

Learning Spatiotemporal Features with 3D Convolutional Networks.
Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri.
International Conference on Computer Vision (ICCV), 2015.
[paper] [code]

2014

EXMOVES: Classifier-based Features for Scalable Action Recognition.
Du Tran and Lorenzo Torresani.
International Conference on Learning Representations (ICLR), 2014.
[paper] [code]

Video Event Detection: from Subvolume Localization to Spatio-Temporal Path Search.
Du Tran, Junsong Yuan, and David Forsyth.
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2014.
[paper] [code]

2012 and before

Max-Margin Structured Output Regression for Spatio-Temporal Action Localization.
Du Tran and Junsong Yuan.
Neural Information Processing Systems (NIPS), 2012.
[paper]

Optimal Spatio-Temporal Path Discovery for Video Event Detection.
Du Tran and Junsong Yuan.
IEEE Computer Vision and Pattern Recognition (CVPR), 2011.
[paper] [code] [data]

Human Activity Recognition with Metric Learning.
Du Tran and Alexander Sorokin.
European Conference on Computer Vision (ECCV), 2008.
[paper] [code]


Miscs

I serve as a reviewer for TPAMI, IJCV, CVPR, ICCV, ECCV, NeurIPS, ICML.

I maintain a list CV and ML conference acceptance rate here.