The VCIP 2020 will feature excellent keynote speakers.
3D point cloud analysis and processing find numerous applications in computer-aided design, 3D printing, autonomous driving, etc. Most state-of-the-art point cloud processing methods are based on convolutional neural networks (CNNs). Although they outperform traditional methods in terms of accuracy, they demand heavy supervision and higher training complexity. Besides, they lack mathematical transparency. In this talk, I will present three interpretable and effective machine learning methods for 3D point cloud registration, classification and segmentation, respectively. First, an unsupervised registration method that extracts salient points for matching is presented. Second, an unambiguous way to order points sequentially in a point cloud set is developed. Then, their spatial coordinates can be treated as geometric attributes of 1D data array. This idea facilitates the classification task. Third, for the segmentation task, we show how to leverage prior knowledge on point clouds to derive an intuitive and effective segmentation method. Extensive experiments are conducted to demonstrate the performance of the three new methods. I will also provide performance benchmarking between these interpretable methods and deep learning methods.
Dr. C.-C. Jay Kuo received his Ph.D. degree from the Massachusetts Institute of Technology in 1987. He is now with the University of Southern California (USC) as Director of the Media Communications Laboratory and Distinguished Professor of Electrical Engineering and Computer Science. His research interests are in the areas of visual computing and communication. He is a Fellow of AAAS, IEEE and SPIE. Dr. Kuo’s research interests are in the areas of multimedia computing and data science and engineering. He has received numerous awards for his outstanding research contributions, including the 2010 Electronic Imaging Scientist of the Year Award, the 2010-11 Fulbright-Nokia Distinguished Chair in Information and Communications Technologies, the 2011 Pan Wen-Yuan Outstanding Research Award, the 2019 IEEE Computer Society Edward J. McCluskey Technical Achievement Award, the 2019 IEEE Signal Processing Society Claude Shannon-Harry Nyquist Technical Achievement Award and the 2020 IEEE TCMC Impact Award. Dr. Kuo has guided 155 students to their PhD degrees and supervised 30 postdoctoral research fellows. His educational achievements have won a wide array of recognitions such as the 2016 IEEE Computer Society Taylor L. Booth Education Award, the 2016 IEEE Circuits and Systems Society John Choma Education Award, the 2016 IS&T Raymond C. Bowman Award, the 2017 IEEE Leon K. Kirchmayer Graduate Teaching Award, the 2017 IEEE Signal Processing Society Carl Friedrich Gauss Education Award, and the 2018 USC Provost’s Mentoring Award.
Seven years after the development of the first version of the High Efficiency Video Coding (HEVC) standard, the major international organizations in the world of video coding have completed the next major generation, called Versatile Video Coding (VVC). The VVC standard, formally designated as ITU-T H.266 and ISO/IEC 23090-3, promises a major improvement in video compression relative to its predecessors. It can offer roughly double the coding efficiency – i.e., it can be used to encode video content to the same level of visual quality while using about 50% fewer bits than HEVC and thus using about 75% fewer bits than H.264/AVC, today’s most widely used format. Thus it can ease the burden on worldwide networks, where video now comprises about 80% of all internet traffic. Moreover, VVC has enhanced features in its syntax for supporting an unprecedented breadth of applications, giving meaning to the word “versatility” used in its title. Completed in July 2020, VVC has begun to emerge in practical implementations and is undergoing testing to characterize its subjective performance.
This talk will review the design of the VVC standard and its development history and provide recent news about its transition from a specification document to real-world products.
Gary J. Sullivan has been a chairman and co-chairman of various video and image coding standardization activities in ITU-T VCEG, ISO/IEC MPEG, ISO/IEC JPEG, and in their joint collaborative teams since 1996. He is best known for leading the development of the Advanced Video Coding (AVC) standard (ITU-T H.264 | ISO/IEC 14496-10), the High Efficiency Video Coding (HEVC) standard (ITU-T H.265 | ISO/IEC 23008-2), the Versatile Video Coding (VVC) standard (ITU-T H.266 | ISO/IEC 29090-3) and the various extensions of those standards. He has been the rapporteur of ITU-T VCEG since 1997 and has co-chaired the Joint Video Experts Team (JVET) for developing the VVC standard since October 2015. In June 2020 he was recommended as chair-elect of ISO/IEC JTC1 Subcommittee 29, the organization in charge of MPEG and JPEG.
He is a Video and Image Technology Architect at Microsoft Research. At Microsoft, he has also been the originator and lead designer of the DirectX Video Acceleration (DXVA) video decoding feature of the Microsoft Windows operating system.
The team efforts that he has led have been recognized by three Emmy Awards. He has received the SMPTE Digital Processing Medal, the IEEE Masaru Ibuka Consumer Electronics Award, the IEEE Consumer Electronics Engineering Excellence Award, two IEEE Trans. CSVT Best Paper awards, the INCITS Technical Excellence Award, the IMTC Leadership Award, and the University of Louisville J. B. Speed Professional Award in Engineering. He is a Fellow of the IEEE and SPIE.
The normalization methods are very important for the effective and efficient optimization of deep neural networks (DNNs). The statistics such as mean and variance can be used to normalize the network activations or weights to make the training process more stable. Among the activation normalization techniques, batch normalization (BN) is the most popular one. However, BN has poor performance when the batch size is small in training. We found that the formulation of BN in the inference stage is problematic, and consequently presented a corrected one. Without any change in the training stage, the corrected BN significantly improves the inference performance when training with small batch size.
Another line of popular normalization methods operate on weights, such as weight normalization (WN) and weight standardization (WS). We proposed a very simple yet effective DNN optimization technique, namely gradient centralization (GC), which operates on the gradients of weights directly. GC simply centralizes the gradient vectors to have zero mean. It can be easily embedded into the current gradient based optimization algorithms with just one line of code. GC demonstrates various desired properties, such as accelerating the training process, improving the generalization performance, and the compatibility for fine-tuning pre-trained models.
Lei Zhang (M’04, SM’14, F’18) joined the Department of Computing, The Hong Kong Polytechnic University, as an Assistant Professor in 2006. Since July 2017, he has been a Chair Professor in the same department. Since 2018, he is also with DAMO Academy, The Alibaba Group, as a Distinguished Engineer. His research interests include Computer Vision, Image and Video Analysis, Pattern Recognition, and Biometrics, etc. Prof. Zhang has published more than 200 papers in those areas. As of 2020, his publications have been cited more than 54,000 times in literature. Prof. Zhang is a Senior Associate Editor of IEEE Trans. on Image Processing, and is/was an Associate Editor of IEEE Trans. on Pattern Analysis and Machine Intelligence, SIAM Journal of Imaging Sciences, IEEE Trans. on CSVT, and Image and Vision Computing, etc. He is a “Clarivate Analytics Highly Cited Researcher” from 2015 to 2019. More information can be found in his homepage http://www4.comp.polyu.edu.hk/~cslzhang/.