学术信息

上一篇:下一篇:
热门文章

关于南方科技大学杨将副教授来校讲学的预告

作者:发布时间:2025年10月16日 20时42分

应数学与计算科学学院、广西应用数学中心(三亿官方官网三亿(中国)_三亿(中国))及广西高校数据分析与计算重点实验室邀请,中国南方科技大学杨将副教授将于2025年10月18日来校讲学,欢迎全校师生踊跃参加。报告具体安排如下:

报告题目:The Staircase Phenomenon and its applications in initialization of Neural Network Training Dynamics

主讲人:杨将副教授

时间:2025年10月18日(周六)上午10:00

地点:花江慧谷4号楼310报告厅

报告摘要:Understanding the training dynamics of deep neural networks (DNNs), particularly how they evolve low-dimensional features from high-dimensional data, remains a central challenge in deep learning theory. In this work, we introduce the concept of $\epsilon$-rank, a novel metric quantifying the effective feature of neuron functions in the terminal hidden layer. Through extensive experiments across diverse tasks, we observe a universal staircase phenomenon: during training process implemented by the standard stochastic gradient descent methods, the decline of the loss function is accompanied by an increase in the $\epsilon$-rank and exhibits a staircase pattern. Theoretically, we rigorously prove a negative correlation between the loss lower bound and $\epsilon$-rank, demonstrating that a high $\epsilon$-rank is essential for significant loss reduction. Moreover, numerical evidences show that within the same deep neural network, the $\epsilon$-rank of the subsequent hidden layer is higher than that of the previous hidden layer. Based on these observations, to eliminate the staircase phenomenon, we propose a novel pre-training strategy on the initial hidden layer that elevates the $\epsilon$-rank of the terminal hidden layer. Numerical experiments validate its effectiveness in reducing training time and improving accuracy across various tasks. Therefore, the newly introduced concept of $\epsilon$-rank is a computable quantity that serves as an intrinsic effective metric characteristic for deep neural networks, providing a novel perspective for understanding the training dynamics of neural networks and offering a theoretical foundation for designing efficient training strategies in practical applications.

主讲人简介:杨将,南方科技大学数学系长聘副教授。2010年获浙江大学学士学位,2014年获香港浸会大学博士学位。2014–2017年先后于宾夕法尼亚州立大学、哥伦比亚大学从事博士后研究,2017年起任职于南方科技大学至今。从事计算数学方向的研究,主要研究兴趣包括关于相场模型和非局部模型的建模、数值方法及应用、深度学习算法设计与理论,研究成果发表在SIAM Review、SIAM Journal on Numerical Analysis、Mathematics of Computation、M3AS、SIAM Journal on Scientific Computing、Journal of Computational Physics等期刊上。曾获东亚工业与应用数学学会学生论文二等奖(2014)、世界华人数学家大会杰出论文奖(2024)、国际基础科学大会“前沿科学奖”(2025)、入选斯坦福-爱思唯尔全球2%顶尖科学家(2025年年度影响力榜单);入选了国家高层次人才计划青年项目、深圳市杰青项目,主持天元数学交叉重点专项1项、国家自然科学基金面上项目2项、广东省自然科学基金项目1项。