BEGIN:VCALENDAR VERSION:2.0 PRODID:-//132.216.98.100//NONSGML kigkonsult.se iCalcreator 2.20.4// BEGIN:VEVENT UID:20250918T093416EDT-7055oO33xj@132.216.98.100 DTSTAMP:20250918T133416Z DESCRIPTION:Title: Tensor Programs and µP\n\nAbstract: We will discuss the limiting behaviour of large neural networks as the layer width goes to inf inity. One of the factors that most affects limiting behaviour is the spec ific parametrization used\; apart from training stability\, this will dete rmine whether or not the neural network can learn features. We show a tech nique for mechanically deriving the 'best' parametrization\, known as µP. As an additional empirical benefit\, we demonstrate that under µP\, hyperp arameters transfer directly across different sizes of models\, allowing fo r running all experiments at a small scale.\n DTSTART:20230927T170000Z DTEND:20230927T180000Z LOCATION:Room 1214\, Burnside Hall\, CA\, QC\, Montreal\, H3A 0B9\, 805 rue Sherbrooke Ouest SUMMARY:Andrew Mackenzie (9IÖÆ×÷³§Ãâ·Ñ) URL:/mathstat/channels/event/andrew-mackenzie-mcgill-u niversity-351320 END:VEVENT END:VCALENDAR