Skip to content
calculated | content

calculated | content

  • About
  • Contact
  • Home

Thoughts on Data Science, Machine Learning, and AI

  • About
  • Contact
  • Home

Power Laws in Deep Learning

Why Does Deep Learning Work ?    If we could get a better handle on this, we could solve some very…

Self-Regularization in Deep Neural Networks: a preview

Why Deep Learning Works: Self Regularization in DNNs An early talk describing details in this paper Implicit Self-Regularization in Deep…

Rethinking–or Remembering–Generalization in Neural Networks

I just got back from ICLR 2019 and presented 2 posters,  (and Michael gave a great talk!) at the Theoretical…

Capsule Networks: A video presentation

Please enjoy my video presentation on Geoff Hinton’s Capsule Networks.  What they are, why they are important, and how they…

Free Energies and Variational Inference

My Labor Day Holiday Blog: for those on email, I will add updates, answer questions,  and make corrections over the…

What is Free Energy: Hinton, Helmholtz, and Legendre

Hinton introduced Free Energies in his 1994 paper, Autoencoders, minimum description length, and Helmholtz Free Energy This paper, along with…

Interview with a Data Scientist

It’s always fun to be interviewed. Check out my recent chat with Max Mautner, the Accidental Engineer http://theaccidentalengineer.com/charles-martin-principal-consultant-calculation-consulting/

Normalization in Deep Learning

A few days ago (Jun 2017), a 100 page on Self-Normalizing Networks appeared.  An amazing piece of theoretical work, it…

Why Deep Learning Works 3: BackProp minimizes the Free Energy ?

 ?Deep Learning is presented as Energy-Based Learning Indeed, we train a neural network by running BackProp, thereby minimizing the model error–which is…

Foundations: Mean Field Boltzmann Machines 1987

A friend from grad school pointed out a great foundational paper on Boltzmann Machines.  It is a 1987 paper from…

Posts navigation

Older posts
Newer posts

Recent Posts

  • WW-PGD: Projected Gradient Descent optimizer
  • WeightWatcher, HTSR theory, and the Renormalization Group
  • Fine-Tuned Llama3.2: Bad Instructions ?
  • What’s instructive about Instruct Fine-Tuning: a weightwatcher analysis
  • Describing Double Descent with WeightWatcher

Archives

  • December 2025
  • December 2024
  • October 2024
  • March 2024
  • February 2024
  • January 2024
  • March 2023
  • February 2023
  • July 2022
  • June 2022
  • October 2021
  • August 2021
  • July 2021
  • April 2021
  • November 2020
  • September 2020
  • February 2020
  • December 2019
  • April 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • June 2018
  • April 2018
  • December 2017
  • September 2017
  • July 2017
  • June 2017
  • February 2017
  • January 2017
  • October 2016
  • September 2016
  • June 2016
  • February 2016
  • December 2015
  • April 2015
  • March 2015
  • January 2015
  • November 2014
  • September 2014
  • August 2014
  • November 2013
  • October 2013
  • August 2013
  • May 2013
  • April 2013
  • December 2012
  • November 2012
  • October 2012
  • September 2012
  • April 2012
  • February 2012
Blog at WordPress.com.
  • Subscribe Subscribed
    • calculated | content
    • Join 732 other subscribers
    • Already have a WordPress.com account? Log in now.
    • calculated | content
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar
 

Loading Comments...