Recently, Meta released LLama3.2 1B and 3B Instruct Fine Tuned LLM. To mixed reviews. On the one hand, it’s ranking … More
Tag: machine learning
SVDSmoothing LLM Layers with WeightWatcher
Recently, Microsoft Research published the LASER method: ”Layer-Selective Rank Reduction” in this recent, very popular paper The Truth is in There: … More
Evaluating LLMs with WeightWatcher Part III: The Magic of Mistral, a Story of Dragon Kings
Recently, the Mistral models have taken the LLM world by storm. The Mistral Mixture of Experts (MOE) 8x7b model outperforms other … More
Evaluating Fine-Tuned LLMs with WeightWatcher
if you are fine-tuning your own LLMs, you need a way to evaluate them. And while there are over a dozen … More
Why WeightWatcher Works
I am frequently asked, why does weightwatcher work ? The weightwatcher tool uses power law fits to model the eigenvalue … More
WeightWatcher: Empirical Quality Metrics for Deep Neural Networks
We introduce the weightwatcher (ww) , a python tool for a python tool for computing quality metrics of trained, and … More
Heavy Tailed Self Regularization in Deep Neural Nets: 1 year of research
My talk at ICSI-the International Computer Science Institute at UC Berkeley. ICSI is a leading independent, nonprofit center for research … More
Foundations: The Partition Function.
We are going to examine the Partition function that arises in Deep Learning methods like Restricted Boltzmann Machines. We take … More
Music Recommendations and the Logistic Metric Embedding
In this post, we are going to see how to build our own music recommender, using the Logistic Metric Embedding … More
