• LM101-086: Ch8: How to Learn the Probability of Infinitely Many Outcomes
    Jul 21 2021

    This 86th episode of Learning Machines 101 discusses the problem of assigning probabilities to a possibly infinite set of outcomes in a space-time continuum which characterizes our physical world. Such a set is called an “environmental event”. The machine learning algorithm uses information about the frequency of environmental events to support learning. If we want to study statistical machine learning, then we must be able to discuss how to represent and compute the probability of an environmental event. It is essential that we have methods for communicating probability concepts to other researchers, methods for calculating probabilities, and methods for calculating the expectation of specific environmental events. This episode discusses the challenges of assigning probabilities to events when we allow for the case of events comprised of an infinite number of outcomes. Along the way we introduce essential concepts for representing and computing probabilities using measure theory mathematical tools such as sigma fields, and the Radon-Nikodym probability density function. Near the end we also briefly discuss the intriguing Banach-Tarski paradox and how it motivates the development of some of these special mathematical tools. Check out: www.learningmachines101.com and www.statisticalmachinelearning.com for more information!!!

    Show more Show less
    35 mins
  • LM101-085:Ch7:How to Guarantee your Batch Learning Algorithm Converges
    May 21 2021

    This 85th episode of Learning Machines 101 discusses formal convergence guarantees for a broad class of machine learning algorithms designed to minimize smooth non-convex objective functions using batch learning methods. In particular, a broad class of unsupervised, supervised, and reinforcement machine learning algorithms which iteratively update their parameter vector by adding a perturbation based upon all of the training data. This process is repeated, making a perturbation of the parameter vector based upon all of the training data until a parameter vector is generated which exhibits improved predictive performance. The magnitude of the perturbation at each learning iteration is called the “stepsize” or “learning rate” and the identity of the perturbation vector is called the “search direction”. Simple mathematical formulas are presented based upon research from the late 1960s by Philip Wolfe and G. Zoutendijk that ensure convergence of the generated sequence of parameter vectors. These formulas may be used as the basis for the design of artificially intelligent smart automatic learning rate selection algorithms. The material in this podcast is designed to provide an overview of Chapter 7 of my new book “Statistical Machine Learning” and is based upon material originally presented in Episode 68 of Learning Machines 101! Check out: www.learningmachines101.com for the show notes!!!

     

    Show more Show less
    31 mins
  • LM101-084: Ch6: How to Analyze the Behavior of Smart Dynamical Systems
    Jan 5 2021

    In this episode of Learning Machines 101, we review Chapter 6 of my book “Statistical Machine Learning” which introduces methods for analyzing the behavior of machine inference algorithms and machine learning algorithms as dynamical systems. We show that when dynamical systems can be viewed as special types of optimization algorithms, the behavior of those systems even when they are highly nonlinear and high-dimensional can be analyzed. Learn more by visiting: www.learningmachines101.com and www.statisticalmachinelearning.com .

    Show more Show less
    33 mins
  • LM101-083: Ch5: How to Use Calculus to Design Learning Machines
    Aug 29 2020

    This particular podcast covers the material from Chapter 5 of my new book “Statistical Machine Learning: A unified framework” which is now available! The book chapter shows how matrix calculus is very useful for the analysis and design of both linear and nonlinear learning machines with lots of examples. We discuss how to use the matrix chain rule for deriving deep learning descent algorithms and how it is relevant to software implementations of deep learning algorithms.  We also discuss how matrix Taylor series expansions are relevant to machine learning algorithm design and the analysis of generalization performance!!

    For additional details check out: www.learningmachines101.com and www.statisticalmachinelearning.com

     

    Show more Show less
    34 mins
  • LM101-082: Ch4: How to Analyze and Design Linear Machines
    Jul 23 2020

    The main focus of this particular episode covers the material in Chapter 4 of my new forthcoming book titled “Statistical Machine Learning: A unified framework.”  Chapter 4 is titled “Linear Algebra for Machine Learning.

    Many important and widely used machine learning algorithms may be interpreted as linear machines and this chapter shows how to use linear algebra to analyze and design such machines. In addition, these same techniques are fundamentally important for the development of techniques for the analysis and design of nonlinear machines. 

    This podcast provides a brief overview of Linear Algebra for Machine Learning for the general public as well as information
    for students and instructors regarding the contents of Chapter 4
    of Statistical Machine Learning. For more details, check out: www.statisticalmachinelearning.com

    Show more Show less
    29 mins
  • LM101-081: Ch3: How to Define Machine Learning (or at Least Try)
    Apr 9 2020

    This particular podcast covers the material in Chapter 3 of my new book “Statistical Machine Learning: A unified framework” with expected publication date May 2020. In this episode we discuss Chapter 3 of my new book which discusses how to formally define machine learning algorithms. Briefly, a learning machine is viewed as a dynamical system that is minimizing an objective function. In addition, the knowledge structure of the learning machine is interpreted as a preference relation graph which is implicitly specified by the objective function. In addition, this week we include in our book review section a new book titled “The Practioner’s Guide to Graph Data  by Denise Gosnell and Matthias Broecheler. To find out more information visit the website: www.learningmachines101.com .

    Show more Show less
    37 mins
  • LM101-080: Ch2: How to Represent Knowledge using Set Theory
    Feb 29 2020

    This particular podcast covers the material in Chapter 2 of my new book “Statistical Machine Learning: A unified framework” with expected publication date May 2020. In this episode we discuss Chapter 2 of my new book, which discusses how to represent knowledge using set theory notation. Chapter 2 is titled “Set Theory for Concept Modeling”.

    Show more Show less
    32 mins
  • LM101-079: Ch1: How to View Learning as Risk Minimization
    Dec 24 2019

    This particular podcast covers the material in Chapter 1 of my new (unpublished) book “Statistical Machine Learning: A unified framework”. In this episode we discuss Chapter 1 of my new book, which shows how supervised, unsupervised, and reinforcement learning algorithms can be viewed as special cases of a general empirical risk minimization framework. This is useful because it provides a framework for not only understanding existing algorithms but also for suggesting new algorithms for specific applications.

    Show more Show less
    26 mins