Smart machines based upon the principles of artificial intelligence and machine learning are now prevalent in our everyday life. For example, artificially intelligent systems recognize our voices, sort our pictures, make purchasing suggestions, and can automatically fly planes and drive cars. In this podcast series, we examine such questions such as: How do these devices work? Where do they come from? And how can we make them even smarter and more human-like? These are the questions that will be addressed in this podcast series!
In this 77th episode of www.learningmachines101.com , we explain the proper semantic interpretation of the Bayesian Information Criterion (BIC) and emphasize how this semantic interpretation is fundamentally different from AIC (Akaike Information Criterion) model selection methods. Briefly, BIC is used to estimate the probability of the training data given the probability model, while AIC is used to estimate out-of-sample prediction error. The probability of the training data given the model is called the “marginal likelihood”. Using the marginal likelihood, one can calculate the probability of a model given the training data and then use this analysis to support selecting the most probable model, selecting a model that minimizes expected risk, and support Bayesian model averaging. The assumptions which are required for BIC to be a valid approximation for the probability of the training data given the probability model are also discussed.
The precise semantic interpretation of the Akaike Information Criterion (AIC) and Generalized Akaike Information Criterion (GAIC) for selecting the best model are provided, explicit assumptions are provided for the AIC and GAIC to be valid, and explicit formulas are provided for the AIC and GAIC so they can be used in practice. AIC and GAIC provide a way of estimating the average prediction error of your learning machine on test data without using test data or cross-validation methods.
In this episode, we explore the question of what can computers do as well as what computers can’t do using the Turing Machine argument. Specifically, we discuss the computational limits of computers and raise the question of whether such limits pertain to biological brains and other non-standard computing machines.
The challenges of representing knowledge using rules are discussed. Specifically, these challenges include: issues of feature representation, having an adequate number of rules, obtaining rules that are not inconsistent, and having rules that handle special cases and situations. To learn more, visit:
This is a remix of the original second episode Learning Machines 101 which describes in a little more detail how the computer program that Arthur Samuel developed in 1959 learned to play checkers by itself without human intervention using a mixture of classical artificial intelligence search methods and artificial neural network learning algorithms. The podcast ends with a book review of Professor Nilsson’s book: “The Quest for Artificial Intelligence: A History of Ideas and Achievements”.
This podcast is basically a remix of the first and second episodes of Learning Machines 101 and is intended to serve as the new introduction to the Learning Machines 101 podcast series. The book "Computation as Done by Brains and Machines" by Professor James A. Anderson is briefly reviewed. For more information, please visit: www.learningmachines101.com
This episode of Learning Machines 101 explains how to use first-order logic and Markov logic nets to represent common sense knowledge in machine learning algorithms links to free software for implementing Markov logic nets and a free database of common-sense knowledge is provided.
This 70th episode of Learning Machines 101 we discuss how to identify facial emotion expressions in images using an advanced clustering technique called Stochastic Neighborhood Embedding for: improving online communications, identifying terrorists, improving lie detector tests, improving athletic performance, and designing smart advertising which looks at a customer’s face to determine if they are bored or interested. The machine learning text “Pattern Recognition and Machine Learning” is reviewed.
This 69th episode of Learning Machines 101 provides a short overview of the 2017 Neural Information Processing Systems conference with a focus on the development of methods for teaching learning machines rather than simply training them on examples. In addition, a book review of the book “Deep Learning” is provided.
Simple mathematical formulas are presented that ensure convergence of a generated sequence of parameter vectors which are updated using an iterative algorithm consisting of adding a stepsize number multiplied by a search direction vector to the current parameter values and repeating this process. These formulas may be used as the basis for the design of artificially intelligent smart automatic learning rate selection algorithms. Please visit: www.learningmachines101.com
In this episode we discuss how to learn to solve constraint satisfaction inference problems. The goal of the inference process is to infer the most probable values for unobservable variables. These constraints, however, can be learned from experience. Specifically, the important machine learning method for handling unobservable components of the data using Expectation Maximization is introduced. Check it out at:
In this episode of Learning Machines 101 (www.learningmachines101.com) we discuss how to solve constraint satisfaction inference problems where knowledge is represented as a large unordered collection of complicated probabilistic constraints among a collection of variables. The goal of the inference process is to infer the most probable values of the unobservable variables given the observable variables. Specifically, Monte Carlo Markov Chain ( MCMC ) methods are discussed.
In this episode rerun we introduce the concept of gradient descent which is the fundamental principle underlying learning in the majority of deep learning and neural network learning algorithms. Check out the website: www.learningmachines101.com to obtain a transcript of this episode!
In this rerun of episode 24 we explore the concept of evolutionary learning machines. That is, learning machines that reproduce themselves in the hopes of evolving into more intelligent and smarter learning machines. This leads us to the topic of stochastic model search and evaluation. Check out the blog with additional technical references at: www.learningmachines101.com
This 63rd episode of Learning Machines 101 discusses how to build reinforcement learning machines which become smarter with experience but do not use this acquired knowledge to modify their actions and behaviors. This episode explains how to build reinforcement learning machines whose behavior evolves as the learning machines become increasingly smarter. The essential idea for the construction of such reinforcement learning machines is based upon first developing a supervised learning machine. The supervised learning machine then “guesses” the desired response and updates its parameters using its guess for the desired response! Although the reasoning seems circular, this approach in fact is a variation of the important widely used machine learning method of Expectation-Maximization. Some applications to learning to play video games, control walking robots, and developing optimal trading strategies for the stock market are briefly mentioned as well. Check us out at: www.learningmachines101.com
This 62nd episode of Learning Machines 101 (www.learningmachines101.com) discusses how to design reinforcement learning machines using your knowledge of how to build supervised learning machines! Specifically, we focus on Value Function Reinforcement Learning Machines which estimate the unobservable total penalty associated with an episode when only the beginning of the episode is observable. This estimated Value Function can then be used by the learning machine to select a particular action in a given situation to minimize the total future penalties that will be received. Applications include: building your own robot, building your own automatic aircraft lander, building your own automated stock market trading system, and building your own self-driving car!!
This is the third of a short subsequence of podcasts providing a summary of events associated with Dr. Golden’s recent visit to the 2015 Neural Information Processing Systems Conference. This is one of the top conferences in the field of Machine Learning. This episode reviews and discusses topics associated with the Introduction to Reinforcement Learning with Function Approximation Tutorial presented by Professor Richard Sutton on the first day of the conference. This episode is a RERUN of an episode originally presented in January 2016 and lays the groundwork for future episodes on the topic of reinforcement learning! Check out: www.learningmachines101.com for more info!!
This 60th episode of Learning Machines 101 discusses how one can use novelty detection or anomaly detection machine learning algorithms to monitor the performance of other machine learning algorithms deployed in real world environments. The episode is based upon a review of a talk by Chief Data Scientist Ira Cohen of Anodot presented at the 2016 Berlin Buzzwords Data Science Conference. Check out: www.learningmachines101.com to hear the podcast or read a transcription of the podcast!
I discuss the concept of a “neural network” by providing some examples of recent successes in neural network machine learning algorithms and providing a historical perspective on the evolution of the neural network concept from its biological origins. For more details visit us at: www.learningmachines101.com
In this 58th episode of Learning Machines 101, I’ll be discussing an important new scientific breakthrough published just last week for the first time in the journal Econometrics in the special issue on model misspecification titled “Generalized Information Matrix Tests for Detecting Model Misspecification”. The article provides a unified theoretical framework for the development of a wide range of methods for determining if a learning machine is capable of learning its statistical environment. The article is co-authored by myself, Steven Henley, Halbert White, and Michael Kashner. It is an open-access article so the complete article can be downloaded for free! The download link can be found in the show notes of this episode at: www.learningmachines101.com . In 30 years everyone will be using these methods so you might as well start using them now!
In this 57th episode, we explain how to use unsupervised machine learning algorithms to catch internet criminals who try to steal your money electronically! Check it out at: www.learningmachines101.com
In this NEW episode we discuss Latent Semantic Indexing type machine learning algorithms which have a PROBABILISTIC interpretation. We explain why such a probabilistic interpretation is important and discuss how such algorithms can be used in the design of document retrieval systems, search engines, and recommender systems. Check us out at: www.learningmachines101.com and follow us on twitter at: @lm101talk
In this rerun of Episode 10, we discuss fundamental principles of learning in statistical environments including the design of learning machines that can use prior knowledge to facilitate and guide the learning of statistical regularities. In particular, the episode introduces fundamental machine learning concepts such as: probability models, model misspecification, maximum likelihood estimation, and MAP estimation. Check us out at: www.learningmachines101.com
Welcome to the 54th Episode of Learning Machines 101 titled "How to Build a Search Engine, Automatically Grade Essays, and Identify Synonyms using Latent Semantic Analysis" (rerun of Episode 40). The principles in this episode are also applicable to the problem of "Market Basket Analysis" and the design of Recommender Systems. Check it out at: www.learningmachines101.com and follow us on twitter: @lm101talk
In this 53rd episode of Learning Machines 101, we introduce the concept of a Swarm Intelligence with respect to Particle Swarm Optimization Algorithms. The essential idea of “Swarm Intelligence” is that you have a group of individual entities which behave in a coordinated manner yet there is no master control center providing directions to all of the individuals in the group. The global group behavior is an “emergent property” of local interactions among individuals in the group! We will analyze the concept of swarm intelligence as a Markov Random Field, discuss how it can be harnessed to enhance the performance of machine learning algorithms, and comment upon relevant mathematics for analyzing and designing “swarm intelligences” so they behave in an appropriate manner by viewing the Swarm as a nonlinear optimization algorithm. For more information check out: www.learningmachines101.com and also check us out on twitter (@lm101talk).
Today, we discuss a simple yet powerful idea which began popular in the machine learning literature in the 1990s which is called “The Kernel Trick”. The basic idea of the “Kernel Trick” is that you specify similarity relationships among input patterns rather than a recoding transformation to solve a nonlinear problem with a linear learning machine. It's a great magic trick...check it out at: www.learningmachines101.com where you can obtain transcripts of this episode and download free machine learning software! Also check out the "Statistical Machine Learning Forum" on Linked In and follow us on Twitter using the twitter handle: @lm101talk
This particular podcast is a RERUN of Episode 20 and describes step by step how to download free software which can be used to make predictions using a feedforward artificial neural network whose hidden units are radial basis functions. This is essentially a nonlinear regression modeling problem. We show the performance of this nonlinear learning machine is substantially better on test data set than the linear learning machine software presented in Episode 13. Basically performance for the linear learning machine was about 13% because the data set was specifically designed to be unlearnable by a linear learning machine, while the performance for the nonlinear machine learning software in this episode is about 70%. Again, I'm a little disappointed that only a few people have downloaded the software and tried things out. You can download windows executable, mac executable, or the MATLAB source code. It's important to actually experiment with real machine learning software if you want to learn about machine learning! Check out: www.learningmachines101.com to obtain transcripts of this podcast and download free machine learning software! Or tweet us at: @lm101talk
In this episode we will explain how to download and use free
machine learning software from the website: www.learningmachines101.com.
This podcast is concerned with the very practical issues
associated with downloading and installing machine learning
software on your computer. If you follow these instructions, by the
end of this episode you will have installed one of the simplest
(yet most widely used) machine learning algorithms on your
computer. You can then use the software to make virtually any kind
of prediction you like. Also follow us on
twitter at: lm101talk
In this episode we continue the discussion of learning when the actions of the learning machine can alter the characteristics of the learning machine’s statistical environment. We describe how to download free lunar lander software so you can experiment with an autopilot for a lunar lander module that learns from its experiences and describe the results of some simulation studies. To learn more, visit:
to download the free lunar lander software which illustrates principles of temporal reinforcement learning and nonlinear control theory. You will also have the opportunity to download free software which illustrates how a simple deep learning neural network with one layer of radial basis functions works and a simple linear regression model learning machine. Check it out!!!
In this episode we consider the problem of learning when the actions of the learning machine can alter the characteristics of the learning machine’s statistical environment. We illustrate the solution to this problem by designing an autopilot for a lunar lander module that learns from its experiences. For more information, check out:
and visit us a twitter: @lm101talk #machinelearning #statistics
We explain how to estimate the parameters of such machines to classify a pattern vector as a member of one of two categories as well as identify special pattern vectors called “support vectors” which are important for characterizing the Support Vector Machine decision boundary. The relationship of Support Vector Machine parameter estimation and logistic regression parameter estimation is also discussed.For more information..check us out at: www.learningmachines101.com
also check us out on twitter at: lm101talk
In this episode, we briefly review Item Response Theory and Bayesian Network Theory methods for the assessment and optimization of student learning and then describe a poster presented on the first day of the Neural Information Processing Systems conference in December 2015 in Montreal which describes a Recurrent Neural Network approach for the assessment and optimization of student learning called “Deep Knowledge Tracing”. For more details check out:
www.learningmachines101.com and follow us on Twitter at: @lm101talk
In this episode we discuss just one out of the 102 different posters which was presented on the first night of the 2015 Neural Information Processing Systems Conference. This presentation describes a system which can answer simple questions about images. Check out: www.learningmachines101.com for additional details!!
This is the third of a short subsequence of podcasts providing a summary of events associated with Dr. Golden’s recent visit to the 2015 Neural Information Processing Systems Conference. This is one of the top conferences in the field of Machine Learning. This episode reviews and discusses topics associated with the Introduction to Reinforcement Learning with Function Approximation Tutorial presented by Professor Richard Sutton on the first day of the conference. Check out: www.learningmachines101.com to learn more!! Also follow us at: "lm101talk" on twitter!
Welcome to the 43rd Episode of Learning Machines 101!We are currently presenting a subsequence of episodes covering the events of the recent Neural Information Processing Systems Conference. However, this weekwill digress with a rerun of Episode 22 which nicely complements our previous discussion of the Monte Carlo Markov Chain Algorithm Tutorial. Specifically, today wediscuss the problem of approaches for learning or equivalently parameter estimation in Monte Carlo Markov Chain algorithms. The topics covered in this episode include: What is the pseudolikelihood method and what are its advantages and disadvantages?What is Monte Carlo Expectation Maximization? And...as a bonus prize...a mathematical theory of "dreaming"!!! The current plan is to returnto coverage of the Neural Information Processing Systems Conference in 2 weeks on January 25!! Check out: www.learningmachines101.com for more details!
This is the second of a short subsequence of podcasts providing a summary of events associated with Dr. Golden’s recent visit to the 2015 Neural Information Processing Systems Conference. This is one of the top conferences in the field of Machine Learning. This episode reviews and discusses topics associated with the Monte Carlo Markov Chain (MCMC) Inference Methods Tutorial held on the first day of the conference. Check out: www.learningmachines101.com to listen or download this podcast episode or download the transcripts! Also visit us at LINKEDIN or TWITTER. The twitter handle is: LM101TALK
This is the first of a short subsequence of podcasts which provides a summary of events associated with Dr. Golden’s recent visit to the 2015 Neural Information Processing Systems Conference. This is one of the top conferences in the field of Machine Learning. This episode introduces the Neural Information Processing Systems Conference and reviews the content of the Morning Deep Learning Tutorial which took place on the first day of the conference. Check out: www.learningmachines101.comfor additional supplementary hyperlinks to the conference and conference papers!!
In this episode we introduce a very powerful approach for computing semantic similarity between documents. Here, the terminology “document” could refer to a web-page, a word document, a paragraph of text, an essay, a sentence, or even just a single word. Two semantically similar documents, therefore, will discuss many of the same topics while two semantically different documents will not have many topics in common. Machine learning methods are described which can take as input large collections of documents and use those documents to automatically learn semantic similarity relations. This approach is called Latent Semantic Indexing (LSI) or Latent Semantic Analysis (LSA). Visit us at: www.learningmachines101.com to learn more!
In this episode we discuss how to solve constraint satisfaction inference problems where knowledge is represented as a large unordered collection of complicated probabilistic constraints among a collection of variables. The goal of the inference process is to infer the most probable values of the unobservable variables given the observable variables. Concepts of Markov Random Fields and Monte Carlo Markov Chain methods are discussed. For additional details and technical notes, please visit the website: www.learningmachines101.com
Also feel free to visit us at twitter: @lm101talk
In this episode, we examine the problem of developing an advanced artificially intelligent technology which is capable of tracking knowledge growth in students in real-time, representing the knowledge state of a student a skill profile, and automatically defining the concept of a skill without human intervention! The approach can be viewed as a sophisticated state-of-the-art extension of the Item Response Theory approach to Computerized Adaptive Testing Educational Technology described in Episode 37. Both tutorial notes and advanced implementational notes can be found in the show notes at: www.learningmachines101.com.
In this episode, we discuss the problem of how to build a smart computerized adaptive testing machine using Item Response Theory (IRT). Suppose that you are teaching a student a particular target set of knowledge. Examples of such situations obviously occur in nursery school, elementary school, junior high school, high school, and college. However, such situations also occur in industry when top professionals in a particular field attend an advanced training seminar. All of these situations would benefit from a smart adaptive assessment machine which attempts to estimate a student’s knowledge in real-time. Such a machine could then use that information to optimize the choice and order of questions to be presented to the student in order to develop a customized exam for efficiently assessing the student’s knowledge level and possibly guiding instructional strategies. Both tutorial notes and advanced implementational notes can be found in the show notes at: www.learningmachines101.com .
In this episode, we discuss the problem of predicting the future from not only recent events but also from the distant past using Recurrent Neural Networks (RNNs). A example RNN is described which learns to label images with simple sentences. A learning machine capable of generating even simple descriptions of images such as these could be used to help the blind interpret images, provide assistance to children and adults in language acquisition, support internet search of content in images, and enhance search engine optimization websites containing unlabeled images. Both tutorial notes and advanced implementational notes for RNNs can be found in the show notes at: www.learningmachines101.com .
In this episode, we address the important questions of “What is a neural network?” and “What is a hot dog?” by discussing human brains, neural networks that learn to play Atari video games, and rat brain neural networks. Check out: www.learningmachines101.com for videos of a neural network that learns to play ATARI video games and transcripts of this podcast!!! Also follow us on twitter at: @lm101talk
See you soon!!
Welcome to the 34th podcast in the podcast series Learning Machines 101 titled
"How to Use Nonlinear Machine Learning Software to Make Predictions".
This particular podcast is a RERUN of Episode 20 and describes step by step how to download free software which can be used to make predictions using a feedforward artificial neural network whose hidden units are radial basis functions. This is essentially a nonlinear regression modeling problem. Check out: www.learningmachines101.comand follow us on twitter: @lm101talk
In this episode will explain how to download and use free machine learning software which can be downloaded from the website: www.learningmachines101.com. The software can be used to make predictions using your own data sets. Although we will continue to focus on critical theoretical concepts in machine learning in future episodes, it is always useful to actually experience how these concepts work in practice.This is a rerun of Episode 13.
In this 32nd episode of Learning Machines 101, we introduce the concept of a Support Vector Machine. We explain how to estimate the parameters of such machines to classify a pattern vector as a member of one of two categories as well as identify special pattern vectors called “support vectors” which are important for characterizing the Support Vector Machine decision boundary. The relationship of Support Vector Machine parameter estimation and logistic regression parameter estimation is also discussed. Check out this and other episodes as well as supplemental references to these episodes at the website: www.learningmachines101.com. Also follow us at twitter using the twitter handle: lm101talk.
In this rerun of Episode 16, we introduce the important concept of gradient descent which is the fundamental principle underlying learning mechanisms in a wide range of machine learning algorithms. Check out the transcripts of this episode and related references and software at: www.learningmachines101.com !!!
Deep learning machine technology has rapidly developed over the past five years due in part to a variety of actors such as: better technology, convolutional net algorithms, rectified linear units, and a relatively new learning strategy called "dropout" in which hidden unit feature detectors are temporarily deleted during the learning process. This article introduces and discusses the concept of "dropout" to support deep learning performance and makes connections of the "dropout" concept to concepts of regularization and model averaging. For more details and background references, check out: www.learningmachines101.com !
This podcast discusses talks, papers, and ideas presented at the recent International Conference on Learning Representations 2015 which was followed by the Artificial Intelligence in Statistics 2015 Conference in San Diego. Specifically, commonly used techniques shared by many successful deep learning algorithms such as: rectilinear units, convolutional filters, and max-pooling are discussed. For more details please visit our website at: www.learningmachines101.com!
This rerun of an earlier episode of Learning Machines 101 discusses the problem of how to evaluate the ability of a learning machine to make generalizations and construct abstractions given the learning machine is provided a finite limited collection of experiences. Check out: www.learningmachines101.com to obtain transcripts of this podcast and download free machine learning software!