成人18免费视频,欧美高清强视频,伊人久久亚洲综合天堂

達飝:AI/ANN -Reinforcement Learning (《人工智能/人工神經元—強化學習方法解析 [英文授課]》)

人工智能undefined人工神經元undefined監督學習undefined機器學習undefined深度學習

2018-11-13 3193

對象

歐美外資企業

目的

見下文

內容

《人工智能/人工神經元—強化學習方法解析 [英語授課]》

AI/ANN -Reinforcement Learning

【Background & Goals】

Reinforcement learning (RL) is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. The problem, due to its generality, is studied in many disciplines, such as game theory, control theory, operations research, information theory, simulation-based optimization, multi-agent systems, swarm intelligence, statistics and genetic algorithms. In this course of lectures, reinforcement learning is being saw as approximate dynamic programming, The approach is studied in the theory of optimal control, though most studies are concerned with the existence of optimal solutions and their characterization, and not with learning or approximation. In machine learning, the environment is typically formulated as a Markov decision process (MDP), as many reinforcement learning algorithms for this context utilize dynamic programming techniques.

Reinforcement learning differs from standard supervised learning in that correct input/output pairs are never presented, nor sub-optimal actions explicitly corrected. Instead the focus is on performance, which involves finding a balance between exploration (of uncharted territory) and exploitation (of current knowledge), The exploration vs. exploitation trade-off has been most thoroughly studied through the multi-armed bandit problem and in finite MDPs.

The main difference between the classical techniques and reinforcement learning algorithms is that the latter do not need knowledge about the MDP (Markov decision process) and they are able to target large MDPs where exact methods become infeasible.

【Trainees】

Programmers and managers engaged in AI/ANN - Reinforcement Learning applications and the managers of the relevant business functions.

Trainees need to have well-understanding to advanced higher mathematics.

(受訓學員必須具備現代高等數學良好基礎)

【Timing】 6 class hours (6 Class hrs/day)

【General Content】

PART 1 Necessary & Essential AI Knowledge

PART 2 A smart Robot in a room ——Example

PART 3 Defining a Markov Decision Process

PART 4 Monte Carlo methods

PART 5 RL Substantializing & Strengthening ——Q-learning

【Detailed Content】