An informative and descriptive title for your literature survey

An informative and descriptive title for your Literature survey
Name + matric num
November 11, 2017

1. Introduction
This paper describes reinforcement learning, its importance, the challenges and the solutions to the
problems faced by agents when trying to apply reinforcement learning. It demonstrates how
particular functions of neural processes can be developed and implemented to solve computational
problems. The study has also provided insight into understanding the function of the brain.
Secondly, the understanding of the various algorithms that are applied to accomplish these
different brain functions and thirdly helps to know how neurobiologist measures the brain function.
1.1. Complexity in using reinforcement learning
Agents are faced with complex tasks as they try to use reinforcement learning effectively in
real-life difficulty. They are forced to derive effective depiction of the surroundings from high-
dimensional sensory inputs and use representation to generalize past familiarities to the new
situation. These can be influenced by different structures that range from complex geometry of
dendritic trees to the enormous, diverse network that is distributed across the brain. These
articles incorporate a great series of parallels that describe how the brain computes.
1.2. Limitation in applicability of reinforcement learning
Although reinforcement learning proxies have accomplished several levels of successes in
various spheres, their usefulness has been restricted to the domain in where important features
can be handmade or to domains with entirely observed, low dimensional state space. The
solution to this problem is by applying the AlphaGo which integrates the constituents in an
organized way at scale in a high –performance tree search engine. Go can help in solving the
problems encountered by artificial intelligence, a challenge decision asking task, and obstinate
search space.

2. Human-level controls through deep reinforce learning.
Summary
Mnih et al., (2015) observes that the theory of reinforcement learning offers a normative
account which has its basis in psychological and neuro-scientific view on the behavior of animals
and how the agents can optimize how they control their surroundings. Human beings must
develop well-organized illustrations of the environs from the sensory inputs and apply them in
generalizing their past experiences to the current situations. According to Mnih et al., (2015),
human beings and animals solve the problem by combining reinforcement learning and
hierarchical sensory processing systems. The authors observe that even though agents have
achieved success in different domains, the applicability of reinforcement learning is limited to
domains which have several features that can only be handmade.
Problem solved
The problem has been solved through training method where deep neural networks were
used create a novel simulated agent that they called Q-networks. According to the authors, the
artificial agent can learn policies directly from high-dimensional sensory inputs with the help of
reinforcement learning. The agent functions by combining reinforcement learning with a class of
artificial neural networks which is enabled by the algorithm created by the authors to develop
various competencies on the numerous challenging tasks. The single algorithm is the main goal
of general artificial intelligence.

Approach
The instability of reinforcement learning which is caused by correlations in the sequence
of observation can revolutionize the procedure, and the data spreading was solved by using an
innovative variation of Q-learning which utilizes only two key notions. According to Mnih et al.,
(2015), after the tests of the DQN, it was discovered that they succeeded at close range that was
equivalent to a skilled human.
Relatedwork
The authors relate their work to the study of neuroscience inspired artificial intelligence
by Hassabis et al., (2017), which is a research that applies the same approach to demonstrate how
artificial neural networks are inspired by neuroscience. It further conforms with the study done
by Silver et al., (2016) on the game of Go where neuroscience seems to play a great role in
inspiring machines learning and provided insights that guides major development in
neuroscience findings. This study article introduces a new artificial agent called deep Q-network
which is an algorithm that is supposed to train the Q-value efficiently and effectively. As
established on the basis of storing and reusing experiences on transition, a model on a neural
network that is based on Reinforcement Learning algorithm is used in this study. The article
demonstrates the effectiveness of deep autoencoder neural networks in the deep reinforcement
learning. The study incorporates a framework that combines deep encoder neural networks with
the proposed model of reinforcement learning. The research emphasizes on efficient data
resulting from the combination of the two while studying the features of deep autoencoders.

Methodology
This paper presents qualitative research methodology where artificial agents are trained
and tested using reinforcement learning. The study applied deep Q-network tool that combines
both reinforcement learning and artificial neural networks. The study was set on a single
algorithm that had the ability to develop various challenges to test the general artificial
intelligence.
Conclusion
The work on deep reinforcement learning focuses on bridging the gap between high-
dimensional sensory activities that provide an artificial agent that has the ability to develop and
succeed in challenging situations. This has helped in motivating algorithmic understanding of the
aspects of physical learning and intelligence awareness.
3. Neuroscience –inspired artificial intelligence.
Summary
The areas of neuroscience and artificial intelligence are related, but there has been no
communication between the two for a long time. According to Hassabis et al., (2017), a better
understanding of biological brains can help in building intelligent machines. At the beginning of
the computer age, the work of artificial intelligence was intertwined with neuroscience, and the
combination proved to be productive. However, this interaction is now not common due to the
growth in complexity and the solidification of disciplinary boundaries between the two. Hassabis
et al., (2017) says that developing human-level general artificial intelligence is a complex task
since the search space of possible solutions is enormous and may have sparse population. This
study involves two major terms, that is, neuroscience and AI. The term neuroscience in this
article has been used to include all fields that involve the study of brains and its cognitive

behaviors. On the other hand, the term AI is used to refer to learning of machines and research
that can help build intelligence machines. The study of human cognition and its neural
implementation has an important role since it provides an insight into different aspects of higher
level general intelligence.
Problem solved
The study explores the benefits of Artificial Intelligence from a variety of interactions to
create solutions to this problem. This has been achieved through, the past; deep learning and
reinforcement learning. The present; attention, episodic memory, working memory, and
continual learning. The future; intuitive understanding of the physical world, transfer learning,
efficient learning, imagination and planning, virtual brain analytics. All these interrelations are
aimed at showing the role of neuroscience in supporting the development of artificial
intelligence. There are however other incidences where the field of neuroscience has benefitted
from artificial intelligence. There are various techniques that have arisen from the interaction of
AI and neuroscience. For instance the use of neuroimaging datasets in machines that used in
multivariate analysis of fMRI and magnetoencephalography (MEG). Therefore, new ideas that
are discovered during the building of intelligent algorithms can offer insight into the
development of intelligence of human beings and animals.
Approach
According to Hassabis et al., (2017) it is beneficial to create artificial intelligence of
closely assessing biological intelligence. Neuroscience offers a resourceful foundation of
encouragement for other kinds of algorithms and architectures, autonomously operating and
corresponding to the mathematical and methods founded on logic and notions that have
overpowered the outdated advances in artificial intelligence. It can also provide validation of

artificial intelligence techniques that are already available. Algorithms that are found to be
implemented in the brain are considered to be plausible as an integral part of the general
intelligence system. Hassabis et al., (2017) assert that developing an artificial intelligence does
not require adherence to biological plausibility. The most significant concerns in the
development of artificial intelligence are those agents that work, for instance, biological
plausibility is considered a guide and not a requirement in this work.
Relatedwork
This work is related to the study of human-level controls through deep reinforce learning
by Mnih et al., (2015). It involves the same principle, researching on cognition behavior and
neural implementation of human being and apply this knowledge in artificial intelligence. This
research also proves that supervised learning provides ideas that inspires recent advances in
artificial intelligence. This is in line with the study on the game of GO by Silver et al., (2016). It
shows that there is a lot of information that could be applied in supervised learning to produce
the right behavior in machines. This paper explores the different functions of computations
which includes computable predicates, computable variables among others. All these functions
involve the same fundamental problems and relate computable numbers. The relationship of
computable numbers and functions can be obtained by developing the concept of the functions of
actual changing characters represented by use of computable numbers. According to Alan
Turings, modeling brain and computer into each other prevents generating new insights in the
field of artificial intelligence. He used models of human-computer to manipulate numbers and
figures through calculations. This experiments indicated differences in biological brain systems
due to the environment and other factors. This actions of neurons and the nervous systems

indicated that neurons could be modeled as logical prepositions. The abstraction that was
constructed by Alan Turings proved that brain computes.
Methodology
This paper traditional approach to present the relationship of biological brains and
artificial machines. The study uses past history to examine the field of neuroscience and that of
artificial intelligence. It also involves studies of current advancement in neural computations and
its role in artificial intelligence.
Conclusion
The importance of neuroscience that helps in growth of artificial intelligence forces the
AI researchers to collaborate with neuroscientists. The effective transfer of insight obtained from
neuroscience to the improvement of the algorithms in artificial intelligence depends on the
relationship between the researchers from both fields. Because scientists always have vague
ideas in their subject of study, the research in artificial intelligence can help to actualize these
studies. This study uses deep learning and other approaches to create solutions to scientific
problems. This is because it has been used to achieve master level playing in most complex
games such as chess, backgammon, and checkers. Therefore the work on artificial intelligence
aims to show that the value function can be updated from episodes of real experience, by joining
it with future estimates using value function approximation. If the neuroscientists collaborate
with the AI researchers, the researchers and the neuroscientists will easily share important
information that can also enhance the effectiveness of their work.

4. Synthesis for mastering the game of Go with deep neural networks and tree search
Abstract
As Silver et al., (2016) puts it in theory form, the game of Go represents perfect
information and a search tree that contains bd series of moves. In this case, b represents the
breadth of the game or the number of legal moves that can be made per position while d
represents the depth of the game and also shows the perfect play per position. This study starts
by elaborating the functions that determine players and winning positions in a game. Silver et al.,
(2016), explains how they developed a fresh tactic to the computer technology that applies value
networks to assess the positions of the boards and the policy networks to choose the moves.
Similar to the agent discussed by Mnih et al., (2015), the AlpaGo also has neural networks that
are directed by a new arrangement of controlled learning from modeled human game experts and
reinforcement learning from self-play games. Silver et al., (2016) adds that they also introduced
a new search algorithm that joins Monte Carlo simulation and policy networks.
Problem statement
The solution to the problem is presented through search algorithm that are found to be
more efficient since the AlpaGo attained a more significant percentage of success rate as
compared to other programs and even performed better than human Go champion from
European, it defeated the human by five games to 0. The agent created by the authors was the
first to beat a skilled human game player in a competitive game of Go. The AlphaGo agent is
better than the DQ networks established by Mnih and others since the latter was found to
perform similarly to human players while the formers perform better and even defeats a skilled
human game player.

Approach
The researchers sequence the neural networks with the use of a pipeline of various phases
of machine learning. The first stage of the study was supervised learning of policy networks.
This involved developing an AlphaGo convolutional neural network through supervised learning
that was obtained from human professional games Silver et al., (2016) explains that the training
starts with a supervised learning policy networks from professional human moves. Monitored
learning policy networks offer fast, useful learning updates with instant response and a gradient
of high quality. The researchers also train a quick policy to help in the rapid sampling of actions
at the time of rollout (Silver et al., 2016). The second training involves reinforcement learning
policy network that enhances the supervised learning policy network. The third instruction
involves value network that envisages the victor of the games that have been played by the
reinforcement learning policy network.
Supervising learning of policy networks is the first phase of training where the researchers use
the previous work to develop a method of guessing moves made by experts through the game by
applying supervised learning. The supervised learning policy network interchanges between the
layers and with the weight (Silver et al., 2016). Reinforcement learning of policy networks in the
training pipeline is necessary for the training since it helps in enhancing the policy network by
policy incline reinforcement learning. The Reinforcement policy network is comparable to the
supervised learning policy network. The last stage of training pipeline concentrates on evaluation
of position and estimation of the value function. The value functions help in predicting the
results from game positions.

In the searching with policy and value networks, the alphaGO program has both the
policy and value networks that are there in the MCTS algorithm. There are two edges (s, a) in the
search tree which stores the action value Q (s, a), prior probability P(s, a), and visit account (s,
a). Silver et al., (2016), runs an internal tournament to evaluate the playing strength of the
AlphaGo program. This takes place among the variants of AlphaGo and other Go programs. All
these programs perform under the MTCS algorithm.
The most important part of the research is to evaluate the playing strength of the
AlphaGo. The researchers had to run a tournament of the game among the subjects of the
AlphaGo variants and other programs used for commercial. The applications that the researchers
use in running the domestic tournament are based MCT algorithms that are considered high-
performance (Silver et al., 2016). The outcome of the competition indicated that the AlphaGo is
stronger compared to other initial Go programs, this is evident in the winning of the AlphaGo
against other programs (Silver et al., 2016). The researchers also assessed the AlphaGo variants
that evaluated the positions by applying the use of the value network. Silver et al., (2016)
indicates that AlphaGo exceeds the performance of other programs thus proving that value
networks offer a practical alternative to Monte Carlo assessment in Go. The researchers
demonstrated that Go is a representation in various ways of the complexities that are faced in the
in artificial innovation in the field of intelligence. The work of combining the policy of tree
search with the value networks by the researcher has brought AlphaGo to professional level in
Go. The success in the combination gives humans hope that the level of performance is
achievable in artificial intelligence domains.

Methodology
This study covers various aspects concerning the successfully playing and winning a
game. Silvers et al., (2016), describes the deep mind and also gives a brief description perfect
information games such as chess, backgammon and the game of Go. The researchers provide the
two-deep neural networks, that is, policy networks and value networks, using the Monte Carlo
tree search technique that is used in the modern computer Go-playing programs. The objective of
this article well indicates the role of computer science in the field of artificial intelligence in the
community of game players.
The success of winning a game depends on the length of the value function. Most games
are too large and can be truncated using approximate value function to achieve the results. This
however has not been the case with the game of Go. Therefore convolutional networks and
temporal difference learning have been applied to attain optimal value function in the game of
Go. Silver et al., (2016), uses Monte Carlo tree search (MTCS) algorithm. This method uses
Monte Carlo rollouts in estimating the value of different states in the search tree. The value of
the breadth of the tree can be reduced by sampling the actions that are obtained from a policy.
This demonstrates a possibility of spreading over possible moves in a given position. When more
simulations are added to this algorithm, then the search trees become bigger, and all relevant
values become clearer. Therefore as you perform more simulations the more accurate the results
that you get.
For the research to incorporate large neural networks efficiently into the AlphaGo program, the
researchers established asynchronous policy and value MCTS algorithm. The deep neural
networks have attained exceptional performance in the visual field since the deep neural
networks can now classify images and recognize faces a well as play Atari games (Silver et al.,

2016). The deep neural networks can construct the localized models through the help of several
strata of neurons organized in slates. The researchers use the same architecture applied in deep
convolutional neural in the game of Go. The researchers use the neural network to lessen the
sufficient depth and extent of the search tree. According to Silver et al., (2016) they decreased
the depth and the breadth of the search tree by assessing location applying the value network and
trying the activities through a policy network.
Rollout policy is a computation of patterns that reflect both response and non-response
patterns of moves in a game. It also includes a few handcrafted local features that encode Go
rules. The previous studies have used symmetries Go in the convolutional layers through
rotational and reflection invariant filters. This however inhabits the performance of large neural
networks because it prevents identification of particular asymmetric patterns. The policy network
was trained in this research in order to help in classifying positions of expert moves as played in
the dataset of KGS. The rotation and reflection in each dataset were computed and sampled
randomly to achieve the optimal value function.
Policy network reinforcement learning was used to train policy network through policy
gradient reinforcement learning. The rehearsals consisted of n number of games that were played
between the trained set and the opponents from a selected sample that would raise the stability of
training. The value network regression was used to train value network to estimate the value
function of the reinforcement learning policy network. This involved constructing new datasets
of unconnected self-play positions. Each feature of policy and value network was pre-processed
into a set of 19x19. The features that were applied indicated every intersection of Go board and
were obtained from the raw illustration of the game rules. The neural network architecture

describes the policy network with an input of 19x19x48 image stack and consists of 48 feature
planes.
The evaluation of the Go computer programs was done through the internal tournament
by Elo rating. All Go computer programs were set and run on individual computers except for
the AlphaGo program. The programs were set to compete against each other. The programs were
then supported using identical specifications and the most advanced versions of hardware
configuration that each program could support. Their achievements were ranked using the KGS
version.
Relatedwork
This particular research has a direct link to both the study of human-level controls
through deep reinforce learning by Mnih et al., (2015) and Neuroscience –inspired artificial
intelligence by Hassabis et al., (2017). These studies represent a set of research that indicate
approaches in neuroscience that test hypothesis to demonstrate machine learning. The study by
Mnih et al., (2015) shows how specialized agents can be controlled using reinforcement learning
and particular algorithms to achieve certain levels of success, this connects to the study of
biological brains to develop intelligence machines by Hassabis et al., (2017), this two studies
confirms the broad established problems that should be attended using neural computation
derived from brains structures that match each problem. The patterns of move represent a crucial
technique that can be used to integrate domain knowledge into the game of Go playing programs.
This work demonstrates supervised learning techniques that are used to study game patterns
based on records that use a generalization of Elo ratings. Every model move in the training data
is deliberated as a win of a team of pattern features. In this case, Elo ratings of each pattern
feature are calculated from the wins. This computation can be applied in the future to calculate a

chance distribution over authorized moves. The applied algorithm outperforms the majority of
the past pattern-learning algorithms. The Monte Carlo search tree program was used to improve
these patterns to reach high levels of classical programs. The AlphaGo is a program and also a
software framework that is used to play the game of Go. It became the first of its kind to
successfully win against a top human professional player in the game of Go. This study shows
the current status and future development in the AlphaGo project.
Outcome
The AlphaGo has achieved the artificial intelligence's challenge; it can efficiently move
selections and evaluate positions due to the deep neural networks that are controlled by the new
arrangement of controlled reinforcement learning as explained by Silver et al., (2016). The
AlphaGo is different from the Deep Blue since it is trained straight from gameplay by general
purpose supervised and reinforcement learning methods while the Deep Blue relied on
handcrafted evaluation function (Silver et al., 2016). The tree search policy combined with the
value networks has enabled AlphaGo to reach a proficient level in the game thus offering
expectation that the level of human activity is achievable in obstinate artificial intelligence
domains.
Conclusion
Reading this paper on the game of Go is a great insight since it gives the reader the inside
outlook. The objective of this article well indicates the role of computer science in the field of
artificial intelligence in the community of game players. Go represents challenges that are faced
by artificial intelligence in a variety of ways. This research uses statistical arguments that are
about brain activity of an organism based on a more profound aggregate of measurement of that
person’s brain to produce particular operational results. Comprehending how the brains functions

enable the researchers to apply these principles at a high neural-dimensional level and provide an
opportunity to advance applications that achieve artificial intelligence by imitating the
performance of an actual brain such as the neural network computer. It is therefore important to
understand how the brains operate at a personal level to be able to understand how brain
activities can deviate from the average and apply this knowledge in solving computational
problems.
5. Conclusion
These articles give a comprehensive research of examination of neuroscience study that benefits
the world with artificial intelligence. From the first article conducted by Mnih et al., (2015) on
deep reinforcement learning, the relevance of neuroscience is evident. It has helped in motivating
algorithmic level queries about aspects of organisms’ intelligence of interest to the researchers.
The articles indicate the effectiveness of transferring of insight obtained from neuroscience to the
development of the algorithms in artificial intelligence. The article by Hassabis et al., (2017)
demonstrate that development of artificial intelligence will contribute in helping humans to
develop efficient technologies since people will understand the human mind and brain processes.
Research on artificial intelligence and comparing these insights with the human mind can help in
the development of understanding regarding the human mind and some of its mysteries. The
study by Silvers et al., (2016) covers various aspects concerning the success of AlphaGo. The
research can help people understand how the human mind work and some of its functions such as
dreams, and creativity. The authors of these articles present convincing arguments that indicate it
can be possible to learn and create solutions from the workings of the brain in building artificial
general intelligence since the brain is the only example currently exist with such intelligence.

The articles have been written and presented differently;
Human-Level Control through Deep Reinforcement Learning
The ideas in this paper are well articulated. The researchers have used formulae and
pictures with clear illustrations in supporting their work. The article lacks citations all through.
This can possibly prevent the reader from accessing further details for a better understanding. In
order to comprehend the content of the article, the reader is required to have relevant knowledge
in neuroscience and psychology. This is because the article demonstrates how animal and human
behavior can be understood and modeled. To enhance the article, the writer should have maintained
a balanced distribution in the number of figures throughout out the article.
Neuroscience Inspired Artificial Intelligence
The article starts with a clear and comprehensive introduction. From the beginning there
are enough citations that are applied to elaborate the work. The writer supports the arguments in
most paragraphs with examples, therefore, making it clear and easy to understand. The whole
article is discussed in a well-arranged manner using articulate topics and points. There are
diagrams that are well elaborated in this article. The writer, however, has not used enough diagrams
throughout the article. For a better understanding of this article, the reader is required to have
sufficient knowledge in the field of information technology and neuroscience. The writer focuses
on educating the reader how the systems of neural activity can be applied in artificial intelligence.
Mastering the Game of Go With Deep Neural Networks and Tree Search
The article has a complex introduction. It is informative, however, its presentation has been
enhanced with a lot of diagrams and formulae; this may interfere with the reading and
comprehension of the reader. There are no clear examples used to elaborate statements in the
article. There are no citations throughout the article. The reader needs to have enough knowledge

in the field of information technology and psychology so as to understand the information
presented in this study. The focus of the writer is on the ability of artificial subjects to play a game
under supervised conditions.

6. References
Hassabis, D., Kumaran, D., Summerfield, C., & Botvinick, M. (2017). Neuroscience-Inspired
Artificial Intelligence. Neuron, 95(2), 245-258.
http://dx.doi.org/10.1016/j.neuron.2017.06.011
Notes: comparing research on artificial intelligence with insights from the human mind
can help in the development of understanding regarding the human mind and some of its
mysteries. The research can help people understand how the human mind work and some
of its functions such as dreams, and creativity.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A., Veness, J., Bellemare, M., Graves, A.,
Riedmiller, M., Fidjeland, A., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A.,
Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S. and Hassabis, D. (2015).
Human-level control through deep reinforcement learning. Nature, 518(7540), pp.529-
533.
Notes: This article offers experiences in the field of neuroscience. It opens the
understanding of the brain activity of organisms.
Silver, D., Huang, A., Maddison, C., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J.,
Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J.,
Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T.
and Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree
search. Nature, 529(7587), pp.484-489.
Notes: The field of artificial intelligence is highly dependent on neuroscience. Therefore
this can help to solve computational problems associated with this field.

An informative and descriptive title for your literature survey

More Related Content

What's hot (18)

Similar to An informative and descriptive title for your literature survey (20)

Recently uploaded (20)

An informative and descriptive title for your literature survey