Unit_4- Principles of AI explaining the importants of AI

SRI RAMAKRISHNA ENGINEERING COLLEGE
Course Instructors:
Mr.J.Judeson Antony Kovilpillai, Asst. Prof./ECE
No. of Credits : 3
20AI202 PRINCIPLES OF ARTIFICIAL INTELLIGENCE
Department of Electronics and Communication Engineering
[Educational Service : SNR Sons Charitable Trust]
[Autonomous Institution, Accredited by NAAC with ‘A’ Grade]
[Approved by AICTE and Permanently Affiliated to Anna University, Chennai]
[ISO 9001-2015 Certified and all eligible programmes Accredited by NBA]
VATTAMALAIPALAYAM, N.G.G.O. COLONY POST, COIMBATORE – 641 022.
2/23/2024 1

VISION OF THE COLLEGE
Vision of the College:
• To develop into a leading world class
Technological University consisting of
Schools of Excellence in various
disciplines with a co-existent Centre for
Engineering Solutions Development for
world-wide clientele.
2/23/2024 2

MISSION OF THE COLLEGE
Mission of the College:
To provide all necessary inputs to the students for
them to grow into knowledge engineers and
scientists attaining.
• Excellence in domain knowledge- practice and
theory.
• Excellence in co-curricular and Extra curricular
talents.
• Excellence in character and personality.
2/23/2024 3

VISION OF THE DEPARTMENT
Vision of the Department:
• To develop Electronics and
Communication Engineers by
keeping pace with changing
technologies, professionalism,
creativity research and
employability.
2/23/2024 4

MISSION OF THE DEPARTMENT
Mission of the Department:
• To provide quality an contemporary education through
effective teaching- learning process that equips the
students with adequate knowledge in Electronics and
Communication Engineering for a successful career.
• To inculcate the students in problem solving and
lifelong learning skills that will enable them to pursue
higher studies and career in research.
• To produce engineers with effective communication
skills, the abilities to lead a team adhering to ethical
values and inclination serve the society.
2/23/2024 5

On successful completion of the course, students
will be able to
CO1 : Understand the basics of Artificial Intelligence
and Intelligent Agents.
CO2 : Apply the problem-solving strategies for real
life scenario applications
CO3 : Make use of Machine learning techniques for
problem solving
CO4 : Analyze the various applications of Artificial
Intelligence
COURSE OUTCOMES
2/23/2024 6

• A simple definition of a Language Model is an AI model that has been
trained to predict the next word or words in a text based on the
preceding words, its part of the technology that predicts the next word
you want to type on
your mobile phone allowing you to complete the message faster.
• The task of predicting the next word/s is referred to as self-supervised
learning, it does not need labels it just needs lots of text
• There is a broad classification of Language Models that fit into two
main groups that are:
• Statistical Language Models: These models use traditional
statistical techniques like N-grams, Hidden Markov Models (HMM) and
certain linguistic rules to learn the probability distribution of words.
• Neural Language Models: These are new players in the NLP town and
have surpassed the statistical language models in their effectiveness.
They use different kinds of Neural Networks to model language.
2/23/2024 7
Language Models

• Let’s begin with the task of computing P(w|H) — probability of word
‘w’, given some history ‘H’.
• Suppose the ‘H’ is ‘its water is so transparent that’, and we want to
know the probability of next word ‘the’: P(the|its water is so transparent
that).
• One way to estimate this probability — relative frequency counts.
• Take a large corpus, count the number of time ‘its water is so transparent
that’ and also count the number of times it has been followed by ‘the’.
2/23/2024 8
N- Gram Model

• While this method of estimating probabilities directly from counts
works fine in many cases, it turns out that even the web isn’t big enough
to give us good estimates in most cases. Why? Because language is
creative, and there are new sentences added everyday, which we wont be
able to count.
• For this reason’s we need to introduce cleverer ways to calculate the
probability of word w given history H.
• To represent the probability of a particular random variable Xi taking on
the value “the”, or P(Xi = “the”), we will use the simplification P(the).
• We’ll represent a sequence of N words either as w1 . . . wn or wn (so the
expression wn−1 means the string w1,w2,…,wn−1). For the joint
probability of each word in a sequence having a particular value P(X =
w1,Y = w2,Z = w3,…,W = wn) we’ll use P(w1,w2,…,wn).
2/23/2024 9

• The intuition of the n-gram model is that instead of computing the
probability of a word given its entire history, we can approximate the
history by just the last few words.
• In the bigram (n = 2) language model the sentence “I saw the red house”
is approximated as:
• P(I, saw, the, red, house) ≈
P(I|‹s›)P(saw | I)P(the | saw)P(red | the)P(house | red)P(‹/s› | house)
• In a trigram (n = 3) language model it will approximate as:
• P(I, saw, the, red, house) ≈
P(I|‹s›, ‹s›)P(saw | ‹s›, I)P(the | I, saw)P(red | saw, the)
P(house | the, red)P(‹/s› | red, house)
Note: ‹s› is a marker denoting the beginning and end of the sentence.
2/23/2024 10

2/23/2024 11
N-Gram Model Example

• Neural Language Models do two things:
• Step 1: Process context → model-specific
 The main idea here is to get a vector representation for
the previous context.
 Using this representation, a model predicts a probability
distribution for the next token.
 This part could be different depending on model
architecture (e.g., RNN, CNN, whatever you want), but
the main point is the same - to encode context.
• Step 2: Generate a probability distribution for the next
token
2/23/2024 18
Neural Language Models

• An information retrieval (IR) system is a set of algorithms that
facilitate the relevance of displayed documents to searched
queries.
• In simple words, it works to sort and rank documents based on
the queries of a user. There is uniformity with respect to the
query and text in the document to enable document
accessibility.
• This also allows a matching function to be used effectively to
rank a document formally using their Retrieval Status Value
(RSV).
• The document contents are represented by a collection of
descriptors, known as terms, that belong to a vocabulary V. An
IR system also extracts feedback on the usability of the
displayed results by tracking the user’s behaviour.
2/23/2024 26
Information retrieval (IR) system

An information retrieval comprises of the
following four key elements:
• D − Document Representation.
• Q − Query Representation.
• F − A framework to match and establish a
relationship between D and Q.
• R (q, di) − A ranking function that determines
the similarity between the query and the
document to display relevant information.
2/23/2024 27

There are three types of Information Retrieval (IR) models:
• Classical IR Model — It is designed upon basic mathematical concepts
and is the most widely-used of IR models. Classic Information Retrieval
models can be implemented with ease. Its examples include Vector-space,
Boolean and Probabilistic IR models. In this system, the retrieval of
information depends on documents containing the defined set of queries.
There is no ranking or grading of any kind. The different classical IR
models take Document Representation, Query representation, and
Retrieval/Matching function into account in their modelling.
• Non-Classical IR Model — They differ from classic models in that they
are built upon propositional logic. Examples of non-classical IR models
include Information Logic, Situation Theory, and Interaction models.
• Alternative IR Model — These take principles of classical IR model and
enhance upon to create more functional models like the Cluster model,
Alternative Set-Theoretic Models Fuzzy Set model, Latent Semantic
Indexing (LSI) model, Alternative Algebraic Models Generalized Vector
Space Model, etc.
2/23/2024 28

• Boolean Model — This model required information to be translated into a
Boolean expression and Boolean queries. The latter is used to determine the
information needed to be able to provide the right match when the Boolean
expression is found to be true. It uses Boolean operations AND, OR, NOT to
create a combination of multiple terms based on what the user asks.
• Vector Space Model — This model takes documents and queries denoted as
vectors and retrieves documents depending on how similar they are. This can
result in two types of vectors which are then used to rank search results either
• Probability Distribution Model — In this model, the documents are
considered as distributions of terms and queries are matched based on the
similarity of these representations. This is made possible using entropy or by
computing the probable utility of the document. They are if two types:
• Probabilistic Models — The probabilistic model is rather simple and takes the
probability ranking to display results. To put it simply, documents are ranked
based on the probability of their relevance to a searched query.
2/23/2024 29
Classical IR Models

Prerequisites for an IR model:
• An automated or manually-operated indexing system
used to index and search techniques and procedures.
• A collection of documents in any one of the following
formats: text, image or multimedia.
• A set of queries that serve as the input to a system, via a
human or machine.
• An evaluation metric to measure or evaluate a system’s
effectiveness (for instance, precision and recall). For
instance, to ensure how useful the information displayed
to the user is.
2/23/2024 30

• Acquisition The IR system sources documents and multimedia information from a
variety of web resources. This data is compiled by web crawlers and is sent to
database storage systems.
• Representation The free-text terms are indexed, and the vocabulary is sorted, both
using automated or manual procedures. For instance, a document abstract will
contain a summary, meta description, bibliography, and details of the authors or co-
authors.
• File Organization File organization is carried out in one of two methods, sequential
or inverted. Sequential file organization involves data contained in the document.
The Inverted file comprises a list of records, in a term by term manner.
• Query An IR system is initiated on entering a query. User queries can either be
formal or informal statements highlighting what information is required. In IR
systems, a query is not indicative of a single object in the database system. It could
refer to several objects whichever match the query. However, their degrees of
relevance may vary.
2/23/2024 31
Components of an Information
Retrieval Model

• Information extraction is the process of extracting specific (pre-
specified) information from textual sources. One of the most trivial
examples is when your email extracts only the data from the message for
you to add in your Calendar.
• Other free-flowing textual sources from which information extraction
can distill structured information are legal acts, medical records, social
media interactions and streams, online news, government documents,
corporate reports and more.
• Gathering detailed structured data from texts, information extraction
enables:
• The automation of tasks such as smart content classification, integrated
search, management and delivery;
• Data-driven activities such as mining for patterns and trends, uncovering
hidden relationships, etc.
2/23/2024 35
Information extraction

• To elaborate a bit on this minimalist way of describing information
extraction, the process involves transforming an unstructured text or a
collection of texts into sets of facts (i.e., formal, machine-readable
statements of the type “Bukowski is the author of Post Office“) that are
further populated (filled) in a database (like an American Literature
database).
• Typically, for structured information to be extracted from unstructured
texts, the following main subtasks are involved:
• Pre-processing of the text – this is where the text is prepared for
processing with the help of computational linguistics tools such as
tokenization, sentence splitting, morphological analysis, etc.
• Finding and classifying concepts – this is where mentions of people,
things, locations, events and other pre-specified types of concepts are
detected and classified.
2/23/2024 37

• Connecting the concepts – this is the task of
identifying relationships between the extracted
concepts.
• Unifying – this subtask is about presenting the
extracted data into a standard form.
• Getting rid of the noise – this subtask involves
eliminating duplicate data.
• Enriching your knowledge base – this is where
the extracted knowledge is ingested in your
database for further use.
2/23/2024 38

• Computers usually won't understand the language we speak or
communicate with. Hence, we break the language, basically the words
and sentences, into tokens and then load it into a program. The process
of breaking down language into tokens is called tokenization.
• For example, consider a simple sentence: "NLP information extraction is
fun''. This could be tokenized into:
• One-word (sometimes called unigram token): NLP, information,
extraction, is, fun
• Two-word phrase (bigram tokens): NLP information, information
extraction, extraction is, is fun, fun NLP
• Three-word sentence (trigram tokens): NLP information extraction,
information extraction is, extraction is fun
2/23/2024 39
Tokenization

• Tagging parts of speech is very crucial for information
extraction from text. It'll help us understand the context of the
text data. We usually refer to text from documents as
''unstructured data'' – data with no defined structure or pattern.
• Hence, with POS tagging we can use techniques that will
provide the context of words or tokens used to categorise them
in specific ways
2/23/2024 40
POS tagging

• Dependency graphs help us find relationships between
neighbouring words using directed graphs.
• This relation will provide details about the dependency type
(e.g. Subject, Object etc.).
• Following is a figure representing a dependency graph of a
short sentence.
• The arrow directed from the word faster indicates that faster
modifies moving, and the label `advmod` assigned to the arrow
describes the exact nature of the dependency.
2/23/2024 41
Dependency graphs

– Event extraction: Given an input document, output zero or more event templates.
For instance, a newspaper article might describe multiple terrorist attacks.
– Named entity recognition: recognition of known entity names (for people and
organizations), place names, temporal expressions, and certain types of numerical
expressions, by employing existing knowledge of the domain or information
extracted from other sentences.
– Typically the recognition task involves assigning a unique identifier to the extracted
entity. A simpler task is named entity detection, which aims at detecting entities
without having any existing knowledge about the entity instances.
– For example, in processing the sentence "M. Smith likes fishing", named entity
detection would denote detecting that the phrase "M. Smith" does refer to a person,
but without necessarily having (or using) any knowledge about a certain M. Smith
who is (or, "might be") the specific person whom that sentence is talking about.
– Coreference resolution: detection of coreference and anaphoric links between text
entities. In IE tasks, this is typically restricted to finding links between previously-
extracted named entities. For example, "International Business Machines" and
"IBM" refer to the same real-world entity. If we take the two sentences "M. Smith
likes fishing. But he doesn't like biking", it would be beneficial to detect that "he" is
referring to the previously detected person "M. Smith".
2/23/2024 42
Applications

– Relationship extraction: identification of relations between entities,[such as:
• PERSON works for ORGANIZATION (extracted from the sentence "Bill works for
IBM.")
• PERSON located in LOCATION (extracted from the sentence "Bill is in France.")
– Table extraction: finding and extracting tables from documents
– Table information extraction : extracting information in structured manner from
the tables. This is more complex task than table extraction, as table extraction is
only the first step, while understanding the roles of the cells, rows, columns, linking
the information inside the table and understanding the information presented in the
table are additional tasks necessary for table information extraction.
– Comments extraction : extracting comments from actual content of article in order
to restore the link between author of each sentence
– Terminology extraction: finding the relevant terms for a given corpus
– Template-based music extraction: finding relevant characteristic in an audio signal
taken from a given repertoire; for instance time indexes of occurrences of
percussive sounds can be extracted in order to represent the essential rhythmic
component of a music piece
2/23/2024 43

• NLP stands for Natural Language Processing, which is a part of
Computer Science, Human language, and Artificial Intelligence. It is the
technology that is used by machines to understand, analyse, manipulate,
and interpret human's languages.
• It helps developers to organize knowledge for performing tasks such as
translation, automatic summarization, Named Entity Recognition
(NER), speech recognition, relationship extraction, and topic
segmentation.
• NLP helps users to ask questions about any subject and get a direct
response within seconds. It also offers exact answers to the question
means it does not offer unnecessary and unwanted information.
• NLP helps computers to communicate with humans in their languages.
• Most of the companies use NLP to improve the efficiency of
documentation processes, accuracy of documentation, and identify the
information from large databases.
2/23/2024 44
NLP

• We, as humans, perform natural language processing (NLP) considerably well, but
even then, we are not perfect. We often misunderstand one thing for another, and we
often interpret the same sentences or words differently.
2/23/2024 46

• Speech recognition, also called speech-to-text, is the task of reliably
converting voice data into text data. Speech recognition is required for
any application that follows voice commands or answers spoken
questions. What makes speech recognition especially challenging is the
way people talk—quickly, slurring words together, with varying
emphasis and intonation, in different accents, and often using incorrect
grammar.
• Part of speech tagging, also called grammatical tagging, is the process
of determining the part of speech of a particular word or piece of text
based on its use and context. Part of speech identifies ‘make’ as a verb in
‘I can make a paper plane,’ and as a noun in ‘What make of car do you
own?’
• Word sense disambiguation is the selection of the meaning of a word
with multiple meanings through a process of semantic analysis that
determine the word that makes the most sense in the given context. For
example, word sense disambiguation helps distinguish the meaning of
the verb 'make' in ‘make the grade’ (achieve) vs. ‘make a bet’ (place).
2/23/2024 48
Applications

• Co-reference resolution is the task of identifying if and when
two words refer to the same entity. The most common example
is determining the person or object to which a certain pronoun
refers (e.g., ‘she’ = ‘Mary’), but it can also involve identifying
a metaphor or an idiom in the text (e.g., an instance in which
'bear' isn't an animal but a large hairy person).
• Sentiment analysis attempts to extract subjective qualities—
attitudes, emotions, sarcasm, confusion, suspicion—from text.
• Natural language generation is sometimes described as the
opposite of speech recognition or speech-to-text; it's the task of
putting structured information into human language.
• Named entity recognition, or NEM, identifies words or
phrases as useful entities. NEM identifies ‘Kentucky’ as a
location or ‘Fred’ as a man's name.
2/23/2024 49

• Sentiment analysis is the process of analyzing emotions within a text
and classifying them as positive, negative, or neutral.
• By running sentiment analysis on social media posts, product reviews,
NPS surveys, and customer feedback, businesses can gain valuable
insights about how customers perceive their brand. Take these Zoom
customer and product reviews, for example:
2/23/2024 50

• Spam detection: You may not think of spam detection as an NLP
solution, but the best spam detection technologies use NLP's text
classification capabilities to scan emails for language that often indicates
spam or phishing. These indicators can include overuse of financial
terms, characteristic bad grammar, threatening language, inappropriate
urgency, misspelled company names, and more. Spam detection is one
of a handful of NLP problems that experts consider 'mostly solved'
(although you may argue that this doesn’t match your email experience).
• Machine translation: Google Translate is an example of widely
available NLP technology at work. Truly useful machine translation
involves more than replacing words in one language with words of
another. Effective translation has to capture accurately the meaning and
tone of the input language and translate it to text with the same meaning
and desired impact in the output language. Machine translation tools are
making good progress in terms of accuracy. A great way to test any
machine translation tool is to translate text to one language and then
back to the original.
2/23/2024 52
Use Cases

• Virtual agents and chatbots: Virtual agents such as Apple's Siri and Amazon's
Alexa use speech recognition to recognize patterns in voice commands and natural
language generation to respond with appropriate action or helpful comments.
Chatbots perform the same magic in response to typed text entries. The best of these
also learn to recognize contextual clues about human requests and use them to
provide even better responses or options over time. The next enhancement for these
applications is question answering, the ability to respond to our questions—
anticipated or not—with relevant and helpful answers in their own words.
• Social media sentiment analysis: NLP has become an essential business tool for
uncovering hidden data insights from social media channels. Sentiment analysis can
analyze language used in social media posts, responses, reviews, and more to extract
attitudes and emotions in response to products, promotions, and events–information
companies can use in product designs, advertising campaigns, and more.
• Text summarization: Text summarization uses NLP techniques to digest huge
volumes of digital text and create summaries and synopses for indexes, research
databases, or busy readers who don't have time to read full text. The best text
summarization applications use semantic reasoning and natural language generation
(NLG) to add useful context and conclusions to summaries.
2/23/2024 53

Natural Language Processing is separated in two different approaches:
Rule-based Natural Language Processing:
• It uses common sense reasoning for processing tasks. For instance, the freezing
temperature can lead to death, or hot coffee can burn people’s skin, along with other
common sense reasoning tasks. However, this process can take much time, and it
requires manual effort.
Statistical Natural Language Processing:
• It uses large amounts of data and tries to derive conclusions from it. Statistical NLP
uses machine learning algorithms to train NLP models. After successful training on
large amounts of data, the trained model will have positive outcomes with
deduction.
2/23/2024 54

a. Lexical Analysis:
• With lexical analysis, we divide a whole chunk of text into paragraphs, sentences,
and words. It involves identifying and analyzing words’ structure.
b. Syntactic Analysis:
• Syntactic analysis involves the analysis of words in a sentence for grammar and
arranging words in a manner that shows the relationship among the words. For
instance, the sentence “The shop goes to the house” does not pass.
c. Semantic Analysis:
• Semantic analysis draws the exact meaning for the words, and it analyzes the text
meaningfulness. Sentences such as “hot ice-cream” do not pass.
d. Disclosure Integration:
• Disclosure integration takes into account the context of the text. It considers the
meaning of the sentence before it ends. For example: “He works at Google.” In this
sentence, “he” must be referenced in the sentence before it.
e. Pragmatic Analysis:
• Pragmatic analysis deals with overall communication and interpretation of language.
It deals with deriving meaningful use of language in various situations.
2/23/2024 56

• Machine translation (MT) is automated translation. It is the process by
which computer software is used to translate a text from one natural
language (such as English) to another (such as Spanish).
• To process any translation, human or automated, the meaning of a text in
the original (source) language must be fully restored in the target
language, i.e. the translation. While on the surface this seems
straightforward, it is far more complex. Translation is not a mere word-
for-word substitution. A translator must interpret and analyze all of the
elements in the text and know how each word may influence another.
This requires extensive expertise in grammar, syntax (sentence
structure), semantics (meanings), etc., in the source and target
languages, as well as familiarity with each local region.
• Human and machine translation each have their share of challenges. For
example, no two individual translators can produce identical translations
of the same text in the same language pair, and it may take several
rounds of revisions to meet customer satisfaction. But the greater
challenge lies in how machine translation can produce publishable
quality translations.
2/23/2024 57
Machine translation (MT)

• Rule-based machine translation relies on countless built-in
linguistic rules and millions of bilingual dictionaries for each
language pair.
• The software parses text and creates a transitional
representation from which the text in the target language is
generated. This process requires extensive lexicons with
morphological, syntactic, and semantic information, and large
sets of rules. The software uses these complex rule sets and
then transfers the grammatical structure of the source language
into the target language.
• Translations are built on gigantic dictionaries and sophisticated
linguistic rules. Users can improve the out-of-the-box
translation quality by adding their terminology into the
translation process. They create user-defined dictionaries which
override the system’s default settings.
2/23/2024 58
Rule-Based Machine Translation Technology

• In most cases, there are two steps: an initial investment that
significantly increases the quality at a limited cost, and an
ongoing investment to increase quality incrementally. While
rule-based MT brings companies to the quality threshold and
beyond, the quality improvement process may be long and
expensive.
• It is the earliest form of MT, rule-based MT, has several serious
disadvantages including requiring significant amounts of
human post-editing, the requirement to manually add
languages, and low quality in general. It has some uses in very
basic situations where a quick understanding of meaning is
required.
2/23/2024 59

• Statistical machine translation utilizes statistical translation
models whose parameters stem from the analysis of
monolingual and bilingual corpora.
• Building statistical translation models is a quick process, but
the technology relies heavily on existing multilingual corpora.
• A minimum of 2 million words for a specific domain and even
more for general language are required. Theoretically it is
possible to reach the quality threshold but most companies do
not have such large amounts of existing multilingual corpora to
build the necessary translation models.
• Additionally, statistical machine translation is CPU intensive
and requires an extensive hardware configuration to run
translation models for average performance levels.
2/23/2024 60
Statistical machine translation

• Neural machine translation (NMT) is an approach to machine
translation that uses an artificial neural network to predict the
likelihood of a sequence of words, typically modeling entire
sentences in a single integrated model. It has the ability to persist
sequential data over several time steps.
2/23/2024 62
Neural machine translation

• Encoder : reads the input sequence of words from
source language and encodes that information into a real
valued vectors also known as the hidden state or thought
vector or context vector. Thought vector encodes the
“meaning” of the input sequence into a single vector. The
encoder outputs are discarded and only the hidden or
internal states are passed as initial inputs to the decoder
• Decoder: takes the thought vector from the encoder as
an input along with the start-of-string <START> token as
the initial input to produce an output sequence.
2/23/2024 63

• LSTM is Long Short Term Memory and is capable of
learning long term dependencies quickly. LSTM can
learn to bridge time intervals in excess of 1000 steps
• LSTM remember information over a long time steps.
They do this by deciding what to remember and what to
forget.
• Cell states play a key role in LSTM. LSTM can decide if
they want to add or remove information from a cell state
by determining how much information should flow
through.
2/23/2024 64

• Basic units of LSTM networks are LSTM layers that have
multiple LSTM cells. Cells do have internal cell state, often
abbreviated as "c", and cells output is what is called a
"hidden state", abbreviated as "h".
2/23/2024 65

• Speech recognition refers to a computer interpreting the words spoken
by a person and converting them to a format that is understandable by a
machine. Depending on the end-goal, it is then converted to text or voice
or another required format.
• For instance, Apple’s Siri and Google’s Alexa use AI-powered speech
recognition to provide voice or text support whereas voice-to-text
applications like Google Dictate transcribe your dictated words to text.
Voice recognition is another form of speech recognition where a source
sound is recognized and matched to a person’s voice.
• Speech recognition AI applications have seen significant growth in
numbers in recent times as businesses are increasingly adopting digital
assistants and automated support to streamline their services. Voice
assistants, smart home devices, search engines, etc are a few examples
where speech recognition has seen prominence
2/23/2024 66
Speech recognition

• Speech recognition is fast overcoming the challenges of poor recording
equipment and noise cancellation, variations in people’s voices, accents,
dialects, semantics, contexts, etc using artificial intelligence and machine
learning.
• This also includes challenges of understanding human disposition, and the
varying human language elements like colloquialisms, acronyms, etc.
• Furthermore, it is now an acceptable format of communication given the large
companies that endorse it and regularly employ speech recognition in their
operations. It is estimated that a majority of search engines will adopt voice
technology as an integral aspect of their search mechanism.
• This has been made possible because of improved AI and machine learning
(ML) algorithms which can process significantly large datasets and provide
greater accuracy by self-learning and adapting to evolving changes.
• Machines are programmed to “listen” to accents, dialects, contexts, emotions
and process sophisticated and arbitrary data that is readily accessible for mining
and machine learning purposes.
2/23/2024 68

• From smart home devices and appliances that take instructions, and
can be switched on and off remotely, digital assistants that can set
reminders, schedule meetings, recognize a song playing in a pub, to
search engines that respond with relevant search results to user
queries, speech recognition has become an indispensable part of our
lives.
• Plenty of businesses now include speech-to-text software to enhance
their business applications and streamline the customer experience.
• Using speech recognition and natural language processing,
companies can transcribe calls, meetings, and even translate them.
• Apple, Google, Facebook, Microsoft, and Amazon are among the
tech giants who continue to leverage AI-backed speech recognition
applications to provide an exemplary user experience.
2/23/2024 69

• The best kind of systems also allow organizations to customize and
adapt the technology to their specific requirements — everything from
language and nuances of speech to brand recognition. For example:
• Language weighting: Improve precision by weighting specific words
that are spoken frequently (such as product names or industry jargon),
beyond terms already in the base vocabulary.
• Speaker labeling: Output a transcription that cites or tags each
speaker’s contributions to a multi-participant conversation.
• Acoustics training: Attend to the acoustical side of the business. Train
the system to adapt to an acoustic environment (like the ambient noise in
a call center) and speaker styles (like voice pitch, volume and pace).
• Profanity filtering: Use filters to identify certain words or phrases and
sanitize speech output.
2/23/2024 70

• Voice-based speech recognition software is now used to initiate
purchases, send emails, transcribe meetings, doctor appointments, and
court proceedings, etc.
• Virtual assistants or digital assistants and smart home devices use voice
recognition software to answer questions, provide weather news, play
music, check traffic, place an order, and so on.
• Companies like Venmo and PayPal allow customers to make
transactions using voice assistants. Several banks in North America and
Canada also provide online banking using voice-based software.
• Ecommerce is significantly powered by voice-based assistants and
allows users to make purchases quickly and seamlessly.
• Speech recognition is poised to impact transportation services and
streamline scheduling, routing, and navigating across cities.
2/23/2024 71
USE CASES

• It is also used to provide accurate subtitles to a video.
• Podcasts, meetings, and journalist interviews can be transcribed
using voice recognition.
• There has been a huge impact on security through voice
biometry where the technology analyses the varying
frequencies, tone and pitch of an individual’s voice to create a
voice profile.
• An example of this is Switzerland’s telecom company
Swisscom which has enabled voice authentication technology
in its call centres to prevent security breaches.
• Customer care services are being traced by AI-based voice
assistants, and chatbots to automate repeatable tasks.
2/23/2024 72

• Healthcare: Doctors and nurses leverage dictation applications to
capture and log patient diagnoses and treatment notes.
• Sales: Speech recognition technology has a couple of applications in
sales. It can help a call center transcribe thousands of phone calls
between customers and agents to identify common call patterns and
issues
• Cognitive bots can also talk to people via a webpage, answering
common queries and solving basic requests without needing to wait for a
contact center agent to be available. In both instances speech recognition
systems help reduce time to resolution for consumer issues.
• Security: As technology integrates into our daily lives, security
protocols are an increasing priority. Voice-based authentication adds a
viable level of security.
2/23/2024 73

• Natural language processing (NLP): While NLP isn’t
necessarily a specific algorithm used in speech recognition, it is
the area of artificial intelligence which focuses on the interaction
between humans and machines through language through speech
and text. Many mobile devices incorporate speech recognition into
their systems to conduct voice search—e.g. Siri—or provide more
accessibility around texting.
• N-grams: This is the simplest type of language model (LM),
which assigns probabilities to sentences or phrases. An N-gram is
sequence of N-words. For example, “order the pizza” is a trigram
or 3-gram and “please order the pizza” is a 4-gram. Grammar and
the probability of certain word sequences are used to improve
recognition and accuracy.
2/23/2024 74

• Neural networks: Primarily leveraged for deep learning algorithms, neural networks
process training data by mimicking the interconnectivity of the human brain through
layers of nodes. Each node is made up of inputs, weights, a bias (or threshold) and an
output.
• If that output value exceeds a given threshold, it “fires” or activates the node, passing
data to the next layer in the network. Neural networks learn this mapping function
through supervised learning, adjusting based on the loss function through the process of
gradient descent. While neural networks tend to be more accurate and can accept more
data, this comes at a performance efficiency cost as they tend to be slower to train
compared to traditional language models.
2/23/2024 75

Unit_4- Principles of AI explaining the importants of AI

More Related Content

Similar to Unit_4- Principles of AI explaining the importants of AI (20)

More from VijayAECE1 (7)

Recently uploaded (20)

Unit_4- Principles of AI explaining the importants of AI