From the course: CompTIA Advanced Security Practitioner (CASP+) (CAS-004) Cert Prep
Artificial intelligence and machine learning
From the course: CompTIA Advanced Security Practitioner (CASP+) (CAS-004) Cert Prep
Artificial intelligence and machine learning
- In this lesson, we're going to talk about artificial intelligence and machine learning. First, let's talk about artificial intelligence. Artificial intelligence is the science of creating machines with the ability to develop problem solving and analysis strategies without significant human direction or intervention. Essentially, we want to have a machine that can think for itself and simulate the human decision making process. Now, there are a lot of great things that artificial intelligence can do, especially in the cybersecurity industry. Using artificial intelligence, we're able to create expert systems. The earliest attempts at artificial intelligence really was more akin to automation, but the system was given a large set of if, then, else statements that told it exactly how to think based on a limited dataset using knowledge basis and set rules. But modern AI can actually think for itself and create its own rules. And this is where things get really interesting in terms of cybersecurity. With an artificial intelligence system, the AI uses an algorithm with some training data to learn how to respond to different situations. Over time, they learn by copying and adding more information to their own datasets, and they evolve and they grow. Artificial intelligence can learn from its past experiences as well, which can help with to identify incidents faster and lower your overall incident response times. Unlike our human analysts, AI can go through hundreds, thousands, or even millions of items every single day, making quick decisions to decide if something is malicious or benign. But the real benefit is that AI isn't reliant on signature based indicators or compromise, but instead it learns over time just like the bad actors do. AI is truly the future, but it won't replace all of our human analysts, so don't worry about losing your job just quite yet. After all, machines make mistakes just like people do. The best AI deployments are the ones where AI is being used in conjunction with our human analysts. So the AI can take care of all the obvious malicious things and all the obvious benign things, but it leaves the difficult cases to a human to actually analyze and make the right decision on. There are some limitations and drawbacks of using AI in your security architectures though that you have to be aware of. First, there are resource issues involved when you're dealing with AI because companies need to invest a ton of time and money to get a workable AI system that is properly trained and tuned and ready for production level work. Second, the datasets used for training can be a huge limitation for you. Your AI is only going to be as good as the training it receives from your datasets. To properly train the AI, your team needs to have a wide variety of datasets, including malicious codes, malware, anomalies, insider threat examples, and more. Third, AI is not just used by the defenders, but it's also beginning to be used by the attackers too. As AI continues to develop, there are now AI hackers on the loose that are developing more and more advanced attacks against our enterprise networks. So you need to be aware of this. And fourth, neural fuzzing is also used for both attack and defense. Fuzzing is a process of testing large amounts of random data as an input to piece of software to identify its vulnerabilities. Now, when you use neural fuzzing defenders and attackers can leverage AI to test large amounts of randomized inputs to find zero day vulnerabilities within your code. So like most things in life, AI can be a great thing or it can become your worst nightmare, depending on how much you invest into it and if you're ready to defend against it as well. The second thing we're going to talk about is machine learning. Now, machine learning is really a subset of artificial intelligence. Machine learning is an application or subset of AI that allows machines to learn from data without being explicitly programmed. This is really a cool thing because we can give the machine learning algorithm a huge dataset, and it's going to begin to find its own patterns to match the dataset's existing labels. To use machine learning, you're first going to need to provide it with a labeled dataset where you've already labeled things into categories. Over time, the machine learning application is going to modify itself to identify things based on patterns it determines from our existing training dataset. Now, as we feed it new data, it can label and categorize that data by itself based upon the patterns it already taught itself in the previous examples. So this makes machine learning really adept at labeling and categorizing things. If I wanted to go through a dataset and say, this is malware and this isn't, and this is malware and this isn't, I can train the machine with a dataset to do that for me. The machine can then take over using its behavioral engine and its machine learning to identify on its own what is and is not malware as it sees new code. Now, this is not a rule-based instruction set, but instead, it's a pattern algorithm that the machine taught itself based upon the large dataset we provided and trained it on, and then it learned that on its own to create its own rules moving forward. Let's take a moment and look at a real world example of machine learning so you can get an idea of how these things work and why they can be really good or they can lead us astray. Now, one of the earlier machine learning experiments that was performed was focused on training a machine to identify what was a party and what was not. So the researchers began showing the machine learning application a lot of different images, and they labeled those and properly categorize them as a party or not a party. So for example, if I showed the computer an image like this one, I would categorize it and say, this is a party. There's a bunch of people there, they're having a good time, and they're playing with some confetti. Looking at it as a human, I can clearly see, yes, this is a party. Then the researcher showed it another image, and this one was categorized to show the computer what a party doesn't look like. So here you could see what looks like people at the office working. So no, that's not really a party. Now, as a human, this is really easy to see, but the computer needs to start making its own assumptions based on what it's seeing in these images. Maybe the computer decided because there were people smiling, it was a party, or maybe because there was people who are frowning, it wasn't a party. We didn't tell the computer how to decide, which is and is not a party. We just told it the first image was a party. The second image wasn't a party, but we left the machine with the task of creating its own patterns to decide how it makes its decision, just like we do with kids in the real world. So the researchers kept doing this with images. They did four or 5,000 images. So for example, here's another one. Is this one a party? No, it looks like they're at a conference. They look like they're at a work event. They're smiling, which is usually a good sign of a party, but they're obviously not at a party. And so I as a human will say, no, this is not a party. Then we go onto the next one. What about this one? Do you think this is a party? Well, there's a couple of ladies dancing. That looks like a party to me, right? So they're probably having a good time either at a club or at a friend's house, and they have a drink in their hand and they're smiling and having a good time at what looks to be a party. So I would say, yes, that's a party and we keep going this way, right? Here's another one. Here's one where people are sitting around a table. They're eating, they're having a good time. There's a lot of different people at this table, so it looks like they may be having a dinner party, which is a type of party, but not one that we're dancing at. So again, I can categorize this as a human and very clearly say this is a party. Now, in this example, I went through just five images, which is a very limited dataset, but let's say I continue to do this and I did this with four or 5,000 images. That would be enough for the computer to start making decisions on what is and is not a party. Now, there is a slight problem with this though, and this is what happens when you train the computer poorly. Do you see what we just did in our five images? We poorly trained this computer. The problem here is that we just use these five images, or even if we use 5,000 images that look similar, the computer would've made some unsafe assumptions about what a party is and what a party isn't. And this is accidentally what happened in this machine learning experiment when they did it at scale with thousands of images. So what was the problem? Well, the problem is with the training data we just gave this computer, we just trained it to be racist. That's right. If you look back at the five images we just went through, you can pause the video, rewind it, and go and watch them again. You're going to see that all the people who are at parties were white men and women. That's what we said was a party. In fact, the only image that had somebody who was of a darker complexion happened to be at the business conference, which we said was not a party. So this computer has now learned that for a party to exist, it has to have white people there. Now, this is one of the major problems with machine learning
Contents
-
-
-
-
-
-
-
-
-
-
-
Emerging technology4m 18s
-
Artificial intelligence and machine learning8m 55s
-
(Locked)
Deep learning8m 58s
-
(Locked)
Big data4m 40s
-
(Locked)
Blockchain distributed consensus5m 36s
-
(Locked)
Passwordless authentication5m 17s
-
(Locked)
Homomorphic encryption3m 37s
-
(Locked)
Virtual and augmented reality4m 32s
-
(Locked)
3D printing3m 3s
-
(Locked)
Quantum computing5m 34s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-