SlideShare a Scribd company logo
FireEye & Scylla :
Intel Threat Analysis
using Graph Database
Rahul Gaikwad, Staff DevOps Engineer
&
Krishna Palati, Senior Devops Manager
Presenters
Rahul Gaikwad, Staff DevOps Engineer
❖ Role
➢ Database Administrator - SQL / NoSQL / Graph DB / Big Data
➢ Infrastructure & Cloud Operations
➢ DevOps Automation Engineer
❖ Education
➢ Master of Computer Applications (MCA) & Executive MBA
➢ Pursuing PhD Research in AIOps
❖ Certifications
➢ Scylla | OCP | CCAH | HDPCA | RHCSA | AWS - SA | AWS - SysOps | Confluent Kafka
Krishna Palati, Senior Devops Manager
❖ Role
➢ Senior DevOps Manager
➢ Cloud Infrastructure, Devops automation and Database Systems
❖ Education
➢ Bachelors and Masters Degrees in Computer Science and Engineering
❖ Hobbies
➢ Running, Biking, Hiking and Playing tennis
Agenda
■ Background
■ Why ScyllaDB
■ ScyllaDB at FireEye
■ Conclusion
■ Q&A
Background
Introduction to FireEye
Solutions
■ Threat Intelligence
■ Helix Security Platform
■ Endpoint Security
■ Network Security and Forensics
■ Email Security
■ Managed Defense
Services
■ Breach Response
■ Security Assessment
■ Security Enhancement
■ Security Transformation
★ FireEye is a intelligence-led Cyber security company
★ We offer solutions that blends security technologies, threat intelligence and consulting.
Forrester New Wave
Leading Threat Intel Services
FireEye & Scylla: Intel Threat Analysis Using a Graph Database
FireEye Threat Intelligence
A portfolio of subscriptions and services designed to address all aspects of an
organization’s intelligence needs.
■ Intelligence Subscriptions
■ Intelligence Enablement
■ Intelligence Capability Development
■ Digital Threat Monitoring
■ Advanced Intelligence Access
Application Use Case
■ Homegrown custom graph database on Postgres
■ Centralizes, organizes and processes cyber threat intelligence data
■ Tracks threat groups by recording all of the analytic correlations
■ Provides analytic results by processing and analysing historical data
■ Data Objects - DNS data, RSS feeds, file md5s, FQDNs and URLs
■ Data Size: Nodes ~500M and Edges ~1.5B
Existing System as Graph DB
Structure of the Graph
■ Stores data as ”nodes” or “edges”
■ Also allows storing tags
Nodes
■ Each node represents a single object, event or evidence
■ E.g. Organizations, actors, hosts, files and FQDNs are represented as nodes in graph
Edges
■ Edges represent the relationships between nodes.
■ E.g. an edge exist from a threat actor to their location
Existing System- Graph Example
Challenges of Existing System
Limitations :
■ Slow performance
■ Not easily scalable
■ Not stable
■ Not highly available
■ Not distributed
Objectives:
■ Replace the current system with a new scalable, highly available,
distributed system.
Tech Evaluation for Graph DB
Evaluation Targets - Multiple Graph DB’s
■ Orient DB
■ Synapse
■ AWS Neptune
■ Janus Graph
Evaluation Criteria - Based on MoSCoW Model
■ Functional
■ Non-Functional
■ Supportability
Why JanusGraph?
Opinionated Selection Criteria for Janus Graph :
■ Indexing capabilities that can be controlled by the user.
■ Free / Full Text search
■ Embedded as well as Server mode setup capability
■ Schema Management
■ Triggers
■ OLAP Capabilities - Distributed Graph Processing
Result:
■ Based on our requirements, tech evaluation and test results, we selected JanusGraph.
Janus Graph is...
■ Distributed
■ Open source
■ Massively scalable
■ Graph Database
also...
■ Supports pluggable Backend Storage
● ScyllaDB
● Cassandra,
● HBase
● Berkeley DB
Motivation for ScyllaDB
Why ScyllaDB ?
Based on tech evaluations and tests we determined Scylla DB is the right
backend storage.
Features :
■ Easy Cluster setup
■ Self Tuning
■ Equal Load distribution
■ Easy to Manage On Cloud
■ Less Administration
■ No GC
■ Compression
ScyllaDB Usage
ScyllaDB Usage for Threat Analysis
■ Since data represents threat activity, we can get answers to questions about:
● Threat actors
● Malware
● Threat activity
● Victims
● Various other things.
■ Graph DB tells a story about data by connecting dots
Graph Traversing Examples
Architecture
Graph DB with ScyllaDB
Environment
Configurations
■ Running on AWS Cloud
■ Single Region (Multi AZ) deployment
■ Using EC2’s
■ AWS Instance - i3.8xlarge
■ Each Cluster has 7 nodes
■ Clusters - DEV, QA, STAGING, PROD.
H/W Per Node Per Cluster
CPU 32 224
RAM (GB) 244 1708
Disk (TB) 16 112
Deployment
Scylla DB - Infrastructure Management
Terraform is a tool for building, changing, and versioning infrastructure
safely and efficiently.
Scylla DB - Configuration Management
Puppet is a Configuration Management tool that is used for deploying,
configuring and managing servers.
Comparison
Conclusion
FireEye Traversing with Scylla DB
■ Very good experience and results observed so far
■ Cost Effective
■ Admin Friendly
■ Superfast
■ Looking at potential opportunities to use ScyllaDB in other projects
Thank You All ..!!
■ FireEye
● Architects
● Engineers: Developers, DevOps & QA
● Project and Program Managers
■ JanusGraph
■ ScyllaDB
● Scylla University
● Community
● Summit Organisers
Thank you Stay in touch
Any questions?
Rahul Gaikwad
rahul.gaikwad@fireeye.com
Krishna Palati
krishna.palati@FireEye.com
linkedin.com/in/rahul-gaikwad-2712b02a
linkedin.com/in/krishnapalati

More Related Content

PDF
Scylla Summit 2016: Compose on Containing the Database
PDF
Scylla Summit 2016: Graph Processing with Titan and Scylla
PDF
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
PDF
Comparing Apache Cassandra 4.0, 3.0, and ScyllaDB
PPTX
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
PDF
Lookout on Scaling Security to 100 Million Devices
PDF
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
PPTX
How Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night
Scylla Summit 2016: Compose on Containing the Database
Scylla Summit 2016: Graph Processing with Titan and Scylla
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
Comparing Apache Cassandra 4.0, 3.0, and ScyllaDB
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
Lookout on Scaling Security to 100 Million Devices
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
How Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night

What's hot (20)

PPTX
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
PPTX
Scylla Summit 2019 Keynote - Avi Kivity
PPTX
Powering a Graph Data System with Scylla + JanusGraph
PPTX
Captial One: Why Stream Data as Part of Data Transformation?
PDF
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
PDF
How to Monitor and Size Workloads on AWS i3 instances
PPTX
iFood on Delivering 100 Million Events a Month to Restaurants with Scylla
PDF
Introducing Scylla Open Source 4.0
PPTX
Scylla Summit 2018: Consensus in Eventually Consistent Databases
PPTX
mParticle's Journey to Scylla from Cassandra
PDF
Scylla Summit 2016: Scylla at Samsung SDS
PDF
Back to the future with C++ and Seastar
PDF
Scylla Virtual Workshop 2020
PDF
ScyllaDB @ Apache BigData, may 2016
PDF
Running a DynamoDB-compatible Database on Managed Kubernetes Services
PPTX
Scylla Summit 2018: Getting the Most Out of Scylla on Kubernetes
PPTX
Scylla Summit 2018: Cassandra and ScyllaDB at Yahoo! Japan
PPTX
How Workload Prioritization Reduces Your Datacenter Footprint
PDF
Scylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
PPTX
Scylla Summit 2022: Scylla 5.0 New Features, Part 1
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
Scylla Summit 2019 Keynote - Avi Kivity
Powering a Graph Data System with Scylla + JanusGraph
Captial One: Why Stream Data as Part of Data Transformation?
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
How to Monitor and Size Workloads on AWS i3 instances
iFood on Delivering 100 Million Events a Month to Restaurants with Scylla
Introducing Scylla Open Source 4.0
Scylla Summit 2018: Consensus in Eventually Consistent Databases
mParticle's Journey to Scylla from Cassandra
Scylla Summit 2016: Scylla at Samsung SDS
Back to the future with C++ and Seastar
Scylla Virtual Workshop 2020
ScyllaDB @ Apache BigData, may 2016
Running a DynamoDB-compatible Database on Managed Kubernetes Services
Scylla Summit 2018: Getting the Most Out of Scylla on Kubernetes
Scylla Summit 2018: Cassandra and ScyllaDB at Yahoo! Japan
How Workload Prioritization Reduces Your Datacenter Footprint
Scylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
Scylla Summit 2022: Scylla 5.0 New Features, Part 1
Ad

Similar to FireEye & Scylla: Intel Threat Analysis Using a Graph Database (20)

PDF
ScyllaDB Virtual Workshop
PPTX
Using ScyllaDB with JanusGraph for Cyber Security
PPTX
Zeotap: Moving to ScyllaDB - A Graph of Billions Scale
PDF
Zeotap: Moving to ScyllaDB - A Graph of Billions Scale
PPTX
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
PDF
JanusGraph DB
PDF
Renegotiating the boundary between database latency and consistency
PPTX
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
PPTX
Large Scale Graph Analytics with JanusGraph
PPTX
Large Scale Graph Analytics with JanusGraph
PPTX
MongoDB vs ScyllaDB: Tractian’s Experience with Real-Time ML
PDF
How Graph Databases used in Police Department?
PDF
ScyllaDB V Developer Deep Dive Series: Resiliency and Strong Consistency via ...
PDF
5 Factors When Selecting a High Performance, Low Latency Database
PDF
ShareChat’s Path to High-Performance NoSQL with ScyllaDB
PDF
ScyllaDB Virtual Workshop: Getting Started with ScyllaDB 2024
PPTX
Real-time Fraud Detection for Southeast Asia’s Leading Mobile Platform
PDF
[DataCon.TW 2019] Graph Query on Big-data, REST API, and Live Analysis Systems
PPTX
Scylla Virtual Workshop 2022
PDF
RedisGraph A Low Latency Graph DB: Pieter Cailliau
ScyllaDB Virtual Workshop
Using ScyllaDB with JanusGraph for Cyber Security
Zeotap: Moving to ScyllaDB - A Graph of Billions Scale
Zeotap: Moving to ScyllaDB - A Graph of Billions Scale
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
JanusGraph DB
Renegotiating the boundary between database latency and consistency
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraph
MongoDB vs ScyllaDB: Tractian’s Experience with Real-Time ML
How Graph Databases used in Police Department?
ScyllaDB V Developer Deep Dive Series: Resiliency and Strong Consistency via ...
5 Factors When Selecting a High Performance, Low Latency Database
ShareChat’s Path to High-Performance NoSQL with ScyllaDB
ScyllaDB Virtual Workshop: Getting Started with ScyllaDB 2024
Real-time Fraud Detection for Southeast Asia’s Leading Mobile Platform
[DataCon.TW 2019] Graph Query on Big-data, REST API, and Live Analysis Systems
Scylla Virtual Workshop 2022
RedisGraph A Low Latency Graph DB: Pieter Cailliau
Ad

More from ScyllaDB (20)

PDF
Understanding The True Cost of DynamoDB Webinar
PDF
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
PDF
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
PDF
New Ways to Reduce Database Costs with ScyllaDB
PDF
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
PDF
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
PDF
Leading a High-Stakes Database Migration
PDF
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
PDF
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
PDF
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
PDF
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
PDF
ScyllaDB: 10 Years and Beyond by Dor Laor
PDF
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
PDF
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
PDF
Vector Search with ScyllaDB by Szymon Wasik
PDF
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
PDF
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
PDF
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
PDF
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
PDF
Lessons Learned from Building a Serverless Notifications System by Srushith R...
Understanding The True Cost of DynamoDB Webinar
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
New Ways to Reduce Database Costs with ScyllaDB
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
Leading a High-Stakes Database Migration
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
ScyllaDB: 10 Years and Beyond by Dor Laor
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
Vector Search with ScyllaDB by Szymon Wasik
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
Lessons Learned from Building a Serverless Notifications System by Srushith R...

Recently uploaded (20)

PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Modernizing your data center with Dell and AMD
PDF
NewMind AI Monthly Chronicles - July 2025
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
cuic standard and advanced reporting.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Cloud computing and distributed systems.
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Modernizing your data center with Dell and AMD
NewMind AI Monthly Chronicles - July 2025
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Machine learning based COVID-19 study performance prediction
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Chapter 3 Spatial Domain Image Processing.pdf
Unlocking AI with Model Context Protocol (MCP)
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Encapsulation_ Review paper, used for researhc scholars
cuic standard and advanced reporting.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Dropbox Q2 2025 Financial Results & Investor Presentation
MYSQL Presentation for SQL database connectivity
Cloud computing and distributed systems.

FireEye & Scylla: Intel Threat Analysis Using a Graph Database

  • 1. FireEye & Scylla : Intel Threat Analysis using Graph Database Rahul Gaikwad, Staff DevOps Engineer & Krishna Palati, Senior Devops Manager
  • 2. Presenters Rahul Gaikwad, Staff DevOps Engineer ❖ Role ➢ Database Administrator - SQL / NoSQL / Graph DB / Big Data ➢ Infrastructure & Cloud Operations ➢ DevOps Automation Engineer ❖ Education ➢ Master of Computer Applications (MCA) & Executive MBA ➢ Pursuing PhD Research in AIOps ❖ Certifications ➢ Scylla | OCP | CCAH | HDPCA | RHCSA | AWS - SA | AWS - SysOps | Confluent Kafka Krishna Palati, Senior Devops Manager ❖ Role ➢ Senior DevOps Manager ➢ Cloud Infrastructure, Devops automation and Database Systems ❖ Education ➢ Bachelors and Masters Degrees in Computer Science and Engineering ❖ Hobbies ➢ Running, Biking, Hiking and Playing tennis
  • 3. Agenda ■ Background ■ Why ScyllaDB ■ ScyllaDB at FireEye ■ Conclusion ■ Q&A
  • 5. Introduction to FireEye Solutions ■ Threat Intelligence ■ Helix Security Platform ■ Endpoint Security ■ Network Security and Forensics ■ Email Security ■ Managed Defense Services ■ Breach Response ■ Security Assessment ■ Security Enhancement ■ Security Transformation ★ FireEye is a intelligence-led Cyber security company ★ We offer solutions that blends security technologies, threat intelligence and consulting.
  • 6. Forrester New Wave Leading Threat Intel Services
  • 8. FireEye Threat Intelligence A portfolio of subscriptions and services designed to address all aspects of an organization’s intelligence needs. ■ Intelligence Subscriptions ■ Intelligence Enablement ■ Intelligence Capability Development ■ Digital Threat Monitoring ■ Advanced Intelligence Access
  • 9. Application Use Case ■ Homegrown custom graph database on Postgres ■ Centralizes, organizes and processes cyber threat intelligence data ■ Tracks threat groups by recording all of the analytic correlations ■ Provides analytic results by processing and analysing historical data ■ Data Objects - DNS data, RSS feeds, file md5s, FQDNs and URLs ■ Data Size: Nodes ~500M and Edges ~1.5B
  • 10. Existing System as Graph DB Structure of the Graph ■ Stores data as ”nodes” or “edges” ■ Also allows storing tags Nodes ■ Each node represents a single object, event or evidence ■ E.g. Organizations, actors, hosts, files and FQDNs are represented as nodes in graph Edges ■ Edges represent the relationships between nodes. ■ E.g. an edge exist from a threat actor to their location
  • 12. Challenges of Existing System Limitations : ■ Slow performance ■ Not easily scalable ■ Not stable ■ Not highly available ■ Not distributed Objectives: ■ Replace the current system with a new scalable, highly available, distributed system.
  • 13. Tech Evaluation for Graph DB Evaluation Targets - Multiple Graph DB’s ■ Orient DB ■ Synapse ■ AWS Neptune ■ Janus Graph Evaluation Criteria - Based on MoSCoW Model ■ Functional ■ Non-Functional ■ Supportability
  • 14. Why JanusGraph? Opinionated Selection Criteria for Janus Graph : ■ Indexing capabilities that can be controlled by the user. ■ Free / Full Text search ■ Embedded as well as Server mode setup capability ■ Schema Management ■ Triggers ■ OLAP Capabilities - Distributed Graph Processing Result: ■ Based on our requirements, tech evaluation and test results, we selected JanusGraph.
  • 15. Janus Graph is... ■ Distributed ■ Open source ■ Massively scalable ■ Graph Database also... ■ Supports pluggable Backend Storage ● ScyllaDB ● Cassandra, ● HBase ● Berkeley DB
  • 17. Why ScyllaDB ? Based on tech evaluations and tests we determined Scylla DB is the right backend storage. Features : ■ Easy Cluster setup ■ Self Tuning ■ Equal Load distribution ■ Easy to Manage On Cloud ■ Less Administration ■ No GC ■ Compression
  • 19. ScyllaDB Usage for Threat Analysis ■ Since data represents threat activity, we can get answers to questions about: ● Threat actors ● Malware ● Threat activity ● Victims ● Various other things. ■ Graph DB tells a story about data by connecting dots
  • 22. Graph DB with ScyllaDB
  • 24. Configurations ■ Running on AWS Cloud ■ Single Region (Multi AZ) deployment ■ Using EC2’s ■ AWS Instance - i3.8xlarge ■ Each Cluster has 7 nodes ■ Clusters - DEV, QA, STAGING, PROD. H/W Per Node Per Cluster CPU 32 224 RAM (GB) 244 1708 Disk (TB) 16 112
  • 26. Scylla DB - Infrastructure Management Terraform is a tool for building, changing, and versioning infrastructure safely and efficiently.
  • 27. Scylla DB - Configuration Management Puppet is a Configuration Management tool that is used for deploying, configuring and managing servers.
  • 30. FireEye Traversing with Scylla DB ■ Very good experience and results observed so far ■ Cost Effective ■ Admin Friendly ■ Superfast ■ Looking at potential opportunities to use ScyllaDB in other projects
  • 31. Thank You All ..!! ■ FireEye ● Architects ● Engineers: Developers, DevOps & QA ● Project and Program Managers ■ JanusGraph ■ ScyllaDB ● Scylla University ● Community ● Summit Organisers
  • 32. Thank you Stay in touch Any questions? Rahul Gaikwad rahul.gaikwad@fireeye.com Krishna Palati krishna.palati@FireEye.com linkedin.com/in/rahul-gaikwad-2712b02a linkedin.com/in/krishnapalati

Editor's Notes

  • #2: KP: Hello everyone, hope you are enjoying the CA weather. As you heard in the introduction video, today we will talk abt how we at FireEye, used ScyllaDB to redesign an existing product and built a new solution for our Intel product portfolio.
  • #3: KP: I am Krishna Palati, I manage Devops team for Solutions Engineering comprising of Intel, Managed Defense and Incidence Response for FireEye. We are responsible for Core Devops, Cloud Infrastructure operations & Database systems. In this presentation we will talk abt how we used Scylla to implement a solution that is critical for our Intelligence product portfolio. RG - Hello, I am Rahul Gaikwad. I am a Staff DevOps Engineer at FireEye cybersecurity. I am responsible for continuous integration and deployment , different database administration and cloud operations. I came from India to talk in Scylla summit about how we are doing Intel Threat analysis using Graph database. We will be talking about the challenges with existing systems and how ScyllaDB helps us solve some of these challenges.
  • #4: KP
  • #5: KP
  • #6: KP: FireEye is a unique cyber security company in the sense that we bring our Security Appliances & Intelligence capabilities together for our customers. Appliances could be physical or virtual and include a range of products like Endpoint (HX), Network (NX), Email (ETP). Solutions include Intel, Managed Defense & Incidence Reponse.
  • #7: KP: As per Forrester Report, FireEye is the leader in cyber Threat Intelligence offering, both for current content and our strategy. We are specifically focused on Intel because we will be discussing the problems we encountered with current technology and solutions we implemented to address them during rest of this presentation.
  • #8: KP: As is evident here, we are Industry recognized thought leader in cyber Intelligence and often called upon to provide our analysis and thoughts on this topic.
  • #9: KP Subscription: Access to published intelligence reports Enablement: Include onboarding and provisioning, API integration with your security systems, analyst access, workshops. Digital Threat Monitoring: Tailored, proactive monitoring and analysis of threats to your brand, your VIPs. Advanced Intelligence Access: This capability enables direct queries into global visibility, insights and intelligence from FireEye. https://www.fireeye.com/content/dam/fireeye-www/products/pdfs/pf/intel/ds-fireeye-threat-intelligence.pdf
  • #10: KP: Now that we went through the business aspects of why and how we do Threat Intel, let's briefly talk about our current application and what it does at very high level.
  • #11: RG: Our customized graph system stores data as “nodes” or “edges”. It also allows analyst to define and apply tags to nodes and edges , we can call it as attributes or characteristics. Each node represents a single object, event or evidence. For example, organizations, actor, hacker, host computers, files, and FQDNs are all represented as nodes in the graph database. Edges represent the relationships between nodes. For example, an edge exist from a threat actor to their location.
  • #12: RG : In the above diagram, blue circles indicate nodes, green arrows are edges, red labels are properties, and orange labels are aspects node 1 - email - sender mail id node 2 – filemd5 - email content message / file attachment node 3 - email – receiver mail id node 4 - ipv4addr – IP address of filemd5 node SenderEmail-ID (node) sent filemd5 email to ReceiverEmail-Id Each node has properties in our intel system. For example: The SenderEmail-Id is associated with APT3 actor - a known hacking group. Filemd5 has been associated with an email phishing campaign. ReceiverEmail-id is a tagged as victim Filemd5 has association with the IP Address from which such phishing campaigns has been executed in the past.
  • #13: RG : Over time, our intel system became very effective & popular. Its usage has increased from hand full of analysts to several hundred analysts spread across the globe. We became a victim of our own success - as we started running into performance limitations.
  • #14: RG: Based on our objectives, we started evaluating Graph database technologies like OrientDB, Synapse, AWS Neptune, JanusGraph. We had various evaluation criteria like Functional – Traversing Speed , Full text search, Concurrent users Non-functional – Pluggable storage backend , High Availability and Disaster Recovery Supportability – Strong and active user community , Already deployed in Production, Documentations
  • #15: RG: Indexing capabilities - We can define the indexes per use case. Free / Full - Text Search is a capability where the system allows users to search for records that includes one or more word within a Free Text Field. Embedded - We can embed JG with application code layer. Schema Management - Allows to define and change Schema. It also validates incoming data (schema validations). Triggers system generates Events when certain specific actions are performed on the underlying database store. OLAP - Online Analytical Processing - using distributed graph processing
  • #16: RG :
  • #17: RG
  • #18: RG: When we setup or scale the cluster, we just need to run scyall_setup.sh which sets up configs automatically. During data migration from existing to new system we got 80% compression rate.
  • #19: RG
  • #20: RG
  • #21: RG: Here is an example of how those questions are asked. We are showing a Gremlin query used to select a Node with specific property. And then traverse through the graph system and find all the other nodes it is connected via edges. As shown in the red highlighted box, the query traversed through 15,000 nodes and provided results in 322 ms - abt 10 times faster than it is in our current system.
  • #22: KP-
  • #23: KP This is a high level overview of what we built in the cloud. It is an N-tier architecture. App UI JanusGraph Scylladb (primary) & Elasticsearch (search) App API System is designed with redundancy for each of these components for scalability and HA. They are built across multiple Availability Zones so we are protected against AZ failures. Everything is in a private VPC with restricted access. Access comes in via Nginx. Authentication/authorization is handled via an Nginx/OpenResty combination to our internal IDAM server. All the business logic is abstracted in the the Application Tier.
  • #24: RG
  • #25: RG As Krishna mentioned , we have setup all system components in AWS cloud. We went through several iterations to come up with the optimal size of the cluster and resources to accomplish our goals like functionality and data migration from current system.
  • #26: RG
  • #27: RG
  • #28: RG : Using these automation tools we can build the whole stack shown in the architecture diagram with in minutes to an hour.
  • #29: RG: We ran set of queries on existing and new system , and found the new system based on Scylla is 10 times faster than the existing system.
  • #30: KP
  • #31: KP: Our experience with Scylladb has been very good. Its cost effective and performant. We are looking at opportunities to use Scylladb in other projects with in FireEye.
  • #32: KP: Finally, a big thanks to our internal FireEye team of Architects, Developers, QA & Devops. Architects and Devs worked closely with Devops to iterate and improve this solution. Our teams are spread across Reston, VA, Amsterdam & Pune, India - and we work very closely to deliver world class solutions. I would also like to extend our gratitude to JanusGraph and ScyllaDB for the excellent Scylla University resources, the community and the organizers of this Summit.
  • #33: KP