SlideShare a Scribd company logo
1
Beyond Big Data


   Riding the
Technology Wave

                  Ira A. (Gus) Hunt
      !
           Chief Technology Officer
Our Mission
We are the nation's first line of defense. We accomplish
what others cannot accomplish and go where others
cannot go. We carry out our mission by:

   Collecting information that reveals the plans, intentions and
   capabilities of our adversaries and provides the basis for
   decision and action.

   Producing timely analysis that provides insight, warning and
   opportunity to the President and decisionmakers charged with
   protecting and advancing America's interests.

   Conducting covert action at the direction of the President to
   preempt threats or achieve US policy objectives.
4 Big Bets
1	
     Revolutionize Big               Data Exploitation
          –  Acquire, federate, secure and exploit. Grow the haystack, magnify the needles.




2	
     Accelerate Operational                              Excellence
          –  Innovate IT operations and run IT like a business.




3	
     Serve CIA by supporting the IC
          –  Assume a leadership role in IC activities that matter to CIA; Build to share




4	
     Drive Performance through Talent                                 Management
          –  Focus on continuous learning and diversity of thought, experience, background
6 Key Technology Enablers
0	
     Secure Mobility
           –    Immediate, secure and appropriate access to people, data and tools from anywhere at anytime



1	
     Advanced Mission Analytics—Analytics as a Service
           –    World-class abilities to discover patterns, correlate information, understand plans and intentions,
                and find and identify operational targets in a sea of data. Big Data analytics as a service


2	
     Enterprise Widgets and Services
           –    A customizable, integrated and adaptive webtop that lets analysts, ops officers, and
                targeters to “have it their way”. Personalization in context.


3	
     Security as a Service
           –    One environment, all data, protected and secure.--ubiquitous encryption, enterprise
                authentication, audit, DRM, secure ID propagation, and Gold Version C&A.



4	
      Data Harbor—Data as a Service
           –    An ultra-high performance data environment that enables CIA missions to acquire,
                federate, and position and securely exploit huge volumes data. Data in context.


5	
     Cloud Computing—Infrastructure as a Service
           –    Capacity ahead of demand. Large scale, elastic, commodity hosting, storage, and compute
It’s a

Big Data
  World


           6
Google
       > 100 PB
  > 1T indexed URLs
  > 3 million servers
> 7.2B page-views/day
                        7
FaceBook
      > 1 billion users
   > 300PB; +> 500TB/day
> 35% of world’s photographs

                          8
YouTube
        > 1000PB
  +>72 hours/minute
>37 million hours/year
 > 4 billion views/day
                         9
World Population
  > 7,057,065,162


                    10
Twitter
> 124B tweets/year
    > 390M/day
     ~4500/sec

                     11
Global Text Messages
     > 6.1T per year
   > 193,000 per second
  > 876 per person per year


                              12
US Cell Calls
   > 2.2 T minutes/year
  > 19 minutes / person / day
(uncompressed < 1 YouTube/year)


                                13
3
Driving Forces
                 14
Social
   Mobile
         Cloud
                 15
+   +      =
Big Data
               16
+       +
Increases the velocity of
  innovation
                        17
+      +
Accelerates social
   Change
                     18
19
+      +
 Altered the
   Flow
of Information
                 20
3
Emerging
 Forces
Nano
   Bio
       Sensors
                 22
Mobile Sensor Platform
       Microphone
       Image
       3-axis accelerometer
       Touch
       Light
       Proximity
       Geolocation

Communicator, Tricorder, Transporter
                                       23
Mobile Health Platform

Pacemaker
Blood sugar tester
Insulin controller
Health monitor
Exercise coach
Remote tune-ups
Early warning system

                       24
Mobile Sensor Platform

Identity by 3-axis accelerometer
    Gender (71%)
    Height--tall or short (80%)
    Weight--heavy or light (80%)
    You by your gait (100%)

           Actitracker—Android App



                                     25
+       +          +
     +       +          =
The inanimate becomes
       sentient
                        26
+         +          +
      +         +          =
     Smarter Planet
  Cars drive themselves
Machines know your needs
                           27
+          +       +
           +          +       =
Drive radical efficiencies
Enhance social engagement
Improve information sharing
Enables global reach
Green (automatic routing)
Improve our health
Stop/prevent crime
…                             28
Sensors are Really Big


1	
     Sensors are unbounded


2	
     Sensors are promiscuous


3	
     Sensors are indiscriminate
The Internet of Things is Bigger


1	
     Everything is Connected


2	
     Everything   Communicates

3	
     Everything is a Sensor
That’s the

Really Big Data
 Challenge of the future


                           31
Why
 We
  Care

         32
Why
 We
  Care

         33
Why
 We
  Care

         34
Why
 We
  Care

         35
Impact of Big Data

1	
     Know what we know

2	
     Discover the gaps in our knowledge

3	
     Focus targeting to fill the gaps

4	
     More effective use of expensive or long lead
        collection assets

5	
     Better global coverage to limit surprise

6	
     Enhance understanding and improve analysis
Implications


               37
4 Rules of Big Data

1	
     It’s the data…
                         - Apologies to James Carville




2	
     Power to the people
                         - Apologies to the Black Panthers




3	
     Latency breeds contempt
                         - Apologies to Aesop



4	
     Context, context, context
                         - Apologies to Lord Harold Samuel
It’s the Data…


                 39
Data vs Tools—A History
                            Lesson

•  Sophisticated tools without the data are
   useless

•  Mediocre tools with the data are frustrating


•  Analysts will always opt for frustration over
   futility, if that is their only option
Our Job
1	
     Leverage the Big Data world


2	
     Find the Information that Matters


3	
     Connect the Dots


4	
     Understand the Plans of our Adversaries
             Safeguard our national security
The
Problem


          42
Our Problem: Which 5K

1	
     Don’t know the future value of data


2	
     We cannot connect dots we don’t have


3	
     Traditional, requirements driven, collection
        fails in the Big Data world
               - Can’t task for data you don’t know you do need
               - The few cannot know the needs of the many
               - Global Coverage requires Global Data
Characteristics of Big Data


1	
     More is always better

2	
     Signal to noise only gets worse

3	
     Enumeration not modeling

4	
     Requirements are usually hindsight
Data as a Service

•  Analysts and operators are not data engineers
•  Need insight and understanding
•  Ask a question and get a coherent answer
•  Cannot know what data sets contain
   information of value to them
•  Imbue data services and tools with those
   smarts
•  Smart Data, smart tools, smarter intelligence

                                                   45
Power to the
  People

               46
Today

•  Analytics and tools are hard to use
•  Specialists are required to derive value
•  Skilled people are in short supply
•  Algorithms are dense and arcane
•  Require a lot of hand curation
•  Built for business not for intelligence

                                              47
New Fields of
 Expertise
    Data Scientist
Information Engineer

                       48
Data Science
* Data science combines elements from many
fields:
      Math
      Statistics
      Data Engineering
      Pattern Recognition and Learning
      Advanced Computing
      Visualization
      Uncertainty Modeling
      Data Warehousing
      High performance computing


                                         * Wikipedia
Big Data Democracy Wins


 The power of big data
can only be fully realized
 when it is in the hands
  of the average user


                             50
Tomorrow

•    Elegant, powerful and easy to use tools and
     visualizations

•    Machines to do more of the heavy lifting

•    Intelligent systems that learn from the user

•    Correlation not search

•    “Curiosity layer”– machines that are curious on
     your behalf
7 Universal Constructs for
                  Analytics

People          Events


Places          Concepts


Organizations   Things


Time

                           52
User Built Recipes




                53
Keep it Simple

•  Data Scientists focus on hard problems

•  Build reusable components that anyone
   can apply—Recipes

•  Share them widely—Apps Store/Apps Mall
   —Recipe Book

•  Let users assemble components their way
  •    Experiment and fail quickly to succeed faster
Latency Breeds
   Contempt

                 55
Its All About Speed

•  Hadoop/Map Reduce—batch
  •    Flexible, powerful, slow


•  Equivalent of Real-Time Map/Reduce
  •    Flexible, powerful and fast
  •    Demel, Caffeine, Impala, Apache Drill, Spanner…


•  Recursive Streams processing w/
   complex analytics

•  In-memory—peta-scale RAM architectures
  •    Distributed, in-memory analytics
Tectonic Technology Shifts
Traditional Processing       Mass Analytics/Big Data
             Data on SAN     Data at processor
   Move Data to Question     Move Question to Data
                   Backup    Replication management
          Vertical scaling   Horizontal scaling
   Capacity after demand     Capacity ahead of demand
                       DR    COOP
        Size to peak load    Dynamic/elastic provisioning
                     Tape    SAN
                      SAN    Disk
                     Disk    SSD
              RAM limited    Peta-scale RAM
New Computing Architectures

•  Data close to compute
•  Power at the edge
•  Optical Computing/Optical Bus
•  End of the motherboard—shared pools of
   everything
•  Software defined everything—compute,
   storage, networking, data center
•  Network is the bottleneck and constraint
Context,
    Context,
         Context

                   59
Everything in Your Frame of
                       Reference
•  Widgets—Webtop in context to business

•  Schema on Read—Data in context to your
   question

•  User assembled analytics—answers in
   context to your questions

•  Elastic computing—computing in context
   to your demand
Closing
Thoughts

           61
High Noon
     in the
Information Age

                  62
It is nearly within our grasp
  to compute on all human
    generated information


                            63
FaceBook
    > 1 billion users
> 35% of all photographs

                       64
The inanimate is rapidly
    becoming sentient

Smarter Planet
  Cars drive themselves
     Machines know your needs
                                65
3 Wave of
   rd

 Computing
Cognitive Machines
        Watson
                     66
+           +            +
                   +           +            =
Moving faster than government can keep up

The legal system is woefully behind

What are your rights? Who owns your data?

Driving the pace of social change

Exponentially increasing cyber threats      67
68

More Related Content

PDF
Internet of Things
PDF
Social Knowledge: Are You Ready for the Future?
PDF
Driving AI Projects From Concept to the Real World
PDF
Data Storytelling
PPTX
Itri icl 0116_distribute
PDF
Privacy, Emerging Technology, and Information Professionals
PDF
Big data tech conclave 2013 brochure (2)
PPTX
In pursuit of augmented intelligence
Internet of Things
Social Knowledge: Are You Ready for the Future?
Driving AI Projects From Concept to the Real World
Data Storytelling
Itri icl 0116_distribute
Privacy, Emerging Technology, and Information Professionals
Big data tech conclave 2013 brochure (2)
In pursuit of augmented intelligence

What's hot (10)

PDF
Big Data & Analytics for Government - Case Studies
PPTX
2020 Screen deaddiction
PPTX
Conscious Social Networking
PDF
Personal Information Management Systems - EDBT/ICDT'15 Tutorial
PPTX
Big data
PDF
MBA-TU-Thailand:BigData for business startup.
PDF
Global Pulse Magazine - Fall 2011
PPTX
Personal Information Search and Discovery
PDF
Pulse Lab Jakarta Launch Presentation
PPT
Acg Terr Sand2004 2130w
Big Data & Analytics for Government - Case Studies
2020 Screen deaddiction
Conscious Social Networking
Personal Information Management Systems - EDBT/ICDT'15 Tutorial
Big data
MBA-TU-Thailand:BigData for business startup.
Global Pulse Magazine - Fall 2011
Personal Information Search and Discovery
Pulse Lab Jakarta Launch Presentation
Acg Terr Sand2004 2130w
Ad

Viewers also liked (6)

PPTX
Edward Chenard, Innovation in Retail
PDF
The Softer Side of Data Science
PPTX
Humans by the hundred
PPTX
Big Qualitative Data, Big Team, Little Time - A Path to Publication
PPTX
Human Factors in Project Management Session 1 projects are about people issue 1
PPTX
Big Data Strategies
Edward Chenard, Innovation in Retail
The Softer Side of Data Science
Humans by the hundred
Big Qualitative Data, Big Team, Little Time - A Path to Publication
Human Factors in Project Management Session 1 projects are about people issue 1
Big Data Strategies
Ad

Similar to THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013 (20)

PDF
Perspectivesonbigdatamissionneeds gushunt-120331115815-phpapp02
PDF
Big Data Analytics - The New Cold War
PDF
The Rise of Big Data and the Chief Data Officer (CDO)
PDF
EDF2013: Invited Talk Julie Marguerite: Big data: a new world of opportunitie...
PDF
Intuit 2020 Report: The New Data Democracy
PDF
Dr. dzaharudin mansor microsoft
PDF
Data science and its potential to change business as we know it. The Roadmap ...
PPTX
Designing in the "New" Digital Economy
PDF
Big data overview external
PDF
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
PPTX
Infographics and big data
PPTX
Data Science Innovations : Democratisation of Data and Data Science
PPTX
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
PPTX
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
DOCX
Policy paper need for focussed big data & analytics skillset building throu...
PDF
Ictam big data
PDF
Convergence of AI, IoT, Big Data and Blockchain: A Review. Kefa Rabah .
PDF
EDF2012 Jaspar Hedegaar Bojsen - Big Data
PDF
The Future of Big Data
 
PDF
Unlocking Value in the Fragmented World of Big Data Analytics (POV Paper)
Perspectivesonbigdatamissionneeds gushunt-120331115815-phpapp02
Big Data Analytics - The New Cold War
The Rise of Big Data and the Chief Data Officer (CDO)
EDF2013: Invited Talk Julie Marguerite: Big data: a new world of opportunitie...
Intuit 2020 Report: The New Data Democracy
Dr. dzaharudin mansor microsoft
Data science and its potential to change business as we know it. The Roadmap ...
Designing in the "New" Digital Economy
Big data overview external
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Infographics and big data
Data Science Innovations : Democratisation of Data and Data Science
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Policy paper need for focussed big data & analytics skillset building throu...
Ictam big data
Convergence of AI, IoT, Big Data and Blockchain: A Review. Kefa Rabah .
EDF2012 Jaspar Hedegaar Bojsen - Big Data
The Future of Big Data
 
Unlocking Value in the Fragmented World of Big Data Analytics (POV Paper)

More from Gigaom (20)

PPTX
Structure 2014 - The strategic value of the cloud - Joe Weinman
PPTX
Structure 2014 - The right and wrong way to scale - Rackspace
PPTX
Structure 2014 - The future of cloud computing survey results
PPTX
Structure 2014 - Launchpad Competition
PPTX
Structure 2014 - Disrupting the data center - Intel sponsor workshop
PPTX
Structure 2014 - Cloud trends - Battery
PDF
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
PDF
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
PDF
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
PDF
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
PDF
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
PDF
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
PDF
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
PDF
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
PDF
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
PDF
Structure Data 2014: IS VIDEO BIG DATA?, Steve Russell
PDF
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
PDF
How Data is Remaking E-commerce - from Roadmap 2013
PDF
25 Favorite Experiences in Tech - from Roadmap 2013
PDF
How Moore’s Law is Influencing Design - from Roadmap 2013
Structure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The future of cloud computing survey results
Structure 2014 - Launchpad Competition
Structure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Cloud trends - Battery
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
How Data is Remaking E-commerce - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013

Recently uploaded (20)

PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Mushroom cultivation and it's methods.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
August Patch Tuesday
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
A Presentation on Artificial Intelligence
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Tartificialntelligence_presentation.pptx
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Spectroscopy.pptx food analysis technology
PDF
Encapsulation theory and applications.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPT
Teaching material agriculture food technology
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
MIND Revenue Release Quarter 2 2025 Press Release
Mushroom cultivation and it's methods.pdf
Encapsulation_ Review paper, used for researhc scholars
August Patch Tuesday
A comparative study of natural language inference in Swahili using monolingua...
Accuracy of neural networks in brain wave diagnosis of schizophrenia
A Presentation on Artificial Intelligence
Unlocking AI with Model Context Protocol (MCP)
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Tartificialntelligence_presentation.pptx
NewMind AI Weekly Chronicles - August'25-Week II
Spectroscopy.pptx food analysis technology
Encapsulation theory and applications.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Teaching material agriculture food technology
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...

THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

  • 1. 1
  • 2. Beyond Big Data Riding the Technology Wave Ira A. (Gus) Hunt ! Chief Technology Officer
  • 3. Our Mission We are the nation's first line of defense. We accomplish what others cannot accomplish and go where others cannot go. We carry out our mission by: Collecting information that reveals the plans, intentions and capabilities of our adversaries and provides the basis for decision and action. Producing timely analysis that provides insight, warning and opportunity to the President and decisionmakers charged with protecting and advancing America's interests. Conducting covert action at the direction of the President to preempt threats or achieve US policy objectives.
  • 4. 4 Big Bets 1   Revolutionize Big Data Exploitation –  Acquire, federate, secure and exploit. Grow the haystack, magnify the needles. 2   Accelerate Operational Excellence –  Innovate IT operations and run IT like a business. 3   Serve CIA by supporting the IC –  Assume a leadership role in IC activities that matter to CIA; Build to share 4   Drive Performance through Talent Management –  Focus on continuous learning and diversity of thought, experience, background
  • 5. 6 Key Technology Enablers 0   Secure Mobility –  Immediate, secure and appropriate access to people, data and tools from anywhere at anytime 1   Advanced Mission Analytics—Analytics as a Service –  World-class abilities to discover patterns, correlate information, understand plans and intentions, and find and identify operational targets in a sea of data. Big Data analytics as a service 2   Enterprise Widgets and Services –  A customizable, integrated and adaptive webtop that lets analysts, ops officers, and targeters to “have it their way”. Personalization in context. 3   Security as a Service –  One environment, all data, protected and secure.--ubiquitous encryption, enterprise authentication, audit, DRM, secure ID propagation, and Gold Version C&A. 4   Data Harbor—Data as a Service –  An ultra-high performance data environment that enables CIA missions to acquire, federate, and position and securely exploit huge volumes data. Data in context. 5   Cloud Computing—Infrastructure as a Service –  Capacity ahead of demand. Large scale, elastic, commodity hosting, storage, and compute
  • 7. Google > 100 PB > 1T indexed URLs > 3 million servers > 7.2B page-views/day 7
  • 8. FaceBook > 1 billion users > 300PB; +> 500TB/day > 35% of world’s photographs 8
  • 9. YouTube > 1000PB +>72 hours/minute >37 million hours/year > 4 billion views/day 9
  • 10. World Population > 7,057,065,162 10
  • 11. Twitter > 124B tweets/year > 390M/day ~4500/sec 11
  • 12. Global Text Messages > 6.1T per year > 193,000 per second > 876 per person per year 12
  • 13. US Cell Calls > 2.2 T minutes/year > 19 minutes / person / day (uncompressed < 1 YouTube/year) 13
  • 15. Social Mobile Cloud 15
  • 16. + + = Big Data 16
  • 17. + + Increases the velocity of innovation 17
  • 18. + + Accelerates social Change 18
  • 19. 19
  • 20. + + Altered the Flow of Information 20
  • 22. Nano Bio Sensors 22
  • 23. Mobile Sensor Platform Microphone Image 3-axis accelerometer Touch Light Proximity Geolocation Communicator, Tricorder, Transporter 23
  • 24. Mobile Health Platform Pacemaker Blood sugar tester Insulin controller Health monitor Exercise coach Remote tune-ups Early warning system 24
  • 25. Mobile Sensor Platform Identity by 3-axis accelerometer Gender (71%) Height--tall or short (80%) Weight--heavy or light (80%) You by your gait (100%) Actitracker—Android App 25
  • 26. + + + + + = The inanimate becomes sentient 26
  • 27. + + + + + = Smarter Planet Cars drive themselves Machines know your needs 27
  • 28. + + + + + = Drive radical efficiencies Enhance social engagement Improve information sharing Enables global reach Green (automatic routing) Improve our health Stop/prevent crime … 28
  • 29. Sensors are Really Big 1   Sensors are unbounded 2   Sensors are promiscuous 3   Sensors are indiscriminate
  • 30. The Internet of Things is Bigger 1   Everything is Connected 2   Everything Communicates 3   Everything is a Sensor
  • 31. That’s the Really Big Data Challenge of the future 31
  • 32. Why We Care 32
  • 33. Why We Care 33
  • 34. Why We Care 34
  • 35. Why We Care 35
  • 36. Impact of Big Data 1   Know what we know 2   Discover the gaps in our knowledge 3   Focus targeting to fill the gaps 4   More effective use of expensive or long lead collection assets 5   Better global coverage to limit surprise 6   Enhance understanding and improve analysis
  • 38. 4 Rules of Big Data 1   It’s the data… - Apologies to James Carville 2   Power to the people - Apologies to the Black Panthers 3   Latency breeds contempt - Apologies to Aesop 4   Context, context, context - Apologies to Lord Harold Samuel
  • 40. Data vs Tools—A History Lesson •  Sophisticated tools without the data are useless •  Mediocre tools with the data are frustrating •  Analysts will always opt for frustration over futility, if that is their only option
  • 41. Our Job 1   Leverage the Big Data world 2   Find the Information that Matters 3   Connect the Dots 4   Understand the Plans of our Adversaries Safeguard our national security
  • 43. Our Problem: Which 5K 1   Don’t know the future value of data 2   We cannot connect dots we don’t have 3   Traditional, requirements driven, collection fails in the Big Data world - Can’t task for data you don’t know you do need - The few cannot know the needs of the many - Global Coverage requires Global Data
  • 44. Characteristics of Big Data 1   More is always better 2   Signal to noise only gets worse 3   Enumeration not modeling 4   Requirements are usually hindsight
  • 45. Data as a Service •  Analysts and operators are not data engineers •  Need insight and understanding •  Ask a question and get a coherent answer •  Cannot know what data sets contain information of value to them •  Imbue data services and tools with those smarts •  Smart Data, smart tools, smarter intelligence 45
  • 46. Power to the People 46
  • 47. Today •  Analytics and tools are hard to use •  Specialists are required to derive value •  Skilled people are in short supply •  Algorithms are dense and arcane •  Require a lot of hand curation •  Built for business not for intelligence 47
  • 48. New Fields of Expertise Data Scientist Information Engineer 48
  • 49. Data Science * Data science combines elements from many fields: Math Statistics Data Engineering Pattern Recognition and Learning Advanced Computing Visualization Uncertainty Modeling Data Warehousing High performance computing * Wikipedia
  • 50. Big Data Democracy Wins The power of big data can only be fully realized when it is in the hands of the average user 50
  • 51. Tomorrow •  Elegant, powerful and easy to use tools and visualizations •  Machines to do more of the heavy lifting •  Intelligent systems that learn from the user •  Correlation not search •  “Curiosity layer”– machines that are curious on your behalf
  • 52. 7 Universal Constructs for Analytics People Events Places Concepts Organizations Things Time 52
  • 54. Keep it Simple •  Data Scientists focus on hard problems •  Build reusable components that anyone can apply—Recipes •  Share them widely—Apps Store/Apps Mall —Recipe Book •  Let users assemble components their way •  Experiment and fail quickly to succeed faster
  • 55. Latency Breeds Contempt 55
  • 56. Its All About Speed •  Hadoop/Map Reduce—batch •  Flexible, powerful, slow •  Equivalent of Real-Time Map/Reduce •  Flexible, powerful and fast •  Demel, Caffeine, Impala, Apache Drill, Spanner… •  Recursive Streams processing w/ complex analytics •  In-memory—peta-scale RAM architectures •  Distributed, in-memory analytics
  • 57. Tectonic Technology Shifts Traditional Processing Mass Analytics/Big Data Data on SAN Data at processor Move Data to Question Move Question to Data Backup Replication management Vertical scaling Horizontal scaling Capacity after demand Capacity ahead of demand DR COOP Size to peak load Dynamic/elastic provisioning Tape SAN SAN Disk Disk SSD RAM limited Peta-scale RAM
  • 58. New Computing Architectures •  Data close to compute •  Power at the edge •  Optical Computing/Optical Bus •  End of the motherboard—shared pools of everything •  Software defined everything—compute, storage, networking, data center •  Network is the bottleneck and constraint
  • 59. Context, Context, Context 59
  • 60. Everything in Your Frame of Reference •  Widgets—Webtop in context to business •  Schema on Read—Data in context to your question •  User assembled analytics—answers in context to your questions •  Elastic computing—computing in context to your demand
  • 62. High Noon in the Information Age 62
  • 63. It is nearly within our grasp to compute on all human generated information 63
  • 64. FaceBook > 1 billion users > 35% of all photographs 64
  • 65. The inanimate is rapidly becoming sentient Smarter Planet Cars drive themselves Machines know your needs 65
  • 66. 3 Wave of rd Computing Cognitive Machines Watson 66
  • 67. + + + + + = Moving faster than government can keep up The legal system is woefully behind What are your rights? Who owns your data? Driving the pace of social change Exponentially increasing cyber threats 67
  • 68. 68