SlideShare a Scribd company logo
Elevator Pitch
An Innovative Big-Data Web Scraping Tech Company
BUSINESS IDEABUSINESS IDEA
Innovative Big-Data Web Scraping Tech Company
HIGHLIGHTS
 What is WebRobot?
 The Problem
 How We Can Solve It
 Team
 Track Record
 Business Model
 Trends & Opportunities
 Main Competitors
 Target
 SWOT Analysis
 Some Numbers (sales, profit, clients)
 Investment Plan
2
1. THE PROJECT
Description
 WebRobot Ltd is a London-based company that operates in the web scraping and web
mining industry in which it aims to become the leader.
 In WebRobot we are building a super scalable infrastructure for data acquisition that
customers can use as a web service. It exploits cloud computing and big-data technologies,
as well as data-extraction and information-extraction algorithms.
 WebRobot will be a great ally to every company that needs to acquire this heterogeneous
network of information and wants to reduce its internal management costs. WebRobot’s
services will represent a strategic resource essential to its business success.
Innovative Big-Data Web Scraping Tech Company 3
1. THE PROJECT
The problem
Every company wishing to achieve, keep and improve its business success needs information (data)
on both the market, customers, and competitors, but this is challenging.
It must get good, reliable, and well-organized data. In addition, it needs to manage them properly.
The World Wide Web is made up of a huge amount of semi-structured and unstructured data.
Furthermore, it constantly changes its structure.
The cost to collect all of these data is often very expensive.
For all these reasons, we need robust and scalable algorithms that can reduce this onerous
maintenance activity.
Innovative Big-Data Web Scraping Tech Company 4
1. THE PROJECT
How We Can Solve the Problem
We can guarantee algorithmic and structural scalability with automatic extraction features.
We offer a powerful solution in the form of a web service.
We integrate cloud computing with big-data technologies applied in the more general web mining
context.
We use visual support tools and SDK to connect to our stack.
WebRobot’s goal is to become a complete ETL service involving data extraction,
web mining, machine learning, and big-data analytics.
Innovative Big-Data Web Scraping Tech Company 5
2. THE TEAM
CEO, CTO
Roger Giuffrè
71% of Equity
Mediterraneo
Capital Ltd
25% of Equity
CCO, CMO
Denis Giuffrè
4% of Equity
CFO
Antonio
Censabella
Roger Giuffrè
Denis Giuffrè Antonio Censabella
MEDITERRANEO CAPITAL LTD
Innovative Big-Data Web Scraping Tech Company 6
3. TRACK RECORD
We are finalizing the first version of the web service which will include the serverless version on the
Lambda technology and Amazon EMR.
We need to integrate the wrapper induction algorithms directly into the spark context. This will help us
refine them with the latest academic findings.
API implementation is fundamentally finished. We have to complete the usability studies of the current
interface.
We need to complete the dashboard that will be released under an open-source license.
We have to design visual tools to support the ETL that has to be generated.
We have a new grammar to set up for the query.
Innovative Big-Data Web Scraping Tech Company 7
4. THE BUSINESS MODEL
The Strategy
We will release the service on the Amazon marketplace, available in three commercial packages:
Entry-Level, Professional, and Enterprise.
Our average selling price could be around €0.0008 per page scraped, but we will make a distinction
between static and dynamic pages that need complex algorithms.
We have verified that the execution costs on a serverless environment and on an EMR cluster can
guarantee us a margin of at least 50%. This margin represents a cost constraint in our pricing policy.
In the future, we will integrate a web agents marketplace and adopt a B2B2C paradigm to fill the gap
with the end users, as well as with the actual use cases.
Innovative Big-Data Web Scraping Tech Company 8
5. THE MARKET AND COMPETITORS
Trends and opportunities
 Markets: Web Scraping, Web Mining, Data Analytics.
 Dimension: $2 billion of estimated value in 2020 alone (in just one single year).
 Growth: based on the market researches, we expect further growth in the
next years induced by (1) an ever-greater centrality of data in the entire
business process, and (2) the predisposition of the companies to outsource,
more and more often, the above-mentioned activities.
Innovative Big-Data Web Scraping Tech Company 9
Main Competitors
 Diffbot: an API for data extraction that uses machine learning heuristic and features to crawl the
pages. Unfortunately, the results are not 100% precise.
 Scrapyhub: a cloud service focused on the Scrapy framework. It offers every single service
separately plus automatic extraction functions that are still in beta version. Anyway, the results are
not always compliant.
 ImportIO: visual tools that customers can use to configure the extractors. However, it is particularly
expensive.
5. THE MARKET AND COMPETITORS
Innovative Big-Data Web Scraping Tech Company 10
6. TARGET
E-commerce companies that require algorithmic pricing and competition monitoring.
Big companies that produce press reviews, carry out social media analysis, opinion mining, and
sentiment analysis activities.
Hedge funds and financial institutions for which information such as financial data and sentiment
indicators are extremely important.
Marketing agencies that need web scraping for SEO and web marketing automation purposes.
Established and startup companies that run or are developing any kind of vertical search engine.
Startups and small businesses that can benefit from building dedicated applications on our stack.
Innovative Big-Data Web Scraping Tech Company 11
7. SWOT ANALYSIS
STRENGTHS WEAKNESSES
Scalability.
Self-service fast big-data extraction
solution.
We need PhD resources to reinforce the
algorithmic extraction.
Very specialized high-tech service that
requires an effort to make it user-friendly
(for non-technical users).
OPPORTUNITIES RISKS
Global market with big expansion
opportunities.
Profitable niche with low competition.
Restrictive regulations on the use of
personal data (in Europe), on data collection
(in Asia), on data referring to minors
(worldwide).
Innovative Big-Data Web Scraping Tech Company 12
8. THE NUMBERS
We are considering a medium / large customer that requires at least 1 million pages per day
at a price of €800.00 (there is a global potential request of 100 billion pages per day).
EUR (in thousands) Year 2021 Year 2022 Year 2023 Year 2024
Sales 2,880 7,200 13,248 20,160
Gross margin 1,440 3,600 6,624 10,080
Net margin 1,440 3,600 6,624 10,080
Num. Customers 10 25 46 70
Innovative Big-Data Web Scraping Tech Company 13
9. INVESTMENT PLAN
The investment strategy
First round: 9% in equity for €300k with a pre-money evaluation of €3 million.
Second round: 9% in equity for €2 million.
Third round: 9% in equity for €10 million.
We plan to eventually go public on the stock exchange.
Innovative Big-Data Web Scraping Tech Company 14

More Related Content

PDF
An Innovative Big-Data Web Scraping Tech Company
PPTX
Moving Market Data to the Cloud - TABB Group and Xignite
PDF
Cwin16 tls-partner-sas new-open_analytics_platform
PDF
Big Data Industry Insights 2015
PDF
Mudgal Analytics
PDF
How Cloud Based Market Data Enables Innovation
PPTX
Deep Work in the Era of Real-Time Communications
PPTX
Second-Wave Mobility: How Mobile-Native Communications Enables the Virtual Wo...
An Innovative Big-Data Web Scraping Tech Company
Moving Market Data to the Cloud - TABB Group and Xignite
Cwin16 tls-partner-sas new-open_analytics_platform
Big Data Industry Insights 2015
Mudgal Analytics
How Cloud Based Market Data Enables Innovation
Deep Work in the Era of Real-Time Communications
Second-Wave Mobility: How Mobile-Native Communications Enables the Virtual Wo...

What's hot (18)

PPTX
Change the Game
PDF
The Cloudification of Capital Markets
PPTX
Speech Technologies Inside the Enterprise
PPTX
Amitpal Tagore, Integral Ad Science - Leveraging Data for Successful Ad Campa...
PDF
Future of work machine learning and middle level jobs 112618
PDF
Big Data
PPTX
Bmc joe goldberg
PDF
The Industrialist: Trends & innovations - Aug 2021
PDF
Keynote GraphTour Europe 2019, Emil Eifrem, CEO & Co-Founder Neo4j
PPTX
SugarCON 2013: World Class Analytics for SugarCRM with IBM
PDF
PaaS: Open For Business
PDF
Accelerating Machine Learning Adoption in the Automotive Industry
PPTX
IBM CDO Fall Summit 2016 Keynote: Driving innovation in the cognitive era
PPTX
CPaaS Is Turning Business Communications Inside Out
PDF
Ai based analytics in the cloud
PDF
Future of Business Intelligence keynote
PDF
Top 20 artificial intelligence companies to watch out in 2022
PPT
P01executive Summary Yy2009mm03dd16
Change the Game
The Cloudification of Capital Markets
Speech Technologies Inside the Enterprise
Amitpal Tagore, Integral Ad Science - Leveraging Data for Successful Ad Campa...
Future of work machine learning and middle level jobs 112618
Big Data
Bmc joe goldberg
The Industrialist: Trends & innovations - Aug 2021
Keynote GraphTour Europe 2019, Emil Eifrem, CEO & Co-Founder Neo4j
SugarCON 2013: World Class Analytics for SugarCRM with IBM
PaaS: Open For Business
Accelerating Machine Learning Adoption in the Automotive Industry
IBM CDO Fall Summit 2016 Keynote: Driving innovation in the cognitive era
CPaaS Is Turning Business Communications Inside Out
Ai based analytics in the cloud
Future of Business Intelligence keynote
Top 20 artificial intelligence companies to watch out in 2022
P01executive Summary Yy2009mm03dd16
Ad

Similar to An Innovative Big-Data Web Scraping Tech Company (20)

PDF
Running a business on Web Scraped Data
PPTX
BTSym24_ApresentationRSL_V2_2024 webscraping.pptx
PDF
Using Web Data to Drive Revenue and Reduce Costs
PDF
What is web scraping?
PDF
slidesgo-scraping-success-unlocking-business-growth-with-ai-powered-web-insig...
PPT
Using Web Data to Drive Revenue and Reduce Costs
PDF
10 signs you should invest in Web Scraping
PDF
Web Scraping BOTS & Automated Data Processing for a US Based Realty Aggregator
PPTX
Jeremy cabral search marketing summit - scraping data-driven content (1)
PPTX
Case 2 i robot presentation team f 20190831
PDF
Robots in retail key themes 2018
PDF
iRobot_Case_analysis.pptx.pdf
PDF
Guide for prospective start-ups in robotics
PDF
Robots in retail key themes 2018
PDF
Automated Web Data Scraping & Self- Serving Dashboard Trimmed Cost
PDF
Getting Content Out The Door Quickly with Scraping, Outsourcing and Team Work...
PDF
Introduction to import.io
PDF
Maximizing Big Data ROI via Best of Breed Technology Patterns and Practices -...
PPT
Óscar Méndez - Big data: de la investigación científica a la gestión empresarial
PPTX
Power Up Competitive Price Intelligence with Web Data
Running a business on Web Scraped Data
BTSym24_ApresentationRSL_V2_2024 webscraping.pptx
Using Web Data to Drive Revenue and Reduce Costs
What is web scraping?
slidesgo-scraping-success-unlocking-business-growth-with-ai-powered-web-insig...
Using Web Data to Drive Revenue and Reduce Costs
10 signs you should invest in Web Scraping
Web Scraping BOTS & Automated Data Processing for a US Based Realty Aggregator
Jeremy cabral search marketing summit - scraping data-driven content (1)
Case 2 i robot presentation team f 20190831
Robots in retail key themes 2018
iRobot_Case_analysis.pptx.pdf
Guide for prospective start-ups in robotics
Robots in retail key themes 2018
Automated Web Data Scraping & Self- Serving Dashboard Trimmed Cost
Getting Content Out The Door Quickly with Scraping, Outsourcing and Team Work...
Introduction to import.io
Maximizing Big Data ROI via Best of Breed Technology Patterns and Practices -...
Óscar Méndez - Big data: de la investigación científica a la gestión empresarial
Power Up Competitive Price Intelligence with Web Data
Ad

Recently uploaded (20)

PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Database Infoormation System (DBIS).pptx
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
Taxes Foundatisdcsdcsdon Certificate.pdf
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PPTX
Computer network topology notes for revision
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Introduction to machine learning and Linear Models
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
Moving the Public Sector (Government) to a Digital Adoption
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Database Infoormation System (DBIS).pptx
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
Business Acumen Training GuidePresentation.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Taxes Foundatisdcsdcsdon Certificate.pdf
Major-Components-ofNKJNNKNKNKNKronment.pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
.pdf is not working space design for the following data for the following dat...
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Computer network topology notes for revision
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Introduction to machine learning and Linear Models
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Moving the Public Sector (Government) to a Digital Adoption

An Innovative Big-Data Web Scraping Tech Company

  • 1. Elevator Pitch An Innovative Big-Data Web Scraping Tech Company BUSINESS IDEABUSINESS IDEA
  • 2. Innovative Big-Data Web Scraping Tech Company HIGHLIGHTS  What is WebRobot?  The Problem  How We Can Solve It  Team  Track Record  Business Model  Trends & Opportunities  Main Competitors  Target  SWOT Analysis  Some Numbers (sales, profit, clients)  Investment Plan 2
  • 3. 1. THE PROJECT Description  WebRobot Ltd is a London-based company that operates in the web scraping and web mining industry in which it aims to become the leader.  In WebRobot we are building a super scalable infrastructure for data acquisition that customers can use as a web service. It exploits cloud computing and big-data technologies, as well as data-extraction and information-extraction algorithms.  WebRobot will be a great ally to every company that needs to acquire this heterogeneous network of information and wants to reduce its internal management costs. WebRobot’s services will represent a strategic resource essential to its business success. Innovative Big-Data Web Scraping Tech Company 3
  • 4. 1. THE PROJECT The problem Every company wishing to achieve, keep and improve its business success needs information (data) on both the market, customers, and competitors, but this is challenging. It must get good, reliable, and well-organized data. In addition, it needs to manage them properly. The World Wide Web is made up of a huge amount of semi-structured and unstructured data. Furthermore, it constantly changes its structure. The cost to collect all of these data is often very expensive. For all these reasons, we need robust and scalable algorithms that can reduce this onerous maintenance activity. Innovative Big-Data Web Scraping Tech Company 4
  • 5. 1. THE PROJECT How We Can Solve the Problem We can guarantee algorithmic and structural scalability with automatic extraction features. We offer a powerful solution in the form of a web service. We integrate cloud computing with big-data technologies applied in the more general web mining context. We use visual support tools and SDK to connect to our stack. WebRobot’s goal is to become a complete ETL service involving data extraction, web mining, machine learning, and big-data analytics. Innovative Big-Data Web Scraping Tech Company 5
  • 6. 2. THE TEAM CEO, CTO Roger Giuffrè 71% of Equity Mediterraneo Capital Ltd 25% of Equity CCO, CMO Denis Giuffrè 4% of Equity CFO Antonio Censabella Roger Giuffrè Denis Giuffrè Antonio Censabella MEDITERRANEO CAPITAL LTD Innovative Big-Data Web Scraping Tech Company 6
  • 7. 3. TRACK RECORD We are finalizing the first version of the web service which will include the serverless version on the Lambda technology and Amazon EMR. We need to integrate the wrapper induction algorithms directly into the spark context. This will help us refine them with the latest academic findings. API implementation is fundamentally finished. We have to complete the usability studies of the current interface. We need to complete the dashboard that will be released under an open-source license. We have to design visual tools to support the ETL that has to be generated. We have a new grammar to set up for the query. Innovative Big-Data Web Scraping Tech Company 7
  • 8. 4. THE BUSINESS MODEL The Strategy We will release the service on the Amazon marketplace, available in three commercial packages: Entry-Level, Professional, and Enterprise. Our average selling price could be around €0.0008 per page scraped, but we will make a distinction between static and dynamic pages that need complex algorithms. We have verified that the execution costs on a serverless environment and on an EMR cluster can guarantee us a margin of at least 50%. This margin represents a cost constraint in our pricing policy. In the future, we will integrate a web agents marketplace and adopt a B2B2C paradigm to fill the gap with the end users, as well as with the actual use cases. Innovative Big-Data Web Scraping Tech Company 8
  • 9. 5. THE MARKET AND COMPETITORS Trends and opportunities  Markets: Web Scraping, Web Mining, Data Analytics.  Dimension: $2 billion of estimated value in 2020 alone (in just one single year).  Growth: based on the market researches, we expect further growth in the next years induced by (1) an ever-greater centrality of data in the entire business process, and (2) the predisposition of the companies to outsource, more and more often, the above-mentioned activities. Innovative Big-Data Web Scraping Tech Company 9
  • 10. Main Competitors  Diffbot: an API for data extraction that uses machine learning heuristic and features to crawl the pages. Unfortunately, the results are not 100% precise.  Scrapyhub: a cloud service focused on the Scrapy framework. It offers every single service separately plus automatic extraction functions that are still in beta version. Anyway, the results are not always compliant.  ImportIO: visual tools that customers can use to configure the extractors. However, it is particularly expensive. 5. THE MARKET AND COMPETITORS Innovative Big-Data Web Scraping Tech Company 10
  • 11. 6. TARGET E-commerce companies that require algorithmic pricing and competition monitoring. Big companies that produce press reviews, carry out social media analysis, opinion mining, and sentiment analysis activities. Hedge funds and financial institutions for which information such as financial data and sentiment indicators are extremely important. Marketing agencies that need web scraping for SEO and web marketing automation purposes. Established and startup companies that run or are developing any kind of vertical search engine. Startups and small businesses that can benefit from building dedicated applications on our stack. Innovative Big-Data Web Scraping Tech Company 11
  • 12. 7. SWOT ANALYSIS STRENGTHS WEAKNESSES Scalability. Self-service fast big-data extraction solution. We need PhD resources to reinforce the algorithmic extraction. Very specialized high-tech service that requires an effort to make it user-friendly (for non-technical users). OPPORTUNITIES RISKS Global market with big expansion opportunities. Profitable niche with low competition. Restrictive regulations on the use of personal data (in Europe), on data collection (in Asia), on data referring to minors (worldwide). Innovative Big-Data Web Scraping Tech Company 12
  • 13. 8. THE NUMBERS We are considering a medium / large customer that requires at least 1 million pages per day at a price of €800.00 (there is a global potential request of 100 billion pages per day). EUR (in thousands) Year 2021 Year 2022 Year 2023 Year 2024 Sales 2,880 7,200 13,248 20,160 Gross margin 1,440 3,600 6,624 10,080 Net margin 1,440 3,600 6,624 10,080 Num. Customers 10 25 46 70 Innovative Big-Data Web Scraping Tech Company 13
  • 14. 9. INVESTMENT PLAN The investment strategy First round: 9% in equity for €300k with a pre-money evaluation of €3 million. Second round: 9% in equity for €2 million. Third round: 9% in equity for €10 million. We plan to eventually go public on the stock exchange. Innovative Big-Data Web Scraping Tech Company 14