Beckman abadi-5min-pres

*
@daniel_abadi
Yale University

* The Big Data phenomenon is the best thing that
could have happened to the database
community
* Despite other definitions related to ‘3 Vs’ --Big Data means BIG Data

* Which means we need scalable database systems

* Still two main components of Big Data
* Performing data analysis at scale
* Performing requests on data at scale

*

* Database community has won the battle

* Some thought that MapReduce might replace

traditional database technology as the primary
means to perform analysis at scale
* Just about every MapReduce vendor has abandoned
this goal
* Hadapt, Impala, Tez, and several others are in a
race to see who can add the most traditional
database execution technology to Hadoop fastest
* Everyone is going in the direction of cost-based
optimizers, traditional database operators, and
push-based query execution

*

* The database community is losing the battle

* NoSQL systems still have very little traditional database
technology inside (despite adding SQL interfaces)
* No race to add DB technology --- why?

* Don’t blame CAP --- CAP is only relevant when there’s a
*

network partition
We never figured out how to do ACID and active replication at
scale

*

Many new proposals make simplifying assumptions in order to
handle scale

* It’s been 30 years ---- why can’t we build a distributed

database that can handle distributed transactions over
actively replicated data at scale?

*

Beckman abadi-5min-pres

More Related Content

What's hot (20)

Viewers also liked (7)

Similar to Beckman abadi-5min-pres (20)

Beckman abadi-5min-pres