SlideShare a Scribd company logo
CS 542 Database Management SystemsRelational Database ProgrammingJ Singh January 24, 2011
Simple SQL Queries (p1)Relation BROWSER_TABLESELECT * FROM BROWSER_TABLE WHERE ENGINE = 'Gecko'Start with the RelationSelect () Rows
Simple SQL Queries (p2)Relation BROWSER_TABLESELECT BROWSER, PLATFORM FROM BROWSER_TABLE WHERE ENGINE = 'Gecko'Start with the RelationSelect () RowsProject () Columns
Simple SQL Queries (p3)Relation BROWSER_TABLESELECT BROWSER, PLATFORM AS OS FROM BROWSER_TABLE WHERE ENGINE = 'Gecko'Start with the RelationSelect () RowsProject () ColumnsRename () Columns
SQL ConditionsIn WHERE clause:String1 = String2, String1 > String2 and other comparison operatorsComparisons are controlled by ‘collations’, e.g.,COLLATE Latin1_General_CI_AS (Latin1 collation, case insensitive, accent sensitive)For other available collations, check your databaseCollations can be specified at three levelsFor the entire databaseFor an attribute during in CREATE TABLEIn the WHERE clauseLIKE String (pattern matching), e.g.,'John Wayne' LIKE 'John%''John Wayne' LIKE ‘% W_yne'
SQL Special Data Types (p1)Dates and Times (look them up)NULL values ( in Relational Algebra)Can mean one of three things:Value is unknownValue is inapplicable (e.g., spouse name for a single person)Value not shown – perhaps because of security concernsRegardless of the cause, NULL can not be treated as a constantOperations with NULLsNULL + number  NULLNULL  number  NULLNULL = NULL  UNKNOWNX IS NULL  TRUE or FALSE (depending on X)NULL  0 NULL - NULL NULLNULL
SQL Special Data Types (p2)UNKNOWN valuesResult from comparison with NULLsOther comparisons yield TRUE or FALSE UNKNOWN means neither TRUE nor FALSEOperations when combined with other logical valuesUNKNOWN AND TRUE  UNKNOWNUNKNOWN AND FALSE  FALSEUNKNOWN OR TRUE  TRUEUNKNOWN OR FALSE  UNKNOWNNOT UNKNOWN  UNKNOWN
Ordering ResultsRelation BROWSER_TABLESELECT BROWSER, PLATFORM FROM BROWSER_TABLE WHERE ENGINE = 'Gecko' ORDER BY ENGINE_VERSION, BROWSERStart with the RelationSelect () RowsOrder RowsProject () Columns
Detour: World DatabaseA sample MySQL database downloadable from the webHas 3 tables: City, Country, CountryLanguageCity ID, Name,CountryCode, District, PopulationCountry Code, Name, Continent, Region, SurfaceArea, IndepYear, Population, LifeExpectancy, GNP, GNPOld, LocalName, GovernmentForm, HeadOfState, Capital, Code2CountryLanguageCountryCode, Language, IsOfficial, PercentageThe three tables are ‘connected’ by the CountryCode attribute.
JoinsFind all cities in EstoniaSELECT City.NameFROM City, CountryWHERE Country.Name = 'Estonia'    AND City.CountryCode = Country.Code ;Find all countries where Dutch is the official languageSELECT Country.NameFROM Country, CountryLanguageWHERE CountryLanguage.CountryCode = Country.Code    AND CountryLanguage.Language = 'Dutch'    AND CountryLanguage.isOfficial = 'T' ;
Join Semantics – Nested LoopsFind all cities in EstoniaSELECT City.Name    FROM City, CountryWHERE Country.Name = 'Estonia’ AND City.CountryCode = Country.CodeIs equivalent toFor each tuple t1 in City:  For each tuple t2 in Country:    If the WHERE clause is satisfied:      Accumulate <t1, t2> into a result setProject City.Name from the accumulated result set
Join Semantics – Relational AlgebraFind all cities in EstoniaSELECT City.NameFROM City, CountryWHERE Country.Name = 'Estonia'    AND City.CountryCode = Country.CodeIs equivalent toA1( B1='Estonia'ANDA2= B2(A  B) )Where A = City, B = Country,	A1 = City.Name, A2 = City.CountryCode, A3 = Country.Code
Self-JoinsFind all districts in Kenya that have more than one citySELECT distinct c1.district FROM city c1, city c2, country WHERE c1.name != c2.nameAND country.code = c1.countrycode AND country.code = c2.countrycode AND country.name = 'kenya';The same table (city) gets used with two names, c1 and c2
Set OperatorsFind all districts in Kenya that have exactly one city(	SELECT distinct city.district	FROM city, country 	WHERE country.code = city.countrycode	AND country.name = 'kenya' )EXCEPT(	SELECT distinct c1.district 	FROM city c1, city c2, country 	WHERE c1.name != c2.name	AND country.code = c1.countrycode 	AND country.code = c2.countrycode 	AND country.name = 'kenya' );Both sides must yield the same tuplesOr UNION or INTERSECT
SubqueriesA different way to structure queries (without using joins)SELECT	___________________FROM	_____Subquery 3____WHERE	_____Subquery 1____			_____Subquery 2____
Subqueries Returning ScalarsFind all cities in EstoniaSELECT City.NameFROM City, CountryWHERE Country.Name = 'Estonia'    AND City.CountryCode = Country.CodeCan also be written asSELECT NameFROM CityWHERE CountryCode = 	(SELECT Code FROM Country WHERE Name = 'Estonia')The two forms are equivalent except when…
Conditions Returning RelationsFind all countries where Dutch is the official languageSELECT Country.NameFROM Country, CountryLanguageWHERE CountryLanguage.CountryCode = Country.Code    AND CountryLanguage.Language = 'Dutch'    AND isOfficial = 'T' ;Can also be written asSELECT Name FROM Country WHERE Code IN 	(	SELECT CountryCode IN CountryLanguage		WHERE Language = 'Dutch' AND isOfficial = 'T' );
Conditions Returning TuplesFind all countries where Dutch is the official languageSELECT Name FROM Country WHERE Code IN 	(	SELECT CountryCode IN CountryLanguage		WHERE Language = 'Dutch' AND isOfficial = 'T' );Can also be written asSELECT Name FROM Country WHERE (Code, 'T') IN 	(	SELECT CountryCode, isOfficial FROM CountryLanguage		WHERE Language = 'Dutch' );
Subqueries in FROM clausesTotal population of all countries with Dutch as the official languageSELECT Name FROM Country WHERE Code IN 	(	SELECT CountryCode IN CountryLanguage		WHERE Language = 'Dutch' AND isOfficial = 'T' );
Cross JoinsPopulations of cities in Finland relative to Aruba & SingaporeSELECT 	city.name as City, city.population as Population, 	cntry.name as Country,	(city.population * 100 / cntry.population) as 'Percent' FROM (SELECT * FROM CITY WHERE CountryCode = 'fin') AS city CROSS JOIN(SELECT * FROM Country WHERE Code='abw' OR Code=‘sgp') AS cntry;
Theta JoinsCross Join with a conditionThe most common form of JOINAll cities in Finland with a population at least double of Aruba SELECT  	cty.name as City,  cty.population as Population,  	cntry.name as Country, 	(cty.population * 100 / cntry.population) as 'Percent' FROM  	(	SELECT * FROM City WHERE CountryCode = 'fin') AS ctyJOIN (SELECT * FROM Country WHERE Code='abw') AS cntryONcty.population > 2*cntry.population;
Outer JoinsSelecting elements of a table regardless of whether they are present in the other table.Cities starting with 'TOK' and countries starting with 'J'SELECT c.*, r.name as Country FROM (select * from city where city.name like 'tok%') as c LEFT OUTER JOIN (select * from country where country.code like 'j%') as r ON (c.countrycode=r.code);Yields 6 cities, 5 in Japan and Tokat in TurkeyWhat if we had done RIGHT OUTER JOIN?
Review and Contrast JoinsMySQL does not implement FULL OUTER JOINHow can we get it if we need it?Are CROSS JOIN and FULL OUTER JOIN the same thing?Table A has 3 rows, table B has 5 rows.How many rows does A CROSS JOIN B have?How many rows does A LEFT OUTER JOIN B have?How about A RIGHT OUTER JOIN B?A FULL OUTER JOIN B?A INNER JOIN B?
Reading AssignmentSection 6.4Section 6.5Keep timing considerations in mindSQL completely evaluates the query before affecting changes
TransactionsACIDAtomicitySets of database operations that need to be accomplished atomically, either they all get done or none do. E.g., during money transfer,If money is taken out of one account, it must be added to the otherConsistencyEnforce constraints on types, values, foreign keysMaintain relationships among data elements (see Atomicity)IsolationEach transaction must appear to be executed as if no other transaction is executing at the same time.DurabilityOnce committed, the change is permanent.
Detour: Transaction ScenarioReal Time Bank (RTB) is an on-line bank.RTB executes money transfers as soon as requests are enteredRTB shows up-to-the-minute account balancesTransactions that would create a negative balances are deniedScenarioInitially, Alice has $250, Bob has $100, Cathy has $150Transactions: Alice pays Bob $200Bob pays Cathy $150Cathy pays Alice $250Interesting aside: only transaction order 1, 2, 3 will succeedAt a Nightly Processing Bank, transaction order would be irrelevant
Transaction AtomicityWork by example: Alice pays Bob $200BEGIN TRANSACTIONUPDATE AccountsSET balance = balance – 200WHERE Owner = 'Alice'IF (0 > SELECT balance FROM Accounts WHERE Owner = 'Alice‘,ROLLBACK TRANSACTION )	-- Note: Pidgin SQL SyntaxUPDATE AccountsSET balance = balance + 200WHERE Owner = 'Bob‘COMMIT TRANSACTION
Transaction IsolationIsolation levels and the problems they leave behind:READ UNCOMMITTEDDirty Read – data of an uncommitted transaction visible to othersREAD COMMITTED: only committed data is visibleNon-repeatable Read – re-reads some data and find that it has changed due to another transaction committingREPEATABLE READ: place locks on all data that are used in the transactionPhantom Read – re-execute a subquery returning a set of rows and find a different set of rowsSERIALIZABLE: As if all transactions occur in a completely isolated fashionToo restrictive, not able to support enough transaction volumeNote: Not every database offers each isolation level.Choose the isolation level with care!
CS 542 Database Management SystemsDatabase Logic – The Foundation of Datalog
AboutDatalogIntellectual debt to Prolog, the logic programming languageResponsible for addition of recursion to SQL-99.Extends SQL but still leaves it Turing-incompleteIntroductory example:Facts:Par(sally, john), Par(martha, mary), Par(mary, peter), Par(john, peter)Rules:Sib(x, y)  Par(x, p) AND Par(y, p) AND x <> yCousin(x, y)  Sib(x, y)Cousin(x, y)  Par(x, xp) AND Par(y, yp) AND Cousin(xp, yp)  Cousin(sally, martha)
Why Data Logic?Why is SQL not sufficient?Deductive rules express things that go in both FROM and WHERE clausesAllow for stating general requirements that are more difficult to state correctly in SQLAllow us to take advantage of research in logic programming and AI
The Formalism of RulesThe Head is true if all the subgoals are trueThe rule applies for all values of its argumentsA variable appearing in the head is distinguished ; otherwise it is nondistinguished.Ancestor(x, y)Head = consequent,a single subgoalRead thissymbol “if”Parent(x, z) AND Ancestor(z, y)Body = antecedent =AND of subgoals.
Interpreting RulesThe head is true for given values of the distinguished variables if there exist values of the non-distinguished variables that make all subgoals of the body true.For the head to be true, all variables must appear in some non-negated subgoal of the bodyUnsafe examples:
IDB/EDBConvention: Predicates begin with a capital, variables begin with lowercasee.g., Ancestor (x, y)Fact predicates are atoms represented as relationsIf a tuple exists, that fact is trueOtherwise, falseA predicate representing a stored relation is called an extensional database (EDB).Subgoals of a rule may be facts or may themselves be rulesEDB when it is a factIntensional database (IDB) when it is a “derived relation”Rule heads are always IDBs
Computing IDB Relations Bottom-upempty out all IDB relationsREPEAT	FOR (each IDB predicate p) DO	    evaluate p using current	        values of all relations;UNTIL (no IDB relation is changed)As long as there is no negation of IDB subgoals, each IDB relation grows with each iterationAt least, it does not shrinkSince relations are finite, the loop eventually terminatesSome rules make it impossible to predict that the loop has a chance to terminate. Considered unsafe
Computing IDB Relations Top-Down (p1)EDB: Par(c,p) = p  is a parent of c.Generalized cousins: people with common ancestors one or more generations back:Sib(x,y) <- Par(x,p) AND Par(y,p) AND x<>yCousin(x,y) <- Sib(x,y)Cousin(x,y) <- Par(x,xp) AND Par(y,yp)				AND Cousin(xp,yp)Form a dependency graph  whose nodes = IDB predicates.Arc X ->Y  if and only if there is a rule with X  in the head and Y  in the body.Cycle = recursion; no cycle = no recursion.
Computing IDB Relations Top-down (p2)for IDB predicate p(x,y, …)	FOR EACH subgoal of p DO	  IF subgoal is IDB, recursive call;	  IF subgoal is EDB, look upThe recursion eventually terminates unless:A distinguished variable does not appear in a subgoalonly appears in a negated subgoalonly appears in an arithmetic subgoalSame 3 conditions for variables in an arithmetic subgoalSame 3 conditions for variables in a negated subgoal
Safe RulesA rule is safe  if:Each distinguished variable,Each variable in an arithmetic subgoal, andEach variable in a negated subgoal,	also appears in a nonnegated,	relational subgoal.Safe rules prevent infinite results.
Evaluating Datalog ProgramsAs long as there is no recursion, we can pick an order to evaluate the IDB predicates, so that all the predicates in the body of its rules have already been evaluated.If an IDB predicate has more than one rule, each rule contributes tuples to its relation.
Expressive Power of DatalogWithout recursion, Datalog can express all and only the queries of core relational algebra.The same as SQL select-from-where, without aggregation and grouping.But with recursion, Datalog can express more than these languages.Yet still not Turing-complete.
SQL Rule Definitions & UsageDefinition of Datalog Rules:WITH[RECURSIVE] <RuleName> (<arguments>)AS <query>;Invocation of Datalog Rules:<SQL query about EDB, IDB>
SQL Recursion Example (p1)Find Sally’s cousinsUsing Recursive definition introduced earlierPar (child, parent) is the EDBExpected SQL QuerySELECT yFROM CousinWHERE x = ‘Sally’;But first, we need to define the IDB Cousin
SQL Recursion Example (p2)WITH Clause (non-recursive)WITH Sib(x, y) AS	FROM Par p1, Par p2	WHERE p1.parent = p2.parent	AND p1.child <> p2.child;WITH Clause (recursive)RECURSIVE Cousin(x, y) AS	(SELECT * FROM Sib)		UNION	(SELECT p1.child, p2.child	 FROM Par p1, Par p2, Cousin	 WHERE p1.parent = Cousin.x	 AND p2.parent = Cousin.y);
Next meetingJanuary 31Sections 7.1 – 7.3Sections 8.1, 8.3 – 8.4Discussion of presentationtopic proposals

More Related Content

PDF
comboboxentry - glade - ruby - guide
PPTX
MongoDB Aggregations Indexing and Profiling
PPTX
Introduction to MongoDB at IGDTUW
PDF
Scd type2 through informatica
PPTX
Indexing and Query Optimisation
PPTX
1 query processing
PPTX
Query processing
comboboxentry - glade - ruby - guide
MongoDB Aggregations Indexing and Profiling
Introduction to MongoDB at IGDTUW
Scd type2 through informatica
Indexing and Query Optimisation
1 query processing
Query processing

Similar to CS 542 Overview of query processing (20)

PPT
lecture2.ppt
PPT
lecture2.ppt
PPT
04.SQL.ppt
PPT
Unit04 dbms
PDF
Four Basic SQL Programming for Learners
PDF
4 Basic SQL.pdf SQL is a standard language for storing, manipulating and retr...
PPTX
Understanding databases and querying
PDF
PDF
Oracle
PPTX
Database Overview
PPTX
OracleSQLraining.pptx
PPT
sql-basic.ppt
PDF
Database Architecture and Basic Concepts
PPTX
Database Management System Review
PPTX
Alasql JavaScript SQL Database Library: User Manual
PPTX
SQL Server Learning Drive
PDF
PNWPHP -- What are Databases so &#%-ing Difficult
PDF
PPTX
Crash course in sql
PDF
CS121Lec04.pdf
lecture2.ppt
lecture2.ppt
04.SQL.ppt
Unit04 dbms
Four Basic SQL Programming for Learners
4 Basic SQL.pdf SQL is a standard language for storing, manipulating and retr...
Understanding databases and querying
Oracle
Database Overview
OracleSQLraining.pptx
sql-basic.ppt
Database Architecture and Basic Concepts
Database Management System Review
Alasql JavaScript SQL Database Library: User Manual
SQL Server Learning Drive
PNWPHP -- What are Databases so &#%-ing Difficult
Crash course in sql
CS121Lec04.pdf
Ad

More from J Singh (20)

PDF
OpenLSH - a framework for locality sensitive hashing
PPTX
Designing analytics for big data
PDF
Open LSH - september 2014 update
PPTX
PaaS - google app engine
PPTX
Mining of massive datasets using locality sensitive hashing (LSH)
PPTX
Data Analytic Technology Platforms: Options and Tradeoffs
PPTX
Facebook Analytics with Elastic Map/Reduce
PPTX
Big Data Laboratory
PPTX
The Hadoop Ecosystem
PPTX
Social Media Mining using GAE Map Reduce
PPTX
High Throughput Data Analysis
PPTX
NoSQL and MapReduce
PPTX
CS 542 -- Concurrency Control, Distributed Commit
PPTX
CS 542 -- Failure Recovery, Concurrency Control
PPTX
CS 542 -- Query Optimization
PPTX
CS 542 -- Query Execution
PPTX
CS 542 Putting it all together -- Storage Management
PPTX
CS 542 Parallel DBs, NoSQL, MapReduce
PPTX
CS 542 Database Index Structures
PPTX
CS 542 Controlling Database Integrity and Performance
OpenLSH - a framework for locality sensitive hashing
Designing analytics for big data
Open LSH - september 2014 update
PaaS - google app engine
Mining of massive datasets using locality sensitive hashing (LSH)
Data Analytic Technology Platforms: Options and Tradeoffs
Facebook Analytics with Elastic Map/Reduce
Big Data Laboratory
The Hadoop Ecosystem
Social Media Mining using GAE Map Reduce
High Throughput Data Analysis
NoSQL and MapReduce
CS 542 -- Concurrency Control, Distributed Commit
CS 542 -- Failure Recovery, Concurrency Control
CS 542 -- Query Optimization
CS 542 -- Query Execution
CS 542 Putting it all together -- Storage Management
CS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Database Index Structures
CS 542 Controlling Database Integrity and Performance
Ad

CS 542 Overview of query processing

  • 1. CS 542 Database Management SystemsRelational Database ProgrammingJ Singh January 24, 2011
  • 2. Simple SQL Queries (p1)Relation BROWSER_TABLESELECT * FROM BROWSER_TABLE WHERE ENGINE = 'Gecko'Start with the RelationSelect () Rows
  • 3. Simple SQL Queries (p2)Relation BROWSER_TABLESELECT BROWSER, PLATFORM FROM BROWSER_TABLE WHERE ENGINE = 'Gecko'Start with the RelationSelect () RowsProject () Columns
  • 4. Simple SQL Queries (p3)Relation BROWSER_TABLESELECT BROWSER, PLATFORM AS OS FROM BROWSER_TABLE WHERE ENGINE = 'Gecko'Start with the RelationSelect () RowsProject () ColumnsRename () Columns
  • 5. SQL ConditionsIn WHERE clause:String1 = String2, String1 > String2 and other comparison operatorsComparisons are controlled by ‘collations’, e.g.,COLLATE Latin1_General_CI_AS (Latin1 collation, case insensitive, accent sensitive)For other available collations, check your databaseCollations can be specified at three levelsFor the entire databaseFor an attribute during in CREATE TABLEIn the WHERE clauseLIKE String (pattern matching), e.g.,'John Wayne' LIKE 'John%''John Wayne' LIKE ‘% W_yne'
  • 6. SQL Special Data Types (p1)Dates and Times (look them up)NULL values ( in Relational Algebra)Can mean one of three things:Value is unknownValue is inapplicable (e.g., spouse name for a single person)Value not shown – perhaps because of security concernsRegardless of the cause, NULL can not be treated as a constantOperations with NULLsNULL + number  NULLNULL  number  NULLNULL = NULL  UNKNOWNX IS NULL  TRUE or FALSE (depending on X)NULL  0 NULL - NULL NULLNULL
  • 7. SQL Special Data Types (p2)UNKNOWN valuesResult from comparison with NULLsOther comparisons yield TRUE or FALSE UNKNOWN means neither TRUE nor FALSEOperations when combined with other logical valuesUNKNOWN AND TRUE  UNKNOWNUNKNOWN AND FALSE  FALSEUNKNOWN OR TRUE  TRUEUNKNOWN OR FALSE  UNKNOWNNOT UNKNOWN  UNKNOWN
  • 8. Ordering ResultsRelation BROWSER_TABLESELECT BROWSER, PLATFORM FROM BROWSER_TABLE WHERE ENGINE = 'Gecko' ORDER BY ENGINE_VERSION, BROWSERStart with the RelationSelect () RowsOrder RowsProject () Columns
  • 9. Detour: World DatabaseA sample MySQL database downloadable from the webHas 3 tables: City, Country, CountryLanguageCity ID, Name,CountryCode, District, PopulationCountry Code, Name, Continent, Region, SurfaceArea, IndepYear, Population, LifeExpectancy, GNP, GNPOld, LocalName, GovernmentForm, HeadOfState, Capital, Code2CountryLanguageCountryCode, Language, IsOfficial, PercentageThe three tables are ‘connected’ by the CountryCode attribute.
  • 10. JoinsFind all cities in EstoniaSELECT City.NameFROM City, CountryWHERE Country.Name = 'Estonia' AND City.CountryCode = Country.Code ;Find all countries where Dutch is the official languageSELECT Country.NameFROM Country, CountryLanguageWHERE CountryLanguage.CountryCode = Country.Code AND CountryLanguage.Language = 'Dutch' AND CountryLanguage.isOfficial = 'T' ;
  • 11. Join Semantics – Nested LoopsFind all cities in EstoniaSELECT City.Name FROM City, CountryWHERE Country.Name = 'Estonia’ AND City.CountryCode = Country.CodeIs equivalent toFor each tuple t1 in City: For each tuple t2 in Country: If the WHERE clause is satisfied: Accumulate <t1, t2> into a result setProject City.Name from the accumulated result set
  • 12. Join Semantics – Relational AlgebraFind all cities in EstoniaSELECT City.NameFROM City, CountryWHERE Country.Name = 'Estonia' AND City.CountryCode = Country.CodeIs equivalent toA1( B1='Estonia'ANDA2= B2(A  B) )Where A = City, B = Country, A1 = City.Name, A2 = City.CountryCode, A3 = Country.Code
  • 13. Self-JoinsFind all districts in Kenya that have more than one citySELECT distinct c1.district FROM city c1, city c2, country WHERE c1.name != c2.nameAND country.code = c1.countrycode AND country.code = c2.countrycode AND country.name = 'kenya';The same table (city) gets used with two names, c1 and c2
  • 14. Set OperatorsFind all districts in Kenya that have exactly one city( SELECT distinct city.district FROM city, country WHERE country.code = city.countrycode AND country.name = 'kenya' )EXCEPT( SELECT distinct c1.district FROM city c1, city c2, country WHERE c1.name != c2.name AND country.code = c1.countrycode AND country.code = c2.countrycode AND country.name = 'kenya' );Both sides must yield the same tuplesOr UNION or INTERSECT
  • 15. SubqueriesA different way to structure queries (without using joins)SELECT ___________________FROM _____Subquery 3____WHERE _____Subquery 1____ _____Subquery 2____
  • 16. Subqueries Returning ScalarsFind all cities in EstoniaSELECT City.NameFROM City, CountryWHERE Country.Name = 'Estonia' AND City.CountryCode = Country.CodeCan also be written asSELECT NameFROM CityWHERE CountryCode = (SELECT Code FROM Country WHERE Name = 'Estonia')The two forms are equivalent except when…
  • 17. Conditions Returning RelationsFind all countries where Dutch is the official languageSELECT Country.NameFROM Country, CountryLanguageWHERE CountryLanguage.CountryCode = Country.Code AND CountryLanguage.Language = 'Dutch' AND isOfficial = 'T' ;Can also be written asSELECT Name FROM Country WHERE Code IN ( SELECT CountryCode IN CountryLanguage WHERE Language = 'Dutch' AND isOfficial = 'T' );
  • 18. Conditions Returning TuplesFind all countries where Dutch is the official languageSELECT Name FROM Country WHERE Code IN ( SELECT CountryCode IN CountryLanguage WHERE Language = 'Dutch' AND isOfficial = 'T' );Can also be written asSELECT Name FROM Country WHERE (Code, 'T') IN ( SELECT CountryCode, isOfficial FROM CountryLanguage WHERE Language = 'Dutch' );
  • 19. Subqueries in FROM clausesTotal population of all countries with Dutch as the official languageSELECT Name FROM Country WHERE Code IN ( SELECT CountryCode IN CountryLanguage WHERE Language = 'Dutch' AND isOfficial = 'T' );
  • 20. Cross JoinsPopulations of cities in Finland relative to Aruba & SingaporeSELECT city.name as City, city.population as Population, cntry.name as Country, (city.population * 100 / cntry.population) as 'Percent' FROM (SELECT * FROM CITY WHERE CountryCode = 'fin') AS city CROSS JOIN(SELECT * FROM Country WHERE Code='abw' OR Code=‘sgp') AS cntry;
  • 21. Theta JoinsCross Join with a conditionThe most common form of JOINAll cities in Finland with a population at least double of Aruba SELECT cty.name as City, cty.population as Population, cntry.name as Country, (cty.population * 100 / cntry.population) as 'Percent' FROM ( SELECT * FROM City WHERE CountryCode = 'fin') AS ctyJOIN (SELECT * FROM Country WHERE Code='abw') AS cntryONcty.population > 2*cntry.population;
  • 22. Outer JoinsSelecting elements of a table regardless of whether they are present in the other table.Cities starting with 'TOK' and countries starting with 'J'SELECT c.*, r.name as Country FROM (select * from city where city.name like 'tok%') as c LEFT OUTER JOIN (select * from country where country.code like 'j%') as r ON (c.countrycode=r.code);Yields 6 cities, 5 in Japan and Tokat in TurkeyWhat if we had done RIGHT OUTER JOIN?
  • 23. Review and Contrast JoinsMySQL does not implement FULL OUTER JOINHow can we get it if we need it?Are CROSS JOIN and FULL OUTER JOIN the same thing?Table A has 3 rows, table B has 5 rows.How many rows does A CROSS JOIN B have?How many rows does A LEFT OUTER JOIN B have?How about A RIGHT OUTER JOIN B?A FULL OUTER JOIN B?A INNER JOIN B?
  • 24. Reading AssignmentSection 6.4Section 6.5Keep timing considerations in mindSQL completely evaluates the query before affecting changes
  • 25. TransactionsACIDAtomicitySets of database operations that need to be accomplished atomically, either they all get done or none do. E.g., during money transfer,If money is taken out of one account, it must be added to the otherConsistencyEnforce constraints on types, values, foreign keysMaintain relationships among data elements (see Atomicity)IsolationEach transaction must appear to be executed as if no other transaction is executing at the same time.DurabilityOnce committed, the change is permanent.
  • 26. Detour: Transaction ScenarioReal Time Bank (RTB) is an on-line bank.RTB executes money transfers as soon as requests are enteredRTB shows up-to-the-minute account balancesTransactions that would create a negative balances are deniedScenarioInitially, Alice has $250, Bob has $100, Cathy has $150Transactions: Alice pays Bob $200Bob pays Cathy $150Cathy pays Alice $250Interesting aside: only transaction order 1, 2, 3 will succeedAt a Nightly Processing Bank, transaction order would be irrelevant
  • 27. Transaction AtomicityWork by example: Alice pays Bob $200BEGIN TRANSACTIONUPDATE AccountsSET balance = balance – 200WHERE Owner = 'Alice'IF (0 > SELECT balance FROM Accounts WHERE Owner = 'Alice‘,ROLLBACK TRANSACTION ) -- Note: Pidgin SQL SyntaxUPDATE AccountsSET balance = balance + 200WHERE Owner = 'Bob‘COMMIT TRANSACTION
  • 28. Transaction IsolationIsolation levels and the problems they leave behind:READ UNCOMMITTEDDirty Read – data of an uncommitted transaction visible to othersREAD COMMITTED: only committed data is visibleNon-repeatable Read – re-reads some data and find that it has changed due to another transaction committingREPEATABLE READ: place locks on all data that are used in the transactionPhantom Read – re-execute a subquery returning a set of rows and find a different set of rowsSERIALIZABLE: As if all transactions occur in a completely isolated fashionToo restrictive, not able to support enough transaction volumeNote: Not every database offers each isolation level.Choose the isolation level with care!
  • 29. CS 542 Database Management SystemsDatabase Logic – The Foundation of Datalog
  • 30. AboutDatalogIntellectual debt to Prolog, the logic programming languageResponsible for addition of recursion to SQL-99.Extends SQL but still leaves it Turing-incompleteIntroductory example:Facts:Par(sally, john), Par(martha, mary), Par(mary, peter), Par(john, peter)Rules:Sib(x, y)  Par(x, p) AND Par(y, p) AND x <> yCousin(x, y)  Sib(x, y)Cousin(x, y)  Par(x, xp) AND Par(y, yp) AND Cousin(xp, yp)  Cousin(sally, martha)
  • 31. Why Data Logic?Why is SQL not sufficient?Deductive rules express things that go in both FROM and WHERE clausesAllow for stating general requirements that are more difficult to state correctly in SQLAllow us to take advantage of research in logic programming and AI
  • 32. The Formalism of RulesThe Head is true if all the subgoals are trueThe rule applies for all values of its argumentsA variable appearing in the head is distinguished ; otherwise it is nondistinguished.Ancestor(x, y)Head = consequent,a single subgoalRead thissymbol “if”Parent(x, z) AND Ancestor(z, y)Body = antecedent =AND of subgoals.
  • 33. Interpreting RulesThe head is true for given values of the distinguished variables if there exist values of the non-distinguished variables that make all subgoals of the body true.For the head to be true, all variables must appear in some non-negated subgoal of the bodyUnsafe examples:
  • 34. IDB/EDBConvention: Predicates begin with a capital, variables begin with lowercasee.g., Ancestor (x, y)Fact predicates are atoms represented as relationsIf a tuple exists, that fact is trueOtherwise, falseA predicate representing a stored relation is called an extensional database (EDB).Subgoals of a rule may be facts or may themselves be rulesEDB when it is a factIntensional database (IDB) when it is a “derived relation”Rule heads are always IDBs
  • 35. Computing IDB Relations Bottom-upempty out all IDB relationsREPEAT FOR (each IDB predicate p) DO evaluate p using current values of all relations;UNTIL (no IDB relation is changed)As long as there is no negation of IDB subgoals, each IDB relation grows with each iterationAt least, it does not shrinkSince relations are finite, the loop eventually terminatesSome rules make it impossible to predict that the loop has a chance to terminate. Considered unsafe
  • 36. Computing IDB Relations Top-Down (p1)EDB: Par(c,p) = p is a parent of c.Generalized cousins: people with common ancestors one or more generations back:Sib(x,y) <- Par(x,p) AND Par(y,p) AND x<>yCousin(x,y) <- Sib(x,y)Cousin(x,y) <- Par(x,xp) AND Par(y,yp) AND Cousin(xp,yp)Form a dependency graph whose nodes = IDB predicates.Arc X ->Y if and only if there is a rule with X in the head and Y in the body.Cycle = recursion; no cycle = no recursion.
  • 37. Computing IDB Relations Top-down (p2)for IDB predicate p(x,y, …) FOR EACH subgoal of p DO IF subgoal is IDB, recursive call; IF subgoal is EDB, look upThe recursion eventually terminates unless:A distinguished variable does not appear in a subgoalonly appears in a negated subgoalonly appears in an arithmetic subgoalSame 3 conditions for variables in an arithmetic subgoalSame 3 conditions for variables in a negated subgoal
  • 38. Safe RulesA rule is safe if:Each distinguished variable,Each variable in an arithmetic subgoal, andEach variable in a negated subgoal, also appears in a nonnegated, relational subgoal.Safe rules prevent infinite results.
  • 39. Evaluating Datalog ProgramsAs long as there is no recursion, we can pick an order to evaluate the IDB predicates, so that all the predicates in the body of its rules have already been evaluated.If an IDB predicate has more than one rule, each rule contributes tuples to its relation.
  • 40. Expressive Power of DatalogWithout recursion, Datalog can express all and only the queries of core relational algebra.The same as SQL select-from-where, without aggregation and grouping.But with recursion, Datalog can express more than these languages.Yet still not Turing-complete.
  • 41. SQL Rule Definitions & UsageDefinition of Datalog Rules:WITH[RECURSIVE] <RuleName> (<arguments>)AS <query>;Invocation of Datalog Rules:<SQL query about EDB, IDB>
  • 42. SQL Recursion Example (p1)Find Sally’s cousinsUsing Recursive definition introduced earlierPar (child, parent) is the EDBExpected SQL QuerySELECT yFROM CousinWHERE x = ‘Sally’;But first, we need to define the IDB Cousin
  • 43. SQL Recursion Example (p2)WITH Clause (non-recursive)WITH Sib(x, y) AS FROM Par p1, Par p2 WHERE p1.parent = p2.parent AND p1.child <> p2.child;WITH Clause (recursive)RECURSIVE Cousin(x, y) AS (SELECT * FROM Sib) UNION (SELECT p1.child, p2.child FROM Par p1, Par p2, Cousin WHERE p1.parent = Cousin.x AND p2.parent = Cousin.y);
  • 44. Next meetingJanuary 31Sections 7.1 – 7.3Sections 8.1, 8.3 – 8.4Discussion of presentationtopic proposals