SlideShare a Scribd company logo
For any help regarding Electrical Engineering Exam Help
Visit :- liveexamhelper.com
Email :- info@liveexamhelper.com
call us at :- +1 678 648 4277 liveexamhelper.com
I Short Answer
1. To reduce the number of plans the query optimizer must consider, the Selinger Opti mizer
employs a number of heuristics to reduce the search space. List three:
– Push selections down to the leaves.
– Push projections down.
– Consider only left deep plans.
– Join tables without join predicates (cross product joins) as late as possible.
2. Give one reason why the REDO pass of ARIES must use physical logging.
The pages on disk may be in any state between “no updates applied” and “all updates applied.”
Logical redo cannot be used on an indeterminate state, whereas physically redo logging just
over writes the data on disk with the logged values. In other words, physical logging is
idempotent whereas logical logging is not.
Electrical Engineering and Computer Science
liveexamhelper.com
II Optimistic Concurrency Control
Read Validate Write
T1
Read set: {A,B} Write set: {B,C}
T2
Read set: {A,D} Write set: {A,B}
Transactions that commit: T1,T2 or T1 Transactions that abort: {} or T2
Justification:
There are two correct solutions:
T1 and T2 both commit. T1 commits because it validates first. T2 will also commit because while its write set intersects T1’s write
set, T1 finishes its write phase before T2 starts its write phase (condition 2 in section 3.1 of the paper).
T1 commits, T2 aborts. T1 commits because it validates first. In the pseudocode from the paper (section 5), T1 is executing the line
marked “write phase” when T2 starts its validation phase. T2 finds T1 in the “finish active” set, and aborts due to a write/write
conflict. This is sort of a “false” conflict, since it would be safe for T2 to commit. However, the parallel algorithm assumes that T1 is
still writing at the time that T2 detects the conflict, so it must abort.
For the following transaction schedules, indicate which transactions would commit and which would abort when
run using the parallel validation scheme described in the Kung and Robinson paper on Optimistic Concurrency
Control. Also give a brief justification for your answer. You may assume that if a transaction aborts, it does not
execute its write phase, rolling back instantly at the end of the validation phase.
3.
Time
liveexamhelper.com
4.
Read Validate Write
T1
Read set: {A,B} Write set: {B,C}
T2
Read set: {A,B} Write set: {B,D}
T3
Read set: {D} Write set: {D}
Transactions that commit: T1,T3
Transactions that abort:T2
Justification:
T1 and T3 commit, T2 aborts. T1 commits because it validates first. T2 must abort because T1’s
write set intersects T2’s read set, and T2 was started reading before T1 finished writing. T3 then
commit because although it has a conflict with T2, when it begins the validation algorithm, T2 has
already aborted (it finishes aborting at the end of the validation phase).
liveexamhelper.com
III Schema Design
Consider a relational table:
Professor(
professor name, professor id, professor office id,
student id, student name, student office id,
student designated refrigerator id, refrigerator owner id, refrigerator
id, refrigerator size, secretary name, secretary id, secretary office )
Suppose the data has the following properties:
A.Professors and secretaries have individual offices, students share offices.
B.Students can work for multiple professors.
C.Refrigerators are owned by one professor.
D.Professors can own multiple refrigerators.
E.Students can only use one refrigerator.
F.The refrigerator the student uses must be owned by one of the professors they work for.
G.Secretaries can work for multiple professors.
H.Professors only have a single secretary.
liveexamhelper.com
5. Put this table into 3rd normal form by writing out the decomposed tables; designate keys in
your tables by underlining them. Designate foreign keys by drawing an arrow from a foreign
key to the primary key it refers to. Note that some of the properties listed above may not be
enforced (i.e., guaranteed to be true) by a 3NF decomposition.
prof: pid pname poffice psec
student: sid sname soffice sfridge
sec: secid secname secoffice
fridge: fridgeid fridgesize fridgeowner
sworksfor: sid profid
liveexamhelper.com
6. Which of the eight properties (A–H) of the data are enforced (i.e., guaranteed to be true)
by the 3NF decomposition and primary and foreign keys you gave above?
B,C,D,E,G,H; the schema does not enfore that profs/secretaries have individual offices or that stu dents
use the fridge of the professor they work for.
7.What could a database administrator do to make sure the properties not explicitly en
forced by your schema are enforced by the database?
Use triggers or stored procedures to ensure that these constraints are checked when data is in serted
or updated
IV External Aggregation
Suppose you are implementing an aggregation operator for a database system, and you need
to support aggre gation over very large data sets with more groups than can fit into memory.
Your friend Johnny Join begins writing out how he would implement the algorithm.
Unfortunately, he stayed up all night working on Lab 3 and so trails off in an incoherent stream
of obscenities and drool before finishing his code. His algorithm begins as follows:
liveexamhelper.com
��
��
8. [Relative to |T|, how big must |M| be for the algorithm to function correctly? Justify your answer
with a sentence or brief analytical argument.
Because bu f s should fit in memory, n ≤ |M|.
The number of hash buckets is computed as n =
.
|T |
|M|
|T |
Plugging in for n, we get that
|M|
≤
|M|.
Therefore, |T| ≤ |M|2.
It follows that |M| ≥
√
|T
|.
9. Assuming the aggregate phase does only one pass over the data, and that f name
= ’AVG’, describe briefly what should go in the body of the for loop (in place of “Your code goes here”). You may
write pseudocode or a few short sentences. Suppose the system provides a function emit(group,val) to output
the value of a group.
All tuples with the same gby f will end up in the same partition, but there may (very likely will) be multiple
gby f values per partition.
The basic algorithm should be something like:
H= new HashTable // stores (gbyf, <sum,cnt>) pairs for tuple t in
f :
i f ((<sum,cnt> = H.get(t.gbyf)) == null)
<sum,cnt> = <0,0> sum = sum + t.aggf cnt
= cnt + 1
T.put(t.gbyf,<sum,cnt>) for (group, <sum,cnt>)
in T:
emit(group,sum/cnt)
Any in-memory aggregation would work here, so if you prefer to sort the values in memory
and then bucket them as is done in the next question, that is also an acceptable answer.
liveexamhelper.com
10. Describe an alternative to this hashing-based algorithm. Your answer shouldn’t require
more than a sentence or two.
Remember that the tuples do not fit in memory.
One solution would be to use an external memory sort on the table, where the sort key for
each tuple is the gbyf field.
After sorting, sequentially scan the sorted file’s tuples, keeping a running average for each gbyf
grouping (they will appear in contiguous chunks in the file). The average should be calculated
on each aggf field.
Other solutions included using external memory indices, such as B+Trees, on gbyf to group the
tuples on which to calculate an average.
liveexamhelper.com
V Cost Estimation
Suppose you are given a database with the following tables and sizes, and that each
data page holds 100 tuples, that both leaf and non-leaf B+Tree pages are dense-packed
and hold 100 keys, and that you have 102 pages of memory. Assume that the buffer
pool is managed as described in the DBMIN Paper (“An Evaluation of Buffer
Management Strategies for Relational Database Systems.”, VLDB 1985.)
Table Size, in pages
T1 100
T2 1000
T3 5000
11. Estimate the minimum number of I/Os required for the following join operations. Ig nore the
difference between random and sequential I/O. Assume that B+Trees store pointers to records in heap
files in their leaves (i.e., B+Tree pages only store keys, not tuples.)
•Nested loops join between T1 and T2, no indices.
T1 fits in memory. Put it as the inner and read all of T1 once. Then scan through T2 as the outer. Total: |T
1| + |T2| = 1100 I/Os.
•Grace hash join between T2 and T3, no indices.
Grace join hashes all of T2, T3 in one pass, outputting records as it goes. It then scans through the hashed output to do
the join. Therefore it reads all tuples twice and writes them once for a total cost of: 3(|T2| + |T3|) = 18000 I/Os.
•Index nested loops join between a foreign key of T2 and the primary key of T3, with a B+Tree index on
liveexamhelper.com
T3 and no index on T2.
Put T2 as the outer and T3 as the inner. Assume the matching tuples are always on a single B+Tree page
and that selectivity is 1 due to foreign-key primary-key join. Cache the upper levels of the B+Tree. T3 is
500,000 tuples, so it needs 3 levels (top level = 100 pointers, second level = 1002 = 10, 000, third level
1003 = 1, 000, 000 pointers). Cache the root plus all but one of the level 2 pages (100 pages). Read all of
T2 once (one page at a time). For each tuple in T2, we do a lookup in the B+Tree and read one B+Tree leaf
page (at level 3), then follow the pointer to fetch the actual tuple from the heap file (using the other page of
memory).
Total cost is: 1000(|T2|) + 100(top o f B + Tree) + {T2} × No.BTree lookups
For 99/100 of the B+Tree lookups, two levels will be cached, so {T2} × No.BTree lookups is:
99/100 ∗(2 × |T2|) = 99/100 ∗2 × 100000 = 198000 pages.
For 1/100 of the B+Tree lookups, only the root level will be cached:
1/100 ∗(3 × |T2|) = 1/100 ∗3 × 100000 = 3000 pages.
So the total is :
1000 + 100 + 198000 + 3000 = 202100 I/Os
We were flexible with the exact calculation here due to the complexity introduced by not being able to
completely cache both levels of the B+Tree.
liveexamhelper.com
LSN TID PrevLsn Type Data
1 1 - SOT -
2 1 1 UP A
3 1 2 UP B
4 2 - SOT -
5 2 4 UP C
6 - - CP dirty, trans
7 3 - SOT -
8 2 5 UP D
9 3 7 UP E
10 1 3 COMMIT -
11 2 8 UP B
12 2 8 CLR B
13 3 7 CLR E
VI ARIES with CLRs
Suppose you are given the following log file.
liveexamhelper.com

More Related Content

PPTX
Computational Assignment Help
PPTX
Computer Science Programming Assignment Help
PPTX
Control System Homework Help
PPTX
Computer Science Assignment Help
PPTX
Software Construction Assignment Help
PPTX
Programming Assignment Help
PPTX
Programming Assignment Help
PPTX
Algorithm Assignment Help
Computational Assignment Help
Computer Science Programming Assignment Help
Control System Homework Help
Computer Science Assignment Help
Software Construction Assignment Help
Programming Assignment Help
Programming Assignment Help
Algorithm Assignment Help

What's hot (20)

PPTX
Mechanical Engineering Assignment Help
PPTX
Electrical Engineering Assignment Help
PDF
Tutorial2
PPTX
Signal Processing Assignment Help
PDF
Oct.22nd.Presentation.Final
PPTX
CPP Homework Help
PPTX
Cpp Homework Help
PPT
Environmental Engineering Assignment Help
PDF
International Journal of Engineering Research and Development (IJERD)
PPTX
40+ examples of user defined methods in java with explanation
PPTX
Computer Science Assignment Help
PPT
Ch10 Recursion
PPT
Pointer
PPTX
Dag representation of basic blocks
DOCX
Data Structure in C (Lab Programs)
PPT
Stdlib functions lesson
PPT
Chapter Eight(1)
PDF
Introduction to Recursion (Python)
Mechanical Engineering Assignment Help
Electrical Engineering Assignment Help
Tutorial2
Signal Processing Assignment Help
Oct.22nd.Presentation.Final
CPP Homework Help
Cpp Homework Help
Environmental Engineering Assignment Help
International Journal of Engineering Research and Development (IJERD)
40+ examples of user defined methods in java with explanation
Computer Science Assignment Help
Ch10 Recursion
Pointer
Dag representation of basic blocks
Data Structure in C (Lab Programs)
Stdlib functions lesson
Chapter Eight(1)
Introduction to Recursion (Python)
Ad

Similar to Electrical Engineering Exam Help (20)

PPTX
Database Systems Assignment Help
PPTX
databasehomeworkhelp.com_ Database System Assignment Help (1).pptx
PPTX
Performance analysis and randamized agoritham
PPTX
VCE Unit 01 (2).pptx
PPTX
Introduction to data structures and complexity.pptx
PDF
Data Structure - Lecture 1 - Introduction.pdf
PPTX
Module-1.pptxbdjdhcdbejdjhdbchchchchchjcjcjc
PDF
Data Structure & Algorithms - Mathematical
DOCX
Data structure notes for introduction, complexity
DOCX
CSC8503 Principles of Programming Languages Semester 1, 2015.docx
PDF
PDF
complexity analysis.pdf
PPT
Rightand wrong[1]
PPT
CS3114_09212011.ppt
PDF
DATA STRUCTURE.pdf
PDF
DATA STRUCTURE
PPT
Tri Merge Sorting Algorithm
PPTX
DA lecture 3.pptx
DOCX
Interview Preparation
PDF
Advanced Algorithms Lecture Notes Mit 6854j Itebooks
Database Systems Assignment Help
databasehomeworkhelp.com_ Database System Assignment Help (1).pptx
Performance analysis and randamized agoritham
VCE Unit 01 (2).pptx
Introduction to data structures and complexity.pptx
Data Structure - Lecture 1 - Introduction.pdf
Module-1.pptxbdjdhcdbejdjhdbchchchchchjcjcjc
Data Structure & Algorithms - Mathematical
Data structure notes for introduction, complexity
CSC8503 Principles of Programming Languages Semester 1, 2015.docx
complexity analysis.pdf
Rightand wrong[1]
CS3114_09212011.ppt
DATA STRUCTURE.pdf
DATA STRUCTURE
Tri Merge Sorting Algorithm
DA lecture 3.pptx
Interview Preparation
Advanced Algorithms Lecture Notes Mit 6854j Itebooks
Ad

More from Live Exam Helper (20)

PPTX
Linear Algebra Problems Involving Fields, Vector Spaces, and Linear Maps
PPTX
Linear Algebra Exam Review: Key Concepts and Expert Solutions
PPTX
Nursing Exam Help
PPTX
Statistical Physics Exam Help
PDF
Take My Nursing Exam
PDF
Pay for economics exam
PDF
Take My Economics Exam
PDF
Best Economics Exam Help
PDF
Take My Accounting Exam
PPTX
Microeconomics Exam Questions and Answers
PPTX
Exam Questions and Solutions of Molecular Biology
PPTX
Probabilistic Methods of Signal and System Analysis Solutions
PPTX
Python Exam (Questions with Solutions Done By Live Exam Helper Experts)
PPTX
Digital Communication Exam Help
PPTX
Digital Communication Exam Help
PPTX
Digital Communication Exam Help
PPTX
Continuum Electromechanics Exam Help
PPTX
Continuum Electromechanics Exam Help
PPTX
Electromechanics Exam Help
PPTX
Materials Science Exam Help
Linear Algebra Problems Involving Fields, Vector Spaces, and Linear Maps
Linear Algebra Exam Review: Key Concepts and Expert Solutions
Nursing Exam Help
Statistical Physics Exam Help
Take My Nursing Exam
Pay for economics exam
Take My Economics Exam
Best Economics Exam Help
Take My Accounting Exam
Microeconomics Exam Questions and Answers
Exam Questions and Solutions of Molecular Biology
Probabilistic Methods of Signal and System Analysis Solutions
Python Exam (Questions with Solutions Done By Live Exam Helper Experts)
Digital Communication Exam Help
Digital Communication Exam Help
Digital Communication Exam Help
Continuum Electromechanics Exam Help
Continuum Electromechanics Exam Help
Electromechanics Exam Help
Materials Science Exam Help

Recently uploaded (20)

PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
RMMM.pdf make it easy to upload and study
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PPTX
Introduction to Building Materials
PPTX
History, Philosophy and sociology of education (1).pptx
PPTX
Orientation - ARALprogram of Deped to the Parents.pptx
PPTX
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
PDF
1_English_Language_Set_2.pdf probationary
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PPTX
Digestion and Absorption of Carbohydrates, Proteina and Fats
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PDF
Hazard Identification & Risk Assessment .pdf
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPTX
Unit 4 Skeletal System.ppt.pptxopresentatiom
PDF
Classroom Observation Tools for Teachers
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
Supply Chain Operations Speaking Notes -ICLT Program
A powerpoint presentation on the Revised K-10 Science Shaping Paper
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
RMMM.pdf make it easy to upload and study
Chinmaya Tiranga quiz Grand Finale.pdf
Introduction to Building Materials
History, Philosophy and sociology of education (1).pptx
Orientation - ARALprogram of Deped to the Parents.pptx
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
1_English_Language_Set_2.pdf probationary
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
Digestion and Absorption of Carbohydrates, Proteina and Fats
Final Presentation General Medicine 03-08-2024.pptx
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
Practical Manual AGRO-233 Principles and Practices of Natural Farming
Hazard Identification & Risk Assessment .pdf
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Unit 4 Skeletal System.ppt.pptxopresentatiom
Classroom Observation Tools for Teachers
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين

Electrical Engineering Exam Help

  • 1. For any help regarding Electrical Engineering Exam Help Visit :- liveexamhelper.com Email :- info@liveexamhelper.com call us at :- +1 678 648 4277 liveexamhelper.com
  • 2. I Short Answer 1. To reduce the number of plans the query optimizer must consider, the Selinger Opti mizer employs a number of heuristics to reduce the search space. List three: – Push selections down to the leaves. – Push projections down. – Consider only left deep plans. – Join tables without join predicates (cross product joins) as late as possible. 2. Give one reason why the REDO pass of ARIES must use physical logging. The pages on disk may be in any state between “no updates applied” and “all updates applied.” Logical redo cannot be used on an indeterminate state, whereas physically redo logging just over writes the data on disk with the logged values. In other words, physical logging is idempotent whereas logical logging is not. Electrical Engineering and Computer Science liveexamhelper.com
  • 3. II Optimistic Concurrency Control Read Validate Write T1 Read set: {A,B} Write set: {B,C} T2 Read set: {A,D} Write set: {A,B} Transactions that commit: T1,T2 or T1 Transactions that abort: {} or T2 Justification: There are two correct solutions: T1 and T2 both commit. T1 commits because it validates first. T2 will also commit because while its write set intersects T1’s write set, T1 finishes its write phase before T2 starts its write phase (condition 2 in section 3.1 of the paper). T1 commits, T2 aborts. T1 commits because it validates first. In the pseudocode from the paper (section 5), T1 is executing the line marked “write phase” when T2 starts its validation phase. T2 finds T1 in the “finish active” set, and aborts due to a write/write conflict. This is sort of a “false” conflict, since it would be safe for T2 to commit. However, the parallel algorithm assumes that T1 is still writing at the time that T2 detects the conflict, so it must abort. For the following transaction schedules, indicate which transactions would commit and which would abort when run using the parallel validation scheme described in the Kung and Robinson paper on Optimistic Concurrency Control. Also give a brief justification for your answer. You may assume that if a transaction aborts, it does not execute its write phase, rolling back instantly at the end of the validation phase. 3. Time liveexamhelper.com
  • 4. 4. Read Validate Write T1 Read set: {A,B} Write set: {B,C} T2 Read set: {A,B} Write set: {B,D} T3 Read set: {D} Write set: {D} Transactions that commit: T1,T3 Transactions that abort:T2 Justification: T1 and T3 commit, T2 aborts. T1 commits because it validates first. T2 must abort because T1’s write set intersects T2’s read set, and T2 was started reading before T1 finished writing. T3 then commit because although it has a conflict with T2, when it begins the validation algorithm, T2 has already aborted (it finishes aborting at the end of the validation phase). liveexamhelper.com
  • 5. III Schema Design Consider a relational table: Professor( professor name, professor id, professor office id, student id, student name, student office id, student designated refrigerator id, refrigerator owner id, refrigerator id, refrigerator size, secretary name, secretary id, secretary office ) Suppose the data has the following properties: A.Professors and secretaries have individual offices, students share offices. B.Students can work for multiple professors. C.Refrigerators are owned by one professor. D.Professors can own multiple refrigerators. E.Students can only use one refrigerator. F.The refrigerator the student uses must be owned by one of the professors they work for. G.Secretaries can work for multiple professors. H.Professors only have a single secretary. liveexamhelper.com
  • 6. 5. Put this table into 3rd normal form by writing out the decomposed tables; designate keys in your tables by underlining them. Designate foreign keys by drawing an arrow from a foreign key to the primary key it refers to. Note that some of the properties listed above may not be enforced (i.e., guaranteed to be true) by a 3NF decomposition. prof: pid pname poffice psec student: sid sname soffice sfridge sec: secid secname secoffice fridge: fridgeid fridgesize fridgeowner sworksfor: sid profid liveexamhelper.com
  • 7. 6. Which of the eight properties (A–H) of the data are enforced (i.e., guaranteed to be true) by the 3NF decomposition and primary and foreign keys you gave above? B,C,D,E,G,H; the schema does not enfore that profs/secretaries have individual offices or that stu dents use the fridge of the professor they work for. 7.What could a database administrator do to make sure the properties not explicitly en forced by your schema are enforced by the database? Use triggers or stored procedures to ensure that these constraints are checked when data is in serted or updated IV External Aggregation Suppose you are implementing an aggregation operator for a database system, and you need to support aggre gation over very large data sets with more groups than can fit into memory. Your friend Johnny Join begins writing out how he would implement the algorithm. Unfortunately, he stayed up all night working on Lab 3 and so trails off in an incoherent stream of obscenities and drool before finishing his code. His algorithm begins as follows: liveexamhelper.com
  • 8. �� �� 8. [Relative to |T|, how big must |M| be for the algorithm to function correctly? Justify your answer with a sentence or brief analytical argument. Because bu f s should fit in memory, n ≤ |M|. The number of hash buckets is computed as n = . |T | |M| |T | Plugging in for n, we get that |M| ≤ |M|. Therefore, |T| ≤ |M|2. It follows that |M| ≥ √ |T |. 9. Assuming the aggregate phase does only one pass over the data, and that f name = ’AVG’, describe briefly what should go in the body of the for loop (in place of “Your code goes here”). You may write pseudocode or a few short sentences. Suppose the system provides a function emit(group,val) to output the value of a group. All tuples with the same gby f will end up in the same partition, but there may (very likely will) be multiple gby f values per partition. The basic algorithm should be something like: H= new HashTable // stores (gbyf, <sum,cnt>) pairs for tuple t in f : i f ((<sum,cnt> = H.get(t.gbyf)) == null) <sum,cnt> = <0,0> sum = sum + t.aggf cnt = cnt + 1 T.put(t.gbyf,<sum,cnt>) for (group, <sum,cnt>) in T: emit(group,sum/cnt) Any in-memory aggregation would work here, so if you prefer to sort the values in memory and then bucket them as is done in the next question, that is also an acceptable answer. liveexamhelper.com
  • 9. 10. Describe an alternative to this hashing-based algorithm. Your answer shouldn’t require more than a sentence or two. Remember that the tuples do not fit in memory. One solution would be to use an external memory sort on the table, where the sort key for each tuple is the gbyf field. After sorting, sequentially scan the sorted file’s tuples, keeping a running average for each gbyf grouping (they will appear in contiguous chunks in the file). The average should be calculated on each aggf field. Other solutions included using external memory indices, such as B+Trees, on gbyf to group the tuples on which to calculate an average. liveexamhelper.com
  • 10. V Cost Estimation Suppose you are given a database with the following tables and sizes, and that each data page holds 100 tuples, that both leaf and non-leaf B+Tree pages are dense-packed and hold 100 keys, and that you have 102 pages of memory. Assume that the buffer pool is managed as described in the DBMIN Paper (“An Evaluation of Buffer Management Strategies for Relational Database Systems.”, VLDB 1985.) Table Size, in pages T1 100 T2 1000 T3 5000 11. Estimate the minimum number of I/Os required for the following join operations. Ig nore the difference between random and sequential I/O. Assume that B+Trees store pointers to records in heap files in their leaves (i.e., B+Tree pages only store keys, not tuples.) •Nested loops join between T1 and T2, no indices. T1 fits in memory. Put it as the inner and read all of T1 once. Then scan through T2 as the outer. Total: |T 1| + |T2| = 1100 I/Os. •Grace hash join between T2 and T3, no indices. Grace join hashes all of T2, T3 in one pass, outputting records as it goes. It then scans through the hashed output to do the join. Therefore it reads all tuples twice and writes them once for a total cost of: 3(|T2| + |T3|) = 18000 I/Os. •Index nested loops join between a foreign key of T2 and the primary key of T3, with a B+Tree index on liveexamhelper.com
  • 11. T3 and no index on T2. Put T2 as the outer and T3 as the inner. Assume the matching tuples are always on a single B+Tree page and that selectivity is 1 due to foreign-key primary-key join. Cache the upper levels of the B+Tree. T3 is 500,000 tuples, so it needs 3 levels (top level = 100 pointers, second level = 1002 = 10, 000, third level 1003 = 1, 000, 000 pointers). Cache the root plus all but one of the level 2 pages (100 pages). Read all of T2 once (one page at a time). For each tuple in T2, we do a lookup in the B+Tree and read one B+Tree leaf page (at level 3), then follow the pointer to fetch the actual tuple from the heap file (using the other page of memory). Total cost is: 1000(|T2|) + 100(top o f B + Tree) + {T2} × No.BTree lookups For 99/100 of the B+Tree lookups, two levels will be cached, so {T2} × No.BTree lookups is: 99/100 ∗(2 × |T2|) = 99/100 ∗2 × 100000 = 198000 pages. For 1/100 of the B+Tree lookups, only the root level will be cached: 1/100 ∗(3 × |T2|) = 1/100 ∗3 × 100000 = 3000 pages. So the total is : 1000 + 100 + 198000 + 3000 = 202100 I/Os We were flexible with the exact calculation here due to the complexity introduced by not being able to completely cache both levels of the B+Tree. liveexamhelper.com
  • 12. LSN TID PrevLsn Type Data 1 1 - SOT - 2 1 1 UP A 3 1 2 UP B 4 2 - SOT - 5 2 4 UP C 6 - - CP dirty, trans 7 3 - SOT - 8 2 5 UP D 9 3 7 UP E 10 1 3 COMMIT - 11 2 8 UP B 12 2 8 CLR B 13 3 7 CLR E VI ARIES with CLRs Suppose you are given the following log file. liveexamhelper.com