1. Data Structure And Algorithm
Unit 1 Introduction
Recursion
Recursion is the action of a function calling itself either directly or indirectly,
and the associated function is known as a recursive function. A recursive
method can be used to tackle some issues with relative ease. Towers of Hanoi
(TOH), in order/preorder/postorder tree traversals, DFS of Graph, etc., are a few
examples of these issues.
Recursion Required for:
Recursion is a fantastic approach that allows us to shorten our code and simplify
understanding and writing. Compared to the iteration approach, it offers a few
benefits that will be covered later. Recursion is one of the finest ways to
complete a work that its related subtasks may describe. The factorial of a
number, for instance.
Characteristics of Recursion
Recursion's characteristics include:
o Repeating the same actions with different inputs.
o To make the issue smaller at each phase, we experiment with smaller
inputs.
o A base condition must stop the recursion; otherwise, an infinite loop
would result.
Steps in an Algorithm
The following algorithmic steps are used to implement recursion in a function:
1. Step 1: Establish a primary case. Choose the most straightforward
situation for which the answer is obvious or trivial. This is the recursion's
halting condition, which stops the function from indefinitely calling itself.
2. Define a recursive case in step two: Describe the issue in terms of its
smaller counterparts. Recursively calling the function will allow you to
solve each subproblem by breaking the problem up into smaller versions
of itself.
3. Step 3: Verify that the recursion ends: Make sure the recursive code does
not go into an infinite loop and ultimately reaches the base case.
4. Step 4: Combine the solutions. To answer the main problem, combine the
solutions to the subproblems.
Recursion consumes additional memory because the recursive function adds to
the stack with each call and stores the items there until the call is concluded.
SY BTECH Computer Science And Business Systems
2. Data Structure And Algorithm
Unit 1 Introduction
Like the stack data structure, the recursive function uses the LIFO (LAST IN
FIRST OUT) structure.
1. /Fibonacci program recursive Function
2. int Fib(int num)
3. {
4. if ( num == 0 )
5. return 0;
6. else if ( num == 1 )
7. return 1;
8. else
9. return ( Fib(num-1) + Fib(num - 2) );
10.}
11.//Let's say we want to calculate the Fibonacci number for n=5
12.Fib(5)
Now consider the following recursive Fibonacci algorithm for n = 5. All stacks
are first saved before printing the matching value of n until n equals zero. The
stacks are eliminated one at a time once the termination condition is met by
returning 0 to the calling stack. Look at the diagram below to comprehend the
call stack hierarchy.
Time Complexity
Time complexity measures how many operations an algorithm completes in
relation to the size of the input.
Big O notation (O()) is the notation that is most frequently used to indicate
temporal complexity.
SY BTECH Computer Science And Business Systems
3. Data Structure And Algorithm
Unit 1 Introduction
The steps involved in finding the time complexity of an algorithm are:
o Find the number of statements with constant time complexity (O(1)).
o Find the number of statements with higher orders of complexity like O(N),
O(N2
), O(log N), etc.
o Express the total time complexity as a sum of the constant.
o Drop the non-dominant terms: Anything you represent as O(N2
+N) can be
written as O(N2
) to get higher-order time complexity.
o Ignore constants; if any, it does not affect the overall time complexity.
Best, Worst, and Average Case Complexity:
In analyzing algorithms, we consider three types of time complexity:
1. Best-case complexity (O(best)): This represents the minimum time
required for an algorithm to complete when given the optimal input. It
denotes an algorithm operating at its peak efficiency under ideal
circumstances.
2. Worst-case complexity (O(worst)): This denotes the maximum time an
algorithm will take to finish for any given input. It represents the scenario
where the algorithm encounters the most unfavourable input.
3. Average-case complexity (O(average)): This estimates the typical
running time of an algorithm when averaged over all possible inputs. It
provides a more realistic evaluation of an algorithm's performance.
How to calculate time Complexity:
#include <stdio.h>
int main()
{
printf("Hello World");
return 0;
}
In the above code “Hello World” is printed only once on the screen.
So, the time complexity is constant: O(1) i.e. every time a constant amount of
time is required to execute code, no matter which operating system or which
machine configurations you are using.
#include <stdio.h>
void main()
{
int i, n = 8;
SY BTECH Computer Science And Business Systems
4. Data Structure And Algorithm
Unit 1 Introduction
for (i = 1; i <= n; i++) {
printf("Hello World !!!n");
}
}
In the above code “Hello World !!!” is printed only n times on the screen, as
the value of n can change.
So, the time complexity is linear: O(n) i.e. every time, a linear amount of time
is required to execute code.
#include <stdio.h>
void main()
{
int i, n = 8;
for (i = 1; i <= n; i=i*2) {
printf("Hello World !!!n");
}
}
Space Complexity
The space complexity helps to determine the efficiency and scalability of a
solution, and it is an important factor to consider when choosing a data structure
or designing an algorithm
Space complexity in data structures refers to the amount of memory used by an
algorithm to solve a problem. It measures the amount of memory space required
to store the data and structures used by an algorithm. This complexity is
important because it determines the scalability of a solution and the ability of a
program to handle large amounts of data. The space complexity is often
expressed in terms of the size of the input, and it is an important factor to
consider when choosing a data structure for a specific problem.
How to Calculate Space Complexity?
The following steps describe how to find the space complexity of an algorithm:
o Find the number of variables of different data types initialised in the
algorithm.
o Find the memory required by each data type (this can be done using the
sizeof( ) operator. Learn how here).
o Multiply the memory required by each data type with the number of
variables of that data type present. The result will give the memory required
by each data type.
o Add the memory required by each data type.
o Ignore the constants as explained in time complexity, if any, in the sum to
obtain the space complexity in the Big O notation.
SY BTECH Computer Science And Business Systems
5. Data Structure And Algorithm
Unit 1 Introduction
Evaluating the space complexity of an algorithm involves determining the
amount of memory used by various elements such as variables of different data
types, program instructions, constant values, and in some cases, function calls
and the recursion stack. The exact amount of memory used by different data
types may vary depending on the operating system, but the method of
calculating the space complexity remains constant.
Data Type Memory Space (in bytes)
Int 4
Float 4
Double 8
Char 1
short int 2
long int 4
int main()
{
int a = 10;
float b = 20.5;
char c = 'A';
int d[10];
return 0;
}
To calculate the complexity of this algorithm, we need to determine the amount
of memory used by each of the variables. In this case:
a is an integer, which takes up 4 bytes of memory.
b is a float, which takes up 4 bytes of memory.
c is a character, which takes up 1 byte of memory.
d is an array of 10 integers, which takes up 40 bytes of memory (10 x 4).
So, the total amount of memory used by this algorithm is 4 + 4 + 1 + 40 = 49
bytes.
int factorial(int n)
{
if (n == 0)
return 1;
else
return n * factorial(n-1);
SY BTECH Computer Science And Business Systems
6. Data Structure And Algorithm
Unit 1 Introduction
}
To calculate the complexity of this algorithm, we need to determine the amount
of memory used by the variables and functions. In this case:
n is an integer input parameter, which takes up 4 bytes of memory.
The function call factorial takes up some memory for the function call stack,
which is implementation-dependent.
In this case, the function factorial is recursive, so it makes multiple function
calls and uses memory on the function call stack. The complexity of this
algorithm is proportional to the number of function calls, which is directly
proportional to the value of n. The more calls, the more memory will be used on
the function call stack
.
In the worst-case scenario, where n is very large, this algorithm can use a
significant amount of memory on the function call stack, leading to a high
space-complexity.
Time Complexity vs. Space Complexity
Parameters Time Complexity Space Complexity
Definition It is the time taken by an
algorithm to solve a particular
problem
It is the space taken by an
algorithm to solve a
particular problem
Unit of
Measurement
Measured in terms of steps
processed
Measured in terms of storage
units acquired
Denotation Big-O, Big-Omega (Ω), Big-
Theta (Θ) Notations
Big-O, Big-Omega (Ω), Big-
Theta (Θ) Notations
Preference Lower Time Complexity >
Higher Time Complexity
Lower Space Complexity >
Higher Space Complexity
Dependency Size of the input Size of the variable
Example (Linear
Search)
Worst Case: O(n); Best Case:
O(1)
O(1)
SY BTECH Computer Science And Business Systems
7. Data Structure And Algorithm
Unit 1 Introduction
Asymptotic Analysis
The efficiency of an algorithm depends on the amount of time, storage and other
resources required to execute the algorithm. The efficiency is measured with the
help of asymptotic notations.
An algorithm may not have the same performance for different types of inputs.
With the increase in the input size, the performance will change.
The study of change in performance of the algorithm with the change in the
order of the input size is defined as asymptotic analysis.
Asymptotic Notations
Asymptotic notations are the mathematical notations used to describe the
running time of an algorithm when the input tends towards a particular value or
a limiting value.
For example: In bubble sort, when the input array is already sorted, the time
taken by the algorithm is linear i.e. the best case.
But, when the input array is in reverse condition, the algorithm takes the
maximum time (quadratic) to sort the elements i.e. the worst case.
When the input array is neither sorted nor in reverse order, then it takes average
time. These durations are denoted using asymptotic notations.
There are mainly three asymptotic notations:
Big-O notation
Omega notation
Theta notation
Big oh Notation (O)
o Big O notation is an asymptotic notation that measures the performance
of an algorithm by simply providing the order of growth of the function.
o This notation provides an upper bound on a function which ensures that
the function never grows faster than the upper bound. So, it gives the
least upper bound on a function so that the function never grows faster
than this upper bound.
It is the formal way to express the upper boundary of an algorithm running time.
It measures the worst case of time complexity or the algorithm's longest amount
of time to complete its operation. It is represented as shown below:
SY BTECH Computer Science And Business Systems
8. Data Structure And Algorithm
Unit 1 Introduction
For example:
If f(n) and g(n) are the two functions defined for positive integers,
then f(n) = O(g(n)) as f(n) is big oh of g(n) or f(n) is on the order of g(n)) if
there exists constants c and no such that:
f(n)≤c.g(n) for all n≥no
This implies that f(n) does not grow faster than g(n), or g(n) is an upper bound
on the function f(n). In this case, we are calculating the growth rate of the
function which eventually calculates the worst time complexity of a function,
i.e., how worst an algorithm can perform.
Let's understand through examples
Example 1: f(n)=2n+3 , g(n)=n
Now, we have to find Is f(n)=O(g(n))?
To check f(n)=O(g(n)), it must satisfy the given condition:
f(n)<=c.g(n)
First, we will replace f(n) by 2n+3 and g(n) by n.
2n+3 <= c.n
Let's assume c=5, n=1 then
2*1+3<=5*1
5<=5
SY BTECH Computer Science And Business Systems
9. Data Structure And Algorithm
Unit 1 Introduction
For n=1, the above condition is true.
If n=2
2*2+3<=5*2
7<=10
For n=2, the above condition is true.
We know that for any value of n, it will satisfy the above condition, i.e.,
2n+3<=c.n. If the value of c is equal to 5, then it will satisfy the condition
2n+3<=c.n. We can take any value of n starting from 1, it will always satisfy.
Therefore, we can say that for some constants c and for some constants n0, it
will always satisfy 2n+3<=c.n. As it is satisfying the above condition, so f(n) is
big oh of g(n) or we can say that f(n) grows linearly. Therefore, it concludes
that c.g(n) is the upper bound of the f(n). It can be represented graphically as:
The idea of using big o notation is to give an upper bound of a particular
function, and eventually it leads to give a worst-time complexity. It provides an
assurance that a particular function does not behave suddenly as a quadratic or a
cubic fashion, it just behaves in a linear manner in a worst-case.
SY BTECH Computer Science And Business Systems
10. Data Structure And Algorithm
Unit 1 Introduction
Omega Notation (Ω)
o It basically describes the best-case scenario which is opposite to the big o
notation.
o It is the formal way to represent the lower bound of an algorithm's
running time. It measures the best amount of time an algorithm can
possibly take to complete or the best-case time complexity.
o It determines what is the fastest time that an algorithm can run.
If we required that an algorithm takes at least certain amount of time without
using an upper bound, we use big- Ω notation i.e. the Greek letter "omega". It is
used to bound the growth of running time for large input size.
If f(n) and g(n) are the two functions defined for positive integers,
then f(n) = Ω (g(n)) as f(n) is Omega of g(n) or f(n) is on the order of g(n)) if
there exists constants c and no such that:
f(n)>=c.g(n) for all n≥no and c>0
Let's consider a simple example.
If f(n) = 2n+3, g(n) = n,
Is f(n)= Ω (g(n))?
It must satisfy the condition:
f(n)>=c.g(n)
To check the above condition, we first replace f(n) by 2n+3 and g(n) by n.
2n+3>=c*n
Suppose c=1
2n+3>=n (This equation will be true for any value of n starting from 1).
Therefore, it is proved that g(n) is big omega of 2n+3 function.
SY BTECH Computer Science And Business Systems
11. Data Structure And Algorithm
Unit 1 Introduction
As we can see in the above figure that g(n) function is the lower bound of the
f(n) function when the value of c is equal to 1. Therefore, this notation gives the
fastest running time. But, we are not more interested in finding the fastest
running time, we are interested in calculating the worst-case scenarios because
we want to check our algorithm for larger input that what is the worst time that
it will take so that we can take further decision in the further process.
Theta Notation (θ)
o The theta notation mainly describes the average case scenarios.
o It represents the realistic time complexity of an algorithm. Every time, an
algorithm does not perform worst or best, in real-world problems,
algorithms mainly fluctuate between the worst-case and best-case, and
this gives us the average case of the algorithm.
o Big theta is mainly used when the value of worst-case and the best-case is
same.
o It is the formal way to express both the upper bound and lower bound of
an algorithm running time.
Let's understand the big theta notation mathematically:
SY BTECH Computer Science And Business Systems
12. Data Structure And Algorithm
Unit 1 Introduction
Let f(n) and g(n) be the functions of n where n is the steps required to execute
the program then:
f(n)= θg(n)
The above condition is satisfied only if when
c1.g(n)<=f(n)<=c2.g(n)
where the function is bounded by two limits, i.e., upper and lower limit, and f(n)
comes in between. The condition f(n)= θg(n) will be true if and only if c1.g(n)
is less than or equal to f(n) and c2.g(n) is greater than or equal to f(n). The
graphical representation of theta notation is given below:
Let's consider the same example where
f(n)=2n+3
g(n)=n
As c1.g(n) should be less than f(n) so c1 has to be 1 whereas c2.g(n) should be
greater than f(n) so c2 is equal to 5. The c1.g(n) is the lower limit of the of the
f(n) while c2.g(n) is the upper limit of the f(n).
c1.g(n)<=f(n)<=c2.g(n)
SY BTECH Computer Science And Business Systems
13. Data Structure And Algorithm
Unit 1 Introduction
Replace g(n) by n and f(n) by 2n+3
c1.n <=2n+3<=c2.n
if c1=1, c2=5, n=1
1*1 <=2*1+3 <=5*1
1 <= 5 <= 5 // for n=1, it satisfies the condition c1.g(n)<=f(n)<=c2.g(n)
If n=2
1*2<=2*2+3<=5*2
2<=7<=10 // for n=2, it satisfies the condition c1.g(n)<=f(n)<=c2.g(n)
Therefore, we can say that for any value of n, it satisfies the condition
c1.g(n)<=f(n)<=c2.g(n). Hence, it is proved that f(n) is big theta of g(n). So, this
is the average-case scenario which provides the realistic time complexity.
Space-Time Trade-off
A tradeoff is a situation where one thing increases and another thing decreases.
It is a way to solve a problem in:
Either in less time and by using more space, or
In very little space by spending a long amount of time.
The best Algorithm is that which helps to solve a problem that requires less
space in memory and also takes less time to generate the output. But in
general, it is not always possible to achieve both of these conditions at the
same time. The most common condition is an algorithm using a lookup table.
This means that the answers to some questions for every possible value can be
written down. One way of solving this problem is to write down the
entire lookup table, which will let you find answers very quickly but will use
a lot of space. Another way is to calculate the answers without writing down
anything, which uses very little space, but might take a long time. Therefore,
the more time-efficient algorithms you have, that would be less space-
efficient.
Types of Space-Time Trade-off
Compressed or Uncompressed data
Re Rendering or Stored images
Smaller code or loop unrolling
Lookup tables or Recalculation
SY BTECH Computer Science And Business Systems
14. Data Structure And Algorithm
Unit 1 Introduction
Compressed or Uncompressed data: A space-time trade-off can be
applied to the problem of data storage. If data stored is uncompressed,
it takes more space but less time. But if the data is stored compressed, it
takes less space but more time to run the decompression algorithm.
There are many instances where it is possible to directly work with
compressed data. In that case of compressed bitmap indices, where it is
faster to work with compression than without compression.
Re-Rendering or Stored images: In this case, storing only the source
and rendering it as an image would take more space but less time i.e.,
storing an image in the cache is faster than re-rendering but requires
more space in memory.
Smaller code or Loop Unrolling: Smaller code occupies less space in
memory but it requires high computation time that is required for
jumping back to the beginning of the loop at the end of each iteration.
Loop unrolling can optimize execution speed at the cost of increased
binary size. It occupies more space in memory but requires less
computation time.
Lookup tables or Recalculation: In a lookup table, an implementation
can include the entire table which reduces computing time but increases
the amount of memory needed. It can recalculate i.e., compute table
entries as needed, increasing computing time but reducing memory
requirements.
For Example: In mathematical terms, the sequence Fn of the Fibonacci
Numbers is defined by the recurrence relation:
Fn = Fn – 1 + Fn – 2,
where, F0 = 0 and F1 = 1.
Programming styles
A programming style is a set of guidelines used to format programming
instructions. It is useful to follow a style as it makes it easier for programmers to
understand the code, maintain it, and assists in reducing the likelihood of
introducing errors.
Naming conventions
When naming variables, functions/methods, classes, files etc. it is important to
follow a naming convention, and use correct English spelling (this assists with
search/find/replace operations). Naming conventions are used to improve visual
appearance and reduce the effort needed to read and understand the code. They
SY BTECH Computer Science And Business Systems
15. Data Structure And Algorithm
Unit 1 Introduction
can vary in different programing languages. The following items are a good
guide:
1. All naming should be descriptive where appropriate.
2. Avoid abbreviation where possible but seek to keep naming an
appropriate mix of easy to understand purpose but not too long.
3. Spaces must not be used. Some languages would use dashes for names
(e.g. total-height in Lisp), while other languages would use an underscore
(e.g. total_height in Python, SQL). Java uses mixed case for variables
starting with lowercase e.g. totalHeight.
4. Constants are usually defined as all uppercase using underscores to
separate words. e.g.MAX_HEIGHT.
Commenting
Comments improve program readability. As with naming conventions, it is
important to use correct English spelling. Please note that different
programming languages (dis)allow differernt commenting styles.
1. Be consistent with your use of commenting syntax, for example:
1. // This is a line comment
2. /* This is a block comment
3. over two lines */
4. -- This is a comment
2. Start each file with a comment at the top describing its contents.
3. Start each class with an accompanying comment that describes what it is
for and how it should be used.
4. Start each function with a comment describing use of the function.
5. Variable names should be descriptive enough not to require commenting.
If this is difficult, provide a brief comment at declaration.
6. Only comment tricky, non-obvious, interesting or important parts of your
code throughout implementation.
7. Follow normal grammatical conventions to ensure readability of
comments.
Indenting and Whitespaces
1. Indenting and Whitespace requirements for particular languages
supersede any general guidelines (such as Python requirements)
2. Blank lines should be used to separate logical groupings of code
3. Minimum of 2 blank lines should be used to separate functions
SY BTECH Computer Science And Business Systems
16. Data Structure And Algorithm
Unit 1 Introduction
4. Be consistent with choice of tabs or spaces for indenting. Adjust tab
sizing so as not to be too large.
5. Aim for best readability. Primary importance is that code is easy to read
and logically follow
if (hours < 24 && minutes < 60 && seconds < 60) {
return true;
} else {
return false;
}
General formatting
1. Keep line lengths to 80 characters or less where appropriate.
2. Adopt the same formatting conventions as previously written code if
adding to or modifying an existing code base.
3. BE CONSISTENT!
Abstraction
Abstraction means displaying only essential information and hiding the details.
Data abstraction refers to providing only essential information about the data
to the outside world, hiding the background details or implementation.
Consider a real-life example of a man driving a car. The man only knows that
pressing the accelerator will increase the speed of the car or applying brakes
will stop the car but he does not know how on pressing the accelerator the
speed is actually increasing, he does not know about the inner mechanism of
the car or the implementation of the accelerator, brakes, etc in the car. This is
what abstraction is.
Abstraction using Classes
We can implement Abstraction in C++ using classes. The class helps us to
group data members and member functions using available access specifiers.
A Class can decide which data member will be visible to the outside world and
which is not.
Abstraction in Header files
One more type of abstraction in C++ can be header files. For example,
consider the pow() method present in math.h header file. Whenever we need to
calculate the power of a number, we simply call the function pow() present in
the math.h header file and pass the numbers as arguments without knowing the
SY BTECH Computer Science And Business Systems
17. Data Structure And Algorithm
Unit 1 Introduction
underlying algorithm according to which the function is actually calculating
the power of numbers.
Abstraction using Access Specifiers
Access specifiers are the main pillar of implementing abstraction in C++. We
can use access specifiers to enforce restrictions on class members. For
example:
Members declared as public in a class can be accessed from anywhere in
the program.
Members declared as private in a class, can be accessed only from within
the class. They are not allowed to be accessed from any part of the code
outside the class.
// C++ Program to Demonstrate the
// working of Abstraction
#include <iostream>
using namespace std;
class implementAbstraction {
private:
int a, b;
public:
// method to set values of
// private members
void set(int x, int y)
{
a = x;
b = y;
}
void display()
{
cout << "a = " << a << endl;
cout << "b = " << b << endl;
}
};
int main()
{
implementAbstraction obj;
obj.set(10, 20);
SY BTECH Computer Science And Business Systems
18. Data Structure And Algorithm
Unit 1 Introduction
obj.display();
return 0;
}
Testing:
Testing data structures and algorithms is an essential skill for any programmer
who wants to write efficient, robust, and scalable code. Data structures and
algorithms are the building blocks of many applications and systems, and they
affect the performance, memory usage, and complexity of your code. Testing
them helps you verify their correctness, identify bugs, optimize their design, and
compare different solutions. In this article, you will learn some of the best ways
to test data structures and algorithms in your preferred programming language,
using examples from Python, Java, and C++.
1. Choose appropriate test cases
The first step to test data structures and algorithms is to choose appropriate test
cases that cover different scenarios and edge cases. Test cases are inputs and
outputs that you use to check if your code works as expected. You should select
test cases that are representative, diverse, and challenging, and that cover both
normal and abnormal situations. For example, if you are testing a sorting
algorithm, you should include test cases with empty, sorted, reversed, random,
duplicate, and large arrays. You should also consider the limits and constraints
of your data structures and algorithms, such as the range of values, the size of
inputs, the time and space complexity, and the error handling.
2. Use built-in or external testing tools
The second step to test data structures and algorithms is to use built-in or
external testing tools that automate the testing process and provide feedback and
reports. Testing tools are software applications or libraries that help you write,
run, and evaluate test cases. Depending on your programming language and
preferences, you can use different testing tools, such as unittest, pytest, or
doctest for Python, JUnit, TestNG, or Hamcrest for Java, or Google Test,
Catch2, or Boost.Test for C++. Testing tools usually offer features such as
assertions, test suites, test runners, test fixtures, test discovery, test coverage,
and test reporting.
3. Write clear and concise code
The third step to test data structures and algorithms is to write clear and concise
code that follows the best practices and conventions of your programming
language. Clear and concise code is easier to read, understand, debug, and
SY BTECH Computer Science And Business Systems
19. Data Structure And Algorithm
Unit 1 Introduction
maintain, and it reduces the chances of errors and bugs. You should write code
that is consistent, modular, readable, documented, and formatted. You should
also use meaningful names, avoid hard-coded values, follow the principle of
least astonishment, and adhere to the coding standards and style guides of your
programming language.
4. Apply debugging techniques
The fourth step to test data structures and algorithms is to apply debugging
techniques that help you find and fix errors and bugs in your code. Debugging
techniques are methods or tools that help you locate, inspect, and modify the
state and behavior of your code. You can use different debugging techniques,
such as print statements, breakpoints, watch expressions, step-by-step
execution, stack traces, error messages, logging, or debugging tools. Debugging
tools are software applications or features that help you debug your code in a
graphical or interactive way, such as PyCharm, Eclipse, or Visual Studio Code.
SY BTECH Computer Science And Business Systems