SlideShare a Scribd company logo
Data Structures in C#
SIVASANKAR GORANTLA
Asymptotic notation
 Before writing any program, we write some blueprint which is called as an
algorithm.
 We can have many solutions for each algorithm like A1, A2, A3 … etc
 Analyze the algorithm in terms of Time and Space complexity. Based on that
we will select the best algorithm.
 There are some notations created by scientists in order to denote these
complexities in simple terminology called as Asymptotic notation.
Types:
 Big oh notation (O notation) – Used to denote the worst case / upper bound of the
algorithm. We are always interested in this.
 Omega notation (Ω notation) – Used to denote the best case/ lower bound of the
algorithm
 Theta notation ( notation) – Used to denote average case of the algorithm
 Ex with array : 5,4,2,6,8,9 best case Ω(1), worst case O(n) , average analysis
(n/2) = (n)
Mostly used Asymptotic notations
constant − Ο(1)
logarithmic − Ο(log n)
linear − Ο(n)
n log n − Ο(n log n)
quadratic − Ο(n
2
)
cubic − Ο(n
3
)
polynomial − n
Ο(1)
exponential − 2
Ο(n)
What is ADT ?
 To manage the complexity of problems and the problem-solving process,
computer scientists use abstractions to allow them to focus on the “big
picture” without getting lost in the details.
 An abstract data type, sometimes abbreviated ADT, is a logical description of
how we view the data and the operations that are allowed without regard to
how they will be implemented.
 Example : List, Map
 One ADT can have several implementations
Example of ADT
 Lets consider the interface System.Collections.IList
 The basic operations, which it defines, are:
 int Add(object) – adds element in the end of the list
 void Insert(int, object) – inserts element on a preliminary chosen position
in the list
 void Clear() – removes all elements in the list
 bool Contains(object) – checks whether the list contains the element
 void Remove(object) – removes the element from the list
 void RemoveAt(int) – removes the element on a given position
 int IndexOf(object) – returns the position of the element
 this[int] – indexer, allows access to the elements on a set position
What is data structure and it’s need?
 Data structure is a systematic way of organizing data in order to use it efficiently.
 Choosing right data structure makes program much more efficient – We could
save memory and execution time. Sometimes even the amount of code that we
write.
Need:
 As applications are getting complex, data also getting increased. Due to this,
below are the three common problems that we are facing today.
 Data Search
 Processing Speed
 Multiple requests
Basic data structures in programming.
 Linear – these include arrays(Array), lists(ArrayList, List<T>), stacks(Stack<T>),
queues(Queue<T>) and linked lists(LinkedList<T>)
 Non-Linear:
 Dictionaries – key-value pairs organized in hash tables (HashTable and
Dictionary<T>)
 Tree-like – Tree, Binary tree, AVL tree, Spanning tree and Heap
 Sets – unordered bunches of unique elements
 Others – multi-sets, bags, multi-bags, priority queues, Graphs…
Motivation behind inventing the array
 Let’s say you have a requirement to store 100 values into the memory. How can
we store these many values into the memory with out using arrays.
 What is the basic thing required to store some value into the memory in high
level languages?
A variable, which holds the address location of the memory.
 In order to store 100 values into the memory, we need to create 100 variables
in the program ?
 100 variable is fine, what if you want to store/access 10000 elements ?
Array
 Arrays are one of the simplest and most commonly used data
structure in computer programming.
 All the elements of array must be of same type. Hence arrays are
homogenous (Why?)
 The contents of the array is stored in contiguous memory
block.(Why?)
 All the elements can be directly accessed with index. (How?)
 Let’s take an example to understand how array stored into the
heap.
 Ex: bool [] booleanArray;
FileInfo [] files;
booleanArray = new bool[10];
files = new FileInfo[10];
Array memory representation
Two dimensional arrays
 Two dimensional arrays.
 For example , if I create multi dimensional array with mxn values then this is how it is
going to store the data in memory

 3D array :
Basic operation on Array
 Read elements by index O(1)
Ex: int valueAtIndexTwo = array[2];
 Write element by specifying the index
Ex: array[10] = 12; O(1)
 Search for an element by value O(n)
 Search for an element by value using Binary search O(log n) only
when array is sorted
 http://eli.thegreenplace.net/2015/memory-layout-of-multi-
dimensional-arrays/
Array analysis
 Ordering – Guaranteed
 Contiguous –Yes
 Direct access –Yes via index O(1)
 Look up efficiency – O(1)
 ArrayList has O(n) time complexity for arbitrary indices of add/remove, but O(1) for
the operation at the end of the list.
 The running time of an array access is denoted O(1) because it is constant. That is,
regardless of how many elements are stored in the array, it takes the same amount
of time to look up an element.
 This constant running time is possible solely because an array's elements are stored
contiguously, hence a lookup only requires knowledge of the array's starting
location in memory, the size of each array element, and the element to be
indexed.
 The .NET Framework does an automatic check on each element access attempt,
whether the index is valid or it is out of the range of the array.
Limitations of Array
 The size of the array is fixed while declaration itself.
 Can store only similar data items
Array List
 The ArrayList maintains an internal object array and provides
automatic resizing of the array as the number of elements added to
the ArrayList grows.
 Because the ArrayList uses an object array, developers can add any
type—strings, integers, FileInfo objects, Form instances, anything.
 Therefore, even if you have an ArrayList that stores nothing but value
types, each ArrayList element is a reference to a boxed value type,
as shown below.
 The boxing and unboxing, along with the extra level of indirection,
that comes with using value types in an ArrayList can hamper the
performance of your application when using large ArrayLists with
many reads and writes.
ArrayList memory representation
Basic operation on ArrayList
 Add(object) – adding a new element
 Insert(int, object) – adding a new element at a specified position
(index)
 Count – returns the count of elements in the list
 Remove(object) – removes a specified element
 RemoveAt(int) – removes the element at a specified position
 Clear() – removes all elements from the list
 this[int] – an indexer, allows accessing the elements by a given
position (index)
 ArrayList.Insert():
if (_size == _items.Length)
{
EnsureCapacity(_size + 1);
}
if (index < _size)
{
Array.Copy(_items, index, _items, index + 1, _size - index);
}
_items[index] = value;
_size++;
 Copies a range of elements from System.Array starting at the specified source index and pastes them to
another System.Array starting at the specified destination index. The length and the indexes are specified as 32-
bit integers.
 ArrayList.RemoveAt():
_size--;
if (index < _size)
{
Array.Copy(_items, index + 1, _items, index, _size - index);
}
Copy(sourceArray, sourceIndex, destinationArray, destinationIndex,
length, false);
 Copies a range of elements from an System.Array starting at the
specified source index and pastes them to another System.Array
starting at the specified destination index. The length and the indexes
are specified as 32-bit integers.
Analysis of ArrayList
 Ordering – Guaranteed
 Contiguous –Yes
 Direct access –Yes via index O(1)
 Look up efficiency – O(1)
 ArrayList has O(n) time complexity for arbitrary indices of
add/remove, but O(1) for the operation at the end of the list
Limitations of ArrayList
 The main problem with ArrayList is that is uses object - it means you
have to cast to and from whatever you are encapsulating.
 Implicit boxing will happen whenever you use a value type - it will
be boxed when put into the ArrayList and unboxed when
referenced.
 Since generics came in, this object has become obsolete and
would only be needed in .NET 1.0/1.1 code.
List<T>
 The List C# data structure was introduced in the .NET Framework 2.0 as part of
the new set of generic collections.
 The List<T> class is a generic equivalent type of ArrayList.
 It implements the IList<T>generic interface by using an array whose size is
dynamically increased as required.
 It keeps its elements in the memory as an array.
 It can be extremely efficient data structure when it is necessary to add elements
fast, extract elements and access the elements by index. Still, it is pretty slow in
inserting and removing elements unless these elements are at the last position.
 Represents a strongly typed list of objects that can be accessed by index.
Provides methods to search, sort, and manipulate lists.
 Elements in this collection can be accessed using an integer index. Indexes in
this collection are zero-based.
Operations on List<T>
 We already explained that the List<T> class uses an inner array for keeping
the elements and the array doubles its size when it gets overfilled. Such
implementation causes the following good and bad sides:
 - The search by index is very fast – we can access with equal speed each
of the elements, regardless of the count of elements.
 - The search for an element by value works with as many comparisons as
the count of elements (in the worst case), i.e. it is slow.
 - Inserting and removing elements is a slow operation – when we add or
remove elements, especially if they are not in the end of the array, we
have to shift the rest of the elements and this is a slow operation.
 - When adding a new element, sometimes we have to increase the
capacity of the array, which is a slow operation, but it happens seldom
and the average speed of insertion to List does not depend on the count
of elements, i.e. it works very fast.
Analysis of List
 Ordering – Guaranteed
 Contiguous –Yes
 Direct access –Yes via index O(1)
 Look up efficiency – O(1)
 Best for small list where direct access is required
Linked List
 A linked-list is a sequence of data structures which are connected together via
links.
 Linked List is a sequence of links which contains items.
 Each link contains a connection to another link. Linked list the second most used
data structure after array.
 Following are important terms to understand the concepts of Linked List.
 Link − Each Link of a linked list can store a data called an element.
 Next − Each Link of a linked list contain a link to next link called Next.
 LinkedList − A LinkedList contains the connection link to the first Link called First.
Advantages of LinkedList<T>
 The append operation is very fast, because the list always knows its
last element (tail).
 Inserting a new element at a random position in the list is very fast
(unlike List<T>) if we have a pointer to this position, e.g. if we insert at
the list start or at the list end.
 Searching for elements by index or by value in LinkedList is a slow
operation, as we have to scan all elements consecutively by
beginning from the start of the list.
 Removing elements is a slow operation, because it includes
searching.
Analysis of LinkesList
 Ordering – User has control over precise control over element over
ordering
 Contiguous – No
 Direct access – No
 Look up efficiency – O(n)
 Best for lists where inserting/deleting in middle is common and no
direct access required
Queue
 Queue is an abstract data type, in which the first element is inserted
from one end called REAR(also called tail), and the deletion of
existing element takes place from the other end called
as FRONT(also called head)
 This makes queue as FIFO data structure, which means that element
inserted first will also be removed first.
 The process to add an element into queue is called Enqueue
 The process of removal of an element from queue is
called Dequeue.
 The process of reading the element at head node is called Peek.
The Queue – Basic Operations
 Queue<T> class provides the basic operations, specific for the data
structure queue. Here are some of the most frequently used:
 - Enqueue(T) – inserts an element at the end of the queue
 - Dequeue() – retrieves the element from the beginning of the
queue and removes it
 - Peek() – returns the element from the beginning of the queue
without removing it
 - Clear() – removes all elements from the queue
 - Contains(T) – checks if the queue contains the element
 - Count – returns the amount of elements in the queue
.NET implementation of the Queue
 In C# queue is implemented using Circular buffer.
 Circular buffer: A circular buffer is a memory allocation scheme where memory is
reused (reclaimed) when an index, incremented modulo the buffer size, writes over
a previously used location.
 Internally it uses array to implement the queue. So it looks like this
.NET implementation of the Queue
 Is full : _tail = (_tail + 1) % _array.Length;
_head = (_head + 1) % _array.Length;
 Is Empty:
Analysis on Queue
 Enqueue : O(1)
 Dequeue : O(1)
Stack
 Stack is an abstract data type or a linear data structure, in which
last element will be removed first.
 This makes Stack as LIFO data structure, which means that element
inserted last will be removed first.
 The process to add an element into stack is called Push
 The process of removal of an element from stack is called Pop.
Stack<T> – Basic Operations
 Push(T) – adds a new element on the top of the stack
 Pop() – returns the highest element and removes it from the stack
 Peek() – returns the highest element without removing it
 Count – returns the count of elements in the stack
 Clear() – retrieves all elements from the stack
 Contains(T) – check whether the stack contains the element
 ToArray() – returns an array, containing all elements of the stack
.NET implementation of Stack
Push :
// Pushes an item to the top of the stack.
//
public virtual void Push(Object obj) {
//Contract.Ensures(Count == Contract.OldValue(Count) + 1);
if (_size == _array.Length) {
Object[] newArray = new Object[2*_array.Length];
Array.Copy(_array, 0, newArray, 0, _size);
_array = newArray;
}
_array[_size++] = obj;
_version++;
}
.NET implementation of Stack
Pop :
// Pops an item from the top of the stack. If the stack is empty, Pop
// throws an InvalidOperationException.
public virtual Object Pop() {
if (_size == 0)
throw new
InvalidOperationException(Environment.GetResourceString("InvalidOperation_Empty
Stack"));
//Contract.Ensures(Count == Contract.OldValue(Count) - 1);
Contract.EndContractBlock();
_version++;
Object obj = _array[--_size];
_array[_size] = null; // Free memory quicker.
return obj;
}
Dictionary data structures
 Hash Table
 Dictionary<T>
What is Hash- Table
 Problem with Ordinal indexing ?
 Hash table combines the random access ability of array with the dynamism of
linked list.
i.e. Insertion/Deletion and Lookup can be done with O(1)
complexity if it is implemented correctly
 To achieve this we can create a data structure where while inserting data, the
data itself gives us some clue about where we can store the data.
 A Hash table is a combination of two things
 First, a hash function which return a non negative value called Hash code.
 Second, an array capable of storing the data that we want to place into the structure.
 The idea is that we run our data through the hash function and then store the
data in the element of an array represented by the returned hashcode.
 As elements are added to a Hashtable, the actual load factor of
the Hashtable increases. When the actual load factor reaches the specified
load factor, the number of buckets in the Hashtable is automatically increased
to the smallest prime number that is larger than twice the current number
of Hashtable buckets.
 For very large Hashtable objects, you can increase the maximum capacity to 2
billion elements on a 64-bit system by setting the enabled attribute of the
configuration element to true in the run-time environment.
 How insertion happens in Hashtable
 How lookup works in hash table
 Ex: if you want search for “John” in the hashtable, we pass key and it hashes that key and gets
the same hash code which was generated while inserting “John” in the hash table. That is 4 .
 It searches “John” at the 4 index of hashtable and returns true as “John” is present at 4th index of
hashtable.
 Each element is a key/value pair stored in a DictionaryEntry object.
 private struct DictionaryEntry{
public TKey key;
public TValue value;
public int hashCode;
public int next;
}
How to define the Hash function?
 There is no limit number of possible hash functions.
 However there are some characteristics expected to qualify it as an
efficient hash function.
 Deterministic – Every time pass the exact the same piece of data into the
hash function, we always get same hash code.
 Uniformly distributed data – You should not get same hash code for different
values every time
 Ex of hash function
What if we came across this situation
 Do you see any problem in the following hastable
 We call this as collision.
 A collision occurs when two pieces of data run through the hash function and
get the same hash code.
 We want to store both pieces of data and don’t want to override the existing
one with new one.
Collision resolution techniques
 Linear probing : in this method if collision occurs we try to place the data in the
next consecutive index until we find the vacancy.It has clustering problem .
 Quadratic probing : If slot s is taken, rather than checking slot s + 1, then s + 2,
and so on as in linear probing, quadratic probing checks slot s + 12 first, then s –
12, then s + 22, then s – 22, then s + 32, and so on. However, even quadratic
hashing can lead to clustering.
 Chaining (Used in Dictionary<T>): Here linked list comes into picture. Instead of
storing one value in each element of hashtable, it contains pointer to the
linked list. So each element of array is a pointer to head of linked list.
 Rehashing (Used in HashTable): It has different hash functions (H1,H2..Hn) when
collision occurs.
 Ex: Hk(key) =
[GetHash(key) + k * (1 + (((GetHash(key) >> 5) + 1) % (hashsize – 1)))] % hashsize
Data structures in c#
When to use what?
 Do you need a sequential list where the element is typically discarded after its
value is retrieved?
 If yes, consider using the Queue class or the Queue<T> generic class if you need first-in,
first-out (FIFO) behavior. Consider using theStack class or the Stack<T> generic class if
you need last-in, first-out (LIFO) behavior. For safe access from multiple threads, use the
concurrent versions ConcurrentQueue<T> and ConcurrentStack<T>.
 If not, consider using the other collections.
 Do you need to access the elements in a certain order, such as FIFO, LIFO, or
random?
 The Queue class and the Queue<T> or ConcurrentQueue<T> generic class offer FIFO
access. For more information, see When to Use a Thread-Safe Collection.
 The Stack class and the Stack<T> or ConcurrentStack<T> generic class offer LIFO
access. For more information, see When to Use a Thread-Safe Collection.
 The LinkedList<T> generic class allows sequential access either from the head to the tail,
or from the tail to the head.
 Do you need to access each element by index?
 The ArrayList and StringCollection classes and the List<T> generic class offer access
to their elements by the zero-based index of the element.
 The Hashtable, SortedList, ListDictionary, and StringDictionary classes, and
the Dictionary<TKey, TValue> and SortedDictionary<TKey, TValue> generic classes
offer access to their elements by the key of the element.
 The NameObjectCollectionBase and NameValueCollection classes, and
the KeyedCollection<TKey, TItem> and SortedList<TKey, TValue>generic classes
offer access to their elements by either the zero-based index or the key of the
element.
 Will each element contain one value, a combination of one key and one
value, or a combination of one key and multiple values?
 One value: Use any of the collections based on the IList interface or
the IList<T> generic interface.
 One key and one value: Use any of the collections based on
the IDictionary interface or the IDictionary<TKey, TValue> generic interface.
 One value with embedded key: Use the KeyedCollection<TKey, TItem> generic
class.
 One key and multiple values: Use the NameValueCollection class.
 Do you need to sort the elements differently from how they were entered?
 The Hashtable class sorts its elements by their hash codes.
 The SortedList class and the SortedDictionary<TKey, TValue> and SortedList<TKey,
TValue> generic classes sort their elements by the key, based on implementations
of the IComparer interface and the IComparer<T> generic interface.
 ArrayList provides a Sort method that takes an IComparer implementation as a
parameter. Its generic counterpart, the List<T> generic class, provides
a Sort method that takes an implementation of the IComparer<T> generic
interface as a parameter.
 Do you need fast searches and retrieval of information?
 ListDictionary is faster than Hashtable for small collections (10 items or fewer).
The Dictionary<TKey, TValue> generic class provides faster lookup than
the SortedDictionary<TKey, TValue> generic class. The multi-threaded
implementation isConcurrentDictionary<TKey,
TValue>. ConcurrentBag<T> provides fast multi-threaded insertion for unordered
data. For more information about both multi-threaded types, see When to Use a
Thread-Safe Collection.

More Related Content

PPTX
Module 1 - LTS - NSTP 1.pptx
PPT
Principles of Training
PPTX
Queue ppt
PPT
Training Principles of Exercise
PPT
Programming in c#
PPTX
data types in C-Sharp (C#)
PPTX
Flutter Intro
PPTX
Flutter
Module 1 - LTS - NSTP 1.pptx
Principles of Training
Queue ppt
Training Principles of Exercise
Programming in c#
data types in C-Sharp (C#)
Flutter Intro
Flutter

What's hot (20)

PPT
Abstract class in java
PPTX
Functions in c
PPT
Introduction to Compiler Construction
PPTX
Operators and expressions in c language
PDF
Set methods in python
PPTX
Function C programming
PPTX
Fundamentals Of Software Architecture
PPT
RECURSION IN C
PPTX
Pointers in c language
PPTX
Data structures and algorithms
PPTX
INLINE FUNCTION IN C++
PDF
Syntax analysis
PPTX
Data Structures and Algorithm - Module 1.pptx
PPTX
Characteristics of OOPS
PPTX
Functions in c language
PPTX
C decision making and looping.
PPTX
Linked list
PPTX
C# Arrays
PPTX
data types in C programming
PPTX
Linked List
Abstract class in java
Functions in c
Introduction to Compiler Construction
Operators and expressions in c language
Set methods in python
Function C programming
Fundamentals Of Software Architecture
RECURSION IN C
Pointers in c language
Data structures and algorithms
INLINE FUNCTION IN C++
Syntax analysis
Data Structures and Algorithm - Module 1.pptx
Characteristics of OOPS
Functions in c language
C decision making and looping.
Linked list
C# Arrays
data types in C programming
Linked List
Ad

Similar to Data structures in c# (20)

PPTX
project on data structures and algorithm
DOCX
Datastructures and algorithms prepared by M.V.Brehmanada Reddy
PPTX
unit 2.pptx
PPTX
Data structures and algorithms arrays
PPT
Generics collections
PPTX
TSAT Presentation1.pptx
PPTX
ppt on arrays in c programming language.pptx
PDF
DSA UNIT II ARRAY AND LIST - notes
PPTX
Mca ii dfs u-1 introduction to data structure
PPTX
A singly linked list is a linear data structure
PDF
Chapter 1 Introduction to Data Structures and Algorithms.pdf
PPTX
Bsc cs ii dfs u-1 introduction to data structure
PPTX
Introduction to Data Structures and their importance
PPT
Generics Collections
PPT
Data Structure In C#
PPTX
Bca ii dfs u-1 introduction to data structure
PDF
M v bramhananda reddy dsa complete notes
PPTX
UNIT 1.pptx
PDF
UNITIII LDS.pdf
PPTX
UNIT 1 Memory ManagementMemory Management.pptx
project on data structures and algorithm
Datastructures and algorithms prepared by M.V.Brehmanada Reddy
unit 2.pptx
Data structures and algorithms arrays
Generics collections
TSAT Presentation1.pptx
ppt on arrays in c programming language.pptx
DSA UNIT II ARRAY AND LIST - notes
Mca ii dfs u-1 introduction to data structure
A singly linked list is a linear data structure
Chapter 1 Introduction to Data Structures and Algorithms.pdf
Bsc cs ii dfs u-1 introduction to data structure
Introduction to Data Structures and their importance
Generics Collections
Data Structure In C#
Bca ii dfs u-1 introduction to data structure
M v bramhananda reddy dsa complete notes
UNIT 1.pptx
UNITIII LDS.pdf
UNIT 1 Memory ManagementMemory Management.pptx
Ad

Recently uploaded (20)

PDF
AI in Product Development-omnex systems
PPT
JAVA ppt tutorial basics to learn java programming
PPTX
history of c programming in notes for students .pptx
PPTX
ISO 45001 Occupational Health and Safety Management System
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
Transform Your Business with a Software ERP System
PDF
System and Network Administraation Chapter 3
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PPTX
Essential Infomation Tech presentation.pptx
PDF
Understanding Forklifts - TECH EHS Solution
PPTX
Online Work Permit System for Fast Permit Processing
PPTX
Materi-Enum-and-Record-Data-Type (1).pptx
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
PTS Company Brochure 2025 (1).pdf.......
AI in Product Development-omnex systems
JAVA ppt tutorial basics to learn java programming
history of c programming in notes for students .pptx
ISO 45001 Occupational Health and Safety Management System
How to Migrate SBCGlobal Email to Yahoo Easily
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
How to Choose the Right IT Partner for Your Business in Malaysia
Transform Your Business with a Software ERP System
System and Network Administraation Chapter 3
Wondershare Filmora 15 Crack With Activation Key [2025
Which alternative to Crystal Reports is best for small or large businesses.pdf
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Essential Infomation Tech presentation.pptx
Understanding Forklifts - TECH EHS Solution
Online Work Permit System for Fast Permit Processing
Materi-Enum-and-Record-Data-Type (1).pptx
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
2025 Textile ERP Trends: SAP, Odoo & Oracle
PTS Company Brochure 2025 (1).pdf.......

Data structures in c#

  • 1. Data Structures in C# SIVASANKAR GORANTLA
  • 2. Asymptotic notation  Before writing any program, we write some blueprint which is called as an algorithm.  We can have many solutions for each algorithm like A1, A2, A3 … etc  Analyze the algorithm in terms of Time and Space complexity. Based on that we will select the best algorithm.  There are some notations created by scientists in order to denote these complexities in simple terminology called as Asymptotic notation. Types:  Big oh notation (O notation) – Used to denote the worst case / upper bound of the algorithm. We are always interested in this.  Omega notation (Ω notation) – Used to denote the best case/ lower bound of the algorithm  Theta notation ( notation) – Used to denote average case of the algorithm  Ex with array : 5,4,2,6,8,9 best case Ω(1), worst case O(n) , average analysis (n/2) = (n)
  • 3. Mostly used Asymptotic notations constant − Ο(1) logarithmic − Ο(log n) linear − Ο(n) n log n − Ο(n log n) quadratic − Ο(n 2 ) cubic − Ο(n 3 ) polynomial − n Ο(1) exponential − 2 Ο(n)
  • 4. What is ADT ?  To manage the complexity of problems and the problem-solving process, computer scientists use abstractions to allow them to focus on the “big picture” without getting lost in the details.  An abstract data type, sometimes abbreviated ADT, is a logical description of how we view the data and the operations that are allowed without regard to how they will be implemented.  Example : List, Map  One ADT can have several implementations
  • 5. Example of ADT  Lets consider the interface System.Collections.IList  The basic operations, which it defines, are:  int Add(object) – adds element in the end of the list  void Insert(int, object) – inserts element on a preliminary chosen position in the list  void Clear() – removes all elements in the list  bool Contains(object) – checks whether the list contains the element  void Remove(object) – removes the element from the list  void RemoveAt(int) – removes the element on a given position  int IndexOf(object) – returns the position of the element  this[int] – indexer, allows access to the elements on a set position
  • 6. What is data structure and it’s need?  Data structure is a systematic way of organizing data in order to use it efficiently.  Choosing right data structure makes program much more efficient – We could save memory and execution time. Sometimes even the amount of code that we write. Need:  As applications are getting complex, data also getting increased. Due to this, below are the three common problems that we are facing today.  Data Search  Processing Speed  Multiple requests
  • 7. Basic data structures in programming.  Linear – these include arrays(Array), lists(ArrayList, List<T>), stacks(Stack<T>), queues(Queue<T>) and linked lists(LinkedList<T>)  Non-Linear:  Dictionaries – key-value pairs organized in hash tables (HashTable and Dictionary<T>)  Tree-like – Tree, Binary tree, AVL tree, Spanning tree and Heap  Sets – unordered bunches of unique elements  Others – multi-sets, bags, multi-bags, priority queues, Graphs…
  • 8. Motivation behind inventing the array  Let’s say you have a requirement to store 100 values into the memory. How can we store these many values into the memory with out using arrays.  What is the basic thing required to store some value into the memory in high level languages? A variable, which holds the address location of the memory.  In order to store 100 values into the memory, we need to create 100 variables in the program ?  100 variable is fine, what if you want to store/access 10000 elements ?
  • 9. Array  Arrays are one of the simplest and most commonly used data structure in computer programming.  All the elements of array must be of same type. Hence arrays are homogenous (Why?)  The contents of the array is stored in contiguous memory block.(Why?)  All the elements can be directly accessed with index. (How?)  Let’s take an example to understand how array stored into the heap.  Ex: bool [] booleanArray; FileInfo [] files; booleanArray = new bool[10]; files = new FileInfo[10];
  • 11. Two dimensional arrays  Two dimensional arrays.  For example , if I create multi dimensional array with mxn values then this is how it is going to store the data in memory   3D array :
  • 12. Basic operation on Array  Read elements by index O(1) Ex: int valueAtIndexTwo = array[2];  Write element by specifying the index Ex: array[10] = 12; O(1)  Search for an element by value O(n)  Search for an element by value using Binary search O(log n) only when array is sorted  http://eli.thegreenplace.net/2015/memory-layout-of-multi- dimensional-arrays/
  • 13. Array analysis  Ordering – Guaranteed  Contiguous –Yes  Direct access –Yes via index O(1)  Look up efficiency – O(1)  ArrayList has O(n) time complexity for arbitrary indices of add/remove, but O(1) for the operation at the end of the list.  The running time of an array access is denoted O(1) because it is constant. That is, regardless of how many elements are stored in the array, it takes the same amount of time to look up an element.  This constant running time is possible solely because an array's elements are stored contiguously, hence a lookup only requires knowledge of the array's starting location in memory, the size of each array element, and the element to be indexed.  The .NET Framework does an automatic check on each element access attempt, whether the index is valid or it is out of the range of the array.
  • 14. Limitations of Array  The size of the array is fixed while declaration itself.  Can store only similar data items
  • 15. Array List  The ArrayList maintains an internal object array and provides automatic resizing of the array as the number of elements added to the ArrayList grows.  Because the ArrayList uses an object array, developers can add any type—strings, integers, FileInfo objects, Form instances, anything.  Therefore, even if you have an ArrayList that stores nothing but value types, each ArrayList element is a reference to a boxed value type, as shown below.  The boxing and unboxing, along with the extra level of indirection, that comes with using value types in an ArrayList can hamper the performance of your application when using large ArrayLists with many reads and writes.
  • 17. Basic operation on ArrayList  Add(object) – adding a new element  Insert(int, object) – adding a new element at a specified position (index)  Count – returns the count of elements in the list  Remove(object) – removes a specified element  RemoveAt(int) – removes the element at a specified position  Clear() – removes all elements from the list  this[int] – an indexer, allows accessing the elements by a given position (index)
  • 18.  ArrayList.Insert(): if (_size == _items.Length) { EnsureCapacity(_size + 1); } if (index < _size) { Array.Copy(_items, index, _items, index + 1, _size - index); } _items[index] = value; _size++;  Copies a range of elements from System.Array starting at the specified source index and pastes them to another System.Array starting at the specified destination index. The length and the indexes are specified as 32- bit integers.
  • 19.  ArrayList.RemoveAt(): _size--; if (index < _size) { Array.Copy(_items, index + 1, _items, index, _size - index); } Copy(sourceArray, sourceIndex, destinationArray, destinationIndex, length, false);  Copies a range of elements from an System.Array starting at the specified source index and pastes them to another System.Array starting at the specified destination index. The length and the indexes are specified as 32-bit integers.
  • 20. Analysis of ArrayList  Ordering – Guaranteed  Contiguous –Yes  Direct access –Yes via index O(1)  Look up efficiency – O(1)  ArrayList has O(n) time complexity for arbitrary indices of add/remove, but O(1) for the operation at the end of the list
  • 21. Limitations of ArrayList  The main problem with ArrayList is that is uses object - it means you have to cast to and from whatever you are encapsulating.  Implicit boxing will happen whenever you use a value type - it will be boxed when put into the ArrayList and unboxed when referenced.  Since generics came in, this object has become obsolete and would only be needed in .NET 1.0/1.1 code.
  • 22. List<T>  The List C# data structure was introduced in the .NET Framework 2.0 as part of the new set of generic collections.  The List<T> class is a generic equivalent type of ArrayList.  It implements the IList<T>generic interface by using an array whose size is dynamically increased as required.  It keeps its elements in the memory as an array.  It can be extremely efficient data structure when it is necessary to add elements fast, extract elements and access the elements by index. Still, it is pretty slow in inserting and removing elements unless these elements are at the last position.  Represents a strongly typed list of objects that can be accessed by index. Provides methods to search, sort, and manipulate lists.  Elements in this collection can be accessed using an integer index. Indexes in this collection are zero-based.
  • 23. Operations on List<T>  We already explained that the List<T> class uses an inner array for keeping the elements and the array doubles its size when it gets overfilled. Such implementation causes the following good and bad sides:  - The search by index is very fast – we can access with equal speed each of the elements, regardless of the count of elements.  - The search for an element by value works with as many comparisons as the count of elements (in the worst case), i.e. it is slow.  - Inserting and removing elements is a slow operation – when we add or remove elements, especially if they are not in the end of the array, we have to shift the rest of the elements and this is a slow operation.  - When adding a new element, sometimes we have to increase the capacity of the array, which is a slow operation, but it happens seldom and the average speed of insertion to List does not depend on the count of elements, i.e. it works very fast.
  • 24. Analysis of List  Ordering – Guaranteed  Contiguous –Yes  Direct access –Yes via index O(1)  Look up efficiency – O(1)  Best for small list where direct access is required
  • 25. Linked List  A linked-list is a sequence of data structures which are connected together via links.  Linked List is a sequence of links which contains items.  Each link contains a connection to another link. Linked list the second most used data structure after array.  Following are important terms to understand the concepts of Linked List.  Link − Each Link of a linked list can store a data called an element.  Next − Each Link of a linked list contain a link to next link called Next.  LinkedList − A LinkedList contains the connection link to the first Link called First.
  • 26. Advantages of LinkedList<T>  The append operation is very fast, because the list always knows its last element (tail).  Inserting a new element at a random position in the list is very fast (unlike List<T>) if we have a pointer to this position, e.g. if we insert at the list start or at the list end.  Searching for elements by index or by value in LinkedList is a slow operation, as we have to scan all elements consecutively by beginning from the start of the list.  Removing elements is a slow operation, because it includes searching.
  • 27. Analysis of LinkesList  Ordering – User has control over precise control over element over ordering  Contiguous – No  Direct access – No  Look up efficiency – O(n)  Best for lists where inserting/deleting in middle is common and no direct access required
  • 28. Queue  Queue is an abstract data type, in which the first element is inserted from one end called REAR(also called tail), and the deletion of existing element takes place from the other end called as FRONT(also called head)  This makes queue as FIFO data structure, which means that element inserted first will also be removed first.  The process to add an element into queue is called Enqueue  The process of removal of an element from queue is called Dequeue.  The process of reading the element at head node is called Peek.
  • 29. The Queue – Basic Operations  Queue<T> class provides the basic operations, specific for the data structure queue. Here are some of the most frequently used:  - Enqueue(T) – inserts an element at the end of the queue  - Dequeue() – retrieves the element from the beginning of the queue and removes it  - Peek() – returns the element from the beginning of the queue without removing it  - Clear() – removes all elements from the queue  - Contains(T) – checks if the queue contains the element  - Count – returns the amount of elements in the queue
  • 30. .NET implementation of the Queue  In C# queue is implemented using Circular buffer.  Circular buffer: A circular buffer is a memory allocation scheme where memory is reused (reclaimed) when an index, incremented modulo the buffer size, writes over a previously used location.  Internally it uses array to implement the queue. So it looks like this
  • 31. .NET implementation of the Queue  Is full : _tail = (_tail + 1) % _array.Length; _head = (_head + 1) % _array.Length;  Is Empty:
  • 32. Analysis on Queue  Enqueue : O(1)  Dequeue : O(1)
  • 33. Stack  Stack is an abstract data type or a linear data structure, in which last element will be removed first.  This makes Stack as LIFO data structure, which means that element inserted last will be removed first.  The process to add an element into stack is called Push  The process of removal of an element from stack is called Pop.
  • 34. Stack<T> – Basic Operations  Push(T) – adds a new element on the top of the stack  Pop() – returns the highest element and removes it from the stack  Peek() – returns the highest element without removing it  Count – returns the count of elements in the stack  Clear() – retrieves all elements from the stack  Contains(T) – check whether the stack contains the element  ToArray() – returns an array, containing all elements of the stack
  • 35. .NET implementation of Stack Push : // Pushes an item to the top of the stack. // public virtual void Push(Object obj) { //Contract.Ensures(Count == Contract.OldValue(Count) + 1); if (_size == _array.Length) { Object[] newArray = new Object[2*_array.Length]; Array.Copy(_array, 0, newArray, 0, _size); _array = newArray; } _array[_size++] = obj; _version++; }
  • 36. .NET implementation of Stack Pop : // Pops an item from the top of the stack. If the stack is empty, Pop // throws an InvalidOperationException. public virtual Object Pop() { if (_size == 0) throw new InvalidOperationException(Environment.GetResourceString("InvalidOperation_Empty Stack")); //Contract.Ensures(Count == Contract.OldValue(Count) - 1); Contract.EndContractBlock(); _version++; Object obj = _array[--_size]; _array[_size] = null; // Free memory quicker. return obj; }
  • 37. Dictionary data structures  Hash Table  Dictionary<T>
  • 38. What is Hash- Table  Problem with Ordinal indexing ?
  • 39.  Hash table combines the random access ability of array with the dynamism of linked list. i.e. Insertion/Deletion and Lookup can be done with O(1) complexity if it is implemented correctly  To achieve this we can create a data structure where while inserting data, the data itself gives us some clue about where we can store the data.  A Hash table is a combination of two things  First, a hash function which return a non negative value called Hash code.  Second, an array capable of storing the data that we want to place into the structure.  The idea is that we run our data through the hash function and then store the data in the element of an array represented by the returned hashcode.
  • 40.  As elements are added to a Hashtable, the actual load factor of the Hashtable increases. When the actual load factor reaches the specified load factor, the number of buckets in the Hashtable is automatically increased to the smallest prime number that is larger than twice the current number of Hashtable buckets.  For very large Hashtable objects, you can increase the maximum capacity to 2 billion elements on a 64-bit system by setting the enabled attribute of the configuration element to true in the run-time environment.
  • 41.  How insertion happens in Hashtable  How lookup works in hash table  Ex: if you want search for “John” in the hashtable, we pass key and it hashes that key and gets the same hash code which was generated while inserting “John” in the hash table. That is 4 .  It searches “John” at the 4 index of hashtable and returns true as “John” is present at 4th index of hashtable.  Each element is a key/value pair stored in a DictionaryEntry object.  private struct DictionaryEntry{ public TKey key; public TValue value; public int hashCode; public int next; }
  • 42. How to define the Hash function?  There is no limit number of possible hash functions.  However there are some characteristics expected to qualify it as an efficient hash function.  Deterministic – Every time pass the exact the same piece of data into the hash function, we always get same hash code.  Uniformly distributed data – You should not get same hash code for different values every time  Ex of hash function
  • 43. What if we came across this situation  Do you see any problem in the following hastable  We call this as collision.  A collision occurs when two pieces of data run through the hash function and get the same hash code.  We want to store both pieces of data and don’t want to override the existing one with new one.
  • 44. Collision resolution techniques  Linear probing : in this method if collision occurs we try to place the data in the next consecutive index until we find the vacancy.It has clustering problem .  Quadratic probing : If slot s is taken, rather than checking slot s + 1, then s + 2, and so on as in linear probing, quadratic probing checks slot s + 12 first, then s – 12, then s + 22, then s – 22, then s + 32, and so on. However, even quadratic hashing can lead to clustering.  Chaining (Used in Dictionary<T>): Here linked list comes into picture. Instead of storing one value in each element of hashtable, it contains pointer to the linked list. So each element of array is a pointer to head of linked list.  Rehashing (Used in HashTable): It has different hash functions (H1,H2..Hn) when collision occurs.  Ex: Hk(key) = [GetHash(key) + k * (1 + (((GetHash(key) >> 5) + 1) % (hashsize – 1)))] % hashsize
  • 46. When to use what?  Do you need a sequential list where the element is typically discarded after its value is retrieved?  If yes, consider using the Queue class or the Queue<T> generic class if you need first-in, first-out (FIFO) behavior. Consider using theStack class or the Stack<T> generic class if you need last-in, first-out (LIFO) behavior. For safe access from multiple threads, use the concurrent versions ConcurrentQueue<T> and ConcurrentStack<T>.  If not, consider using the other collections.  Do you need to access the elements in a certain order, such as FIFO, LIFO, or random?  The Queue class and the Queue<T> or ConcurrentQueue<T> generic class offer FIFO access. For more information, see When to Use a Thread-Safe Collection.  The Stack class and the Stack<T> or ConcurrentStack<T> generic class offer LIFO access. For more information, see When to Use a Thread-Safe Collection.  The LinkedList<T> generic class allows sequential access either from the head to the tail, or from the tail to the head.
  • 47.  Do you need to access each element by index?  The ArrayList and StringCollection classes and the List<T> generic class offer access to their elements by the zero-based index of the element.  The Hashtable, SortedList, ListDictionary, and StringDictionary classes, and the Dictionary<TKey, TValue> and SortedDictionary<TKey, TValue> generic classes offer access to their elements by the key of the element.  The NameObjectCollectionBase and NameValueCollection classes, and the KeyedCollection<TKey, TItem> and SortedList<TKey, TValue>generic classes offer access to their elements by either the zero-based index or the key of the element.  Will each element contain one value, a combination of one key and one value, or a combination of one key and multiple values?  One value: Use any of the collections based on the IList interface or the IList<T> generic interface.  One key and one value: Use any of the collections based on the IDictionary interface or the IDictionary<TKey, TValue> generic interface.  One value with embedded key: Use the KeyedCollection<TKey, TItem> generic class.  One key and multiple values: Use the NameValueCollection class.
  • 48.  Do you need to sort the elements differently from how they were entered?  The Hashtable class sorts its elements by their hash codes.  The SortedList class and the SortedDictionary<TKey, TValue> and SortedList<TKey, TValue> generic classes sort their elements by the key, based on implementations of the IComparer interface and the IComparer<T> generic interface.  ArrayList provides a Sort method that takes an IComparer implementation as a parameter. Its generic counterpart, the List<T> generic class, provides a Sort method that takes an implementation of the IComparer<T> generic interface as a parameter.  Do you need fast searches and retrieval of information?  ListDictionary is faster than Hashtable for small collections (10 items or fewer). The Dictionary<TKey, TValue> generic class provides faster lookup than the SortedDictionary<TKey, TValue> generic class. The multi-threaded implementation isConcurrentDictionary<TKey, TValue>. ConcurrentBag<T> provides fast multi-threaded insertion for unordered data. For more information about both multi-threaded types, see When to Use a Thread-Safe Collection.