SlideShare a Scribd company logo
Information Storage and
Management
By AKASH BADONE
(UNIT I -Introduction to Storage Technology)
 Application: An application is a computer program that provides the logic for computing
operations.
 Database : A Database is an organized collection of Data.
 Server and operating system: A computing platform that runs applications and databases.
 Network: A data path that facilitates communication between clients and servers or between
servers and storage.
 Storage array: A device that stores data persistently for subsequent use.
Core elements of a data center
Key Requirements for Data Center
Elements
Key Requirements for Data Center
Elements
 Availability : Accessibility should be insured for a business implementation.
 Security : Polices, procedures, and proper integration of the data center core elements that
will prevent unauthorized access to information must be established.
 Scalability : Additional On-Demand services , without interrupting business operations.
 Performance: All the core elements of the data center should be able to provide optimal
performance and service all processing requests at high speed.
 Data integrity : Mechanisms such as error correction codes and parity bits which ensure
that data is written to disk exactly as it was received.
 Capacity : To increase and decrease core elements capacity on demand.
 Manageability: A data center should perform all operations and activities in the most
efficient manner. Manageability can be achieved through automation and the reduction of
human (manual) intervention in common tasks.
Evolution of Storage Technology and
Architecture
Direct-attached storage (DAS)
Just a Bunch Of Disk(JBOD)
Redundant Array of Independent Disks (RAID)
Network-attached storage (NAS)
Storage area network (SAN)
Internet Protocol SAN (IP-SAN)
…Briefly described in UNIT III
Information Lifecycle Management(ILM)
Information lifecycle management (ILM) is a proactive strategy that enables an IT organization
to effectively manage the data throughout its lifecycle, based on predefined business policies.
This allows an IT organization to optimize the storage infrastructure for maximum return on
investment. An ILM strategy should include the following characteristics:
 Business-centric: It should be integrated with key processes, applications, and initiatives of
the business to meet both current and future growth in information.
 Centrally managed: All the information assets of a business should be under the purview of
the ILM strategy.
 Policy-based: The implementation of ILM should not be restricted to a few departments.
ILM should be implemented as a policy and encompass all business applications, processes,
and resources.
 Heterogeneous: An ILM strategy should take into account all types of storage platforms
and operating systems.
 Optimized: Because the value of information varies, an ILM strategy should consider the
different storage requirements and allocate storage resources based on the information’s
value to the business.
ILM Implementation
 Classifying data and applications on the basis of
business rules and policies to enable
differentiated treatment of information
 Implementing policies by using information
management tools, starting from the creation of
data and ending with its disposal
 Managing the environment by using integrated
tools to reduce operational complexity
 Organizing storage resources in tiers to align
the resources with data classes, and storing
information in the right type of infrastructure
based on the information’s current value
ILM Benefits
 Improved utilization by using tiered storage platforms and increased visibility of all
enterprise information.
 Simplified management by integrating process steps and interfaces with individual
tools and by increasing automation.
 A wider range of options for backup, and recovery to balance the need for business
continuity.
 Maintaining compliance by knowing what data needs to be protected for what
length of time.
 Lower Total Cost of Ownership (TCO) by aligning the infrastructure and
management costs with information value. As a result, resources are not wasted,
and complexity is not introduced by managing low-value data at the expense of
high-value data.
Key Challenges in Managing Information
 Exploding digital universe: The rate of information growth is increasing
exponentially. Duplication of data to ensure high availability and repurposing has
also contributed to the multifold increase of information growth.
 Increasing dependency on information: The strategic use of information plays an
important role in determining the success of a business and provides competitive
advantages in the marketplace.
 Changing value of information: Information that is valuable today may become less
important tomorrow. The value of information often changes over time. Framing a
policy to meet these challenges involves understanding the value of information
over its lifecycle.
(UNIT II -Storage Systems Architecture)
 Key components of a disk drive are
platter, spindle, read/write head,
actuator arm assembly, and controller.
 The set of rotating platters is sealed in a
case, called a Head DiskAssembly(HDA).
 A typical HDD consists of one or more
flat circular disks called platters.
Key components of a disk drive
Key components of a disk drive
 When the spindle is rotating, there is a microscopic air gap between the R/W heads
and the platters, known as the head flying height.
 This air gap is removed when the spindle stops rotating and the R/W head rests on
a special area on the platter near the spindle. This area is called the landing zone.
 If the drive malfunctions and the R/W head accidentally touches the surface of the
platter outside the landing zone, a head crash occurs.
Data Transfer Rate Zoned bit recording
Disk Drive Performance Based on Time
 Disk service time is the time taken by a disk to complete an I/O request. Components that
contribute to service time on a disk drive are seek time, rotational latency, and data transfer
rate.
 Seek Time : The time taken to reposition and settle the arm and the head over the correct track is
known as Seek Time. The lower the seek time, the faster the I/O operation. Disk vendors publish the
following seek time specifications:
 Full Stroke: The time taken by the R/W head to move across the entire width of the disk, from the innermost
track to the outermost track.
 Average: The average time taken by the R/W head to move from one random track to another, normally listed
as the time for one-third of a full stroke.
 Track-to-Track: The time taken by the R/W head to move between adjacent tracks.
 Rotational Latency : To access data, the actuator arm moves the R/W head over the platter to a particular
track while the platter spins to position the requested sector under the R/W head. The time taken by the
platter to rotate and position the data the R/W head is called rotational latency.
 Data Transfer Rate : The data transfer rate (also called transfer rate) refers to the average amount of data
per unit time that the drive can deliver to the HBA . In a read operation, the data first moves from disk
platters to R/W heads, and then it moves to the drive’s internal buffer. Finally, data moves from the buffer
through the interface to the host HBA. In a write operation, the data moves from the HBA to the internal
buffer of the disk drive through the drive’s interface. The data then moves from the buffer to the R/W
heads. Finally, it moves from the R/W heads to the platters
Intelligent Storage System
Components of an Intelligent Storage System
 Front End
 Cache
 Back End
 Physical Disk
Logical Unit Number (LUN)
Physical drives or RAID protected drives can be logically spitted into volumes these are
called Logical Volumes or Logical Unit number.
Intelligent Storage Array
Intelligent storage systems generally fall into one of the following two categories:
■ High-end storage systems : High-end storage systems, referred to as active-active arrays, are
generally aimed at large enterprises for centralizing corporate data. These arrays are designed with
a large number of controllers and cache memory. An active-active array implies that the host can
perform I/O’s to its LUNs across any of the available paths.
■ Midrange storage systems : Midrange storage systems are also referred to as active-passive
arrays and they are best suited for small- and medium-sized enterprises. In an active-passive array,
a host can perform I/O’s to a LUN only through the paths to the owning controller of that LUN.
These paths are called active paths. The other paths are passive with respect to this LUN. As shown
in Figure 4-8, the host can perform reads or writes to the LUN only through the path to controller
A, as controller A is the owner of that LUN. The path to controller B remains passive and no I/O
activity is performed through this path.
Midrange storage systems are typically designed with two controllers, each of which contains host
interfaces, cache, RAID controllers, and disk drive interfaces.
Active-Active/Active –Passive Configurations.
Redundant Array of Independent Disks (RAID)
Comparison in RAID Levels
Comparison in RAID Levels
RAID Levels ( 0 , 1 , 3 )
RAID Level ( 5 & 6 )
Nested RAID ( 0 + 1 / 1 + 0 )
Parity
Parity is a method of protecting striped data
from HDD failure without the cost of mirroring.
An additional HDD is added to the stripe width
to hold parity, a mathematical construct that
allows re-creation of the missing data. Parity is
a redundancy check that ensures full protection
of data without maintaining a full set of
duplicate data
Parity RAID
(UNIT III -Introduction to Networked Storage)
Evolution of Storage Technology and Architecture
 Direct-attached storage (DAS) : This type of storage
connects directly to a server (host) or a group of servers
in a cluster. Storage can be either internal or external to
the server. External DAS alleviated the challenges of
limited internal storage capacity
 Just a Bunch Of Disk(JBOD) : It is encloser full of cluster
disk , having no internal controller in JBOD hard disk are
permanently fitted into encloser and power supply are
taken outputs.
 Redundant Array of Independent Disks (RAID) : This
technology was developed to address the cost,
performance, and availability requirements of data. It
continues to evolve today and is used in all storage
architectures such as DAS, SAN, and so on.
Evolution of Storage Technology and Architecture
 Network-attached storage (NAS) : This is dedicated storage for file
serving applications. Unlike a SAN, it connects to an existing
communication network (LAN) and provides file access to
heterogeneous clients. Because it is purposely built for providing
storage to file server applications, it offers higher scalability,
availability, performance, and cost benefits compared to general
purpose file servers
 Storage area network (SAN) : This is a dedicated, high-performance
Fiber Channel (FC) network to facilitate block-level communication
between servers and storage. Storage is partitioned and assigned to a
server for accessing its data. SAN offers scalability, availability,
performance, and cost benefits compared to DAS.
 Internet Protocol SAN (IP-SAN) : One of the latest evolutions in storage
architecture, IP-SAN is a convergence of technologies used in SAN and
NAS. IP-SAN provides block-level communication across a local or
wide area network (LAN or WAN), resulting in greater consolidation
and availability of data.
Information storage and management
Information storage and management
Information storage and management
Information storage and management

More Related Content

PPTX
Chapter 4
PDF
Data Storage and Information Management
PPTX
Data backup and disaster recovery
PDF
E-Business Models
PPTX
RAID LEVELS
PDF
Best Practices for Planning your Datacenter
PDF
Research Methodology & IPR-I
PPTX
Chapter 4
Data Storage and Information Management
Data backup and disaster recovery
E-Business Models
RAID LEVELS
Best Practices for Planning your Datacenter
Research Methodology & IPR-I

What's hot (20)

PDF
Information Security Lecture Notes
PPT
distributed shared memory
PPTX
Chapter 1
PPT
Evolution of the cloud
PPTX
Information Security Blueprint
PPT
Map Reduce
PPT
chapter 1. Introduction to Information Security
PPTX
Distributed database management system
PPTX
Distributed DBMS - Unit 6 - Query Processing
PPT
Security Requirements in IoT Architecture
PPT
Naming in Distributed Systems
PDF
CS9222 ADVANCED OPERATING SYSTEMS
PPTX
Data science.chapter-1,2,3
PPTX
Data management issues
PPTX
Introduction to Distributed System
PPTX
Query processing in Distributed Database System
DOC
Naming in Distributed System
PPTX
Introduction to IoT Security
 
PPTX
PPTX
Distributed design alternatives
Information Security Lecture Notes
distributed shared memory
Chapter 1
Evolution of the cloud
Information Security Blueprint
Map Reduce
chapter 1. Introduction to Information Security
Distributed database management system
Distributed DBMS - Unit 6 - Query Processing
Security Requirements in IoT Architecture
Naming in Distributed Systems
CS9222 ADVANCED OPERATING SYSTEMS
Data science.chapter-1,2,3
Data management issues
Introduction to Distributed System
Query processing in Distributed Database System
Naming in Distributed System
Introduction to IoT Security
 
Distributed design alternatives
Ad

Similar to Information storage and management (20)

DOCX
Information Storage and Management notes ssmeena
DOC
Informatica and datawarehouse Material
PDF
Xd planning guide - storage best practices
DOC
Data warehouse concepts
PPT
Data Center Optimization
PPT
Data Warehouse
PDF
IRJET - The 3-Level Database Architectural Design for OLAP and OLTP Ops
PDF
IRJET- Analysis for EnhancedForecastof Expense Movement in Stock Exchange
PDF
Connect July-Aug 2014
PPTX
Data warehouse
PPT
20IT501_DWDM_PPT_Unit_I.ppt
PPTX
data mining
PPTX
data mining
DOCX
Unit 1
PDF
Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...
PPT
Presentation
PDF
Insiders Guide- Managing Storage Performance
PPTX
Warehouse Planning and Implementation
PPTX
DATAWAREHOUSE MAIn under data mining for
PPT
20IT501_DWDM_PPT_Unit_I.ppt
Information Storage and Management notes ssmeena
Informatica and datawarehouse Material
Xd planning guide - storage best practices
Data warehouse concepts
Data Center Optimization
Data Warehouse
IRJET - The 3-Level Database Architectural Design for OLAP and OLTP Ops
IRJET- Analysis for EnhancedForecastof Expense Movement in Stock Exchange
Connect July-Aug 2014
Data warehouse
20IT501_DWDM_PPT_Unit_I.ppt
data mining
data mining
Unit 1
Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...
Presentation
Insiders Guide- Managing Storage Performance
Warehouse Planning and Implementation
DATAWAREHOUSE MAIn under data mining for
20IT501_DWDM_PPT_Unit_I.ppt
Ad

Recently uploaded (20)

PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
RMMM.pdf make it easy to upload and study
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
Cell Structure & Organelles in detailed.
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Cell Types and Its function , kingdom of life
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Insiders guide to clinical Medicine.pdf
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
01-Introduction-to-Information-Management.pdf
PDF
Basic Mud Logging Guide for educational purpose
human mycosis Human fungal infections are called human mycosis..pptx
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
RMMM.pdf make it easy to upload and study
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Cell Structure & Organelles in detailed.
PPH.pptx obstetrics and gynecology in nursing
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Cell Types and Its function , kingdom of life
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Renaissance Architecture: A Journey from Faith to Humanism
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Final Presentation General Medicine 03-08-2024.pptx
Insiders guide to clinical Medicine.pdf
Microbial diseases, their pathogenesis and prophylaxis
O7-L3 Supply Chain Operations - ICLT Program
01-Introduction-to-Information-Management.pdf
Basic Mud Logging Guide for educational purpose

Information storage and management

  • 2. (UNIT I -Introduction to Storage Technology)  Application: An application is a computer program that provides the logic for computing operations.  Database : A Database is an organized collection of Data.  Server and operating system: A computing platform that runs applications and databases.  Network: A data path that facilitates communication between clients and servers or between servers and storage.  Storage array: A device that stores data persistently for subsequent use. Core elements of a data center
  • 3. Key Requirements for Data Center Elements
  • 4. Key Requirements for Data Center Elements  Availability : Accessibility should be insured for a business implementation.  Security : Polices, procedures, and proper integration of the data center core elements that will prevent unauthorized access to information must be established.  Scalability : Additional On-Demand services , without interrupting business operations.  Performance: All the core elements of the data center should be able to provide optimal performance and service all processing requests at high speed.  Data integrity : Mechanisms such as error correction codes and parity bits which ensure that data is written to disk exactly as it was received.  Capacity : To increase and decrease core elements capacity on demand.  Manageability: A data center should perform all operations and activities in the most efficient manner. Manageability can be achieved through automation and the reduction of human (manual) intervention in common tasks.
  • 5. Evolution of Storage Technology and Architecture Direct-attached storage (DAS) Just a Bunch Of Disk(JBOD) Redundant Array of Independent Disks (RAID) Network-attached storage (NAS) Storage area network (SAN) Internet Protocol SAN (IP-SAN) …Briefly described in UNIT III
  • 6. Information Lifecycle Management(ILM) Information lifecycle management (ILM) is a proactive strategy that enables an IT organization to effectively manage the data throughout its lifecycle, based on predefined business policies. This allows an IT organization to optimize the storage infrastructure for maximum return on investment. An ILM strategy should include the following characteristics:  Business-centric: It should be integrated with key processes, applications, and initiatives of the business to meet both current and future growth in information.  Centrally managed: All the information assets of a business should be under the purview of the ILM strategy.  Policy-based: The implementation of ILM should not be restricted to a few departments. ILM should be implemented as a policy and encompass all business applications, processes, and resources.  Heterogeneous: An ILM strategy should take into account all types of storage platforms and operating systems.  Optimized: Because the value of information varies, an ILM strategy should consider the different storage requirements and allocate storage resources based on the information’s value to the business.
  • 7. ILM Implementation  Classifying data and applications on the basis of business rules and policies to enable differentiated treatment of information  Implementing policies by using information management tools, starting from the creation of data and ending with its disposal  Managing the environment by using integrated tools to reduce operational complexity  Organizing storage resources in tiers to align the resources with data classes, and storing information in the right type of infrastructure based on the information’s current value
  • 8. ILM Benefits  Improved utilization by using tiered storage platforms and increased visibility of all enterprise information.  Simplified management by integrating process steps and interfaces with individual tools and by increasing automation.  A wider range of options for backup, and recovery to balance the need for business continuity.  Maintaining compliance by knowing what data needs to be protected for what length of time.  Lower Total Cost of Ownership (TCO) by aligning the infrastructure and management costs with information value. As a result, resources are not wasted, and complexity is not introduced by managing low-value data at the expense of high-value data.
  • 9. Key Challenges in Managing Information  Exploding digital universe: The rate of information growth is increasing exponentially. Duplication of data to ensure high availability and repurposing has also contributed to the multifold increase of information growth.  Increasing dependency on information: The strategic use of information plays an important role in determining the success of a business and provides competitive advantages in the marketplace.  Changing value of information: Information that is valuable today may become less important tomorrow. The value of information often changes over time. Framing a policy to meet these challenges involves understanding the value of information over its lifecycle.
  • 10. (UNIT II -Storage Systems Architecture)  Key components of a disk drive are platter, spindle, read/write head, actuator arm assembly, and controller.  The set of rotating platters is sealed in a case, called a Head DiskAssembly(HDA).  A typical HDD consists of one or more flat circular disks called platters. Key components of a disk drive
  • 11. Key components of a disk drive  When the spindle is rotating, there is a microscopic air gap between the R/W heads and the platters, known as the head flying height.  This air gap is removed when the spindle stops rotating and the R/W head rests on a special area on the platter near the spindle. This area is called the landing zone.  If the drive malfunctions and the R/W head accidentally touches the surface of the platter outside the landing zone, a head crash occurs.
  • 12. Data Transfer Rate Zoned bit recording
  • 13. Disk Drive Performance Based on Time  Disk service time is the time taken by a disk to complete an I/O request. Components that contribute to service time on a disk drive are seek time, rotational latency, and data transfer rate.  Seek Time : The time taken to reposition and settle the arm and the head over the correct track is known as Seek Time. The lower the seek time, the faster the I/O operation. Disk vendors publish the following seek time specifications:  Full Stroke: The time taken by the R/W head to move across the entire width of the disk, from the innermost track to the outermost track.  Average: The average time taken by the R/W head to move from one random track to another, normally listed as the time for one-third of a full stroke.  Track-to-Track: The time taken by the R/W head to move between adjacent tracks.  Rotational Latency : To access data, the actuator arm moves the R/W head over the platter to a particular track while the platter spins to position the requested sector under the R/W head. The time taken by the platter to rotate and position the data the R/W head is called rotational latency.  Data Transfer Rate : The data transfer rate (also called transfer rate) refers to the average amount of data per unit time that the drive can deliver to the HBA . In a read operation, the data first moves from disk platters to R/W heads, and then it moves to the drive’s internal buffer. Finally, data moves from the buffer through the interface to the host HBA. In a write operation, the data moves from the HBA to the internal buffer of the disk drive through the drive’s interface. The data then moves from the buffer to the R/W heads. Finally, it moves from the R/W heads to the platters
  • 14. Intelligent Storage System Components of an Intelligent Storage System  Front End  Cache  Back End  Physical Disk
  • 15. Logical Unit Number (LUN) Physical drives or RAID protected drives can be logically spitted into volumes these are called Logical Volumes or Logical Unit number.
  • 16. Intelligent Storage Array Intelligent storage systems generally fall into one of the following two categories: ■ High-end storage systems : High-end storage systems, referred to as active-active arrays, are generally aimed at large enterprises for centralizing corporate data. These arrays are designed with a large number of controllers and cache memory. An active-active array implies that the host can perform I/O’s to its LUNs across any of the available paths. ■ Midrange storage systems : Midrange storage systems are also referred to as active-passive arrays and they are best suited for small- and medium-sized enterprises. In an active-passive array, a host can perform I/O’s to a LUN only through the paths to the owning controller of that LUN. These paths are called active paths. The other paths are passive with respect to this LUN. As shown in Figure 4-8, the host can perform reads or writes to the LUN only through the path to controller A, as controller A is the owner of that LUN. The path to controller B remains passive and no I/O activity is performed through this path. Midrange storage systems are typically designed with two controllers, each of which contains host interfaces, cache, RAID controllers, and disk drive interfaces.
  • 18. Redundant Array of Independent Disks (RAID)
  • 21. RAID Levels ( 0 , 1 , 3 )
  • 22. RAID Level ( 5 & 6 )
  • 23. Nested RAID ( 0 + 1 / 1 + 0 )
  • 24. Parity Parity is a method of protecting striped data from HDD failure without the cost of mirroring. An additional HDD is added to the stripe width to hold parity, a mathematical construct that allows re-creation of the missing data. Parity is a redundancy check that ensures full protection of data without maintaining a full set of duplicate data Parity RAID
  • 25. (UNIT III -Introduction to Networked Storage) Evolution of Storage Technology and Architecture  Direct-attached storage (DAS) : This type of storage connects directly to a server (host) or a group of servers in a cluster. Storage can be either internal or external to the server. External DAS alleviated the challenges of limited internal storage capacity  Just a Bunch Of Disk(JBOD) : It is encloser full of cluster disk , having no internal controller in JBOD hard disk are permanently fitted into encloser and power supply are taken outputs.  Redundant Array of Independent Disks (RAID) : This technology was developed to address the cost, performance, and availability requirements of data. It continues to evolve today and is used in all storage architectures such as DAS, SAN, and so on.
  • 26. Evolution of Storage Technology and Architecture  Network-attached storage (NAS) : This is dedicated storage for file serving applications. Unlike a SAN, it connects to an existing communication network (LAN) and provides file access to heterogeneous clients. Because it is purposely built for providing storage to file server applications, it offers higher scalability, availability, performance, and cost benefits compared to general purpose file servers  Storage area network (SAN) : This is a dedicated, high-performance Fiber Channel (FC) network to facilitate block-level communication between servers and storage. Storage is partitioned and assigned to a server for accessing its data. SAN offers scalability, availability, performance, and cost benefits compared to DAS.  Internet Protocol SAN (IP-SAN) : One of the latest evolutions in storage architecture, IP-SAN is a convergence of technologies used in SAN and NAS. IP-SAN provides block-level communication across a local or wide area network (LAN or WAN), resulting in greater consolidation and availability of data.