SlideShare a Scribd company logo
2
Most read
3
Most read
24
Most read
.
VTOP MODULE 2
COMPUTER VISION
MODULE 2
• Depth Estimation And Multi-Camera Views:
Depth Estimation and Multi-Camera Views:
Perspective, Binocular Stereopsis: Camera and
Epipolar Geometry; Homography, Rectification,
DLT, RANSAC, 3-D reconstruction framework;
Auto-calibration.
 Depth Estimation is the task of measuring the distance of each pixel relative to
the camera. Depth is extracted from either monocular (single) or stereo (multiple
views of a scene) images. Traditional methods use multi-view geometry to find
the relationship between the images. Newer methods can directly estimate depth
by minimizing the regression loss, or by learning to generate a novel view from a
sequence.
 It is an important task in computer vision and has various applications such as
3D reconstruction, augmented reality, autonomous navigation, and more.
 There are several techniques for depth estimation, and one commonly used
approach is stereo vision. Stereo vision involves using a pair of cameras, known
as a stereo camera setup, to capture images of a scene from slightly different
viewpoints. The disparity between corresponding pixels in the left and right
images can be used to calculate the depth information.
Depth Estimation
 To estimate depth using stereo vision, the following steps are typically involved:
Camera calibration: Accurate calibration of the stereo camera setup is necessary to
determine the intrinsic and extrinsic parameters of each camera. This calibration
process establishes the relationship between the 3D world coordinates and the
corresponding 2D image points.
Image rectification: Rectification is performed to transform the stereo image pair so
that corresponding epipolar lines become scanlines. This simplifies the matching
process by reducing it to a 1D search problem.
Disparity calculation: Matching algorithms are used to find correspondences
between the left and right images. These algorithms aim to identify the pixel
disparities, i.e., the horizontal shift of a point between the two images. Common
techniques include block matching, semi-global matching, and graph cuts.
Depth Estimation
Depth computation: Once the disparity map is obtained, the depth can be calculated
using triangulation. By knowing the baseline distance (distance between the two
camera centers) and the focal length of the cameras, the depth at each pixel can be
computed using simple geometry.
Apart from stereo vision, there are other methods for depth estimation, including
structured light, time-of-flight, and monocular depth estimation using a single
camera. Monocular depth estimation relies on various cues, such as texture, motion,
perspective, and object size, to infer depth information. Deep learning-based
approaches, especially convolutional neural networks (CNNs), have shown
promising results in monocular depth estimation by learning from large-scale
datasets.
Depth Estimation
Depth Estimation
Multi-camera views refer to the use of multiple cameras positioned at different
locations or angles to capture a scene simultaneously. By combining the views from
multiple cameras, it becomes possible to obtain a more comprehensive
understanding of the scene, including depth information and different perspectives.
Here are some key points about multi-camera views:
Enhanced Coverage: With multiple cameras, it is possible to cover a larger area of
the scene compared to a single camera. Each camera can capture a different portion
or angle of the scene, providing a wider field of view.
Improved Depth Perception: By utilizing multiple cameras, depth information can
be extracted using techniques like stereo vision or structure from motion. By
comparing the views from different cameras, it becomes possible to estimate the
depth of objects in the scene, enabling 3D reconstruction and depth-based
applications.
Redundancy and Robustness: Having multiple camera views provides redundancy
in capturing the scene. If one camera fails or its view is obstructed, other cameras
can still provide information about the scene. This redundancy enhances the
robustness and reliability of the system.
Multi Camera View
Viewpoint Diversity: Each camera in a multi-camera setup can have a different
perspective or viewpoint of the scene. This diversity of viewpoints can be beneficial
for various applications, such as object tracking, activity recognition, or scene
understanding. By combining different perspectives, a more comprehensive
representation of the scene can be obtained.
Multi-Modal Information: Multi-camera views can also capture different
modalities of the scene, such as visible light, infrared, depth sensors, or thermal
imaging. By combining these different modalities, richer and more detailed
information about the scene can be obtained, leading to improved understanding and
analysis.
Applications of multi-camera views include surveillance systems, autonomous
vehicles, virtual reality, augmented reality, robotics, sports analysis, and many more.
The synchronized and coordinated use of multiple cameras enables a deeper
understanding of the scene, enhances accuracy and robustness, and opens up new
possibilities in computer vision and imaging applications.
Multi Camera View
Multi-camera views refer to the use of multiple cameras to capture different
perspectives simultaneously. These multiple camera angles are then often edited
together to create a dynamic and engaging visual experience for the audience. Each
camera provides a unique perspective, allowing viewers to see different angles,
details, and reactions.
Multi-camera setups are commonly used in various media productions, including
television shows, live events, sports broadcasts, and films. Here are some key
perspectives achieved through multi-camera views:
Wide Shots: A wide shot provides an overall view of the scene, capturing the entire
set or location. It establishes the context, shows the spatial relationships between
characters or objects, and sets the stage for more detailed shots.
Medium Shots: Medium shots focus on characters or objects from a medium
distance. They offer a balanced view, showing the subject from the waist up or from
the knees up. Medium shots are often used for dialogue scenes and allow viewers to
see facial expressions and body language.
Perspective
Close-ups: Close-up shots zoom in on a specific subject, such as a person's face or
an object. They highlight details and emotions, creating an intimate connection
between the viewer and the subject. Close-ups are particularly effective for
conveying emotions or emphasizing important story elements.
Over-the-Shoulder Shots: Over-the-shoulder shots are commonly used in dialogue
scenes. They capture the back of one person's shoulder and part of their head, with
the main focus on the person they are facing. This perspective provides a sense of
depth and helps viewers feel like they are part of the conversation.
Reaction Shots: Reaction shots capture the emotional responses or reactions of
characters to a particular event or dialogue. They are usually close-ups of a
character's face, emphasizing their expressions and adding depth to the scene.
Point-of-View Shots: Point-of-view shots provide the audience with the perspective
of a particular character. The camera becomes the character's eyes, showing what
they see and their subjective experience of the situation. These shots can create a
sense of immersion and empathy.
Perspective
By combining and switching between these different camera perspectives, directors
and editors can create engaging visual narratives that enhance the storytelling
experience. Multi-camera views provide flexibility in post-production, allowing for
the selection of the best shots and angles to convey the intended message and evoke
the desired emotions from the audience.
Perspective
Binocular stereopsis is the ability of humans (and some animals) to perceive depth
and three-dimensional information by utilizing the binocular disparity resulting from
having two eyes placed horizontally on the face. Each eye captures a slightly
different view of the world, and the brain combines these two images to create a
single perception with depth perception.
The process of binocular stereopsis involves several steps:
Binocular Disparity: Binocular disparity refers to the differences in the retinal
images between the two eyes. Because the eyes are horizontally separated, they
receive slightly different perspectives of the same scene. These disparities are due to
the parallax effect and provide important depth cues.
 The parallax effect is a phenomenon that occurs due to the displacement or
difference in the apparent position of an object when viewed from different
angles. It is a visual cue that helps perceive depth and distance in a scene.
Binocular Stereopsis
 The parallax effect is closely related to binocular disparity, which is the primary
mechanism behind binocular stereopsis (the ability to perceive depth using two
eyes). When we view objects with binocular vision, each eye has a slightly
different perspective, resulting in a disparity between the images captured by
each eye. The brain processes these disparities to compute depth information and
create a perception of three-dimensional space.
Here's an example to illustrate the parallax effect:
Hold your finger in front of your face and look at it first with your left eye and then
with your right eye, alternating between the two. You will notice that the finger
appears to shift its position relative to the background. This apparent shift is the
parallax effect in action. The amount of shift or displacement is greater when the
object is closer to you and smaller when it is farther away.
Binocular Stereopsis
Correspondence Matching: The brain's visual processing system compares the
images from each eye and matches corresponding points or features between them. It
searches for similar patterns, textures, or edges in both images to establish
correspondences.
Binocular Stereopsis
Disparity Calculation: Once the corresponding points are identified, the brain
measures the horizontal displacement or disparity between them. The magnitude of
the disparity is proportional to the depth difference between the object and the
observer.
Depth Perception: By analyzing the magnitude of the disparity, the brain estimates
the relative depth of objects in the visual scene. Objects that appear closer will have
a larger disparity, while objects farther away will have a smaller disparity.
Binocular Stereopsis
Fusion and 3D Perception: The brain combines the information from both eyes,
integrating the two slightly different perspectives into a single perception. This
fusion of the images creates the perception of depth, allowing us to see the world in
three dimensions.
Binocular stereopsis is an important component of human vision and provides us
with valuable depth cues, allowing us to navigate and interact with the environment
effectively. It enables us to judge distances, perceive the relative positions of objects,
and experience a sense of depth and solidity in our visual perception.
In addition to human vision, binocular stereopsis has applications in fields such as
computer vision and robotics. By using stereo cameras or other depth-sensing
techniques, machines can replicate the principles of binocular stereopsis to perceive
depth and reconstruct three-dimensional representations of the world around them.
Binocular Stereopsis
Camera geometry refers to the mathematical and physical properties that describe the
behavior and characteristics of a camera. It encompasses both intrinsic and extrinsic
parameters that define how the camera captures and projects the 3D world onto a 2D
image.
Intrinsic Parameters: Intrinsic parameters are internal to the camera and define its
internal optical characteristics. These parameters include:
Camera Geometry
 Focal Length: The focal length
determines the camera's field of
view and the degree of
magnification. It represents the
distance between the camera's lens
and the image sensor when the
subject is in focus.
 Principal Point: The principal
point represents the optical center
of the camera. It is the point where
the optical axis intersects the
image plane.
 Lens Distortion: Lens distortion refers to the imperfections in the camera lens
that can cause image distortions. Common types of distortion include radial
distortion (barrel or pincushion distortion) and tangential distortion.
Camera Geometry
Tangential Distortion: Tangential distortion is a different type of distortion that
occurs due to misalignments or irregularities in the lens elements. It causes the
image to appear skewed or stretched asymmetrically, typically in a non-linear
manner. Tangential distortion can result from factors such as slight tilting or
displacement of the lens elements or inconsistencies in lens manufacturing.
Radial Distortion: Radial distortion refers to the distortion that occurs when straight lines
near the edges of an image appear curved or bent. It is caused by imperfections in the lens that
cause light rays to refract differently depending on their distance from the center of the lens.
Radial distortion is typically classified into two subtypes:
Camera Geometry
Barrel Distortion: Barrel distortion
causes straight lines to curve outward,
resembling the shape of a barrel. It occurs
when the outer portions of the image are
magnified more than the center. This
distortion is commonly observed in wide-
angle lenses.
Pincushion Distortion: Pincushion
distortion causes straight lines to curve
inward, resembling the shape of a
pincushion. It occurs when the center of
the image is magnified more than the outer
portions. Pincushion distortion is often
observed in telephoto lenses.
Extrinsic Parameters: Extrinsic parameters describe the position and orientation of
the camera in the 3D world. These parameters include:
Camera Center: The camera center, also known as the optical center or camera
position, represents the location of the camera's optical axis in the 3D world.
Camera Pose: The camera pose describes the position (translation) and orientation
(rotation) of the camera relative to a reference coordinate system.
Projection Model: The projection model defines how the 3D world is projected
onto the 2D image plane. The most common projection model used is the pinhole
camera model, which assumes a perspective projection. It assumes that light rays
pass through a single point (pinhole) in the camera and project onto the image plane.
Camera Calibration: Camera calibration is the process of determining the intrinsic
and extrinsic parameters of a camera. It involves capturing calibration images with
known calibration patterns, such as a chessboard, and using mathematical algorithms
to estimate the camera parameters.
Camera Geometry
Understanding camera geometry and its parameters is crucial for various
applications, including computer vision, 3D reconstruction, camera calibration,
augmented reality, and robotics. By accurately modeling the camera's behavior, it
becomes possible to interpret and manipulate images and accurately estimate the
position and geometry of objects in the 3D world.
Camera Geometry
Epipolar geometry is a fundamental concept in computer vision and stereo imaging
that describes the geometric relationship between two camera views observing the
same scene. It provides constraints on the possible locations of corresponding points
in the two images, enabling depth estimation and 3D reconstruction.
Epipolar geometry include
Epipolar Geometry
Epipole: The epipole is a
point that represents the
projection of one camera
center onto the image plane of
the other camera. It is the
point of intersection between
the line connecting the camera
centers (baseline) and the
image plane. Each camera has
its own epipole in the other
camera's image.𝑒𝑟
Here 𝑒𝑙 and 𝑒𝑟 is the epipole
Epipolar Plane: The epipolar plane is a 3D plane that contains the baseline (the line
connecting the camera centers) and any point in the 3D scene. It represents the
possible locations of corresponding points in the two camera views.
Epipolar Geometry
Epipolar line: The epipolar line is the straight line of intersection of the epipolar
plane with the image plane. It is the image in one camera of a ray through the optical
center and image point in the other camera. All epipolar lines intersect at the epipole.
Epipolar Geometry

More Related Content

PPT
Digital Image Processing (DIP)
PPT
Image enhancement
PPT
Image Sensing & Acquisition
PDF
IMAGE PROCESSING - MATHANKUMAR.S - VMKVEC
PDF
Unit-1 basics of computer graphics
PPTX
IMAGE SEGMENTATION.
PPTX
Applications of Digital image processing in Medical Field
PPSX
Edge Detection and Segmentation
Digital Image Processing (DIP)
Image enhancement
Image Sensing & Acquisition
IMAGE PROCESSING - MATHANKUMAR.S - VMKVEC
Unit-1 basics of computer graphics
IMAGE SEGMENTATION.
Applications of Digital image processing in Medical Field
Edge Detection and Segmentation

What's hot (20)

PPTX
PDF
Aerial photographic film
PPTX
PDF
Introduction to object detection
PDF
Image Registration (Digital Image Processing)
PPTX
camera calibration
PPTX
Fourier descriptors & moments
PPTX
Object recognition
PDF
Dimensionality reduction with UMAP
PPTX
Histogram Processing
PPTX
Comparison of image fusion methods
PDF
DBSCAN
PPTX
Camera model ‫‬
PPTX
Lidar final ppt
PPT
06 spatial filtering DIP
PPS
Image Processing Basics
PDF
Lecture 11
PPT
Image processing1 introduction
PPTX
Exploring Spatial data in GIS Environment
PPT
Gnss data-processing
Aerial photographic film
Introduction to object detection
Image Registration (Digital Image Processing)
camera calibration
Fourier descriptors & moments
Object recognition
Dimensionality reduction with UMAP
Histogram Processing
Comparison of image fusion methods
DBSCAN
Camera model ‫‬
Lidar final ppt
06 spatial filtering DIP
Image Processing Basics
Lecture 11
Image processing1 introduction
Exploring Spatial data in GIS Environment
Gnss data-processing
Ad

Similar to MODULE 2 computer vision part 2 depth estimation (20)

PDF
Virtual viewpoint three dimensional panorama
PDF
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
PDF
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
PDF
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
PPTX
Depth & space
PDF
Sony
DOCX
PDF
M&L 2012 - 3D - by Mantovani, Porcaro
PPT
Virtual Reality 3D home applications
PPTX
Concept of stereo vision based virtual touch
PDF
Saad alsheekh multi view
PPT
Depth perception by imran ali
PDF
8. 10179 13825-1-ed edit iqbal
PDF
Topic stereoscopy, Parallax, Relief displacement
PDF
An Application of Stereo Image Reprojection from Multi-Angle Images fo...
PPTX
Perspective TLE Grade 8
PPTX
Chris Varekamp (Philips Group Innovation, Research): Depth estimation, Proces...
PPTX
Stereo Magnification Learning view synthesis using multiplane images.pptx
PPT
Visstereo
PDF
Visual Mapping and Collision Avoidance Dynamic Environments in Dynamic Enviro...
Virtual viewpoint three dimensional panorama
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
Depth & space
Sony
M&L 2012 - 3D - by Mantovani, Porcaro
Virtual Reality 3D home applications
Concept of stereo vision based virtual touch
Saad alsheekh multi view
Depth perception by imran ali
8. 10179 13825-1-ed edit iqbal
Topic stereoscopy, Parallax, Relief displacement
An Application of Stereo Image Reprojection from Multi-Angle Images fo...
Perspective TLE Grade 8
Chris Varekamp (Philips Group Innovation, Research): Depth estimation, Proces...
Stereo Magnification Learning view synthesis using multiplane images.pptx
Visstereo
Visual Mapping and Collision Avoidance Dynamic Environments in Dynamic Enviro...
Ad

More from garimajain959768 (6)

PPT
Flip Flops sequential circuit and types
PPTX
Transport layer security computer network.pptx
PPTX
compliment representation and code ascii
PPTX
health info clinical data warehouse.pptx
PPTX
classes and importance of tuberculosis.pptx
PPT
information for world ozone day with importance
Flip Flops sequential circuit and types
Transport layer security computer network.pptx
compliment representation and code ascii
health info clinical data warehouse.pptx
classes and importance of tuberculosis.pptx
information for world ozone day with importance

Recently uploaded (20)

PPTX
Sustainable Sites - Green Building Construction
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
Well-logging-methods_new................
PPTX
Internet of Things (IOT) - A guide to understanding
PPT
Mechanical Engineering MATERIALS Selection
PPT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
additive manufacturing of ss316l using mig welding
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPT
Project quality management in manufacturing
PPTX
Artificial Intelligence
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
Geodesy 1.pptx...............................................
PPT
introduction to datamining and warehousing
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Sustainable Sites - Green Building Construction
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Well-logging-methods_new................
Internet of Things (IOT) - A guide to understanding
Mechanical Engineering MATERIALS Selection
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
additive manufacturing of ss316l using mig welding
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Project quality management in manufacturing
Artificial Intelligence
Embodied AI: Ushering in the Next Era of Intelligent Systems
Geodesy 1.pptx...............................................
introduction to datamining and warehousing
UNIT 4 Total Quality Management .pptx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx

MODULE 2 computer vision part 2 depth estimation

  • 2. MODULE 2 • Depth Estimation And Multi-Camera Views: Depth Estimation and Multi-Camera Views: Perspective, Binocular Stereopsis: Camera and Epipolar Geometry; Homography, Rectification, DLT, RANSAC, 3-D reconstruction framework; Auto-calibration.
  • 3.  Depth Estimation is the task of measuring the distance of each pixel relative to the camera. Depth is extracted from either monocular (single) or stereo (multiple views of a scene) images. Traditional methods use multi-view geometry to find the relationship between the images. Newer methods can directly estimate depth by minimizing the regression loss, or by learning to generate a novel view from a sequence.  It is an important task in computer vision and has various applications such as 3D reconstruction, augmented reality, autonomous navigation, and more.  There are several techniques for depth estimation, and one commonly used approach is stereo vision. Stereo vision involves using a pair of cameras, known as a stereo camera setup, to capture images of a scene from slightly different viewpoints. The disparity between corresponding pixels in the left and right images can be used to calculate the depth information. Depth Estimation
  • 4.  To estimate depth using stereo vision, the following steps are typically involved: Camera calibration: Accurate calibration of the stereo camera setup is necessary to determine the intrinsic and extrinsic parameters of each camera. This calibration process establishes the relationship between the 3D world coordinates and the corresponding 2D image points. Image rectification: Rectification is performed to transform the stereo image pair so that corresponding epipolar lines become scanlines. This simplifies the matching process by reducing it to a 1D search problem. Disparity calculation: Matching algorithms are used to find correspondences between the left and right images. These algorithms aim to identify the pixel disparities, i.e., the horizontal shift of a point between the two images. Common techniques include block matching, semi-global matching, and graph cuts. Depth Estimation
  • 5. Depth computation: Once the disparity map is obtained, the depth can be calculated using triangulation. By knowing the baseline distance (distance between the two camera centers) and the focal length of the cameras, the depth at each pixel can be computed using simple geometry. Apart from stereo vision, there are other methods for depth estimation, including structured light, time-of-flight, and monocular depth estimation using a single camera. Monocular depth estimation relies on various cues, such as texture, motion, perspective, and object size, to infer depth information. Deep learning-based approaches, especially convolutional neural networks (CNNs), have shown promising results in monocular depth estimation by learning from large-scale datasets. Depth Estimation
  • 7. Multi-camera views refer to the use of multiple cameras positioned at different locations or angles to capture a scene simultaneously. By combining the views from multiple cameras, it becomes possible to obtain a more comprehensive understanding of the scene, including depth information and different perspectives. Here are some key points about multi-camera views: Enhanced Coverage: With multiple cameras, it is possible to cover a larger area of the scene compared to a single camera. Each camera can capture a different portion or angle of the scene, providing a wider field of view. Improved Depth Perception: By utilizing multiple cameras, depth information can be extracted using techniques like stereo vision or structure from motion. By comparing the views from different cameras, it becomes possible to estimate the depth of objects in the scene, enabling 3D reconstruction and depth-based applications. Redundancy and Robustness: Having multiple camera views provides redundancy in capturing the scene. If one camera fails or its view is obstructed, other cameras can still provide information about the scene. This redundancy enhances the robustness and reliability of the system. Multi Camera View
  • 8. Viewpoint Diversity: Each camera in a multi-camera setup can have a different perspective or viewpoint of the scene. This diversity of viewpoints can be beneficial for various applications, such as object tracking, activity recognition, or scene understanding. By combining different perspectives, a more comprehensive representation of the scene can be obtained. Multi-Modal Information: Multi-camera views can also capture different modalities of the scene, such as visible light, infrared, depth sensors, or thermal imaging. By combining these different modalities, richer and more detailed information about the scene can be obtained, leading to improved understanding and analysis. Applications of multi-camera views include surveillance systems, autonomous vehicles, virtual reality, augmented reality, robotics, sports analysis, and many more. The synchronized and coordinated use of multiple cameras enables a deeper understanding of the scene, enhances accuracy and robustness, and opens up new possibilities in computer vision and imaging applications. Multi Camera View
  • 9. Multi-camera views refer to the use of multiple cameras to capture different perspectives simultaneously. These multiple camera angles are then often edited together to create a dynamic and engaging visual experience for the audience. Each camera provides a unique perspective, allowing viewers to see different angles, details, and reactions. Multi-camera setups are commonly used in various media productions, including television shows, live events, sports broadcasts, and films. Here are some key perspectives achieved through multi-camera views: Wide Shots: A wide shot provides an overall view of the scene, capturing the entire set or location. It establishes the context, shows the spatial relationships between characters or objects, and sets the stage for more detailed shots. Medium Shots: Medium shots focus on characters or objects from a medium distance. They offer a balanced view, showing the subject from the waist up or from the knees up. Medium shots are often used for dialogue scenes and allow viewers to see facial expressions and body language. Perspective
  • 10. Close-ups: Close-up shots zoom in on a specific subject, such as a person's face or an object. They highlight details and emotions, creating an intimate connection between the viewer and the subject. Close-ups are particularly effective for conveying emotions or emphasizing important story elements. Over-the-Shoulder Shots: Over-the-shoulder shots are commonly used in dialogue scenes. They capture the back of one person's shoulder and part of their head, with the main focus on the person they are facing. This perspective provides a sense of depth and helps viewers feel like they are part of the conversation. Reaction Shots: Reaction shots capture the emotional responses or reactions of characters to a particular event or dialogue. They are usually close-ups of a character's face, emphasizing their expressions and adding depth to the scene. Point-of-View Shots: Point-of-view shots provide the audience with the perspective of a particular character. The camera becomes the character's eyes, showing what they see and their subjective experience of the situation. These shots can create a sense of immersion and empathy. Perspective
  • 11. By combining and switching between these different camera perspectives, directors and editors can create engaging visual narratives that enhance the storytelling experience. Multi-camera views provide flexibility in post-production, allowing for the selection of the best shots and angles to convey the intended message and evoke the desired emotions from the audience. Perspective
  • 12. Binocular stereopsis is the ability of humans (and some animals) to perceive depth and three-dimensional information by utilizing the binocular disparity resulting from having two eyes placed horizontally on the face. Each eye captures a slightly different view of the world, and the brain combines these two images to create a single perception with depth perception. The process of binocular stereopsis involves several steps: Binocular Disparity: Binocular disparity refers to the differences in the retinal images between the two eyes. Because the eyes are horizontally separated, they receive slightly different perspectives of the same scene. These disparities are due to the parallax effect and provide important depth cues.  The parallax effect is a phenomenon that occurs due to the displacement or difference in the apparent position of an object when viewed from different angles. It is a visual cue that helps perceive depth and distance in a scene. Binocular Stereopsis
  • 13.  The parallax effect is closely related to binocular disparity, which is the primary mechanism behind binocular stereopsis (the ability to perceive depth using two eyes). When we view objects with binocular vision, each eye has a slightly different perspective, resulting in a disparity between the images captured by each eye. The brain processes these disparities to compute depth information and create a perception of three-dimensional space. Here's an example to illustrate the parallax effect: Hold your finger in front of your face and look at it first with your left eye and then with your right eye, alternating between the two. You will notice that the finger appears to shift its position relative to the background. This apparent shift is the parallax effect in action. The amount of shift or displacement is greater when the object is closer to you and smaller when it is farther away. Binocular Stereopsis
  • 14. Correspondence Matching: The brain's visual processing system compares the images from each eye and matches corresponding points or features between them. It searches for similar patterns, textures, or edges in both images to establish correspondences. Binocular Stereopsis
  • 15. Disparity Calculation: Once the corresponding points are identified, the brain measures the horizontal displacement or disparity between them. The magnitude of the disparity is proportional to the depth difference between the object and the observer. Depth Perception: By analyzing the magnitude of the disparity, the brain estimates the relative depth of objects in the visual scene. Objects that appear closer will have a larger disparity, while objects farther away will have a smaller disparity. Binocular Stereopsis
  • 16. Fusion and 3D Perception: The brain combines the information from both eyes, integrating the two slightly different perspectives into a single perception. This fusion of the images creates the perception of depth, allowing us to see the world in three dimensions. Binocular stereopsis is an important component of human vision and provides us with valuable depth cues, allowing us to navigate and interact with the environment effectively. It enables us to judge distances, perceive the relative positions of objects, and experience a sense of depth and solidity in our visual perception. In addition to human vision, binocular stereopsis has applications in fields such as computer vision and robotics. By using stereo cameras or other depth-sensing techniques, machines can replicate the principles of binocular stereopsis to perceive depth and reconstruct three-dimensional representations of the world around them. Binocular Stereopsis
  • 17. Camera geometry refers to the mathematical and physical properties that describe the behavior and characteristics of a camera. It encompasses both intrinsic and extrinsic parameters that define how the camera captures and projects the 3D world onto a 2D image. Intrinsic Parameters: Intrinsic parameters are internal to the camera and define its internal optical characteristics. These parameters include: Camera Geometry  Focal Length: The focal length determines the camera's field of view and the degree of magnification. It represents the distance between the camera's lens and the image sensor when the subject is in focus.  Principal Point: The principal point represents the optical center of the camera. It is the point where the optical axis intersects the image plane.
  • 18.  Lens Distortion: Lens distortion refers to the imperfections in the camera lens that can cause image distortions. Common types of distortion include radial distortion (barrel or pincushion distortion) and tangential distortion. Camera Geometry Tangential Distortion: Tangential distortion is a different type of distortion that occurs due to misalignments or irregularities in the lens elements. It causes the image to appear skewed or stretched asymmetrically, typically in a non-linear manner. Tangential distortion can result from factors such as slight tilting or displacement of the lens elements or inconsistencies in lens manufacturing.
  • 19. Radial Distortion: Radial distortion refers to the distortion that occurs when straight lines near the edges of an image appear curved or bent. It is caused by imperfections in the lens that cause light rays to refract differently depending on their distance from the center of the lens. Radial distortion is typically classified into two subtypes: Camera Geometry Barrel Distortion: Barrel distortion causes straight lines to curve outward, resembling the shape of a barrel. It occurs when the outer portions of the image are magnified more than the center. This distortion is commonly observed in wide- angle lenses. Pincushion Distortion: Pincushion distortion causes straight lines to curve inward, resembling the shape of a pincushion. It occurs when the center of the image is magnified more than the outer portions. Pincushion distortion is often observed in telephoto lenses.
  • 20. Extrinsic Parameters: Extrinsic parameters describe the position and orientation of the camera in the 3D world. These parameters include: Camera Center: The camera center, also known as the optical center or camera position, represents the location of the camera's optical axis in the 3D world. Camera Pose: The camera pose describes the position (translation) and orientation (rotation) of the camera relative to a reference coordinate system. Projection Model: The projection model defines how the 3D world is projected onto the 2D image plane. The most common projection model used is the pinhole camera model, which assumes a perspective projection. It assumes that light rays pass through a single point (pinhole) in the camera and project onto the image plane. Camera Calibration: Camera calibration is the process of determining the intrinsic and extrinsic parameters of a camera. It involves capturing calibration images with known calibration patterns, such as a chessboard, and using mathematical algorithms to estimate the camera parameters. Camera Geometry
  • 21. Understanding camera geometry and its parameters is crucial for various applications, including computer vision, 3D reconstruction, camera calibration, augmented reality, and robotics. By accurately modeling the camera's behavior, it becomes possible to interpret and manipulate images and accurately estimate the position and geometry of objects in the 3D world. Camera Geometry
  • 22. Epipolar geometry is a fundamental concept in computer vision and stereo imaging that describes the geometric relationship between two camera views observing the same scene. It provides constraints on the possible locations of corresponding points in the two images, enabling depth estimation and 3D reconstruction. Epipolar geometry include Epipolar Geometry Epipole: The epipole is a point that represents the projection of one camera center onto the image plane of the other camera. It is the point of intersection between the line connecting the camera centers (baseline) and the image plane. Each camera has its own epipole in the other camera's image.𝑒𝑟 Here 𝑒𝑙 and 𝑒𝑟 is the epipole
  • 23. Epipolar Plane: The epipolar plane is a 3D plane that contains the baseline (the line connecting the camera centers) and any point in the 3D scene. It represents the possible locations of corresponding points in the two camera views. Epipolar Geometry
  • 24. Epipolar line: The epipolar line is the straight line of intersection of the epipolar plane with the image plane. It is the image in one camera of a ray through the optical center and image point in the other camera. All epipolar lines intersect at the epipole. Epipolar Geometry