Data-driven Ophthalmology
Introduction
●
Purpose of this presentation is to
provide a light visual literature
review on “big data” or deep
learning / artificial intelligence
solutions to come for
ophthalmology and vision sciences.
– More with an idea to introduce
topics that you might have not
thought of before without
going to deeply to details
Some of the background in order
to understand this presentation
better are covered in my previous
presentation →
●
Presentation itself is quite dense,
and better suitable to be read from
a tablet/desktop rather than as a
slideshow projected somewhere
Shallow introduction for Deep Learning Retinal
Image Analysis
Published on Aug 20, 2016
https://www.slideshare.net/PetteriTeikariPhD/shallow-introduction
-for-deep-learning-retinal-image-analysis
“Old-school” unimodal model
Imageclassificationforretinalpathologies
Ophthalmic IMAGING 2D Fundus 3D OCT→
Examples of color and high-dynamic-range (HDR) disc
photographs of 2 normal controls (a, b and c, d) and 2
glaucoma patients (e, f and g, h). Left column (a, c, e,
and g) color disc photograph and right column (b, d, f,
and h) high-dynamic-range concept disc photograph.
https://doi.org/10.1155/2017/8209270
Linear-scale adaptive optics (AO)-Optical Coherence Tomography (OCT) volume acquired with three different AO focus depths (RNFL, OPL, and
IS/OS) and combined for displaying appearance of retinal layers in AO-OCT images. En face images are projections of subvolumes shown in
the middle, demonstrating the fine-depth sectioning ability of AO-OCT. (Jonnal et al., 2016)
Optical Coherence Tomography (OCT) and its variants, the de facto standard for eye diagnostics
Multispectral imaging going beyond RGB channels and laser-based OCTs (Figure from Annidis)
Ophthalmic IMAGING (A)SLO and multimodal systems
(2015) https://doi.org/10.1364/BOE.6.001407
(2016) https://doi.org/10.1364/BOE.7.001783
https://doi.org/10.1007/s00417-016-3361-7
Fundus autofluorescence, microperimetry and
hyperreflective intraretinal spor (HRS) analysis using OCT
Ophthalmic IMAGING Functional Imaging
http://dx.doi.org/10.1167/iovs.16-21389
http://dx.doi.org/10.1167/iovs.16-20598
Model of the retinal vasculature represented by a binary tree. The
vessels bifurcate in a dichotomous manner except for the
precapillaries, which are point of origin of four capillaries. Adapted
from Takahashi et al. (2009)
http://dx.doi.org/10.1111/aos.13365
http://dx.doi.org/10.1080/02713683.2016.1217544
KEYWORDS: Hyperspectral retinal camera, primary open-angle glaucoma, retinal oxygen saturation
http://dx.doi.org/10.1167/iovs.13-12124
The average arteriolar (left) and venular (right) OD values at each given (5-nm) imaged wavelength
from 500 to 600 nm for all of the volunteers.
In summary, this article has described a novel hyperspectral prototype for
spectral imaging of the retina that can potentially be used in the future to
acquire retinal vessel blood oxygen saturation values. By considering the
limitations of ocular imaging encountered by other retinal oximetry studies,
namely longer acquisition and exposure times, flash exposure, and limited
wavelength intervals, this new instrument may be promising in acquiring more
refined and faster measurements of nonflash exposure retinal oximetry
measurements in vivo that can potentially be applied to human retinal vascular
disease.
Ophthalmic IMAGING portable imaging
Human Factor and Usability Testing of a
Binocular OCT System - EASE Study
Reena Chopra1
, Padraig J. Mulholland1, 2
, Adam M. Dubis1
, Roger S.
Anderson1, 2
, Pearse A. Keane1
1
NIHR Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS
Foundation Trust and UCL Institute of Ophthalmology, London, United Kingdom; 2
Optometry
and Vision Science Research Group, School of Biomedical Sciences, Ulster University,
Coleraine, Northern Ireland, United Kingdom
Automated quantitative pupillometry using the Binocular OCT
Purpose: A prototype binocular optical coherence
tomography (OCT) device has recently been developed that
performs ‘whole-eye’ OCT imaging in an automated manner
(Envision Diagnostics, Inc. USA). The inclusion of ‘smart
technology’ such as customizable display screens and voice
recognition also permits the quantitative assessment of
visual acuity (VA), visual fields, ocular motility, and
pupillometry (Fig. 1). As this device will primarily be used in
elderly and visually impaired populations, we performed
prospective usability testing of an early prototype with a
view to predicting function in a clinical setting, and to
identify any potential user errors – EASE Study
(ClinicalTrials.gov Identfier: NCT02822612).
ARVO 2017 Annual Meeting Abstracts
Session 516: Advancements in OCT
Ophthalmologica 2017;238:89-99https://doi.org/10.1159/000475773
http://dx.doi.org/10.15761/NFO.1000102
Fundus Photography in the 21st Century—A Review of Recent Technological Advances
and Their Implications for Worldwide Healthcare
Panwar Nishtha, Huang Philemon, Lee Jiaying, Keane Pearse A., Chuan Tjin Swee, Richhariya Ashutosh, Teoh
Stephen, Lim Tock Han, and Agrawal Rupesh. Telemedicine and e-Health. March 2016, 22(3): 198-208.
https://doi.org/10.1089/tmj.2015.0068
iCam, 3nethra, CenterVue, iOptics EasyScan, Topcon TRC-NW8FPLUS, Zeiss Visucam 200, Kowa Nonmyd7, Canon CR-2, Oculus Imagecam,
iExaminer, PanOptic, Volk Pictor, VersaCam, JedMed Horus Scope, Optomed Smartscope, Kowa Genesis-D, Riester, Ocular Cellscope, PEEK,
dEye
Retinal Layer Segmentation Pathological retina challenging still
https://arxiv.org/abs/1704.02161
https://arxiv.org/abs/1707.04931
Branch Residual U-Network (BRU-net)
https://doi.org/10.1364/BOE.8.003292
https://doi.org/10.1364/BOE.8.001926
Voxeleron Awarded NIH SBIR
Grant for Device-independent
Retinal OCT Image Analysis
Software
February 8, 2017 Daniel Russakoff
Voxeleron will collaborate with Professor Pablo
Villoslada of UCSF/IDIBAPS and Dr. Pearse Keane of
Moorfields Eye Hospital to validate the algorithms
and ensure clinical utility.
in the choriocapillaris is shown. https://www.voxeleron.com/orion/
Vascular segmentation
http://dx.doi.org/10.1136/bmjophth-2016-000032
https://doi.org/10.1007/978-3-319-59876-5_56
https://doi.org/10.1007/s10916-017-0719-2
https://arxiv.org/abs/1704.03743
Other Retinal segmentation & Detection
Christos Bergeles, Adam M. Dubis, Benjamin Davidson,
Melissa Kasilian, Angelos Kalitzeos, Joseph Carroll, Alfredo
Dubra, Michel Michaelides, and Sebastien Ourselin
Biomedical Optics Express Vol. 8, Issue 6, pp. 3081-3094
(2017) https://doi.org/10.1364/BOE.8.003081
https://arxiv.org/abs/1706.03008
(2017) https://doi.org/10.1109/ISBI.2017.7950704
Suman Sedai, Ruwan Tennakoon, Pallab Roy Khoa Cao and Rahil Garnavi
IBM Research - Australia, Melbourne, VIC, Australia
localization of the fovea, second stage produces an accurate
segmentation of the fovea region.
We present an algorithm that automatically detects cones in
AOSLO split-detection images without supervision. Our
algorithm is among the first that use machine learning to
develop and use a photoreceptor model on-the-fly. Comparing
to Cunefare et al. (2016), specifically, the approach presented
here can tackle both densely and sparsely populated
photoreceptor images as it is independent of the spatial
arrangement of cones. Further, it introduces contrast
enhancement filters, which improve the quality of low signal-to-
noise ratio (SNR) images.
m
Optic disc and Cup segmentation or detection
https://arxiv.org/abs/1704.00979
Visual comparison of the predicted results and correct
segmentation on RIM-ONE v.3 for the optic disc (a)-(c), (g)-(i)
and cup (d)-(f), (j)-(l). On (d)-(f), (j)-(l) region of the optic disc is
shown as an input image.
https://doi.org/10.1109/TPAMI.2016.2577031
https://arxiv.org/abs/1707.06397
We propose a simple yet
effective method, termed
Deep Descriptor
Transforming (DDT), for
evaluating the correlations of
descriptors and then
obtaining the category-
consistent regions, which can
accurately locate the common
object in a set of unlabeled
images, i.e., unsupervised
object discovery.
IMAGE CLASSIFICATION #1
July–August, 2017 Volume 1, Issue 4, Pages 322–327
Cecilia S. Lee, MD, Doug M. Baughman, BS, Aaron Y. Lee, MD, MSCI
Department of Ophthalmology, University of Washington School of Medicine,
Seattle, Washington. http://dx.doi.org/10.1016/j.oret.2016.12.009
Examples of identification of pathology by the deep learning algorithm. Optical coherence
tomography images showing age-related macular degeneration (AMD) pathology
(A, B, C) are used as input images, and hotspots (D, E, F) are identified using an occlusion
test from the deep learning algorithm. The intensity of the color is determined by the drop
in the probability of being labeled AMD when occluded.
An occlusion test (Zeiler and Fergus, 2016) was performed to identify the
areas contributing most to the neural network's assigning the category of
AMD. A blank 20 × 20-pixel box was systematically moved across every
possible position in the image and the probabilities were recorded. The
highest drop in the probability represents the region of interest that
contributed the highest importance to the deep learning algorithm.
Varun Gulshan, PhD1; Lily Peng, MD, PhD1; Marc Coram, PhD1; et al
JAMA. 2016;316(22):2402-2410. doi:10.1001/jama.2016.17216
Validation Set Performance for All-Cause Referable Diabetic
Retinopathy in the EyePACS-1 Data Set (9946 Images) Performance of
the algorithm (black curve) and ophthalmologists (colored circles) for all-cause
referable diabetic retinopathy, defined as moderate or worse diabetic
retinopathy, diabetic macular edema, or ungradable image. The black
diamonds highlight the performance of the algorithm at the high-sensitivity
and high-specificity operating points.
IMAGE CLASSIFICATION #2
Stefanos Apostolopoulos, Carlos Ciller, Sandro I. De Zanet,
Sebastian Wolf, Raphael Sznitman
https://arxiv.org/abs/1610.03628
Ahmed ElTanboly,Marwa Ismail, Ahmed Shalaby, Andy Switala, Ayman El-Baz, Shlomit
Schaal, Georgy Gimel’farb,Magdi El-Azab
First published: 17 March 2017
DOI: 10.1002/mp.12071
https://doi.org/10.1146/annurev-bioeng-071516-044442
IMAGE Quality in image classification
Image Restoration: From Sparse and Low-rank Priors to Deep Priors
Learning Deep CNN Denoiser Prior for Image Restoration
Lei Zhang,, Wangmeng Zuo
The Hong Kong Polytechnic University, Harbin Institute of Technology
CLEAN
GAUSSIAN NOISE
GAUSSIAN BLUR
Example performance of quality resilient networks on various quality
distortions. This table shows the class prediction for an image under several
different types of distortions (from top to bottom: clean, Gaussian noise and
Gaussian blur). The original VGG16 network (Mclean
) fails on distorted images.
Networks fine-tuned on different types of distortions perform well on that
particular distortion, but not on other distortion types (Mnoise
and Mblur
). Our
mixture of experts based model (Mmix
) performs well over all distortion types as
well as the original clean image.
https://arxiv.org/abs/1703.08119
https://arxiv.org/abs/1611.05760
State-of-the-art image classification networks like VGG-16 perform poorly on blurred input (left),
when using model weights trained on high-quality sharp image datasets (center). However, while
they often make erroneous predictions in terms of the most likely classes for a blurred image, they
do so with lower confidence—producing distributions that are higher-entropy than those for sharp
images. However, this drop in performance is largely an artifact of being trained without any
blurred examples. We find that by fine-tuning the model on a mix of blurred and sharp images
for just a few epochs, allows it to perform well on both sharp and blurred inputs (right).
IMAGE Restoration enhancement
Deep Bilateral Learning for Real-Time Image Enhancement
MICHAËL GHARBI, MIT CSAIL; JIAWEN CHEN, Google Research; JONATHAN T. BARRON,
Google Research; SAMUEL W. HASINOFF, Google Research; FRÉDO DURAND, MIT
CSAIL / Inria, Université Côte d’Azur, http://dx.doi.org/10.1145/3072959.3073592
https://arxiv.org/abs/1707.02880
Our novel neural network architecture can reproduce sophisticated
image enhancements with inference running in real time at full HD
resolution on mobile devices. It can not only be used to dramatically
accelerate reference implementations, but can also learn subjective
effects from human retouching.
Image Restoration: From Sparse and Low-rank Priors to Deep Priors
Lei Zhang,, Wangmeng Zuo
The Hong Kong Polytechnic University, Harbin Institute of Technology
https://arxiv.org/abs/1704.03264
Kai Zhang ; Wangmeng Zuo ; Yunjin Chen ; Deyu Meng ; Lei Zhang
https://doi.org/10.1109/TIP.2017.2662206
An example to show the capacity of our proposed model for three different tasks (denoising, super-resolution, JPEG image deblocking). The input image is composed by noisy images with noise level 15
(upper left) and 25 (lower left), bicubically interpolated low-resolution images with upscaling factor 2 (upper middle) and 3 (lower middle), JPEG images with quality factor 10 (upper right) and 30 (lower right).
Note that the white lines in the input image are just used for distinguishing the six regions, and the residual image is normalized into the range of [0, 1] for visualization. Even the input image is corrupted with
different distortions in different regions, the restored image looks natural and does not have obvious artifacts.
IMAGE CLASSIFICATION Jointly with image restoration
https://arxiv.org/abs/1706.04284
https://arxiv.org/abs/1701.06487
(a) The whole ground truth image 0051x4 from DIV2K dataset. We show the
comparison of the zoom-in region between: (b) the ground truth; (c) the noisy image
with i.i.d. Gaussian noise of zero mean and σ = 30; (d) the denoised image by BM3D
; the denoising result of our proposed denoising network (e) without the guidance of
high-level vision information; (f) with the guidance of high-level vision information
Our experimental results demonstrate that the proposed architecture
not only yields superior image denoising results preserving fine
details, but also overcomes the performance degradation of different
high-level vision tasks, e.g., image classification and semantic
segmentation, due to image noise or artifacts caused by conventional
denoising approaches such as over-smoothing.
We propose a novel end-to-end differentiable architecture for joint denoising,
deblurring, and classification that makes classification robust to realistic noise and
blur. The proposed architecture dramatically improves the accuracy of a
classification network in low light and other challenging conditions,
outperforming alternative approaches such as retraining the network on noisy and
blurry images and preprocessing raw sensor inputs with conventional denoising
and deblurring algorithms
UNCERTAINTY in image enhancement
https://arxiv.org/abs/1705.00664
In this work, we investigate the value of uncertainty modelling in 3D super-
resolution with convolutional neural networks (CNNs). However, the highly ill-
posed nature of such problems results in inevitable ambiguity in the learning of
networks. We propose to account for intrinsic uncertainty through a per-patch
heteroscedastic noise model and for parameter uncertainty through approximate
Bayesian inference in the form of variational dropout. We demonstrate through
experiments on both healthy and pathological brains the potential utility of such
an uncertainty measure in the risk assessment of the super-resolved images for
subsequent clinical use.
This paper proposes a new implementation of supervised image quality
enhancement method referred as Bayesian image quality transfer (IQT). via
CNNs. This involves two key innovations in CNN-based models: 1) we extend
the subpixel CNNs previously limited to 2D images, to 3D volumes,
outperforming previous models in accuracy and speed on a DTI SR task; 2)
we devise new architectures enabling estimates of different components of
the uncertainty in the SR mapping
Data-driven Ophthalmology
Sparsity and Model compressability
We thoroughly explored the granularity of sparsity with experiments on detailed
accuracy-density relationship. Due to the advantage of index saving, coarse-grained
pruning is able to achieve a higher model compression ratio, which is desirable for mobile
implementation. We also analyzed the hardware implementation advantages and show
that coarse-grained sparsity saves 2× output∼ memory access compared with fine-
grained sparsity, and ∼ 3× compared with dense implementation. Given the advantages
of simplicity and efficiency from a hardware perspective, coarse-grained sparsity enables
more efficient hardware architecture design of deep neural networks.
Towards multimodal models
Combining structuralandfunctionaldata
Future of OCT and retinal biomarkers
From Schmidt-Erfurth et al. (2016): “The therapeutic efficacy of VEGF inhibition in combination with the potential of
OCT-based quantitative biomarkers to guide individualized treatment may shift the medical need from CNV treatment
towards other and/or additional treatment modalities. Future therapeutic approaches will likely focus on early and/or
disease-modifying interventions aiming to protect the functional and structural integrity of the morphologic complex
that is primarily affected in AMD, i.e. the choriocapillary - RPE – photoreceptor unit. Obviously, new biomarkers
tailored towards early detection of the specific changes in this functional unit will be required as well as follow-up
features defining the optimal therapeutic goal during extended therapy, i.e. life-long in neovascular AMD. Three novel
additions to the OCT armamentarium are particularly promising in their capability to identify the biomarkers of the
future:”
Polarization-sensitive OCT OCT angiography Adaptiveopticsimaging
“this modality is particularly appropriate to highlight early
features during the pathophysiological development of
neovascular AMD
Findings from studies using adaptive optics implied that
decreased photoreceptor function in early AMD may be
possible, suggesting that eyes with pseudodrusen appearance
may experience decreased retinal (particularly scotopic) function
in AMD independent of CNV or RPE atrophy.”
“...the specific patterns of RPE plasticity
including RPE atrophy, hypertrophy, and
migration can be assessed and quantified).
Moreover, polarization-sensitiv e OCT allows
precise quantification of RPE-driven disease
at the early stage of drusen”,
“Angiographic OCT with its potential
to capture choriocapillary, RPE, and
neuroretinal fetures provides novel
types of biomarkers identifying
disease pathophysiology rather than
late consecutive features during
advanced neovascular AMD.””
Schlanitz et al. (2011)
zmpbmt.meduniwien.ac.at
See also Leitgeb et al. (2014)
Zayit-Soudry et al. (2013)
Multimodal models in general in medicine
https://dx.doi.org/10.1097%2FWCO.0000000000000460
Imaging plus X: multimodal models of neurodegenerative disease
Neil P. Oxtoby and Daniel C. Alexander, for the EuroPOND consortium
Old paradigm disease progression models. (a) It shows the hypothetical
model of Jack et al. (2010), which illustrates qualitative sigmoid evolution in
AD of scalar biomarkers such as CSF Aβ level, cognitive test scores and
hippocampal volume or atrophy. The lack of quantitative information prevents
direct diagnostic usage. (b) It shows a traditional longitudinal model of AD
atrophy Scahill et al. (2002) by binning individuals a-priori into ‘mild’,
‘moderate’ and ‘severe’ classes based on cognitive test scores. The model
can potentially match new individuals to the same stages using imaging data,
but must exclude cognitive scores to avoid circularity. AD, Alzheimer's disease.
The temporally continuous self-modelling regression approach of Jedynak et al. (2012).
The model shows the characteristic trajectories of a diverse set of biomarkers against a
common continuous disease stage variable learned from the ADNI and PAQUID (Personnes
Agées Quid) data sets. The model can potentially estimate the disease stage of a new
patient by identifying the position along the trajectory set that best matches their data.
ADNI, Alzheimer's disease neuroimaging initiative.
We have reviewed data-driven model-based analyses of neurodegenerative disease. We have argued the
potential for generative data-driven models to take centre stage in the study and management of
neurodegenerative diseases if we are to generate new avenues for disease understanding in the earliest,
preclinical stages. This is necessitated by the challenges in monitoring any neurological disease over its
full time course, coupled with overlapping phenotypes and lack of a single biomarker that is dynamic
across the full disease time course.
The main focus of development and application to date has been in Alzheimer's disease, but various efforts
including the EuroPOND project are expanding the application to other dementias, multiple-sclerosis, prion
diseases, normal ageing and development, and even non-brain applications. These techniques have the
potential for widespread impact in realising precision medicine across many such domains.
Retina as deep learning network
Photoreceptor
layer
Horizontal
Cells
BipolarCells AmacrineCells GanglionCell
layer
DL Layer1 DLLayer2 DL Layer3 DL Layer4 DLLayer5
LIGHT BRAIN
With enough data, we can do densely
connected (i.e. every layer is connected to
every other layer) feedforward network (or
even recurrent) not having to constrain the
network as all the modulatory pathways are
notwell known
https://arxiv.org/abs/1608.06993; Cited by 29
Joint training of
alllayers with
layer-wise
targets
derived from
ERGand
pupillometry
OPN4
https://arxiv.org/abs/1409.5185;Citedby292 
Forexample, glaucoma
affectsganglion cell function,
whereas retinitis
pigmentosa affects
photoreceptors
DL-Deeplearning
OPN4- Melanopsin (ipRGC)
Retina (and V1) as deep learning network
DOI: 10.13140/RG.2.2.27438.72003 12/2016, Conference: NIPS 2016 Workshop -
Brains and Bits: Neuroscience Meets Machine Learning,
Riccardo Volpi, Istituto Italiano di Tecnologia; Matteo Zanotto; Diego Sona,: Vittorio Murino
International Work-Conference on the Interplay Between Natural and Artificial Computation
IWINAC 2017: Natural and Artificial Computation for Biomedicine and Neuroscience pp 464-472
Towards a Deep Learning Model of Retina: Retinal Neural Encoding of
Color Flash Patterns
Antonio Lozano. Javier Garrigós, J. Javier Martínez, J. Manuel Ferrández, Eduardo Fernández
https://doi.org/10.1007/978-3-319-59740-9_46
https://arxiv.org/abs/1702.01825
Visualizing the internal activity of a CNN
in response to a natural scene stimulus.
(A-C) Time series of the CNN activity
(averaged over space) for the first
convolutional layer (8 units, A), the
second convolutional layer (16 units, B),
and the final predicted response for an
example cell (C, cyan trace). The
recorded (true) response is shown below
the model prediction (C, gray trace) for
comparison. (D) Spatial activation of
example CNN filters at a particular time
point. The selected stimulus frame (top,
grayscale) is represented by parallel
pathways encoding spatial information
in the first (purple) and second (green)
convolutional layers (a subset of the
activation maps is shown for brevity). (E)
Autocorrelation of the temporal activity
in (A-C). The correlation in the recorded
firing rates is shown in gray
https://doi.org/10.1101/120956
Furthermore, the composite nonlinear computation performed by retinal
circuitry corresponds to a boolean OR function applied to bipolar cell feature
detectors. Our general computational framework may aid in extracting
principles of nonlinear hierarchical sensory processing across diverse
modalities from limited data.
https://arxiv.org/abs/1706.06208
Retina Model synthesis as Deep learning architecture
Indirectinferenceonretinalcircuit: Hardtorecordeveryintermediatestepin humans
INPUT
Light
OUTPUT
Pupil size
McDougal and Gamlin 2008
AUXILIARY OUTPUT
functional MRI (fMRI)
Temporal transfer functions for the postreceptoral cone
pathways.Spitschanet al. (2016). Seealso Hung etal. (2016).The original responses from the achromatic luminance experiments and their
derived PCA waveforms. The results of the component analysis illustrate that the
pupil response can be described quite well as a linear sum of a sustained and a
transientcomponent. - Young etal. (1993)
Maynard et al. (2015)
INTERMEDIATE
OUTPUT
Electroretinography (ERG)
(left) Proposed neural pathways andsynapticmechanisms underlying ipRGC
influence on light adaptation (right) M1 ipRGCs modulate the light-adapted
ERG b-wave viaD4dopaminereceptors– Priggeetal. (2016)
Multifocal Electroretinogram (UC Davis)
The relative spectral sensitivities of the five
photoreceptors in human retina, including S-, M-,
L-cones, rods, and ipRGCs (A), LED spectral
distributions (B), and LED chromaticities in 1964
CIE 10°space(C).- Cao etal. (2015)
Deeplearningframework forphototransduction studies, and clinicaldiagnosisdecisionsupportsystems
Retina Model synthesis Photoreceptor contributions #1: ERG
INPUT
Light
OUTPUT
Pupil size ?
Not done in the study by
Allen et al. (2016) 
INTERMEDIATE OUTPUT
Electroretinography (ERG)
Vary the light parameters (intensity, wavelight, modulation) to probe what are the 'normal' responses
either in visual processing/phototransduction in 'basic science' paradigms, or alternatively employ light
parametersthatbestdiscriminate between retinalpathologies.
Note! In optimally constructed model with more parameters (more explicit retinal circuitry), one could infer all possible outcomes
(pathologicalornot)fromtheframework.Butinpracticewearelimitedtothedataavailable.
For example if glaucoma is shown to be detected well using PLR, we could extend that dataset with using same protocol and
simultaneously record ERG, visual fields, etc, and then have more complete model, and then have “good” predictive power with
ERGandvisualfieldaloneifPLRisnotpossible todo.
Rod and cone ERGsover mesopic
irradiances. Allenetal.(2016) 
Stimulusdesignand quantification. The output
of athree-primaryLED light source (peak
emission at 354, 460, and 600 nm) wasused to
generatefour spectra, with precise excitation of
melanopsin, rod, SWS, and LWSopsins.
Allenetal.(2016) 
Normalized b-wave amplitudes(G), implicit times(H), and OP amplitudes (I) for light-adapted
cone ERGsin Opn1mwR mice for pairsof rod-divergent stimuli(blackfilled circles are rod/mel-
lowand grayopen circles are rod/mel-high) withstimulusintensityquantified intermsof rod
effective photons/cm2/s. - Allen et al. (2016) 
We now have the 'pure photoreceptor' response (well,
you know Ray), and if these responses are normal but
PLR abnormal, we could assume that the problem is
downstream giving hints about the given pathology
ERG Methodological background #1
Bingyao Tan; Erik Mason; Benjamin MacLellan; Kostadinka K. Bizheva
IOVS March 2017, Vol.58, 1673-1681. doi:10.1167/iovs.17-21543
Comparison of the changes in the total axial retinal blood flow (RBF) and the ERG b-wave
magnitude resulting from 200-ms single flash and 1-second, 10 Hz, 20% duty cycle flicker stimuli of
the same illumination intensity. (A) Representative ERG traces. The pink and gray shaded areas mark
the duration of the visual stimuli. Original time recordings of the total axial RBF in response to the
single flash and flicker stimuli.
Pedro Monsalve; Giacinto Triolo; Jonathon Toft-Nielsen; Jorge Bohorquez; Amanda D. Henderson;
Rafael Delgado; Edward Miskiel; Ozcan Ozdamar; William J. Feuer; Vittorio Porciatti
Translational Vision Science & Technology May 2017, Vol.6, 5.
doi:10.1167/tvst.6.3.5
A new PERG method with increased dynamic range allows recording of retinal
ganglion cell function in advanced stages of optic nerve disorders. It also
quantifies the response decline during the test, an autoregulatory
adaptation to metabolic challenge that decreases with age and presence of
disease.
Here we describe a new method for steady-state PERG recording in human
based on a visual display unit built with Light-Emitting Diode (LED) technology,
skin electrodes, and optimized signal processing to quantify response
adaptation (dubbed PERGx as a contraction of PERGnext). We show that,
compared to a validated method, the PERGx has a very high signal-to-noise ratio
(SNR); this suggests that meaningful responses can be recorded in advanced
stages of diseases such as nonarteritic ischemic optic neuropathy (NAION).
PERGx temporal dynamics and intrinsic variability in a representative
normal subject. (A) The amplitude of PERGx samples (blue circles, 16
consecutive partial averages of 64 epochs each over 2 minutes) progressive
declined (adapted) with a slope of −0.031 μV/sample (R2 = 0.48), whereas the
PERGx phase (red circles) was stationary. (B) Polar diagram displaying combined
amplitude and phase of PERG samples (open black circles) and noise samples
(open grey triangles). The PERG amplitude (1.65 μV) is represented by the
length of vector connecting the origin of the axes with the cluster centroid. The
PERG phase (63.6°) is represented by the angle Φ between the vector and the
x-axis.
ERG Methodological background #2
https://doi.org/10.1007/s10633-017-9593-y
Discrete Wavelet Transform (DWT) analysis applied to the mfERG response from a control (left)
and a patient (right). Topgraphical representation of the 2F-mfERG M-sequence used here
(MOFOFO), with frames displaced in time in order to better correspond visually to the recorded
response. The original signal from one hexagon of the mfERG (waveform inside box on top) can
be decomposed into many frequency levels, depending on the length of the time series. The first
level (1211 Hz) corresponds to high frequencies (noise), while the highest level (11 Hz)
corresponds to the lowest frequencies. For each frequency level, the vertical lines represent
individual wavelet coefficients. For each level, the variance between these coefficients is
computed and subjected to further analysis as the WVA (wavelet variance). Legend: DC direct
component; IC1 first induced component; IC2 second induced component
The entire process of retinal visual processing involves
the phototransduction cascade with different groups of
cells and circuits from the photoreceptors to the
ganglion cells. Thus, electrical signals produced by
different biological structures contribute to the retinal
response of the mfERG that is recorded from the cornea
[Hood et al. (2002); Luo et al. (2011)]
. In the standard mfERG, amplitude
and implicit time are often analyzed [Hood et al. (2012)]
.
Early glaucoma Dilru C Amarasekera BS, Arthur F Resende MD, Michael Waisbourd MD, Sanjeev Puri MD, Marlene R Moster MD,
Lisa A Hark PhD, L Jay Katz MD, Scott J Fudemberg MD, Anand V Mantravadi MD
First published: 20 July 2017 DOI: 10.1111/ceo.13006
Unreliable test results were excluded.
Abbreviations: ss-PERG=Steady-State Pattern Electroretinogram; SD-tVEP=ShortDuration
transient Visual Evoked Potentials; Lc=Low Contrast; Hc=High Contrast; SNR=Signal-to-
Noise Ratio.
Electrophysiological techniques thus play a valuable role in a diagnostic environment dominated by
highly effective tools such as OCT via the addition of an objective functional perspective to the
diagnosis of glaucoma. Although the use of PERG and VEP as a measure of retinal ganglion cell and
visual pathway dysfunction has been established, few studies have measured the potential clinical
utility of the novel rapid testing platform of ss-PERG and SD-tVEP in patients with glaucoma.ss-
PERG was effectively able to discern between glaucomatous and healthy eyes. The diagnostic
ability of ss-PERG was superior to that of SD-tVEP. ss-PERG may thus have a role as a clinically useful
electrophysiological diagnostic tool.
Retina Model synthesis Photoreceptor contributions #2: PLR
INPUT
Light
OUTPUT
Pupil size
INTERMEDIATE OUTPUT
Electroretinography (ERG)?
ERGnot done thistime
Experimental design. (A, Left) L, M, and S cones and melanopsin-containing
ipRGCs mediate vision at daytime light levels. (Center) Photoreceptor spectral
sensitivities. (Right) Physiological measurements of ipRGCs find excitatory L
and M cone inputs and inhibitory S-cone inputs (12). (B) A digital spectral
integrator produces sinusoidal photoreceptor-directed modulations that pass
through an artificial pupil into the pharmacologically dilated left eye. The
consensual pupil response of the right eye is recorded. (C) Photoreceptor-
directed modulations. Balanced changes in the spectrum of light around a
background spectrum nominally isolate targeted photoreceptors. -
Spitschan et al. (2014)
Group PLR data are well fit by the two componentlinear filtermodel. (A) The mean response across all subjects
(01–16) is shown at 0.05 and 0.5 Hz, for L+M-, melanopsin-, and S-cone-directed modulations. Fit values are
derived from those found for subject 01, with only amplitude parameters adjusted (Table S2). This is because the
average data are available at only two temporal frequencies and do not sufficiently constrain all parameters of the
model. To obtain the average data plotted, amplitudes and phases were averaged separately (i.e., average
amplitude obtained without consideration of phase, average phase obtained without consideration of amplitude).
The model was fit to the data as plotted. (B)Polar-plot representations of the group data with model fit points,
following conventions as in Fig 3. The data are normalized separately for each temporal frequency. Error bars (± 2
SEM across subjects) are smaller than the plot points for the data. -Spitschan et al. (2014)
Now aswe are feeding
in more data,we are in
theory learning how the
lightparameters
should be designed
tohave the best
photoreceptor
response isolation.
And have presentations
for corresponding ERG
and PLR responses.
It would also help if all the
studieswerefromhumans:P
Retina Model synthesis further downstream
INPUT
Light
OUTPUT
Pupil size
INTERMEDIATE OUTPUT
Electroretinography (ERG)
“KNOWN BEHAVIOR”
Auxiliary OUTPUT
dLGN
Build on top of previous models. We “know” how specific light stimulus is processed by the retina (ERG), and how is
this reflected in pupil behavior (PLR) via olivary pretectal nucleus (OPN). So using the same parameters, record the
activityofLGN for example whichisnice atleast forbasicscience, not necessarilyforpathologyscreening.
A: LED spectral power densities and in vivo photoreceptor spectral sensitivity
(normalised). The output of blue and yellow LEDs was adjusted to produce
equivalent effects on rods (black line). By contrast, the blue LED, always
appeared brighter for melanopsin (green line). B: Protocol 1. Melanopsin-
isolating steps in dLGN and retina, respectively) presentations of the blue LED
were interleaved with 210 or 180 sec of the (dLGN and retina, respectively)
yellow to produce a ‘step’ visible only to melanopsin. C: Protocol 2. Irradiance
slowly ramped up (0.5 ND per 200 seconds) before remaining at a steady state
for 10 seconds. D: The effective change in photon flux for melanopsin (green)
and rods (black) across a full repeat of Protocol 2. Settings of ND filter at the
point of each melanopsin isolating step are provided above.
- Davis et al. (2015)
INTERMEDIATE OUTPUT #2
Ganglion cell firingrates
Responsestomelanopsin-isolatingstepsand
gradual irradiance rampsin retina.
- Davis et al. (2015)
Responsesto melanopsin-stepsin the dLGN.
- Davis et al. (2015)
Retina Model synthesis
INPUT
Light
INTERMEDIATE OUTPUT
Electroretinography(ERG)
OUTPUT
Pupil size
INTERMEDIATE OUTPUT #2
Ganglion cell firingrates
Auxiliary OUTPUT
dLGN
Sonow we know how retina worksin a data-drivendeep learningsense (noexplicitmodelling ofretinainbiological
sense). We can heuristically cheatand define connectionsasdefined fromliterature
So as we feed in data from studies, the interactions between blocks are “automagically” quantified by adjusting the
convolutional weights in the deep learning model. At some point if we have enough data we could also start to relax the
circuitconstraints and hypothesize thatthere could be recurrent feedback from dLGN to OPN (controlling pupil size), and do
'blindcausalityanalysis' (Nikolaprobably an experton that)
https://arxiv.org/abs/1601.03610
We have proposed a novel framework for
causal analysis in time-series which does not
require any assumptions about the statistical
relationships among the variables of the study,
i.e., it is model-free.
Our results show that Twitter data polarity
does indeed have a causal impact on the
stock market prices of the examined
companies. Hence, we believe social media
data could represent a valuable source of
information for understanding the dynamics
of stock market movements
http://www.slideshare.net
http://dx.doi.org/10.1534/genetics.114.165704
Retina Model synthesis Pathologies?
INPUT
Light
INTERMEDIATE OUTPUT
Electroretinography (ERG)
OUTPUT
Pupil size
INTERMEDIATE OUTPUT #2
Ganglion cell firingrates
Auxiliary OUTPUT
dLGN
In case with glaucoma, one would expect that the peripheral retina gets destroyed first
(A) Schematic diagram showing the flash stimulation sequence of
the slow-sequence (slow flickering stimulation, MOOO) multifocal
electroretinogram (mfERG). (B) The first-order kernel of the slow-
sequence mfERG from the central (rings 1 to 2) and peripheral (rings
3 to 6)region - Chanetal.(2011)s.
Overlapping visual field test-region layout and luminance
characteristics of the multifocal pupillographic objective
perimetrystimuli forall protocols.  -Carleetal. (2014)
Now we can define normal and pathologies as classes as you would in typical image classificationtasks ('dogs',
'cats', 'etc'), but instead of just using single image (whether it be fundus or OCT (SD/SS/AO/Angiography), we can
combine boththe image and behavioral response for better quantificationof the retinal pathology.
Retina Model synthesis VISUAL FIELD
Old school psychophysical functional measure that is often found stressful by the patients
https://doi.org/10.1016/j.ophtha.2017.04.021
De Moraes CG, Hood DC, Thenappan A, Girkin CA, Medeiros FA,
Weinreb RN, Zangwill LM, Liebmann JM.
Central visual field damage seen on the 10-2 test is
often missed with the 24-2 strategy in all groups. This
finding has implications for the diagnosis of glaucoma
and classification of severity.
JAMA Ophthalmol. 2017;135(7):783-788.
doi: 10.1001/jamaophthalmol.2017.1659
JAMA Ophthalmol. 2017;135(7):742-747.
doi: 10.1001/jamaophthalmol.2017.1396
A deep-learning based automatic
glaucoma identification
ARVO 2017: 320 Visual Fields, Vision Function, Psychophysics I
Serife Seda S. Kucur, Mathias Abegg, Sebastian Wolf, Raphael Sznitman.
ARTORG Center, University of Bern, Bern, Switzerland; Department of Opthalmology,
Inselspital Bern, Bern, Switzerland.
The inherent local and global characteristics of visual fields (VFs)
can be exploited in a strong data-driven sense and could provide
better understanding of VFs with regards to glaucoma. Ultimately,
this may help to efficiently automatize the diagnosis process.
Our hypothesis is that alternative representations of raw VFs, in
terms of different spatial scales, could be learned by computers
using machine learning techniques towards an effective
automatized glaucoma identification task. Accordingly, we present
a Convolutional Neural Network (CNN)-based approach for
classification of VFs as being glaucomatous or non-glaucomatous.
Conclusions: These results support the fact that processing Vfs
through a CNN generates different representation of data in
terms of its hidden characteristics and patterns that are efficient
to discriminate between glaucomatous and non-glaucomatous
VFs in an automated way. The performance could be further
improved with a different CNN architecture. The trained CNNs
have the potential to be utilized for glaucoma progression
analysis as well
https://doi.org/10.1016/j.ophtha.2017.01.027
http://dx.doi.org/10.1097/IJG.0000000000000710
Retina Model synthesis beyond retinopathies #1
What to diagnose fromthe eye, e.g. neurodegenerative disease such as alzheimer’s disease
Is the Eye an Extension of the Brain in
Central Nervous System Disease?
Lies De Groef1,2
and Maria Francesca Cordeiro1,3,4
Journal of Ocular Pharmacology and Therapeutics. June 2017,
https://doi.org/10.1089/jop.2016.0180
1
Glaucoma and Retinal Neurodegenerative Disease Research Group, Institute of Ophthalmology, University
College London, London, United Kingdom.
2
Neural Circuit Development and Regeneration Research Group, Department of Biology, University of Leuven,
Leuven, Belgium.
3
Western Eye Hospital, Imperial College Healthcare NHS Trust, London, United Kingdom.
4
ICORG, Department of Surgery and Cancer, Imperial College London, London, United Kingdom.
Compilation of examples to illustrate the concept ‘‘the eye as a window to the brain’’.
Typical ocular diseases, such as uveitis, glaucoma, and AMD, have in common several
pathological mechanisms with CNS diseases, for example, MS and AD. Both in vivo and post
mortem examinations of the eye can therefore be used to study the disease mechanisms
underlying these pathologies in the eye and brain. (1) fluorescein angiography; (2)
intraocular pressure measurement (copyright iCare, TonoLab); (3) optical coherence
tomography scan; (4) confocal scanning laser ophthalmoscopy imaging of curcumin-labeled
protein aggregates; (5) retinal oximetry; (6) ZO-1 tight junction immunostaining on
wholemounted retina; (7) transmission electron microscopy image of trabecular meshwork;
(8) Iba-1 microglia immunostaining on retinal section; (9) Brn3a retinal ganglion cell
immunostaining on wholemounted retina; (10) β-amyloid immunostaining on retinal section;
and (11) concanavalin A vessel labeling on wholemounted retina. AD, Alzheimer’s; AMD,
age-related macular degeneration; MS, multiple sclerosis
Front Aging Neurosci. 2017; 9: 214.
Published online 2017 Jul 6. doi: 10.3389/fnagi.2017.00214
The Role of Microglia in Retinal Neurodegeneration:
Alzheimer's Disease, Parkinson, and Glaucoma
Ana I. Ramirez,1,2 Rosa de Hoz,1,2 Elena Salobrar-Garcia,1,3 Juan J. Salazar,1,2 Blanca Rojas,1,3 Daniel Ajoy,1
Inés López-Cuenca,1 Pilar Rojas,1,4 Alberto Triviño,1,3 and José M. Ramírez1,3,*
Front Neurol. 2017; 8: 162.
Published online 2017 May 4. doi: 10.3389/fneur.2017.00162
Retinal Ganglion Cells and Circadian Rhythms in
Alzheimer’s Disease, Parkinson’s Disease, and
Beyond
Chiara La Morgia,1,2,* Fred N. Ross-Cisneros,3 Alfredo A. Sadun,3,4 and Valerio Carelli1,2
Summary of circadian rhythm
abnormalities in AD, PD, and HD.
AD, Alzheimer’s disease; PD, Parkinson’s disease;
HD, Huntington’s disease; IV, intra-daily variability;
IS, inter-daily stability; RA, relative amplitude; BP,
blood pressure; HR, heart rate.
Schematic representation of the hypothetical events
associated with the neuroinflammation in AD (A),
PD (B), and glaucoma (C). AD, Alzheimer's Disease; PD,
Parkinson's Disease; ILM, inner limitant membrane; NFL,
nerve fiber layer; GCL, ganglion cell layer; IPL, inner
plexiform layer; INL, inner nuclear layer; OPL, outer
plexiform layer; ONL, outer nuclear layer; OLM, outer
limitant membrane; PL, photoreceptor layer; RPE, retinal
pigment epithelium; BM, Bruch membrane; C, choroid;
Aβ, beta-amyloid; pTau, phosphorylated tau.
Health Economics for Medical
Startups | Background
Business Models focus
●
Often technical founders focus too much on the technology, and do no achieve the
Product-market fit
– In medical startups, it is often very useful to do proper health economics
calculations to see your idea to customers and investors.
●
In other words, how much can your solution make the healthcare more efficient
economically while improving quality of care to the patient.
– Other common problem in the long run is the reimbursement as in most
countries, the patient itself does not that pay fully the healthcare that the patient
receives, and the market access is complicated with varying regulations/policies in
each country.
http://startupheretoronto.com
www.smi-online.co.uk
Business Models Innovations on the model
https://hbr.org/2016/10
Healx: A Case Study
Informed by our business model framework, we advised (and Cambridge
Judge Business School’s business accelerator supported) the tech venture
Healx, which focuses on the treatment of patients with rare diseases in the
emerging field of personalized medicine. A big challenge for pharmaceutical
companies in this domain is that rare-disease markets are very small, so
companies usually have to charge astronomical prices. (One drug, Soliris,
used in the treatment of paroxysmal nocturnal hemoglobinuria, costs about
$500,000 per patient-year.)
Enter Healx, with a platform that leverages big data technology and analytics
across multiple databases owned by various organizations within global life
sciences and health care to efficiently match treatments to rare-disease patients.
Its initial business model hit three of our six key features. First, Healx’s value
proposition was about asset sharing (for example, making available clinical-trial
databases that record the effectiveness of most drugs across therapeutic areas
and diseases, including rare ones). Second, the business promised
more personalization by revealing drugs with high potential for treating the rare
diseases covered. Finally, Healx’s model would, in theory, create a collaborative
ecosystem by bringing together big pharma (which has the treatment and trial
data) and health care providers (which have data about effectiveness and
incompatibility reactions and also personal genome descriptions).
https://healx.io/
More recently, Healx has developed a machine-learning algorithm that can use a
patient’s biological information not only to match drugs to disease symptoms but
also to predict exactly which drug will achieve what level of effectiveness for
that particular patient. The latest version of its business model
brings personalization to the maximum possible level and adds agility, because
the treating clinician—armed with the biological data and the algorithm—can make
better treatment decisions directly with the patient and doesn’t have to rely on
fixed rules of thumb about which of the few available off-label drugs to use. In
this way, Healx is able to support decentralized, real-time, accurate decision
making.
This version of the Healx model has even more transformation potential—it
exhibits four of the six features; it has already generated revenue from
customers; and in the long term it could empower patients by giving them much
more information before they consult a medical practitioner. Although it is still
too early to tell whether that potential will be realized, Healx is clearly a venture
to watch. It has earned a number of prizes (including the 2015 Life Science
Business of the Year and the 2016 Graduate Business of the Year in the
Cambridge cluster) and sizable investments from several global funds.
LOSs Function performance quantification
●
In medical studies, the ROC curve and especially Area Under the Curve (AUC) is used as an
easy scalar to describe the performance of the classifier.
TensorFlow allows direct
optimization of ROC
http:dx.doi.org/10.1093/bib/bbr008
http://arxiv.org/abs/1605.06652
Conclusion: The AUC is an unreliable
measure of screening performance because
in practice the standard deviation of a
screening or diagnostic test in affected and
unaffected individuals can differ. The
problem is avoided by not using AUC at all,
and instead specifying detection rates
(DRs) for given false positive rates (FPRs) or
FPRs for given Drs.
http://dx.doi.org/10.1177/0969141313517497
http://tflearn.org/objectives/
Mozer, Michael C. "Optimizing classifier
performance via an approximation to
the Wilcoxon-Mann-Whitney statistic."
(2003). aaai.org/Papers
Front Public Health. 2015; 3: 57.
Published online 2015 Apr 20.
doi: 10.3389/fpubh.2015.00057
PMCID: PMC4403252
Threshold-Free Measures for
Assessing the Performance of
Medical Screening Tests
HEALTH ECONOMICAL Loss function
wikipedia.org
Analogies from churn prediction?
http://dx.doi.org/10.1186/s40165-015-0014-6
“Nevertheless, current state-of-the-art classification algorithms are not well
aligned with commercial goals, in the sense that, the models miss to include
the real financial costs and benefits during the training and evaluation phases.
In the case of churn, evaluating a model based on a traditional measure such
as accuracy or predictive power, does not yield to the best results when
measured by the actual financial cost, ie. investment per subscriber on a
loyalty campaign and the financial impact of failing to detect a real churner
versus wrongly predicting a non-churner as a churner”
What are the economical costs of each block in the contingency table, optimization for medical economics?
- More expensive to have false negatives as patients will not be diagnosed both in terms of economical cost and reduced quality of life for patients
Health economics models
https://dx.doi.org/10.3310/hta11410
Screening in UK for Glaucoma, NHS Setting
Published: Ann Intern Med. 2013;159(7):484-489
DOI: 10.7326/0003-4819-159-6-201309170-00686
Estimate of needed duration and number of subjects by Steve Kymes needed for
proper health economical study for glaucoma screening program. Presented by John
Boland at “Should we screen for glaucoma?” session at World Glaucoma Congress
2017 in Helsinki, Finland.
Indian J Ophthalmol. 2011 Jan; 59(Suppl1): S24–S30.
doi: 10.4103/0301-4738.73684 PMCID: PMC3038514
Cost-effectiveness of screening for open angle
glaucoma in developed countries
Anja Tuulonen
Clin Ophthalmol. 2017; 11: 337–346.
doi: 10.2147/OPTH.S120398 PMCID: PMC5317344
Cost and detection rate of glaucoma screening
with imaging devices in a primary care center
Alfonso Anton,1,2,3,4 Monica Fallon,3,5 Francesc Cots,2 María A Sebastian,6
Antonio Morilla-Grasa,4Sergi Mojal,3 and Xavier Castells2
RISK STRATIFICATION & Screening
Target screening for high-risk cases (family history, age, ethnicity, gender)
https://doi.org/10.1016/j.ajo.2017.05.017
(2016) https://doi.org/10.1109/TMI.2016.2608782
We introduce a novel Bayesian nonparametric model that uses
the concept of disease trajectories for disease subtype
identification.. We investigate several models with our
algorithm, and show that one with age, pack years (a measure of
cigarette exposure), and smoking status as predictors gives the
best compromise between estimated predictive performance
and model complexity.
https://arxiv.org/abs/1705.07674
The proposed risk score incorporates both the patients’ non-stationary temporal
physiological information and their individual baseline co-variates in order to accurately
describe the patients’ physiological trajectories.
Aaron Zalewski ; William Long ; Alistair E. W. Johnson ; Roger G. Mark ; Li-wei H. Lehman
Date of Conference: 16-19 Feb. 2017, https://doi.org/10.1109/BHI.2017.7897302
https://arxiv.org/abs/1704.08797
RISK factors
For example for Glaucoma
“Overview of ethnicity and race” by M. Roy Wilson (United States)
at Risk Profiling symposium at World Glaucoma Congress 2017, Helsinki, Finland
http://dx.doi.org/10.1001/jamaophthalmol.2015.1478
http://dx.doi.org/10.1126/science.aam7935
“Doctor AI” Systems | Introduction
AI Doctor
https://arxiv.org/abs/1512.03542
http://arxiv.org/abs/1602.00357
http://arxiv.org/abs/1511.02554
Longitudinal analysis → try to diagnose pathologies as early as possible.
Incorporate disease progression measurements and treatment interventions for
optimal personalized treatment.
Feature engineering remains a major bottleneck when creating predictive systems from electronic
medical records. At present, an important missing element is detecting predictive regular clinical motifs
from irregular episodic records. We present Deepr (short for Deep record), a new end-to-end deep
learning system that learns to extract features from medical records and predicts future risk
automatically. Deepr transforms a record into a sequence of discrete elements separated by coded
time gaps and hospital transfers. On top of the sequence is a convolutional neural net that detects and
combines predictive local clinical motifs to stratify the risk. Deepr permits transparent inspection and
visualization of its inner working. We validate Deepr on hospital data to predict unplanned readmission
after discharge. Deepr achieves superior accuracy compared to traditional techniques, detects
meaningful clinical motifs, and uncovers the underlying structure of the disease and intervention
space.
http://arxiv.org/abs/1607.07519
Condition dynamics Long short-term memory (LSTM)
C memory of LSTM
x diagnoses (features vector)
p procedures, medications
f illness "forgetting" (curing or toxicity)
m planned/unplanned admission flag
h weighed "illness pooling"
i input gate (new information updated to memory)
o output gate (disease state)
http://arxiv.org/abs/1511.03677
https://arxiv.org/abs/1510.07641
Condition dynamics always missing data in clinical time series
TREATING MISSING DATA Various options
1. Zero-Imputation Set to zero when missing data
2. FORWARD-FILLING use previous values
3. MISSINGNESS Treat the missing value as a signal, as lack of a value
measured e.g. in an ICU can carry information itself (Lipton et al.
2016)
4. BAYESIAN STATE-SPACE MODELING to fill the missing data (Luttinen et
al. 2016, BayesPy package)
5. GENERATIVE MODELING Train the deep network to generate missing
samples (Im et al. 2016, RNN GAN; see also github:
sequence_gan)
http://arxiv.org/abs/1606.01865
https://arxiv.org/abs/1606.04130
Condition dynamics -based Individualized treatment
●
Schmidt-Erfurth and Waldstein (2016): There is a critical unmet medical need to identify, characterize, and
validate biomarkers that could provide solid guidance for an efficient individualized treatment with regards to
optimal functional outcome and disease management. Such biomarkers would enable the treating physician to
tailor personalized treatment to each patient's individual disease and need, in order to provide adequate disease
control, minimize recurrence and neurosensory damage, and limit the number of invasive and costly
interventions.
Relationship between initial visual acuity, visual acuity
change and final visual acuity during therapy of
neovascular age-related macular degeneration (i.e.,the
ceiling effect). The interpolation curves illustrate final
visual acuity levels dependent on baseline visual acuity
in the controlled trials CATT and IVAN as well inthe
real-world UK neovascular AMD database study
Role of subretinal fluid as a treatment-modifying imaging
biomarker. In patients with subretinal fluid at baseline (blue
graphs), antiangiogenic therapy leads to identical visual
acuity outcomes, regardless of treatment regimen (monthly
versus every 12 weeks dosing). In contrast, patients without
subretinal fluid at baseline (red graphs) demonstrate
unfavourable outcomes if treatment was not administered
on a monthly basis.
Pigment-epithelial detachment as risk factor for vision loss
during individualized dosing. In the VIEW studies, patients
received continuous anti-VEGF therapy during the first 48
weeks. At 52 weeks, a discontinuous, “as-needed” dosing
regimen was introduced. Only in a precisely defined patient
population, i.e. eyes with pigment-epithelial detachments
developing secondary intraretinal cystoid fluid (IRC, red graph),
the reactive dosing regimen led to pronounced vision loss.
Future therapeutic approaches will likely focus on early and/or disease modifying interventions aiming to protect
the functional and structural integrity of the morphologic complex that is primarily affected in AMD, i.e. the
choriocapillary e RPE e photoreceptor unit.
Multimodal innovative imaging technologies, such as PS-OCT, OCT angiography, and adaptive optics allow access
to yet unidentified biomarkers representing the origin of neovascular AMD as well as functionally relevant
therapeutic aims. Improved big-data applicability and reproducibility aided by computerized OCT analysis will likely
allow personalized antiangiogenic therapy with minimal interventions, while providing maximum disease
control,using advanced imaging software and hardware. It is the responsibility of the scientific and clinical community
to follow the open path of advanced imaging in a collaborative and interdisciplinary approach together with
ophthalmologists, biologists, physicists, and computer scientists in an efficient interdisciplinary approach.
Condition dynamics risk factors for glaucoma progression
https://doi.org/10.1016/j.ajo.2017.06.003
To determine the intraocular and systemic risk factor differences
between a cohort of rapid glaucoma disease progressors and non-
rapid disease progressors.
Conclusion: Cardiovascular disease is an important risk factor for
rapid glaucoma disease progression irrespective of IOP control.
Condition dynamics Disease progression #1
Clin Ophthalmol. 2017; 11: 1015–1020. May 23.
doi: 10.2147/OPTH.S116265 PMCID: PMC5449101
Automated retinal imaging and trend
analysis – a tool for health monitoring
Karin Roesch, Tristan Swedish, and Ramesh Raskar
MIT Media Lab, Massachusetts Institute of Technology, Cambridge, MA, USA
The future of health diagnostics. Current diagnostics are based on a
“snapshot” in time and limited data points. In the future, large datasets
acquired over time through constant monitoring will be analyzed to
establish baselines and trends, enabling preventative interventions.
Knowing when a feature occurred is key. For example, the MA
population is dynamic and changes occur in a matter of months. For
diabetic retinopathy (DR), it has been established that microaneurysms
(MAs) are the earliest lesions visible.6 Additionally, MA turnover rates
are indicative of early-stage DR as well as the likelihood of DR
progression to macular edema.
Po-Hsiang Chiu, George Hripcsak
Department of Biomedical Informatics, Columbia University, 622 W. 168th Street, New York, NY, USA
https://doi.org/10.1016/j.jbi.2017.04.009
Learning statistical models of phenotypes using noisy
labeled training data
Vibhu Agarwal Tanya Podchiyska Juan M Banda Veena Goel Tiffany I LeungEvan P Minty Timothy E Sweeney Elsie Gyang Nigam H Shah
J Am Med Inform Assoc (2016) 23 (6): 1166-1173.
DOI: https://doi.org/10.1093/jamia/ocw028
Condition dynamics Disease progression #2
Hrvoje Bogunović; Alessio Montuoro; Magdalena Baratsits;
Maria G. Karantonis; Sebastian M. Waldstein; Ferdinand Schlanitz;
Ursula Schmidt-Erfurth
Investigative Ophthalmology & Visual Science June 2017,
Vol.58, BIO141-BIO150. DOI: 10.1167/iovs.17-21789
Observations at baseline and the first follow-up are used for predicting
drusen regression in the future, for example, the following 1-year period.
Examples of drusen thickness maps and the drusen regression prediction within 1-year
period. Last column shows true positives (green), false positives (orange), and false negatives
(blue). Each row represents one example eye.
http://dx.doi.org/10.1001/jamaophthalmol.2016.5111
http://dx.doi.org/10.1002/sim.7300
Application of our approach using linear mixed models to Alzheimer’s Disease Neuroimaging Initiative data with
bootstrapped 95% CI including boxplots of neocortical Aβ burden (standard uptake value ratio (SUVR)) for each
diagnosis group, separately for amyloid–β positive and negative individuals. It takes 24.47 years to progress from an
SUVR of 0.79 to 1.01. This is equivalent to a rate of 0.009 increase in SUVR per year. Similarly, it takes 10.76 years
to progress from an SUVR of 0.73 to 0.79. See the text for further details. HC, healthy control; MCI, mild cognitively
impaired; AD, Alzheimer’s disease
Text Analysis | Introduction
Condition dynamics Natural Language processing (NLP)
http://arxiv.org/abs/1602.05568
http://arxiv.org/abs/1602.03686
http://homepages.inf.ed.ac.uk/ballison/pdf/lrec_skipgrams.pdf
http://www.bioscience.ai/schedule
http://arxiv.org/abs/1508.04112
Text analysis for clinical notes #1
http://dx.doi.org/10.3233/978-1-61499-753-5-201
Medical Text Classification using Convolutional Neural
Networks
Mark Hughes, Irene Li, Spyros Kotoulas, Toyotaro Suzumura (Submitted on
22 Apr 2017). https://arxiv.org/abs/1704.06841
We present an approach to automatically classify clinical text at a sentence level. We are
using deep convolutional neural networks to represent complex features. We train the
network on a dataset providing a broad categorization of health information. Through a
detailed evaluation, we demonstrate that our method outperforms several approaches
widely used in natural language processing tasks by about 15%.
Text analysis for clinical notes #2
13 April 2017. https://doi.org/10.1109/BHI.2017.7897302
https://doi.org/10.1016/j.jbi.2017.07.006
We proposed the first models based on recurrent neural networks (more
specifically Long Short-Term Memory - LSTM) for classifying relations from
clinical notes.
We also evaluated the impact of word embedding on the performance of
LSTM models and showed that medical domain word embedding help
improve the relation classification. These results support the use of LSTM
models for classifying relations between medical concepts, as they show
comparable performance to previously published systems while requiring
no manual feature engineering.
In this work, we explore the use of Hierarchical Dirichlet Processes (HDP)
as a Bayesian nonparametric framework to infer patients' states of health
by combining multiple sources of data. In particular, we employ HDP to
combine clinical time series and text from the nursing progress notes in a
probabilistic topic modeling framework for patient risk stratification
iDoctor: Personalized and professionalized medical
recommendations based on hybrid matrix factorization
Future Generation Computer Systems
Volume 66, January 2017, Pages 30-35
https://doi.org/10.1016/j.future.2015.12.001
Personalized Medicine | Introduction
Precision / personalized medicine #1
re-work.co/blog
http://dx.doi.org/10.1101/070490
“For the first time, we demonstrate that DLNN trained on a large pharmacogenomic data set can effectively
predict the therapeutic response of specific drugs in specific cancer types, from a large panel of both drugs and
cancer cell lines. These findings serve as a proof of concept for the application of DLNN to predict therapeutic
responsiveness, a milestone in precision medicine.”
http://dx.doi.org/10.1056/NEJMp1500523
http://dx.doi.org/10.3389%2Ffpsyt.2016.00034
http://dx.doi.org/10.1016/j.media.2016.06.024
Precision / personalized medicine #2
We introduce an IoT driven architecture and discuss how non-
invasive, affordable, unobtrusive sensing using mobile phones,
wearables and nearables is making physiological and pathological
data collection from human body possible in thus far unimaginable
ways. We also introduce breakthrough technologies in form
of exosomes and 3D organ printing that has the potential to disrupt
the future healthcare landscape.
http://dx.doi.org/10.1007/978-3-319-42141-4_9
https://doi.org/10.1109/TMM.2016.2614225
To facilitate the intensive computation required for interactive analytics, we design an efficient
sparse principal component analysis (SPCA) solver based on a variance reduced stochastic
gradient technique. The benefits of our method are demonstrated by analyzing two different
EHR patient cohorts, a public and a private dataset containing EHRs of 101 767 and 223 076
patients, respectively. Our evaluations show that PHENOTREE can detect clinically meaningful
hierarchical phenotypes.
http://dx.doi.org/10.3390/ijms17091555
Precision / personalized medicine #3
Multimorbidity space and dynamic disease progression.
http://dx.doi.org/10.1038/nrg.2016.87
The co-occurrence of diseases can inform the underlying network biology of shared and multifunctional genes and pathways. In addition,
comorbidities help to elucidate the effects of external exposures, such as diet, lifestyle and patient care. With worldwide health transaction data
now often being collected electronically, disease co-occurrences are starting to be quantitatively characterized.
Linking network dynamics to the real-life, non-ideal patient in whom diseases co-occur and interact provides a valuable basis for generating
hypotheses on molecular disease mechanisms, and provides knowledge that can facilitate drug repurposing and the development of targeted
therapeutic strategies.
Example Clinical AI Pipelines
Glaucoma decision support tools
Old-school methodsformultimodal and structural features
Development of machine learning models
for diagnosis of glaucoma
Seong Jae Kim, Kyong Jin Cho, Sejong Oh
Published: May 23, 2017.
https://doi.org/10.1371/journal.pone.0177726
We used 100 cases of data as a test dataset and 399 cases
of data as a training and validation dataset. To develop the
glaucoma prediction model, we considered four machine
learning algorithms: C5.0, random forest (RF), support vector
machine (SVM), and k-nearest neighbor (KNN).
Color-fundus and red-free fundus photography (A),
peripapillary RNFL thickness measured by SD-OCT (B),
and automated 30–2 visual field test (C). The presence of
a tigroid fundus and peripapillary atrophy was observed, and
there was a decrease in the RNFL thickness on the
peripapillary RNFL thickness scan. In the visual field test,
the abnormalities were judged to be of no clinical
significance.
Computers in Biology and Medicine
Volume 8, Issue 1, January 1978, Pages 25-40
Glaucoma consultation by computer
Sholom Weiss, Casimir A. Kulikowski, Aran Safir
https://doi.org/10.1016/0010-4825(78)90011-2
Automated detection of glaucoma using structural and
non structural features
SpringerPlus December 2016, 5:1519
Anum A. Salam, Tehmina Khalil, M. Usman Akram, Amina Jameel, Imran Basit
First Online: 09 September 2016
https://doi.org/10.1186/s40064-016-3175-4
Tensor Networks Inspiration from quantum networks #1
Supervised Learning with Quantum-
Inspired Tensor Networks
E. Miles Stoudenmire, David J. Schwab last revised 18 May 2017
https://arxiv.org/abs/1605.05775
Deep Learning and Quantum Entanglement:
Fundamental Connections with Implications to
Network Design
Yoav Levine, David Yakira, Nadav Cohen, Amnon Shashua last revised 10 Apr 2017
https://arxiv.org/abs/1704.01552
Neural networks for computing best
rank-one approximations of tensors
and its applications
Maolin Che, Andrzej Cichocki, Yimin Wei. 22 May 2017
https://doi.org/10.1016/j.neucom.2017.04.058
This paper presents the neural dynamical network
to compute a best rank-one approximation of a
real-valued tensor. We implement the neural
network model by the ordinary differential
equations (ODE), which is a class of continuous-
time recurrent neural network. Finally, we
generalize the proposed neural networks to the
computation of the restricted singular values and
the associated restricted singular vectors of real-
valued tensors. We illustrate and validate
theoretical results via numerical simulations.
Keywords: Neural network, Ordinary differential equations, Lyapunov function, Lyapunov stability theory, Rank-one tensor, Best rank-one
approximation, Z-eigenpair, Symmetric-definite tensor pair, H-eigenpair, The local maximal generalized eigenpair, The local minimal
generalized eigenpair, Generalized tensor eigenpair, Local optimal rank-one approximation, Restricted singular value, Restricted singular
vector
We theoretically analyze convolutional arithmetic circuit (ConvACs), and empirically validate
our findings on more common ConvNets which involve ReLU activations and max pooling.
Beyond the results described above, the description of a deep convolutional network in well-
defined graph-theoretic tools and the formal connection to quantum entanglement, are two
interdisciplinary bridges that are brought forth by this work.
Neural-network representation of the
many-body ground states.
convolutional neural networks, can constitute the
basis of more advanced NQS and therefore have
the potential for increasing their expressive
power.
Tensor Networks Inspiration from quantum networks #2
Low-Rank Tensor Networks for Dimensionality
Reduction and Large-Scale Optimization
Problems: Perspectives and Challenges PART 1
A. Cichocki, N. Lee, I.V. Oseledets, A.-H. Phan, Q. Zhao, D. Mandic last revised 19 Jul
2017 (this version, v2)
https://arxiv.org/abs/1609.00893
Tensor Networks for Dimensionality Reduction and
Large-scale Optimization: Part 2 Applications and
Future Perspectives
A. Cichocki, N. Lee, I.V. Oseledets, A.-H. Phan, Q. Zhao, D. Mandic Foundations and
Trends® in Machine Learning (2017): Vol. 9: No. 6, pp 431-673.
http://dx.doi.org/10.1561/2200000067
“Tensor decompositions and tensor network algorithms require sophisticated software libraries, which are being rapidly
developed. The TT Toolbox, developed by Oseledets and coworkers, (http://github.com/oseledets/TT-Toolbox) for
MATLAB and (http://github.com/oseledets/ttpy) for PYTHON is currently the most complete software for the TT
(MPS/MPO) and QTT networks. The TT toolbox supports advanced applications, which rely on solving sets of linear
equations (including the AMEn algorithm), symmetric eigenvalue decomposition (EVD), and inverse/psudoinverse of
huge matrices.”
Keywords: Tensor networks, Function-related tensors, CP decomposition, Tucker models, tensor train (TT) decompositions, matrix product states (MPS), matrix product operators (MPO), basic tensor operations, multiway component analysis, multilinear blind source
separation, tensor completion, linear/multilinear dimensionality reduction, large-scale optimization problems, symmetric eigenvalue decomposition (EVD), PCA/SVD, huge systems of linear equations, pseudo-inverse of very large matrices, Lasso and Canonical
Correlation Analysis (CCA)
Tensor Networks in Healthcare
SCH: INT: Collaborative Research: High-throughput Phenotyping on Electronic Health
Records using Multi-Tensor Factorization
Jimeng Sun, Bradley Malin, Joshua Denny, Joydeep Ghosh, Abel Kho
Funding Source: NSF Smart Connect Health Integrated Grant: Award Number 1418511
http://www.sunlab.org/research/phenotyping/
Techniques
Task 1: Phenotyping Generation: How to turn EHR data into meaningful clinical concepts (Phenotypes)?
Task 2: Phenotyping Refinement: How to incoporate feedback to ensure the generated phenotypes clinically meaningful?
Task 3: Phenotyping Adaptation: How to port phenotypes from one institution to another?
Applications
App 1: Cohort Construction: Validate the generated phenotypes recover some existing phenotypes (from PheKB)
App 2: GWAS: Develop genomic-wide association studies using the generated phenotypes (as target or control variables)
App 3:Predictive modeling: Use generated phenotypes as features to faciliate predictive modeling https://arxiv.org/abs/1704.03141
Tensor Networks in Industry
Animashree Anandkumar
Associate Professor (with tenure)
University of California Irvine
I am a faculty at CS department within ICS at University of California Irvine since December 2016. Before that I was
a faculty at EECS department at UCIrvine since August 2010. I am a member of the center for pervasive
communications and computing (CPCC).
I am currently a principal scientist at Amazon Web Services (AWS) and on leave from UCI.
My research focus is in the high-dimensional learning of probabilistic graphical models and latent variable models.
Broadly I am interested in machine learning, high-dimensional statistics, tensor methods, statistical physics,
information theory and signal processing.
https://youtu.be/gEFaLKzrKYc?t=6m52s
https://youtu.be/KmvZu9qJNzg?t=7m15s
https://youtu.be/B4YvhcGaafw?t=5m40s
https://www.oreilly.com/ideas/lets-build-open-source-tensor-
libraries-for-data-science
“Model Refinement” Techniques
UNCERTAINTY ANALYSIS
’Layperson’background
development at internet giants like Google and Facebook. 
https://www.wired.com/2016/12/uber-buys-mysterious-startup-make-ai-company/
UNCERTAINTY ANALYSIS
In practicefor retinalimaging
https://doi.org/10.1101/084210
Here we propose to estimate the uncertainty of DNNs in medical
diagnosis based on a recent theoretical insight on the link between
dropout networks and approximate Bayesian inference. Using the example
of detecting diabetic retinopathy (DR) from fundus photographs, we
show that uncertainty informed decision referral improves diagnostic
performance. Experiments across different networks, tasks and datasets
showed robust generalization.
Depending on network capacity and task/dataset difficulty, we surpass
85% sensitivity and 80% specificity as recommended by the NHS when
referring 0%-20% of the most uncertain decisions for further inspection.
We analyse causes of uncertainty by relating intuitions from 2D
visualizations to the high-dimensional image space, showing that it is in
particular the difficult decisions that the networks consider uncertain.
bioRxiv preprint first posted online
Oct. 28, 2016
Interpretability | Background
Visualizing disease Clinicians want answers
Mitigating the resistance from clinical community, put effort in explaining the diagnosis
Roth et al. (2015)
Ribeiro et al. (2016)Baskaran et al. (2012)
ClinicalHeuristic Glaucomadecision tree
"Cliniciansneedthe data-drivenmodelpredictionstoalignwiththeirdomainknowledge"
Dr. Jenna Wiens @ NIPS 2016, “NIPS 2016 Workshop on Machine Learning for Health”
http://www.nipsml4hc.ws/jenna-wiens
Essentially the causal decision tree becomes now “hard-
to-interpret” deep learning model. How to communicate
this paradigm shift to clinicians?
Visualization state-of-the-art techniques in General
DOI: 10.1111/cgf.13210
An example of modeling
with visual analytics.
BaobabView
[Van den Elzen and van Wijk (2011)]
uses a
tree-like interactive view to
support a manually
controlled decision tree
construction process
An example of model selection. Squares [Ren et al. (2017)] uses small multiples
composed of grids of different colors and visual textures to display the
distribution of probabilities in classification
© VADER Lab at ASU 2017.
All rights for the techniques and
images belong to their respective
owners.
Visualization high-dimensional visualization #1
Shusen Liu ; Dan Maljovec ; Bei Wang ; Peer-Timo Bremer ; Valerio Pascucci
(2016) https://doi.org/10.1109/TVCG.2016.2640960
Dominik Sacha ; Leishi Zhang ; Michael Sedlmair ; John A. Lee ; Jaakko Peltonen ;
Daniel Weiskopf ; Stephen C. North ; Daniel A. Keim
(2016) https://doi.org/10.1109/TVCG.2016.2598495
Visualization high-dimensional visualization #2
http://dx.doi.org/10.1111/cgf.13237
Dimensionality reduction provides a scalable alternative to create visualizations
(projections) that enable insight into the structure of such datasets. However, applying
dimensionality reduction independently for each dataset in a sequence may introduce
unnecessary variability in the resulting sequence of projections, which makes tracking
the evolution of the data significantly more challenging. We show that this issue
affects t-SNE, a widely used dimensionality reduction technique. In this context, we
propose dynamic t-SNE, an adaptation of t-SNE that introduces a controllable trade-
off between temporal coherence and projection reliability. Our evaluation in two
time-dependent datasets shows that dynamic t-SNE eliminates unnecessary temporal
variability and encourages smooth changes between projections.
https://doi.org/10.2312/eurovisshort.20161164
Visualization ”unboxing” ConvNet black box #1
https://arxiv.org/abs/1311.2901; Cited by 2,133 articles
https://doi.org/10.1109/TVCG.2016.2598838
To enable a more intuitive exploration process, we are open-sourcing the Embedding Projector, a
web application for interactive visualization and analysis of high-dimensional data recently
shown as an A.I. Experiment, as part of TensorFlow. We are also releasing a standalone version
at projector.tensorflow.org, where users can visualize their high-dimensional data without the
need to install and run TensorFlow.
Visualization ”unboxing” ConvNet black box #2
HILDA’17, Chicago, IL, USA
http://dx.doi.org/10.1145/3077257.3077260
https://arxiv.org/abs/1704.01942
“ACTIVIS has been deployed on Facebook’s machine learning platform. We present case studies with
Facebook researchers and engineers, and usage scenarios of how ACTIVIS may work with different models.”
Minsuk Kahng is with Georgia Tech; Pierre Andrews is with Facebook; Aditya Kalro is with Facebook; Duen Horng (Polo) Chau.
DARVIZ: deep abstract representation, visualization,
and verification of deep learning models
ICSE-NIER '17 Proceedings of the 39th International Conference on Software Engineering:
New Ideas and Emerging Results Track. https://doi.org/10.1109/ICSE-NIER.2017.13
ShapeShop: Towards Understanding Deep Learning
Representations via Interactive Experimentation
CHI EA '17 Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors
in Computing Systems https://doi.org/10.1145/3027063.3053103
Visualization ”unboxing” recurrent/Sequence black box #1
https://arxiv.org/abs/1705.08153
Uninterpretable examples. Left: Illustration of an arbitrary set of parameters for
an LSTM trained on the MIT-BIH dataset. Numbers indicate different connections
for the input weight vector (rectangle) and the hidden layer weight matrix (square).
Right: The memory values c for arbitrary units in the LSTM trained on the MIT-
BIH data
LSTM hidden unit outputs compared to wavelet coefficients. The top of each column is the
original sample that was correctly classified using the respective LSTM model. The following
two pairs of rows are the cherry-picked pairs of wavelet coefficients and hidden unit outputs
that are roughly similar. The type of wavelet coefficient and the specific hidden unit are
indicated above each plot. The Daubechies wavelet coefficients are 108 time steps long
(instead of 216) because it makes use of the discrete wavelet transform. The wavelet
coefficients were computed using the PyWavelets package in Python
The sample saliencies for the ECG data using different techniques depicted in each column.
The occlusion width is the number of time steps that are occluded per instance. All the
samples shown have a length of 216 time steps (x-axis) and were correctly classified by the
model. The importance of each input step is shown on a scale of 0 to 1, with 1 being the
most important. The type of ECG signal is indicated on the left with LBBB – left bundle
branch block beat, RBBB – right bundle branch block beat, Paced – paced beat, and V-fib –
ventricular fibrillation.
Class mode
visualizations. The
optimized class
modes for the ECG
data (left) and the
MNIST data (right).
Here the input is
optimized with
respect to each class
in order to find the
most likely input for
each class. The class
for each plot is
indicated on the left
of the image. This
technique did not
yield interpretable
results.
Visualization Medical deep learning models #1
https://arxiv.org/abs/1707.02485
Overall illustration of MDNet. We use a bladder image with its diagnostic report as an example. The
image model generates an image feature to pass to LSTM in the form of a task tuple and a Conv
feature embedding (for the attention model) computed by the AAS module (defined in the method).
LSTM executes prediction tasks according to the specified image feature type
The illustration of class-specific
attention. From top to bottom,
test images, pathologist
annotations, and class attention
maps. Like the pathologist
annotations, the attention maps
are most activated in urothelial
regions, largely ignoring stromal
or background regions. Best
viewed in color.
http://dx.doi.org/10.1016/j.oret.2016.12.009
An occlusion test (Zeiler and Fergus, 2016) was performed to identify the areas
contributing most to the neural network's assigning the category of AMD. A blank 20
× 20-pixel box was systematically moved across every possible position in the image
and the probabilities were recorded. The highest drop in the probability represents
the region of interest that contributed the highest importance to the deep learning
algorithm.
Visualization Medical deep learning models #2
https://arxiv.org/abs/1703.10757
Inspired by Zhou et al. (2016), we present in
this section the idea of generating the
Regression Activation Maps (RAM) of an
input image to localize the discriminative region
towards the regression outcomes. It is known
that the convolutional units of each layers of
CNN act as visual concept detectors to identify
low-level concepts like textures or materials, to
high-level concepts like objects or scenes.
Deeper into the network, the units become
increasingly discriminative. However, the fully-
connected layers will make it difficult to
identify the importance of different units for
identifying the output labels (regression values,
in our networks). Instead, using global
averaging pooling (GAP) and the linear
output unit, we can directly visualize the region
of interest (ROI) that are most discriminative
for a given regression value. As we use
regression for the purpose of classification,
each single RAM obtained for each single
image explicitly depict the ROI on different
clinical level.
In this work, we provided a deep learning model that
includes regression activation maps layer (RAM). The
RAM layer can provide the robust interpretability of the
proposed detection model by monitoring the
pathogenesis so that the proposed model can be taken
as an assistant for clinicians
Interpretability to EHR Mining and decision making #1
https://youtu.be/co3lTOSgFlA
The source code of RETAIN is publicly available at
https://github.com/mp2893/retain Model Interpretation for Heart Failure Prediction We demonstrate the interpretability of RETAIN by
studying its behavior in the HF prediction task. We choose a HF patient from the test set and calculate the contribution of
the variables (medical codes in this case) for making the binary prediction. Figure 3a is the visualization of the
contributions of the variables in each visit. The patient suffered from skin problems, skin disorder (SD), benign neoplasm
(BN), excision of skin lesion (ESL), for some time before showing symptoms of HF, cardiac dysrhythmia (CD), heart valve
disease (HVD) and coronary atherosclerosis (CA), then being diagnosed with HF at the end. We can see that skin-related
codes from the earlier visits made little contribution to HF prediction as expected. RETAIN properly puts much attention
to the HF-related codes that occurred in recent visits.
Interpretability to EHR Mining and decision making #1
GRAM: Graph-based Attention Model for Healthcare
Representation Learning
Edward Choi, Mohammad Taha Bahadori, Le Song, Walter F. Stewart, Jimeng Sun‘
last revised 1 Apr 2017 (this version, v3)
https://arxiv.org/abs/1611.07012
“Deep learning methods exhibit promising performance for predictive modeling in healthcare, but
two important challenges remain: - Data insufficiency: Often in healthcare predictive modeling,
the sample size is insufficient for deep learning methods to achieve satisfactory results.
-Interpretation: The representations learned by deep learning methods should align with medical
knowledge. To address these challenges, we propose a GRaph-based Attention Model, GRAM that
supplements electronic health records (EHR) with hierarchical information inherent to medical
ontologies.”
https://jkulas12.github.io/GRAM_Visualization/ :
Datasets | Primer
Dataset Size How much samples?
The more the better, but there are obvious problems with obtaining huge medical datasets
https://arxiv.org/abs/1511.06348
(A) The number of misclassified images on each body part class and (B) of total
misclassified ones on whole body in increasing number of training data sets.
Classification accuracy results according to increasing size of training data sets
There is rule-of-thumb (#1)stating that one should have
10x the number of samples as parameters in the
network (for more formal approach, see VC dimension),
and for example the ResNet (He et al. 2015) in the
ILSVRC2015 challenge had around 1.7M parameters,
thus requiring 17M images with this rule-of-thumb.
https://www.researchgate.net/post/What_is_the_minimum_sample_size_required_
to_train_a_Deep_Learning_model-CNN
Dataset Size How much samples?
More is better always if you train with higher capacity models
https://arxiv.org/abs/1707.02968
Since 2012, there have been significant advances in
representation capabilities of the models and
computational capabilities of GPUs. But the size of the
biggest dataset has surprisingly remained constant. What
will happen if we increase the dataset size by 10× or
100×?
Our experiments yield some surprising (and some
expected) findings:
Better Representation Learning Helps! Our first observation is that large-scale
data helps in representation learning as evidenced by improvement in performance
on each and every vision task we study. This suggests that collection of a larger-
scale dataset to study pretraining may greatly benefit the field. Our findings also
suggest a bright future for unsupervised or self-supervised [10, 42]
representation learning approaches. It seems the scale of data can overpower
noise in the label space.
Performance increases linearly with orders of magnitude of training data!
Perhaps the most surprising element of our finding is the relationship between
performance on vision tasks and the amount of training data (log-scale) used for
representation learning. We find that this relationship is still linear! Even with
300M training images, we do not observe any plateauing effect for the tasks
studied.
Capacity is Crucial: We also observe that to fully exploit 300M images, one needs
higher capacity models. For example, in case of ResNet-50 the gain on COCO
object detection is much smaller (1.87%) compared to (3%) when using ResNet-
152.
Training with Long-tail: Our data has quite a long tail and yet the representation
learning seems to work. This long-tail does not seem to adversely affect the
stochastic training of ConvNets (training still converges).
New state of the art results: Finally, our paper presents new state-of-the-art
results on several benchmarks using the models learned from JFT-300M. For
example, a single model (without any bells and whistles) can now achieve 37.4 AP
as compared to 34.3 AP on the COCO detection benchmark.
Dataset Size data augmentation #1
s
Images from:
ftp://ftp.dca.fee.unicamp.br/pub/docs/vonzuben/ia353_1s15/topico10_IA353_1s2015.pdf |
Wu et al. (2015)
Synthetically increase the number of training sample by distorting them in way expected from the dataset (random
xy-shifts, left-right flips, add gaussian noise, blur, etc.) → This have shown to reduce overfitting.
As noted in the previous slides on image quality, it is useful to train the model with various image quality levels
Köhler et al. (2013)
The most successful convolutional architectures are developed starting from ImageNet, a large
scale collection of images of object categories downloaded from the Web. This kind of images is
very different from the situated and embodied visual experience of robots deployed in
unconstrained settings. To reduce the gap between these two visual experiences, this paper
proposes a simple yet effective data augmentation layer that zooms on the object of interest
and simulates the object detection outcome of a robot vision system. The layer, that can be used
with any convolutional deep architecture, brings to an increase in object recognition performance
of up to 7%, in experiments performed over three different benchmark databases.
https://arxiv.org/abs/1705.02139
Dataset Size data augmentation #2
Apply domain-specific perturbations
Dataset Augmentation in Feature Space
Terrance DeVries, Graham W. Taylor(Submitted on 17 Feb 2017)
https://arxiv.org/abs/1702.05538
Dreaming More Data: Class-dependent Distributions over
Diffeomorphisms for Learned Data Augmentation
Søren Hauberg, Oren Freifeld, Anders Boesen Lindbo Larsen, John Fisher, Lars
Hansen ; Proceedings of the 19th International Conference on Artificial Intelligence and Statistics,
PMLR 51:342-350, 2016.
http://proceedings.mlr.press/v51/hauberg16.html
Our approach is, however, not limited to MNIST:
●
Image alignment and registration is a routine task in many medical imaging tasks,
such as the analysis of MRI.
●
We make similar observations for time-series data such as acoustic signals. Here
dynamic time warping (DTW) is often used as preprocessing to remove
differences in the temporal speed of individual signals.
●
Mesh alignment is also standard pre-processing step in the analysis of three-
dimensional meshes. As deep models are beginning to appear for three-
dimensional data it would be interesting to combine them with learned
augmentation schemes.
https://doi.org/10.1016/j.neucom.2016.12.025
In this paper, we propose five data augmentation methods dedicated to face images,
including landmark perturbation and four synthesis methods (hairstyles, glasses, poses,
illuminations). The proposed methods effectively enlarge the training dataset, which
alleviates the impacts of misalignment, pose variance, illumination changes and partial
occlusions, as well as the overfitting during training
Dataset Size Generative synthetic data
Augmentation through generative adversarial models (GAN)
the CVPR 2017 awards are out! The two winners are
Densely Connected Convolutional Networks by Facebook and
Improving the Realism of Synthetic Images
https://arxiv.org/abs/1612.07828
https://machinelearning.apple.com/2017/07/07/GAN.html
https://github.com/wayaai/SimGAN
https://arxiv.org/abs/1706.02071
https://github.com/val-iisc/deligan
TextureGAN: Controlling Deep Image Synthesis with
Texture Patches
Wenqi Xian, Patsorn Sangkloy, Jingwan Lu, Chen Fang, Fisher Yu, James Hays
(Submitted on 9 Jun 2017)
https://arxiv.org/abs/1706.02823
Dataset Size semi-supervised training #1
Jointly use labeled and unlabeled data
https://arxiv.org/abs/1705.08850
Our empirical results show that using the tangents of the data manifold (as estimated by the
generator of the GAN) to inject invariances in the classifier improves the performance on semi-
supevised learning tasks.
https://arxiv.org/abs/1706.00400
N. Siddharth, Brooks Paige, Jan-Willem Van de Meent, Alban Desmaison, Frank Wood,
Noah D. Goodman, Pushmeet Kohli, Philip H.S. Torr
Here we are interested in learning disentangled representations that encode
distinct aspects of the data into separate variables. We propose to learn such
representations using model architectures that generalize from standard
Variational autoencoders (VAEs) employing a general graphical model structure
in the encoder and decoder. This allows us to train partially-specified models
that make relatively strong assumptions about a subset of interpretable
variables and rely on the flexibility of neural networks to learn representations
for the remaining variables. We further define a general objective for semi-
supervised learning in this model class, which can be approximated using an
importance sampling procedure.
Dataset Size semi-supervised training #2
https://arxiv.org/abs/1707.03631
https://arxiv.org/abs/1705.09783
In this work, we present a semi-supervised learning framework that
uses generated data to boost task performance. Under this
framework, we characterize the properties of various generators
and theoretically prove that a complementary (i.e. bad) generator
improves generalization. Empirically our proposed method improves
the performance of image classification on several benchmark
datasets.
Our proposed method, adversarial dropout, can be viewed from the
dropout and from the adversarial training perspectives. Our
proposed adversarial dropout can be interpreted as dropout masks
whose direction is counter-optimized, adversarially, to the model’s
label assignment. However, it should be noted that adversarial
dropout and traditional adversarial training with additive
perturbation are different because adversarial dropout induces the
sparse structure of neural network while the other do not make
changes on the neural network directly.
Dataset Size Active learning and “smart” labeling #1
When labelingis very time-consuming, activelearning can help us in choosing which unlabeled samples to label
Active Learning and Proofreading
for Delineation of Curvilinear
Structures
Mosinska, Agata Justyna; Tarnawski, Jakub; Fua, Pascal
Presented at: MICCAI, Quebec City, Canada, September 10-14, 2017
https://infoscience.epfl.ch/record/229472
https://arxiv.org/abs/1704.07433
Dataset Size Transfer learning
Leveraging features learned from bigger non-medical datasets
Our approach fine-tunes a pre-trained convolutional neural network (CNN),
GoogLeNet. The fine-tuned CNN could effectively identify pathologies in
comparison to classical learning. Our algorithm aims to demonstrate that
models trained on non-medical images can be fine-tuned for classifying OCT
images with limited training data.
Biomedical Optics Express Vol. 8, Issue 2, pp. 579-592 (2017)
https://doi.org/10.1364/BOE.8.000579
International Workshop on Large-Scale Annotation of Biomedical Data and Expert Label Synthesis
International Workshop on Deep Learning in Medical Image Analysis
LABELS 2016, DLMIA 2016: Deep Learning and Data Labeling for Medical Applications pp 188-196
Understanding the Mechanisms of Deep
Transfer Learning for Medical Images
https://doi.org/10.1007/978-3-319-46976-8_20
Hariharan Ravishankar, Prasad Sudhakar, Rahul Venkataramani, Sheshadri Thiruvenkadam, Pavan Annangi, Narayanan
Babu, Vivek Vaidya
Deep Learning and Convolutional Neural Networks for Medical Image Computing Pp 181-193
Part of the Advances in Computer Vision and Pattern Recognition book series (ACVPR)
On the Necessity of Fine-Tuned Convolutional Neural Networks for
Medical Imaging
https://doi.org/10.1007/978-3-319-42999-1_11
Nima Tajbakhsh, Jae Y. Shin, Suryakanth R. Gurudu, R. Todd Hurst, Christopher B. Kendall,
Michael B. Gotway, Jianming Liang
In this paper, we studied the necessity of fine-tuning and the effective level
of knowledge transfer to 4 medical imaging applications. Our experiments
demonstrated medical imaging applications were conducive to transfer
learning and that fine-tuned CNNs were necessary to achieve high
performance particularly with limited training datasets. We also showed that
the desired level of fine-tuning differed from one application to another.
While deeper levels of fine-tuning were suitable for polyp and PE detection,
intermediate fine-tuning worked the best for interface segmentation and
colonoscopy frame classification. Our findings led us to conclude that layer-
wise fine-tuning is a practical way to reach the best performance based on
the amount of available data.
Dataset Quality Beyond
A giant with feet of clay: on the validity of the
data that feed machine learning in medicine
Federico Cabitza, Davide Ciucci, Raffaele Rasoini last revised 26 Jun 2017
https://arxiv.org/abs/1706.06838
We point out how uncertainty is so ingrained in medicine that it
biases also the representation of clinical phenomena, that is the
very input of ML models, thus undermining the clinical
significance of their output. Recognizing this can motivate both
medical doctors, in taking more responsibility in the
development and use of these decision aids, and the
researchers, in pursuing different ways to assess the value of
these systems. In so doing, both designers and users could take
this intrinsic characteristic of medicine more seriously and
consider alternative approaches that do not "sweep uncertainty
under the rug" within an objectivist fiction, which everyone can
come up by believing as true.
5 Garbage in, Gospel out
The question of the quality of medical record and of the data
extracted from there is still understudied [
Cabitza and Batini, 2016; Stetson et al. 2012], let alone in
regard to machine learning projects [Feldman et al. 2017]. The
assumption that medical data could support secondary uses
has been challenged since almost 25 years ago, and also
strongly so, e.g., by Reiser 1991, who described several cases
of erroneous, missing and ambiguous data, and by
Burnum (1989), who provocatively wrote that “all medical
record information should be regarded as suspect; much of it is
fiction” (p. 484)”
JAMA. Published online July 20, 2017. doi: 10.1001/jama.2017.7797
https://doi.org/10.1177/0272989X12465490
Conclusions: Our exploratory analysis method reveals
unexpected effects. It indicates that, despite the original
study detecting no significant average effect, computer-
aided detection (CAD) helped the less discriminating
readers but hindered the more discriminating readers.
Such differential effects, although subtle, may be clinically
significant and important for improving both computer
algorithms and protocols for their use. They should be
assessed when evaluating CAD and similar warning
systems.
EXTRA
Neuroscience
RESOURCES
Retina | Background
RETINA
A schematic view of the retina showing the organization of different neuronal populations and
their synaptic connections. Rods and cones are confined to the photoreceptor layer. Light
detected by rods and cones is processed and signalled to retinal ganglion cells (RGCs) through
horizontal, amacrine and bipolar cells. RGCs are the only output neurons from the
retina to the brain. A subset of RGCs (4–5% of the total number of RCGs) are intrinsically
photosensitive RGCs (ipRGCs) containing the photopigment melanopsin. There are at least
five subtypes of ipRGCs (M1–M5) with different morphological and electrophysiological
properties, which show widespread projection patterns throughout the brain.
LeGates et al. (2014):
“Light as a central modulator of circadian rhythms, sleep and affect”
Retinal circuits. (a) The cellular and synaptic (i.e., plexiform) layers of the retina. Some of the various
cell types composing the five classes of neurons are shown: rod and cone photoreceptors, horizontal
cells (HCs), ON and OFF cone bipolar cells (BCs), rod BCs, AII and wide-field (WF) amacrine cells
(ACs), and ON and OFF ganglion cells (GCs). The ON and OFF BC axon terminals and GC dendrites
stratify in separate halves of the inner plexiform layer. (b) Several cell types from panel a, redrawn to
illustrate how rod signals pass through the inner retina. Excitatory (+) and inhibitory (−) synapses are
shown. A gap junction (denoted by the resistor symbol) allows bidirectional current flow between AII
ACs and ON cone BCs. The AII AC splits the ON rod BC signal into ON and OFF components using
either electrical (gap junction, ON) or chemical (glycinergic, OFF) synapses. Note that in daylight
conditions, cone-mediated drive to the AII influences the OFF pathway as follows: cone → ON cone
BC → AII AC → OFF BC and GC.
Example of circuit switching. (a) The
excitatory input to an ON ganglion cell (GC) is
driven by both rod and cone circuits. The rod
circuits actually signal via the cone bipolar cell
terminal. The inhibition from the surround is
mediated by a wide-field amacrine cell (WF AC)
driven exclusively by cone circuits. (b) When the
rod circuit is active, the ON GC has a receptive field
with an excitatory center component only. When
the cone circuit is active, the inhibitory surround
component switches on.
Synaptic motifs. (a) From the perspective of a bipolar cell (pipette
attached), inhibition arising from amacrine cells (ACs) occurs via multiple
synaptic motifs. Excitatory (+) and inhibitory (−) synapses are indicated;
feedback and feedforward synapses can occur in both ON and OFF
systems, and crossover inhibition acts between ON and OFF systems. The
illustrated circuit is an ON → OFF inhibitory one, but the opposite pattern
(OFF → ON) could also occur. (b) From the perspective of a ganglion cell
(GC) (pipette attached), inhibition from ACs occurs via multiple synaptic
motifs. This panel follows the same conventions as used in panel a.
Note! Melanopsin-containing retinal ganglion cells (pRGC, ipRGC,
mRGC, the same thing) were discovered only recently in 2002 by
Berson et al. [Cited by 1956], thus you might find them missing from
textbook versions of retinal circuits
Initially they were thought of contributing mainly on sleep/alertness
and circadian rhythm regulation, but recently it has been shown that
they contribute to image forming vision as well.
RETINA response characteristics: Spectral #1
SPECTRAL PROPERTIES
Teikari thesis (2012)
Enezi et al. 2011.
Stockmann And Sharpe (2000), CVRL
Govardovskii et al. 2000
van de Kraats and van Norren 200
7
Walraven 2003 CIE Report
“For environmental light”
“At retinal level
if you would not have
ocular media”
The absorbance spectrum of an exemplary vertebrate rhodopsin (lmax
~ 500 nm), considered as a sum of absorbance bands, indicated by
alpha (a), beta (b), gamma (g), sigma (s) and epsilon (e) normalized to
the peak absorbance of the alpha-band (after
Stavenga and van Barneveld 1975, from Stavenga 2010).
The sidelobe on the short-wave side
come from the beta band (see
template from Govardovskii et al. 2000)
Self-screening effect changes the
width/peak of the absorption spectrum. (A)
Percentage absorption spectra of various concentrations of
photopigment (OD - optical density in log units). (B) An
illustration of self-screening in at various photoreceptor lengths.
Human rod photoreceptor is ~25 mm, (Pugh and Lamb 2000)
and the cone photoreceptor 13 mm (Baylor et al. 1984). The
longest known photoreceptor has been found in dragonfly, the
length being 1,100 mm (Labhart and Nilsson 1995).
“Human crystalline lens
strongly absorb blue light
and UV”
V'(l) is the spectral sensitivity for night vision, and V(l) for
daytime vision. Not shown is mesopic vision VM(l) that is a
nonlinear combination of daytime and night vision operating on
dim light color vision.
Quantally defined daytime sensitivity
(2º central vision, Sharpe et al., 2005):
V*(l) = [1.891·l(l) + m(l)]/2.80361
Where l is long-wavelength ('red') cone sensitivity,
and m medium-wavelength (green) cone sensitivity
Note!
Melanopsin and 
S­cones do not 
seem to 
contribute to 
central vision 
luminance 
perception
vs.
RGB Luminance
Stockman, A., & Sharpe, L. T. (2008).
Spectral sensitivity In The Senses: A
Comprehensive Reference, Volume 2: Vision
II (pp. 87-100)
Goodeve et al., 1942
Without the
crystalline
lens
(aphakic
eye), visual
sensitivity
would
extend to
ultraviolet
RETINA response characteristics: Spectral #2
Dominance of L cones over S cones across species. Measured S cone proportion is shown for a
variety of animals. For some animals, two measurements at different locations on the retina are shown.
Large variation in L cone proportion indicates dorso-ventral asymmetries, like those discussed in
Szél et al. (2000).
Science  10 Jun 2011:Vol. 332, Issue 6035, pp. 1307-1312 DOI: 10.1126/science.1200172
For a short wavelength–sensitive pigment, although
its noise literally disappears at lmax < 400 nm (Fig. C),
nonspecific light absorption by proteins, peaking at
~280 nm, becomes a limiting factor. These
considerations probably explain, at least partially, why
the lmax values of native visual pigments are
confined to the narrow bandwidth of ~360 to
620nm,limitingcolorvisionaccordingly.
Predicted thermal-noise rate constant
as a function of lmax . Black circles,
rhodopsins; red squares, cone
pigments.
http://dx.doi.org/10.1016/S0896-6273(00)80845-4
Present-day vertebrates vary enormously in the sophistication of their color vision, the density and spatial distribution of
cone classes, and the number and absorption maxima of their cone pigments (38, 30, 31 and 100). At one extreme, most
mammals have only three pigments: the two ancestral cone pigments and rhodopsin. At the other evolutionary extreme,
chickenspossess sixpigments: fourconepigments,onerhodopsin, and a pinealvisual pigment, pinopsin.
In this evolutionary comparison, humans and their closest primate relatives represent an intermediate level of
complexity. Humans have four visual pigments (1999): a single member of the <500 nm family of cone pigments (the
blue or short-wave pigment, with an absorption maximum at 425 nm), two highly homologous members of the >500 nm∼
family (the green or middle-wave pigment, and red or long-wave pigment, with absorption maxima at 530 and 560 nm,∼ ∼
respectively), and rhodopsin. The presence of only a single gene encoding a >500 nm pigment in almost all New World
primates, and in all nonprimate mammals studied to date, places the red/green visual pigment gene duplication in the Old
World primate lineage at 30–40 million years ago, shortly after the geologic split between Africa and South America (∼
Jacobs1993)
DOI:10.1098/rstb.2009.0050
The spectral tuning of vertebrate opsins will also be influenced by their evolutionary
history (Goldsmith1990). Melanopsin acts as a bistable pigment able to regenerate
(recycle) itschromophore (11-cis-retinal) using all-trans-retinal and long-wavelength light
inamanner reminiscentoftheinvertebratephotopigments(Melyan etal.2005).
In this regard melanopsin may be unique among mammalian photopigments in
formingastableassociationwithall-trans-retinal.
Interestingly, the melanopsins appear to share some of the key characteristics of an
invertebrate-like signal transduction pathway. Both pRGCs and cells transfected
with melanopsin show depolarizing responses to light and, displays chromophore
bistability/tristability (Emanueletal.2015) another feature of the invertebrate
photopigments. Amino acid sequence features of melanopsin protein resulting delayed
deactivationoftemporalintegrationoflightsignal(Mureetal.,2016).
RETINA response characteristics: Spectral #3
Transducing intermediate pigment states
Schematic representation of the photochemical invertebrate rhodopsin
cycle of blowfly (Calliphora). rhodopsin R excited by light absorption converts
to bathorhodopsin B. Thermal decay via lumirhodopsin L to metarhodopsin M
follows. The back reaction proceeds via putative intermediates K and possibly
N.Timeconstants oftheconversion stepsareindicated (Kruizinga etal. 1983 ).
Vertebrate rhodopsin intermediates. (A)Decay of theactivated Meta II state
to Meta III. Illumination of rhodopsin’s dark state (lmax = 500 nm ) produces the
Meta I/Meta II photoproduct equilibrium. By applying a second illumination, the
decay product Meta III of the second pathway can be converted back to Meta
I/Meta II (again consisting mostly of Meta II), while the decay products of the first
pathway, opsin and all-trans retinal, remain largely unreactive .(B, Bovine)
rhodopsin transduction. Activation of rhodopsin is achieved by light-dependent
isomerization of the chromophore and subsequent thermal relaxation of the
receptor on the millisecond time scale to the active receptor conformation (
Bartl and Vogel 2007).
A State Model for Tristable Melanopsin (A) State diagram of
melanopsin (top) based on parameters measured biochemically from
purified pigment (Matsuyamaet al., 2012). Shown are melanopsin (R),
metamelanopsin (M), and extramelanopsin (E) with chromophores
designated. Below are plotted the relative photosensitivities (i.e., products of
the extinction coefficients and quantum efficiencies) of these states as a
function of wavelength. (B) Predicted equilibrium fraction of each pigment
state as a function of wavelength. Lines show the R state (black), M state
(blue), and E state(red) - Emanuel etal. 2015
Photoreversal of vertebrate rhodopsin (Williams1964). Both the test flash
and the bleaching light consisted of long wavelengths primarily absorbed by
rhodopsin. The blue, photoregenerating flash contained wavelengths
absorbed by the longer-lived intermediates of the bleaching process. This
photoreversal might in practice enhanceBlue Light Hazard (Grimmetal.2000).
Regeneration of pigment to responsive state by
second illumination both with 'invertebrate'-like melanopsin
andvertebraterhodopsin.
DOI: 10.1042/bj3301201
http://dx.doi.org/10.1016/j.visres.2005.12.017 | Cited by 26
Time courses on amounts of photolysis products in goldfish cones
normalized tobleached visual pigment. 
Decompositionof final T- and L-spectra
of rod outer segments at 1800 s
postbleach (noisy curves) into
components. It reveals, in addition to
dehydroretinal and P480, a generation
of a small amount of dehydroretinol.
The sum of RAL, ROL, and P480 (bold
curves) provides a good approximation
of theexperimental spectra.
RETINA response characteristics: Spectral #4
http://dx.doi.org/10.1038/13185
Genetic and psychophysical results from the latter class indicated
that limited red–green discrimination can be achieved with
pigments that have the same peak wavelength sensitivity and that
differ only in optical density.
Types of color blindness with their prevalence
faculty.montgomerycollege.edu
http://dx.doi.org/10.1016/j.visres.2011.08.016
sensationalcolor.com/understanding-color
www.npr.org/2014/11/16
http://www.bbc.co.uk/news/entertainment-arts-27884975
By Colin Schultz | smithsonian.com | August 20, 2012
RETINA response characteristics: Spectral #5
A tiny group of people can see ‘invisible’ colours (“tetrachromacy” - four cones,
instead of three cones) that no-one else can perceive, discovers David Robson.
How do they do it?
“Jordan’s “acid test” involved coloured discs showing different mixtures of
pigment, such as a green made of yellow and blue. The mixtures were
too subtle for most people to notice: almost all people would see the same
shade of olive green, but each combination should give out a subtly different
spectrum of light that would be perceptible to someone with a fourth cone.
Sure enough, Jordan’s subject was able to differentiate between the different
mixtures each time. “
http://www.bbc.com/future/story/20140905-the-women-with-super-human-vision
While tetrachromacy is so rare that it makes headlines every time a new case
emerges, it might come as a surprise that women with four cone types in their
retinas are actually more common than we think. Researchers estimate that they
represent as much as 12% of the female population (4). So why aren’t we
surrounded by women with extraordinary colour vision? Researchers have found
that only a small fraction of women who possess an extra cone type actually get to
enjoy more colours. So what does it take to be a true tetrachromat? How does the
human retina come to produce four cone types, and why does it only concern
women? More importantly, why don’t all women fulfil their genetic potential? And
how do we find the special women who do?
theneurosphere.com/2015/12/17
[4] Jordan, G. et al. (2010). The dimensionality of color vision in carriers of anomalous trichromacy.
 Journal of Vision, 10.
Tetrachromats are rare enough, but Concetta Antico is particularly remarkable, since, as an artist, she is able to
give us a rare view into that world. “Her artwork might tap into a structure that all of us can appreciate,” says
Kimberly Jameson at the University of California, Irvine, who has studied Antico extensively. It’s even possible
that she might suggest ways for more people to see the same way.
RETINA response characteristics: Intensity
Illumination levels. Typical ambient light levels are
compared with photopic luminance (log cd m-2
), pupil
diameter (mm), photopic and scotopic retinal illuminance
(log photopic and scotopic trolands respectively) and
visual function. The scotopic, mesopic and photopic
regions are defined according to whether rods alone,
rods and cones, or cones alone operate. The conversion
from photopic to scotopic values assumed a white
standard CIE D65 illumination. (
Stockman and Sharpe 2006)
How these four separate mechanisms — photopigment
depletion, pupil contraction, cellular adaptation and response
compression — coordinate luminance adaptation is not yet
known. However, Peter Kaiser and Robert Boynton provide a
quantitative illustration of how the four principal processes
might interact, as shown below.
http://www.handprint.com/HP/WCL/color4.htm
l
http://www.handprint.com/HP/WCL/color4.htm
l
Top right: Spectral response of the eye for point sources. Peak cone
sensitivity is over 200 times lower than peak rod sensitivity.
Relative sensitivities of S, L and M cones are shown within photopic mode; by
combining their inputs, the brain creates colors. Bottom left: Exposed to low-
light conditions in full photopic mode, cone sensitivity increases 30-100 times
within ~10 minutes, reaching its maximum sensitivity level (the darker it is, the
faster transition from cones-to-rods function; in near-complete darkness, the
cones shut down almost instantly). At the point of cones-rods break, rods
become dominant, gaining in sensitivity some 200-1000 times over peak cone
sensitivity within the next ~20 minutes (individual sensitivity varies within the
shown approximate range: by a factor of ~3 and ~10 for the cones and rods,
respectively). In the process, peak sensitivity shifts from ~555nm (photopic) to
~507nm (scotopic). The response range shifts from ~400-730nm to ~370-
650nm, respectively. Dark-to-light eye adaptation takes considerably less:
only about 7 minutes.
a
Maximum sensitivity level, after ~10 min in darkness; maximum bright-light
cone sensitivity is 30-100 times lower.
http://www.telescope-optics.net/eye_spectral_response.htm
RETINA response characteristics: Circuit
(A) Time course of the Early Receptor Potential (ERP) and Late Receptor Potential (LRP) in
monkey retina compared to ERG a-wave (redrawn from Brown and Murakami1964). (B)
Intensity-dependence of human ERP illustrating the log-linear relationship between light
intensity in pigment-level responses, and non-linear relationship between light intensity and a-
wave response (redrawn from Debecker and Zanen 1975 ). Graph from Teikari thesis (2012).
The cells of the retina and their response to a spot light flash. The photoreceptors are the rods and
cones in which a negative receptor potential is elicited. This drives the bipolar cell to become either
depolarized or hyperpolarized. The amacrine cell has a negative feedback effect. The ganglion cell
fires an action pulse so that the resulting spike train is proportional to the light stimulus level. (bem.fi)
The classical photoreceptors cones and rods are not designed to encode absolute light levels (unlike
melanopsin RGCs), and non-linearity in the visual processing is introduced very early already on the
retinal level. The pigment conformational change (cis to trans) is linear in relation to light intensity, but
photoreceptor response already is nonlinear (a-wave)
The dependence of the b-wave amplitude
(solid squares) and the a-wave amplitude
(open squares) on the log intensity of the
light stimulus. The data points describe the
mean ± 2 SD of the responses obtained in
the dark-adapted state from 40 eyes of20
volunteers with normal vision.
The relationship between he b wave amplitude and the a wave amplitude
obtained from responses evoked in the dark-adapted state. The continuous
line describes the mean relationship, while the 2 dashed lines bind the
normal range (mean+2SD). Open and solid triangles represent normal ERG
data obtained, respectively, from papers by Berson and Weleber. Data of 2
patients are also illustrated; one patient suffers from high myopia (open
circles), while the other complained of nyctalopia (solid circles).
Relationship between
the amplitudesof the b
wave and the a wave asa
useful index for
evaluating the
electroretinogram.
I. Periman
Br JOphthalmol
1983;67:443-448
doi:10.1136/bjo.67.7.443
Cited by69
Retina advanced processing
(2010) http://dx.doi.org/10.1016/j.neuron.2009.12.009, Cited by 266
(A) Detection of dim light flashes in the rod-
to-rod bipolar pathway. Each photoreceptor
output is sent through a band-pass temporal
filter followed by a thresholding operation
before summation by the rod bipolar cell.
Computations Performed by the Retina and Their Underlying Microcircuits.
(B) Sensitivity to texture motion. The bipolar
cells have biphasic dynamics and thus
respond transiently. Only the depolarized
bipolar cells communicate to the ganglion cell,
because of rectification in synaptic
transmission.
(C) Detection of differential motion Polyaxonal
amacrine cells in the periphery are excited by
the same motion-sensitive circuit and send
inhibitory inputs to the center. If motion in the
periphery is synchronous with that in the
center, the excitatory transients will coincide
with the inhibitory ones, and firing is
suppressed.
(D) Detection of approaching motion. The
circuit that generates this approach sensitivity is
composed of excitation from OFF bipolar cells
and inhibition from amacrine cells that are
activated by ON bipolar cells, at least partly via
gap junction coupling. Importantly, these inputs
are nonlinearly rectified before integration by the
ganglion cell
(E) Rapid encoding of spatial structures with
spike latencies The responses result from a circuit
that combines synaptic inputs from both ON and
OFF bipolar cells whose signals are individually
rectified. The timing differences in the responses
follow from a delay (∆t) in the ON pathwa. y
(F) Switching circuit. A control signal selectively
gates one of two potential input signals. (Right) In
the retina, such a control signal is driven by certain
wide-field amacrine cells (A1), which are activated
during rapid image shifts in the periphery. Their
activation leads to a suppression of OFF bipolar
signals and, through a putative local amacrine cell
(A2), to disinhibition of ON bipolar signals
Journal of Vision May 2008, Vol.8, 15
doi: 10.1167/8.5.15
Cited by 42
Basic data from Hofer, Singer, et al. (2005). (Top
panels) Schematics of retinal mosaics used for
the 5 observers. These are subsets of the full
regions characterized for each observer. For
each observer, the imaging and densitometry
data were insufficient to assign a class or exact
location to some cones. These parameters were
filled in according to the procedure described by
Hofer, Singer, et al. (2005). In the schematics, L
cones are colored red, M cones green, and S
cones blue. L:M ratios of mosaics used: HS
1:3.1; YY 1.2:1; AP 1.3:1; MD 1.6:1, BS, 14.7:1
Roorda (2011): Instead of eliciting three classes of response generated from the stimulation of the
three cone classes, they found that subjects demanded as many as 7 color categories. Analysis
of the responses suggested that the color appearance generated by a single cone is more a function of
how it is situated with respect to other cones rather than by its spectral subtype. Cones that are in a
position to provide strong chromatic cues generate colored percepts, whereas cones that are not in a
good position to do so generate achromatic, or white percepts. Given the random arrangement of the
three cone classes in the retina, it is sensible that the visual system would develop in this way to best
handle the dual role that retina has in conveying both spatial and color vision.
An adaptive optics
system was used to
measure and correct
for aberrations in the
optics of individual
observers. This
enabled resolution of
individual cones in
acquired fundus
images.
Science Advances 14 Sep 2016:
Vol. 2, no. 9, e1600797
Retina already recurrent as well
Published: May 3, 2011
http://dx.doi.org/10.1371/journal.pbio.1001058
Published: May 3, 2011 | http://dx.doi.org/10.1371/journal.pbio.1001057
A conceptual model of positive feedback in the outer retina. (A)
Diagram depicting the differential spread of positive and negative feedback within
an HC. The top bar denotes the illumination pattern. A cone depolarized in darkness
will release glutamate, activating AMPA receptors (APMARs), causing depolarization
and Ca2+ influx. The rise in Ca2+ is restricted to the specific dendrite that contacts
the cone, and the resulting positive feedback is localized to that cone. The
depolarization spreads electrotonically through the HC, resulting in negative
feedback from all of the dendrites. (B) Model simulations of the effect of feedback
on synaptic release from a linear array of cones exposed to a dark spot on a non-
saturating light background (see Methods). The positive feedback signal (blue)
is localized to HC dendrites in contact with dark cones while the negative
feedback signal (red) electrotonically spreads through the HCs. Traces show
simulated cone release with no feedback (green), with negative feedback, (red), and
with equally weighted negative and positive feedback (blue).
Spatial circuitry models in dark- and light-adapted conditions. A: in
dark-adapted conditions, OFF bipolar cells receive wide spatial inhibition from wide-
field GABAergic amacrine cells. Coupling between both AII and other glycinergic
amacrine cells likely contribute to increasing the wide spatial spread of glycinergic
signals to OFF bipolar cells. B: in light-adapted conditions, OFF bipolar cells receive
spatially narrow glycinergic input, likely due to uncoupling of AII and other glycinergic
amacrine cells. Light stimuli distant from the bipolar cell likely active serial inhibitory
connections between GABAergic amacrine cells, which would shorten spatial
GABAergic signals to OFF bipolar cells. C: functional schematic of changing bipolar
cell center-surround sizes. In dark-adapted conditions, OFF bipolar cells receive
wide and strong inhibition, so their inhibitory surrounds are large. If 2 small spots
of light are presented to the retina, spot A stimulates excitatory output from the
center of one OFF bipolar cell, whereas spot B stimulates surround inhibitory
connections to that same cell. Overall output is reduced in this instance due to the
addition of inhibitory input. In light-adapted conditions, OFF bipolar cells receive
narrow and weaker inhibition, so their inhibitory surrounds are small. In these
conditions, spot B does not stimulate the inhibitory surround, and there is no
reduction in excitatory bipolar cell output from spot A. Thus the strength of the
bipolar cell output in the light-adapted case is stronger
doi:10.1152/jn.00948.2015
Neuroscience Deep Learning |
Background
fMRI+EEG+Behavioral data multimodal data
http://dx.doi.org/10.1016/j.neuroimage.2015.12.030
Specifically, we show how combining either EEG and
fMRI with a behavioral model can perform substantially
better than a behavioral-data-only model in both
generativeandpredictivemodelinganalyses.
We then show how a trivariate model – a model
including EEG, fMRI, and behavioral data – outperforms
bivariate models in both generative and predictive
modelinganalyses
Graphicaldiagramderivedfrom
Turneretal.(2016) [seepreviousslidesfor
EEG+fMRI: Observable data are
representedasgrayboxes,whereas
unknown(latent) variables arerepresented
asemptycircles.Theorangeplate
representsthebehavioraldata/model,the
greenplaterepresentstheEEGdata/model,
andtheblueplaterepresentsthefMRI
data/model.
ζ
I
Themethodallows
foranybehavioral
modeltobe
combinedwith
multipleneural
measures.
Generative
Deep Network
Improvetheexisting
generativemodels
MEG Visual processing with deep learning
http://dx.doi.org/10.1016/j.neuroimage.2016.03.063
Magnetoencephalography (MEG)
Image set and single-image decoding. (A) The stimulus
set comprised 48 indoor scene images differing in the size of the
space depicted (small vs. large), as well as clutter, contrast, and
luminance level; here each experimental factor combination is
exemplified by one image. The image set was based on
behaviorally validated images of scenes differing in size and
clutter level, de-correlating factors size and clutter explicitly by
experimental design (Park et al., 2015).
The deep neural network architecture “AlexNet” was implemented
following Krizhevsky et al. (2012). We chose this particular
architecture because it was the best performing model in object
classification in the ImageNet 2012 competition (
Russakovsky et al., 2014)
Supplementary Movie 1.
The deep scene model accounts for more of the MEG size
signal than other models. (A) We combined representational
similarity with partial correlation analysis to determine which
computational models explained emerging representations of
scene size in the brain.
Together our data provide a first description of an electrophysiological signal for layout
processing in humans and suggest that deep neural networks are a promising
framework to investigate how spatial layout representations emerge in the human brain.
Future studies using image sets optimized to drive low-and high level visual cortex equally
are necessary, to test whether layer-specific representations in deep neural networks can
be mapped in both time and in space onto processing stages in the human brain.
Sidenote!
AlexNet was indeed
revolutionary at its
time, but the 2015
winner ResNet from
Microsoft surpassed
human performance
Brain Circuit feed-forward vs recurrent
a | Feedforward network. The
diagram shows a multilayer perceptron,
consisting of three sequential layers of
neurons (represented by circles), in which
every neuron from each layer is
connected to every neuron of the next
layer. In this network, inputs are
sequentially processed layer by layer in a
unidirectional fashion, from the input
layer on the left, to the ‘hidden’ layer in
the middle, to the output layer on the
right. The simple addition of synaptic
weights in the output layer results in the
generation of selective responses. The
computation is an emergent property of
the activity of the entire network.
b | Recurrent network: an
example of an attractor (feedback)
neural network in which four
pyramidal neurons (blue) are
connected to themselves through
recurrent axons (thin lines) with
synaptic weights (wij) that change
owing to a learning rule. The
network receives an external set of
inputs (top connections) and
generates an output (bottom
arrows). In networks with recurrent
and symmetric connectivity the
activity becomes ‘attracted’ to
particular stable patterns.
http://dx.doi.org/10.1038/nrn3962, Cited by 49
Nature Reviews Neuroscience 11, 615-627
(September 2010)
doi:10.1038/nrn2886
The feedforward network as a model of
information processing in the brain. a | A
schematic of hierarchical processing in the visual
systems of primates. Similar schematic models
have also been described for other sensory and
motor areas. b | Each module in part a can be
considered as a recurrent network of excitatory
and inhibitory neurons. Each of the rectangular
boxes represents a recurrent random network. The
hierarchical structure of the brain is conceived
here as a network of recurrent networks
with forward and backward excitatory
connections. So far, only the feedforward part
(shown in black) of such a network of networks
has been investigated in a systematic manner.
Recurrent excitation and inhibition within one
group and excitatory synapses that do not
contribute to the feedforward hierarchy of
subsequent groups (shown in grey) have not been
considered yet
Residual variants state-of-the-art Deep feedforward network
https://arxiv.org/abs/1512.03385; Cited by 578
https://arxiv.org/abs/1603.05027
https://arxiv.org/abs/1602.07261
https://arxiv.org/abs/1602.07360
http://dx.doi.org/10.1007/978-3-319-46976-8_19
Skip connections
https://arxiv.org/abs/1604.08671
The framework of the proposed DEGREE network. The recurrent residual network
recovers sub-bands of the HR image features iteratively and edge features are utilized
as the guidance in image SR for preserving sharp details.
Circuit design deep networks vs Human brain #1
Center for Data Science, New York University
Department of Brain and Cognitive Sciences, MIT
Department of Psychology and Center for Brain Science, Harvard University
Center for Brains Minds and Machines
https://arxiv.org/abs/1604.00289; Citedby13
Science 11 Dec 2015:
Vol. 350, Issue 6266, pp. 1332-1338
DOI: 10.1126/science.aab3050; Cited by70
Circuit design deep networks vs Human brain #2
https://arxiv.org/abs/1604.03640
Center for Brains, Minds and Machines, McGovern Institute, MIT
How similar is an ultra-deep residual network to the primate cortex? A notable difference is the
depth. While a residual network has as many as 1202 layers, biological systems seem to have two
orders of magnitude less. In fact, there are about half a dozen areas in the ventral stream of visual
cortex from the retina to the Inferior Temporal cortex. Notice that it takes in the order of 10ms for
neural activity to propagate from one area to another one. The evolutionary advantage of having
fewer layers is apparent: it supports rapid (100msec from image onset to meaningful information
in IT neural population) visual recognition, which is a key ability of human and non-human primates.
It is intriguingly possible to account for this discrepancy by taking into account recurrent
connections within each visual area. Areas in visual cortex comprise six different layers with lateral
and feedback connections, which are believed to mediate some attentional effectsand even learning
(such as backpropagation). “Unrolling” in time the recurrent computations carried out by the visual
cortex provides an equivalent “ultra-deep” feedforward network, which might represent a more
appropriate comparison with the state-of-the-art computer vision models.
In addition, we conjecture that the effectiveness of recent “ultra-deep” neural networks primarily
come from the fact they can efficiently model the recurrent computations that are required by
the recognition task. We show compelling evidences for this conjecture by demonstrating that 1. a
deep residual network is formally equivalent to a shallow RNN; 2. such a RNN with weight
sharing, thus with orders of magnitude less parameters (depending on the unrolling depth), can
retain most of the performance of the corresponding deep residual network. Furthermore, we
generalize such a RNN into a class of models that are more biologically-plausible models of cortex
and show their effectiveness on CIFAR-10.
The transition matrices used in the paper. “BN” denotes Batch Normalization and “Conv” denotes convolution. Deconvolution layer (denoted
by “Deconv”) is [34] used as a transition function from a spacially small state to a spacially large one. BRCx2/BRDx2 denotes a BN-ReLU-
Conv/Deconv-BN-ReLU-Conv/Deconv pipeline (similar to a residual module). There is always a 2x2 subsampling/upsampling between nearby
states (e.g., V1/h1: 32x32, V2/h2: 16x16, V4/h3:8x8, IT:4x4). Stride 2 (convolution) or upsampling 2 (deconvolution) is used in transition
functions to match the spacial sizes of input and output states. The intermediate feature sizes of transition function BRCx2/BRDx2 or
BRCx3/BRDx3 are chosen to be the average feature size of input and output states. “+I” denotes a identity shortcut mapping. The design of
transition functions could be an interesting topic for future research.
Circuit design deep networks vs Human brain #3
HYPOTHESIS & THEORY ARTICLE
Front. Comput. Neurosci., 14 September 2016 | http://dx.doi.org/10.3389/fncom.2016.00094; Cited by 5
Putative differences between conventional and brain-like neural network
designs. (A) In conventional deep learning, supervised training is based on externally-supplied,
labeled data. (B) In the brain, supervised training of networks can still occur via gradient descent
on an error signal, but this error signal must arise from internally generated cost functions.
(C) Internally generated cost functions and error-driven training of cortical deep networks form
part of a larger architecture containing several specialized systems. Although the trainable cortical
areas are schematized as feedforward neural networks here, LSTMs or other types of recurrent
networks may be a more accurate analogy, and many neuronal and network properties such as
spiking, dendritic computation, neuromodulation, adaptation and homeostatic plasticity, timing-
dependent plasticity, direct electrical connections, transient synaptic dynamics,
excitatory/inhibitory balance, spontaneous oscillatory activity, axonal conduction delays (
Izhikevich, 2006) and others, will influence what and how such networks learn.
THISARTICLE ISPART OF THE RESEARCH TOPIC
ArtificialNeural NetworksasModelsofNeuralInformationProcessing
Machine learning and neuroscience speak different languages today. Brain science has
discovered a dazzling array of brain areas (Solari and Stoner, 2011), cell types, molecules,
cellular states, and mechanisms for computation and information storage. Machine
learning, in contrast, has largely focused on instantiations of a single principle: function
optimization.
We will argue here, however, that neuroscience and machine learning are again ripe for
convergence. Three aspects of machine learning are particularly important in the context
of this paper.
Hypothesis 1 – The Brain Optimizes Cost Functions
Hypothesis 2 – Cost Functions Are Diverse across Areas and Change over
Development
Hypothesis 3 – Specialized Systems Allow Efficient Solution of Key Computational
Problems
Machine learning may be equally transformed by neuroscience. Within the brain, a
myriad of subsystems and layers work together to produce an agent that exhibits general
intelligence.
Hypothesis 1 – Existence of Cost Functions
Hypothesis 2 – Biological Fine-structure of Cost Functions
Hypothesis 3 – Embedding within a Pre-structured Architecture
Hypothesis 1–
Did Evolution Separate Cost Functions from
OptimizationAlgorithms?
We hypothesize that the brain also acquired such a separation between
optimization mechanisms and cost functions. When did the division
between cost functions and optimization algorithms occur? How is this
separation implemented? How did innovations in cost functions and
optimization algorithms evolve? And how do our own cost functions and
learning algorithms differ from those of other animals?

More Related Content

PDF
Shallow introduction for Deep Learning Retinal Image Analysis
PPTX
Interpreting Visual Fields
PDF
AI in Ophthalmology | Startup Landscape
PDF
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and Oculomics
PDF
Retinal image analysis using morphological process and clustering technique
PDF
CV of gongpu lan 20150712
PDF
Application of generative adversarial networks (GAN) for ophthalmology image ...
PDF
Advanced Retinal Imaging
Shallow introduction for Deep Learning Retinal Image Analysis
Interpreting Visual Fields
AI in Ophthalmology | Startup Landscape
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and Oculomics
Retinal image analysis using morphological process and clustering technique
CV of gongpu lan 20150712
Application of generative adversarial networks (GAN) for ophthalmology image ...
Advanced Retinal Imaging

Similar to Data-driven Ophthalmology (20)

PDF
An effective deep learning network for detecting and classifying glaucomatous...
PDF
Retinal Blood Vessels Exudates Classification For Detection Of Hemmorages Tha...
PDF
Practical Considerations in the design of Embedded Ophthalmic Devices
PDF
AUTOMATED DETECTION OF HARD EXUDATES IN FUNDUS IMAGES USING IMPROVED OTSU THR...
PDF
Detection of Glaucoma using Optic Disk and Incremental Cup Segmentation from ...
PDF
Optic Disc and Macula Localization from Retinal Optical Coherence Tomography ...
PDF
AN AUTOMATIC SCREENING METHOD TO DETECT OPTIC DISC IN THE RETINA
PDF
RETINAL IMAGE CLASSIFICATION USING NEURAL NETWORK BASED ON A CNN METHODS
PDF
A novel-approach-for-retinal-lesion-detection-indiabetic-retinopathy-images
PDF
Analysis on Glaucoma Detection
PDF
Application of deep learning methods for automated analysis of retinal struct...
PDF
Short intro for retinal biomarkers of Alzheimer’s Disease
PDF
Automated fundus image quality assessment and segmentation of optic disc usin...
PDF
Er36881887
PPTX
Glaucoma progressiondetection based on Retinal Features.pptx
PDF
Pupillometry Through the Eyelids
PDF
Reduction of retinal senstivity in eyes with reticular pseudodrusen
PPTX
Fractured Bone Case Study by Slidesgo.pptx
PDF
Automated Detection of Optic Disc in Retinal FundusImages Using PCA
PDF
FUZZY CLUSTERING BASED GLAUCOMA DETECTION USING THE CDR
An effective deep learning network for detecting and classifying glaucomatous...
Retinal Blood Vessels Exudates Classification For Detection Of Hemmorages Tha...
Practical Considerations in the design of Embedded Ophthalmic Devices
AUTOMATED DETECTION OF HARD EXUDATES IN FUNDUS IMAGES USING IMPROVED OTSU THR...
Detection of Glaucoma using Optic Disk and Incremental Cup Segmentation from ...
Optic Disc and Macula Localization from Retinal Optical Coherence Tomography ...
AN AUTOMATIC SCREENING METHOD TO DETECT OPTIC DISC IN THE RETINA
RETINAL IMAGE CLASSIFICATION USING NEURAL NETWORK BASED ON A CNN METHODS
A novel-approach-for-retinal-lesion-detection-indiabetic-retinopathy-images
Analysis on Glaucoma Detection
Application of deep learning methods for automated analysis of retinal struct...
Short intro for retinal biomarkers of Alzheimer’s Disease
Automated fundus image quality assessment and segmentation of optic disc usin...
Er36881887
Glaucoma progressiondetection based on Retinal Features.pptx
Pupillometry Through the Eyelids
Reduction of retinal senstivity in eyes with reticular pseudodrusen
Fractured Bone Case Study by Slidesgo.pptx
Automated Detection of Optic Disc in Retinal FundusImages Using PCA
FUZZY CLUSTERING BASED GLAUCOMA DETECTION USING THE CDR
Ad

More from PetteriTeikariPhD (20)

PDF
ML and Signal Processing for Lung Sounds
PDF
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
PDF
Wearable Continuous Acoustic Lung Sensing
PDF
Precision Medicine for personalized treatment of asthma
PDF
Two-Photon Microscopy Vasculature Segmentation
PDF
Skin temperature as a proxy for core body temperature (CBT) and circadian phase
PDF
Summary of "Precision strength training: The future of strength training with...
PDF
Precision strength training: The future of strength training with data-driven...
PDF
Intracerebral Hemorrhage (ICH): Understanding the CT imaging features
PDF
Hand Pose Tracking for Clinical Applications
PDF
Precision Physiotherapy & Sports Training: Part 1
PDF
Multimodal RGB-D+RF-based sensing for human movement analysis
PDF
Creativity as Science: What designers can learn from science and technology
PDF
Light Treatment Glasses
PDF
Deep Learning for Biomedical Unstructured Time Series
PDF
Hyperspectral Retinal Imaging
PDF
Instrumentation for in vivo intravital microscopy
PDF
Future of Retinal Diagnostics
PDF
OCT Monte Carlo & Deep Learning
PDF
Optical Designs for Fundus Cameras
ML and Signal Processing for Lung Sounds
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
Wearable Continuous Acoustic Lung Sensing
Precision Medicine for personalized treatment of asthma
Two-Photon Microscopy Vasculature Segmentation
Skin temperature as a proxy for core body temperature (CBT) and circadian phase
Summary of "Precision strength training: The future of strength training with...
Precision strength training: The future of strength training with data-driven...
Intracerebral Hemorrhage (ICH): Understanding the CT imaging features
Hand Pose Tracking for Clinical Applications
Precision Physiotherapy & Sports Training: Part 1
Multimodal RGB-D+RF-based sensing for human movement analysis
Creativity as Science: What designers can learn from science and technology
Light Treatment Glasses
Deep Learning for Biomedical Unstructured Time Series
Hyperspectral Retinal Imaging
Instrumentation for in vivo intravital microscopy
Future of Retinal Diagnostics
OCT Monte Carlo & Deep Learning
Optical Designs for Fundus Cameras
Ad

Recently uploaded (20)

PDF
OSCE SERIES ( Questions & Answers ) - Set 3.pdf
PPTX
Primary Tuberculous Infection/Disease by Dr Vahyala Zira Kumanda
PDF
04 dr. Rahajeng - dr.rahajeng-KOGI XIX 2025-ed1.pdf
PDF
The Digestive System Science Educational Presentation in Dark Orange, Blue, a...
PPT
Infections Member of Royal College of Physicians.ppt
PPTX
Hearthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
PPT
Dermatology for member of royalcollege.ppt
PPTX
Physiology of Thyroid Hormones.pptx
DOCX
PEADIATRICS NOTES.docx lecture notes for medical students
PPTX
ANESTHETIC CONSIDERATION IN ALCOHOLIC ASSOCIATED LIVER DISEASE.pptx
PPTX
Critical Issues in Periodontal Research- An overview
PPTX
Wheat allergies and Disease in gastroenterology
PDF
B C German Homoeopathy Medicineby Dr Brij Mohan Prasad
PPTX
Vesico ureteric reflux.. Introduction and clinical management
PDF
AGE(Acute Gastroenteritis)pdf. Specific.
PPTX
Impression Materials in dental materials.pptx
PPT
Rheumatology Member of Royal College of Physicians.ppt
PPTX
Neoplasia III.pptxjhghgjhfj fjfhgfgdfdfsrbvhv
PDF
Lecture on Anesthesia for ENT surgery 2025pptx.pdf
PDF
OSCE SERIES ( Questions & Answers ) - Set 5.pdf
OSCE SERIES ( Questions & Answers ) - Set 3.pdf
Primary Tuberculous Infection/Disease by Dr Vahyala Zira Kumanda
04 dr. Rahajeng - dr.rahajeng-KOGI XIX 2025-ed1.pdf
The Digestive System Science Educational Presentation in Dark Orange, Blue, a...
Infections Member of Royal College of Physicians.ppt
Hearthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Dermatology for member of royalcollege.ppt
Physiology of Thyroid Hormones.pptx
PEADIATRICS NOTES.docx lecture notes for medical students
ANESTHETIC CONSIDERATION IN ALCOHOLIC ASSOCIATED LIVER DISEASE.pptx
Critical Issues in Periodontal Research- An overview
Wheat allergies and Disease in gastroenterology
B C German Homoeopathy Medicineby Dr Brij Mohan Prasad
Vesico ureteric reflux.. Introduction and clinical management
AGE(Acute Gastroenteritis)pdf. Specific.
Impression Materials in dental materials.pptx
Rheumatology Member of Royal College of Physicians.ppt
Neoplasia III.pptxjhghgjhfj fjfhgfgdfdfsrbvhv
Lecture on Anesthesia for ENT surgery 2025pptx.pdf
OSCE SERIES ( Questions & Answers ) - Set 5.pdf

Data-driven Ophthalmology

  • 2. Introduction ● Purpose of this presentation is to provide a light visual literature review on “big data” or deep learning / artificial intelligence solutions to come for ophthalmology and vision sciences. – More with an idea to introduce topics that you might have not thought of before without going to deeply to details Some of the background in order to understand this presentation better are covered in my previous presentation → ● Presentation itself is quite dense, and better suitable to be read from a tablet/desktop rather than as a slideshow projected somewhere Shallow introduction for Deep Learning Retinal Image Analysis Published on Aug 20, 2016 https://www.slideshare.net/PetteriTeikariPhD/shallow-introduction -for-deep-learning-retinal-image-analysis
  • 4. Ophthalmic IMAGING 2D Fundus 3D OCT→ Examples of color and high-dynamic-range (HDR) disc photographs of 2 normal controls (a, b and c, d) and 2 glaucoma patients (e, f and g, h). Left column (a, c, e, and g) color disc photograph and right column (b, d, f, and h) high-dynamic-range concept disc photograph. https://doi.org/10.1155/2017/8209270 Linear-scale adaptive optics (AO)-Optical Coherence Tomography (OCT) volume acquired with three different AO focus depths (RNFL, OPL, and IS/OS) and combined for displaying appearance of retinal layers in AO-OCT images. En face images are projections of subvolumes shown in the middle, demonstrating the fine-depth sectioning ability of AO-OCT. (Jonnal et al., 2016) Optical Coherence Tomography (OCT) and its variants, the de facto standard for eye diagnostics Multispectral imaging going beyond RGB channels and laser-based OCTs (Figure from Annidis)
  • 5. Ophthalmic IMAGING (A)SLO and multimodal systems (2015) https://doi.org/10.1364/BOE.6.001407 (2016) https://doi.org/10.1364/BOE.7.001783 https://doi.org/10.1007/s00417-016-3361-7 Fundus autofluorescence, microperimetry and hyperreflective intraretinal spor (HRS) analysis using OCT
  • 6. Ophthalmic IMAGING Functional Imaging http://dx.doi.org/10.1167/iovs.16-21389 http://dx.doi.org/10.1167/iovs.16-20598 Model of the retinal vasculature represented by a binary tree. The vessels bifurcate in a dichotomous manner except for the precapillaries, which are point of origin of four capillaries. Adapted from Takahashi et al. (2009) http://dx.doi.org/10.1111/aos.13365 http://dx.doi.org/10.1080/02713683.2016.1217544 KEYWORDS: Hyperspectral retinal camera, primary open-angle glaucoma, retinal oxygen saturation http://dx.doi.org/10.1167/iovs.13-12124 The average arteriolar (left) and venular (right) OD values at each given (5-nm) imaged wavelength from 500 to 600 nm for all of the volunteers. In summary, this article has described a novel hyperspectral prototype for spectral imaging of the retina that can potentially be used in the future to acquire retinal vessel blood oxygen saturation values. By considering the limitations of ocular imaging encountered by other retinal oximetry studies, namely longer acquisition and exposure times, flash exposure, and limited wavelength intervals, this new instrument may be promising in acquiring more refined and faster measurements of nonflash exposure retinal oximetry measurements in vivo that can potentially be applied to human retinal vascular disease.
  • 7. Ophthalmic IMAGING portable imaging Human Factor and Usability Testing of a Binocular OCT System - EASE Study Reena Chopra1 , Padraig J. Mulholland1, 2 , Adam M. Dubis1 , Roger S. Anderson1, 2 , Pearse A. Keane1 1 NIHR Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, United Kingdom; 2 Optometry and Vision Science Research Group, School of Biomedical Sciences, Ulster University, Coleraine, Northern Ireland, United Kingdom Automated quantitative pupillometry using the Binocular OCT Purpose: A prototype binocular optical coherence tomography (OCT) device has recently been developed that performs ‘whole-eye’ OCT imaging in an automated manner (Envision Diagnostics, Inc. USA). The inclusion of ‘smart technology’ such as customizable display screens and voice recognition also permits the quantitative assessment of visual acuity (VA), visual fields, ocular motility, and pupillometry (Fig. 1). As this device will primarily be used in elderly and visually impaired populations, we performed prospective usability testing of an early prototype with a view to predicting function in a clinical setting, and to identify any potential user errors – EASE Study (ClinicalTrials.gov Identfier: NCT02822612). ARVO 2017 Annual Meeting Abstracts Session 516: Advancements in OCT Ophthalmologica 2017;238:89-99https://doi.org/10.1159/000475773 http://dx.doi.org/10.15761/NFO.1000102 Fundus Photography in the 21st Century—A Review of Recent Technological Advances and Their Implications for Worldwide Healthcare Panwar Nishtha, Huang Philemon, Lee Jiaying, Keane Pearse A., Chuan Tjin Swee, Richhariya Ashutosh, Teoh Stephen, Lim Tock Han, and Agrawal Rupesh. Telemedicine and e-Health. March 2016, 22(3): 198-208. https://doi.org/10.1089/tmj.2015.0068 iCam, 3nethra, CenterVue, iOptics EasyScan, Topcon TRC-NW8FPLUS, Zeiss Visucam 200, Kowa Nonmyd7, Canon CR-2, Oculus Imagecam, iExaminer, PanOptic, Volk Pictor, VersaCam, JedMed Horus Scope, Optomed Smartscope, Kowa Genesis-D, Riester, Ocular Cellscope, PEEK, dEye
  • 8. Retinal Layer Segmentation Pathological retina challenging still https://arxiv.org/abs/1704.02161 https://arxiv.org/abs/1707.04931 Branch Residual U-Network (BRU-net) https://doi.org/10.1364/BOE.8.003292 https://doi.org/10.1364/BOE.8.001926 Voxeleron Awarded NIH SBIR Grant for Device-independent Retinal OCT Image Analysis Software February 8, 2017 Daniel Russakoff Voxeleron will collaborate with Professor Pablo Villoslada of UCSF/IDIBAPS and Dr. Pearse Keane of Moorfields Eye Hospital to validate the algorithms and ensure clinical utility. in the choriocapillaris is shown. https://www.voxeleron.com/orion/
  • 10. Other Retinal segmentation & Detection Christos Bergeles, Adam M. Dubis, Benjamin Davidson, Melissa Kasilian, Angelos Kalitzeos, Joseph Carroll, Alfredo Dubra, Michel Michaelides, and Sebastien Ourselin Biomedical Optics Express Vol. 8, Issue 6, pp. 3081-3094 (2017) https://doi.org/10.1364/BOE.8.003081 https://arxiv.org/abs/1706.03008 (2017) https://doi.org/10.1109/ISBI.2017.7950704 Suman Sedai, Ruwan Tennakoon, Pallab Roy Khoa Cao and Rahil Garnavi IBM Research - Australia, Melbourne, VIC, Australia localization of the fovea, second stage produces an accurate segmentation of the fovea region. We present an algorithm that automatically detects cones in AOSLO split-detection images without supervision. Our algorithm is among the first that use machine learning to develop and use a photoreceptor model on-the-fly. Comparing to Cunefare et al. (2016), specifically, the approach presented here can tackle both densely and sparsely populated photoreceptor images as it is independent of the spatial arrangement of cones. Further, it introduces contrast enhancement filters, which improve the quality of low signal-to- noise ratio (SNR) images. m
  • 11. Optic disc and Cup segmentation or detection https://arxiv.org/abs/1704.00979 Visual comparison of the predicted results and correct segmentation on RIM-ONE v.3 for the optic disc (a)-(c), (g)-(i) and cup (d)-(f), (j)-(l). On (d)-(f), (j)-(l) region of the optic disc is shown as an input image. https://doi.org/10.1109/TPAMI.2016.2577031 https://arxiv.org/abs/1707.06397 We propose a simple yet effective method, termed Deep Descriptor Transforming (DDT), for evaluating the correlations of descriptors and then obtaining the category- consistent regions, which can accurately locate the common object in a set of unlabeled images, i.e., unsupervised object discovery.
  • 12. IMAGE CLASSIFICATION #1 July–August, 2017 Volume 1, Issue 4, Pages 322–327 Cecilia S. Lee, MD, Doug M. Baughman, BS, Aaron Y. Lee, MD, MSCI Department of Ophthalmology, University of Washington School of Medicine, Seattle, Washington. http://dx.doi.org/10.1016/j.oret.2016.12.009 Examples of identification of pathology by the deep learning algorithm. Optical coherence tomography images showing age-related macular degeneration (AMD) pathology (A, B, C) are used as input images, and hotspots (D, E, F) are identified using an occlusion test from the deep learning algorithm. The intensity of the color is determined by the drop in the probability of being labeled AMD when occluded. An occlusion test (Zeiler and Fergus, 2016) was performed to identify the areas contributing most to the neural network's assigning the category of AMD. A blank 20 × 20-pixel box was systematically moved across every possible position in the image and the probabilities were recorded. The highest drop in the probability represents the region of interest that contributed the highest importance to the deep learning algorithm. Varun Gulshan, PhD1; Lily Peng, MD, PhD1; Marc Coram, PhD1; et al JAMA. 2016;316(22):2402-2410. doi:10.1001/jama.2016.17216 Validation Set Performance for All-Cause Referable Diabetic Retinopathy in the EyePACS-1 Data Set (9946 Images) Performance of the algorithm (black curve) and ophthalmologists (colored circles) for all-cause referable diabetic retinopathy, defined as moderate or worse diabetic retinopathy, diabetic macular edema, or ungradable image. The black diamonds highlight the performance of the algorithm at the high-sensitivity and high-specificity operating points.
  • 13. IMAGE CLASSIFICATION #2 Stefanos Apostolopoulos, Carlos Ciller, Sandro I. De Zanet, Sebastian Wolf, Raphael Sznitman https://arxiv.org/abs/1610.03628 Ahmed ElTanboly,Marwa Ismail, Ahmed Shalaby, Andy Switala, Ayman El-Baz, Shlomit Schaal, Georgy Gimel’farb,Magdi El-Azab First published: 17 March 2017 DOI: 10.1002/mp.12071 https://doi.org/10.1146/annurev-bioeng-071516-044442
  • 14. IMAGE Quality in image classification Image Restoration: From Sparse and Low-rank Priors to Deep Priors Learning Deep CNN Denoiser Prior for Image Restoration Lei Zhang,, Wangmeng Zuo The Hong Kong Polytechnic University, Harbin Institute of Technology CLEAN GAUSSIAN NOISE GAUSSIAN BLUR Example performance of quality resilient networks on various quality distortions. This table shows the class prediction for an image under several different types of distortions (from top to bottom: clean, Gaussian noise and Gaussian blur). The original VGG16 network (Mclean ) fails on distorted images. Networks fine-tuned on different types of distortions perform well on that particular distortion, but not on other distortion types (Mnoise and Mblur ). Our mixture of experts based model (Mmix ) performs well over all distortion types as well as the original clean image. https://arxiv.org/abs/1703.08119 https://arxiv.org/abs/1611.05760 State-of-the-art image classification networks like VGG-16 perform poorly on blurred input (left), when using model weights trained on high-quality sharp image datasets (center). However, while they often make erroneous predictions in terms of the most likely classes for a blurred image, they do so with lower confidence—producing distributions that are higher-entropy than those for sharp images. However, this drop in performance is largely an artifact of being trained without any blurred examples. We find that by fine-tuning the model on a mix of blurred and sharp images for just a few epochs, allows it to perform well on both sharp and blurred inputs (right).
  • 15. IMAGE Restoration enhancement Deep Bilateral Learning for Real-Time Image Enhancement MICHAËL GHARBI, MIT CSAIL; JIAWEN CHEN, Google Research; JONATHAN T. BARRON, Google Research; SAMUEL W. HASINOFF, Google Research; FRÉDO DURAND, MIT CSAIL / Inria, Université Côte d’Azur, http://dx.doi.org/10.1145/3072959.3073592 https://arxiv.org/abs/1707.02880 Our novel neural network architecture can reproduce sophisticated image enhancements with inference running in real time at full HD resolution on mobile devices. It can not only be used to dramatically accelerate reference implementations, but can also learn subjective effects from human retouching. Image Restoration: From Sparse and Low-rank Priors to Deep Priors Lei Zhang,, Wangmeng Zuo The Hong Kong Polytechnic University, Harbin Institute of Technology https://arxiv.org/abs/1704.03264 Kai Zhang ; Wangmeng Zuo ; Yunjin Chen ; Deyu Meng ; Lei Zhang https://doi.org/10.1109/TIP.2017.2662206 An example to show the capacity of our proposed model for three different tasks (denoising, super-resolution, JPEG image deblocking). The input image is composed by noisy images with noise level 15 (upper left) and 25 (lower left), bicubically interpolated low-resolution images with upscaling factor 2 (upper middle) and 3 (lower middle), JPEG images with quality factor 10 (upper right) and 30 (lower right). Note that the white lines in the input image are just used for distinguishing the six regions, and the residual image is normalized into the range of [0, 1] for visualization. Even the input image is corrupted with different distortions in different regions, the restored image looks natural and does not have obvious artifacts.
  • 16. IMAGE CLASSIFICATION Jointly with image restoration https://arxiv.org/abs/1706.04284 https://arxiv.org/abs/1701.06487 (a) The whole ground truth image 0051x4 from DIV2K dataset. We show the comparison of the zoom-in region between: (b) the ground truth; (c) the noisy image with i.i.d. Gaussian noise of zero mean and σ = 30; (d) the denoised image by BM3D ; the denoising result of our proposed denoising network (e) without the guidance of high-level vision information; (f) with the guidance of high-level vision information Our experimental results demonstrate that the proposed architecture not only yields superior image denoising results preserving fine details, but also overcomes the performance degradation of different high-level vision tasks, e.g., image classification and semantic segmentation, due to image noise or artifacts caused by conventional denoising approaches such as over-smoothing. We propose a novel end-to-end differentiable architecture for joint denoising, deblurring, and classification that makes classification robust to realistic noise and blur. The proposed architecture dramatically improves the accuracy of a classification network in low light and other challenging conditions, outperforming alternative approaches such as retraining the network on noisy and blurry images and preprocessing raw sensor inputs with conventional denoising and deblurring algorithms
  • 17. UNCERTAINTY in image enhancement https://arxiv.org/abs/1705.00664 In this work, we investigate the value of uncertainty modelling in 3D super- resolution with convolutional neural networks (CNNs). However, the highly ill- posed nature of such problems results in inevitable ambiguity in the learning of networks. We propose to account for intrinsic uncertainty through a per-patch heteroscedastic noise model and for parameter uncertainty through approximate Bayesian inference in the form of variational dropout. We demonstrate through experiments on both healthy and pathological brains the potential utility of such an uncertainty measure in the risk assessment of the super-resolved images for subsequent clinical use. This paper proposes a new implementation of supervised image quality enhancement method referred as Bayesian image quality transfer (IQT). via CNNs. This involves two key innovations in CNN-based models: 1) we extend the subpixel CNNs previously limited to 2D images, to 3D volumes, outperforming previous models in accuracy and speed on a DTI SR task; 2) we devise new architectures enabling estimates of different components of the uncertainty in the SR mapping
  • 19. Sparsity and Model compressability We thoroughly explored the granularity of sparsity with experiments on detailed accuracy-density relationship. Due to the advantage of index saving, coarse-grained pruning is able to achieve a higher model compression ratio, which is desirable for mobile implementation. We also analyzed the hardware implementation advantages and show that coarse-grained sparsity saves 2× output∼ memory access compared with fine- grained sparsity, and ∼ 3× compared with dense implementation. Given the advantages of simplicity and efficiency from a hardware perspective, coarse-grained sparsity enables more efficient hardware architecture design of deep neural networks.
  • 20. Towards multimodal models Combining structuralandfunctionaldata
  • 21. Future of OCT and retinal biomarkers From Schmidt-Erfurth et al. (2016): “The therapeutic efficacy of VEGF inhibition in combination with the potential of OCT-based quantitative biomarkers to guide individualized treatment may shift the medical need from CNV treatment towards other and/or additional treatment modalities. Future therapeutic approaches will likely focus on early and/or disease-modifying interventions aiming to protect the functional and structural integrity of the morphologic complex that is primarily affected in AMD, i.e. the choriocapillary - RPE – photoreceptor unit. Obviously, new biomarkers tailored towards early detection of the specific changes in this functional unit will be required as well as follow-up features defining the optimal therapeutic goal during extended therapy, i.e. life-long in neovascular AMD. Three novel additions to the OCT armamentarium are particularly promising in their capability to identify the biomarkers of the future:” Polarization-sensitive OCT OCT angiography Adaptiveopticsimaging “this modality is particularly appropriate to highlight early features during the pathophysiological development of neovascular AMD Findings from studies using adaptive optics implied that decreased photoreceptor function in early AMD may be possible, suggesting that eyes with pseudodrusen appearance may experience decreased retinal (particularly scotopic) function in AMD independent of CNV or RPE atrophy.” “...the specific patterns of RPE plasticity including RPE atrophy, hypertrophy, and migration can be assessed and quantified). Moreover, polarization-sensitiv e OCT allows precise quantification of RPE-driven disease at the early stage of drusen”, “Angiographic OCT with its potential to capture choriocapillary, RPE, and neuroretinal fetures provides novel types of biomarkers identifying disease pathophysiology rather than late consecutive features during advanced neovascular AMD.”” Schlanitz et al. (2011) zmpbmt.meduniwien.ac.at See also Leitgeb et al. (2014) Zayit-Soudry et al. (2013)
  • 22. Multimodal models in general in medicine https://dx.doi.org/10.1097%2FWCO.0000000000000460 Imaging plus X: multimodal models of neurodegenerative disease Neil P. Oxtoby and Daniel C. Alexander, for the EuroPOND consortium Old paradigm disease progression models. (a) It shows the hypothetical model of Jack et al. (2010), which illustrates qualitative sigmoid evolution in AD of scalar biomarkers such as CSF Aβ level, cognitive test scores and hippocampal volume or atrophy. The lack of quantitative information prevents direct diagnostic usage. (b) It shows a traditional longitudinal model of AD atrophy Scahill et al. (2002) by binning individuals a-priori into ‘mild’, ‘moderate’ and ‘severe’ classes based on cognitive test scores. The model can potentially match new individuals to the same stages using imaging data, but must exclude cognitive scores to avoid circularity. AD, Alzheimer's disease. The temporally continuous self-modelling regression approach of Jedynak et al. (2012). The model shows the characteristic trajectories of a diverse set of biomarkers against a common continuous disease stage variable learned from the ADNI and PAQUID (Personnes Agées Quid) data sets. The model can potentially estimate the disease stage of a new patient by identifying the position along the trajectory set that best matches their data. ADNI, Alzheimer's disease neuroimaging initiative. We have reviewed data-driven model-based analyses of neurodegenerative disease. We have argued the potential for generative data-driven models to take centre stage in the study and management of neurodegenerative diseases if we are to generate new avenues for disease understanding in the earliest, preclinical stages. This is necessitated by the challenges in monitoring any neurological disease over its full time course, coupled with overlapping phenotypes and lack of a single biomarker that is dynamic across the full disease time course. The main focus of development and application to date has been in Alzheimer's disease, but various efforts including the EuroPOND project are expanding the application to other dementias, multiple-sclerosis, prion diseases, normal ageing and development, and even non-brain applications. These techniques have the potential for widespread impact in realising precision medicine across many such domains.
  • 23. Retina as deep learning network Photoreceptor layer Horizontal Cells BipolarCells AmacrineCells GanglionCell layer DL Layer1 DLLayer2 DL Layer3 DL Layer4 DLLayer5 LIGHT BRAIN With enough data, we can do densely connected (i.e. every layer is connected to every other layer) feedforward network (or even recurrent) not having to constrain the network as all the modulatory pathways are notwell known https://arxiv.org/abs/1608.06993; Cited by 29 Joint training of alllayers with layer-wise targets derived from ERGand pupillometry OPN4 https://arxiv.org/abs/1409.5185;Citedby292  Forexample, glaucoma affectsganglion cell function, whereas retinitis pigmentosa affects photoreceptors DL-Deeplearning OPN4- Melanopsin (ipRGC)
  • 24. Retina (and V1) as deep learning network DOI: 10.13140/RG.2.2.27438.72003 12/2016, Conference: NIPS 2016 Workshop - Brains and Bits: Neuroscience Meets Machine Learning, Riccardo Volpi, Istituto Italiano di Tecnologia; Matteo Zanotto; Diego Sona,: Vittorio Murino International Work-Conference on the Interplay Between Natural and Artificial Computation IWINAC 2017: Natural and Artificial Computation for Biomedicine and Neuroscience pp 464-472 Towards a Deep Learning Model of Retina: Retinal Neural Encoding of Color Flash Patterns Antonio Lozano. Javier Garrigós, J. Javier Martínez, J. Manuel Ferrández, Eduardo Fernández https://doi.org/10.1007/978-3-319-59740-9_46 https://arxiv.org/abs/1702.01825 Visualizing the internal activity of a CNN in response to a natural scene stimulus. (A-C) Time series of the CNN activity (averaged over space) for the first convolutional layer (8 units, A), the second convolutional layer (16 units, B), and the final predicted response for an example cell (C, cyan trace). The recorded (true) response is shown below the model prediction (C, gray trace) for comparison. (D) Spatial activation of example CNN filters at a particular time point. The selected stimulus frame (top, grayscale) is represented by parallel pathways encoding spatial information in the first (purple) and second (green) convolutional layers (a subset of the activation maps is shown for brevity). (E) Autocorrelation of the temporal activity in (A-C). The correlation in the recorded firing rates is shown in gray https://doi.org/10.1101/120956 Furthermore, the composite nonlinear computation performed by retinal circuitry corresponds to a boolean OR function applied to bipolar cell feature detectors. Our general computational framework may aid in extracting principles of nonlinear hierarchical sensory processing across diverse modalities from limited data. https://arxiv.org/abs/1706.06208
  • 25. Retina Model synthesis as Deep learning architecture Indirectinferenceonretinalcircuit: Hardtorecordeveryintermediatestepin humans INPUT Light OUTPUT Pupil size McDougal and Gamlin 2008 AUXILIARY OUTPUT functional MRI (fMRI) Temporal transfer functions for the postreceptoral cone pathways.Spitschanet al. (2016). Seealso Hung etal. (2016).The original responses from the achromatic luminance experiments and their derived PCA waveforms. The results of the component analysis illustrate that the pupil response can be described quite well as a linear sum of a sustained and a transientcomponent. - Young etal. (1993) Maynard et al. (2015) INTERMEDIATE OUTPUT Electroretinography (ERG) (left) Proposed neural pathways andsynapticmechanisms underlying ipRGC influence on light adaptation (right) M1 ipRGCs modulate the light-adapted ERG b-wave viaD4dopaminereceptors– Priggeetal. (2016) Multifocal Electroretinogram (UC Davis) The relative spectral sensitivities of the five photoreceptors in human retina, including S-, M-, L-cones, rods, and ipRGCs (A), LED spectral distributions (B), and LED chromaticities in 1964 CIE 10°space(C).- Cao etal. (2015) Deeplearningframework forphototransduction studies, and clinicaldiagnosisdecisionsupportsystems
  • 26. Retina Model synthesis Photoreceptor contributions #1: ERG INPUT Light OUTPUT Pupil size ? Not done in the study by Allen et al. (2016)  INTERMEDIATE OUTPUT Electroretinography (ERG) Vary the light parameters (intensity, wavelight, modulation) to probe what are the 'normal' responses either in visual processing/phototransduction in 'basic science' paradigms, or alternatively employ light parametersthatbestdiscriminate between retinalpathologies. Note! In optimally constructed model with more parameters (more explicit retinal circuitry), one could infer all possible outcomes (pathologicalornot)fromtheframework.Butinpracticewearelimitedtothedataavailable. For example if glaucoma is shown to be detected well using PLR, we could extend that dataset with using same protocol and simultaneously record ERG, visual fields, etc, and then have more complete model, and then have “good” predictive power with ERGandvisualfieldaloneifPLRisnotpossible todo. Rod and cone ERGsover mesopic irradiances. Allenetal.(2016)  Stimulusdesignand quantification. The output of athree-primaryLED light source (peak emission at 354, 460, and 600 nm) wasused to generatefour spectra, with precise excitation of melanopsin, rod, SWS, and LWSopsins. Allenetal.(2016)  Normalized b-wave amplitudes(G), implicit times(H), and OP amplitudes (I) for light-adapted cone ERGsin Opn1mwR mice for pairsof rod-divergent stimuli(blackfilled circles are rod/mel- lowand grayopen circles are rod/mel-high) withstimulusintensityquantified intermsof rod effective photons/cm2/s. - Allen et al. (2016)  We now have the 'pure photoreceptor' response (well, you know Ray), and if these responses are normal but PLR abnormal, we could assume that the problem is downstream giving hints about the given pathology
  • 27. ERG Methodological background #1 Bingyao Tan; Erik Mason; Benjamin MacLellan; Kostadinka K. Bizheva IOVS March 2017, Vol.58, 1673-1681. doi:10.1167/iovs.17-21543 Comparison of the changes in the total axial retinal blood flow (RBF) and the ERG b-wave magnitude resulting from 200-ms single flash and 1-second, 10 Hz, 20% duty cycle flicker stimuli of the same illumination intensity. (A) Representative ERG traces. The pink and gray shaded areas mark the duration of the visual stimuli. Original time recordings of the total axial RBF in response to the single flash and flicker stimuli. Pedro Monsalve; Giacinto Triolo; Jonathon Toft-Nielsen; Jorge Bohorquez; Amanda D. Henderson; Rafael Delgado; Edward Miskiel; Ozcan Ozdamar; William J. Feuer; Vittorio Porciatti Translational Vision Science & Technology May 2017, Vol.6, 5. doi:10.1167/tvst.6.3.5 A new PERG method with increased dynamic range allows recording of retinal ganglion cell function in advanced stages of optic nerve disorders. It also quantifies the response decline during the test, an autoregulatory adaptation to metabolic challenge that decreases with age and presence of disease. Here we describe a new method for steady-state PERG recording in human based on a visual display unit built with Light-Emitting Diode (LED) technology, skin electrodes, and optimized signal processing to quantify response adaptation (dubbed PERGx as a contraction of PERGnext). We show that, compared to a validated method, the PERGx has a very high signal-to-noise ratio (SNR); this suggests that meaningful responses can be recorded in advanced stages of diseases such as nonarteritic ischemic optic neuropathy (NAION). PERGx temporal dynamics and intrinsic variability in a representative normal subject. (A) The amplitude of PERGx samples (blue circles, 16 consecutive partial averages of 64 epochs each over 2 minutes) progressive declined (adapted) with a slope of −0.031 μV/sample (R2 = 0.48), whereas the PERGx phase (red circles) was stationary. (B) Polar diagram displaying combined amplitude and phase of PERG samples (open black circles) and noise samples (open grey triangles). The PERG amplitude (1.65 μV) is represented by the length of vector connecting the origin of the axes with the cluster centroid. The PERG phase (63.6°) is represented by the angle Φ between the vector and the x-axis.
  • 28. ERG Methodological background #2 https://doi.org/10.1007/s10633-017-9593-y Discrete Wavelet Transform (DWT) analysis applied to the mfERG response from a control (left) and a patient (right). Topgraphical representation of the 2F-mfERG M-sequence used here (MOFOFO), with frames displaced in time in order to better correspond visually to the recorded response. The original signal from one hexagon of the mfERG (waveform inside box on top) can be decomposed into many frequency levels, depending on the length of the time series. The first level (1211 Hz) corresponds to high frequencies (noise), while the highest level (11 Hz) corresponds to the lowest frequencies. For each frequency level, the vertical lines represent individual wavelet coefficients. For each level, the variance between these coefficients is computed and subjected to further analysis as the WVA (wavelet variance). Legend: DC direct component; IC1 first induced component; IC2 second induced component The entire process of retinal visual processing involves the phototransduction cascade with different groups of cells and circuits from the photoreceptors to the ganglion cells. Thus, electrical signals produced by different biological structures contribute to the retinal response of the mfERG that is recorded from the cornea [Hood et al. (2002); Luo et al. (2011)] . In the standard mfERG, amplitude and implicit time are often analyzed [Hood et al. (2012)] . Early glaucoma Dilru C Amarasekera BS, Arthur F Resende MD, Michael Waisbourd MD, Sanjeev Puri MD, Marlene R Moster MD, Lisa A Hark PhD, L Jay Katz MD, Scott J Fudemberg MD, Anand V Mantravadi MD First published: 20 July 2017 DOI: 10.1111/ceo.13006 Unreliable test results were excluded. Abbreviations: ss-PERG=Steady-State Pattern Electroretinogram; SD-tVEP=ShortDuration transient Visual Evoked Potentials; Lc=Low Contrast; Hc=High Contrast; SNR=Signal-to- Noise Ratio. Electrophysiological techniques thus play a valuable role in a diagnostic environment dominated by highly effective tools such as OCT via the addition of an objective functional perspective to the diagnosis of glaucoma. Although the use of PERG and VEP as a measure of retinal ganglion cell and visual pathway dysfunction has been established, few studies have measured the potential clinical utility of the novel rapid testing platform of ss-PERG and SD-tVEP in patients with glaucoma.ss- PERG was effectively able to discern between glaucomatous and healthy eyes. The diagnostic ability of ss-PERG was superior to that of SD-tVEP. ss-PERG may thus have a role as a clinically useful electrophysiological diagnostic tool.
  • 29. Retina Model synthesis Photoreceptor contributions #2: PLR INPUT Light OUTPUT Pupil size INTERMEDIATE OUTPUT Electroretinography (ERG)? ERGnot done thistime Experimental design. (A, Left) L, M, and S cones and melanopsin-containing ipRGCs mediate vision at daytime light levels. (Center) Photoreceptor spectral sensitivities. (Right) Physiological measurements of ipRGCs find excitatory L and M cone inputs and inhibitory S-cone inputs (12). (B) A digital spectral integrator produces sinusoidal photoreceptor-directed modulations that pass through an artificial pupil into the pharmacologically dilated left eye. The consensual pupil response of the right eye is recorded. (C) Photoreceptor- directed modulations. Balanced changes in the spectrum of light around a background spectrum nominally isolate targeted photoreceptors. - Spitschan et al. (2014) Group PLR data are well fit by the two componentlinear filtermodel. (A) The mean response across all subjects (01–16) is shown at 0.05 and 0.5 Hz, for L+M-, melanopsin-, and S-cone-directed modulations. Fit values are derived from those found for subject 01, with only amplitude parameters adjusted (Table S2). This is because the average data are available at only two temporal frequencies and do not sufficiently constrain all parameters of the model. To obtain the average data plotted, amplitudes and phases were averaged separately (i.e., average amplitude obtained without consideration of phase, average phase obtained without consideration of amplitude). The model was fit to the data as plotted. (B)Polar-plot representations of the group data with model fit points, following conventions as in Fig 3. The data are normalized separately for each temporal frequency. Error bars (± 2 SEM across subjects) are smaller than the plot points for the data. -Spitschan et al. (2014) Now aswe are feeding in more data,we are in theory learning how the lightparameters should be designed tohave the best photoreceptor response isolation. And have presentations for corresponding ERG and PLR responses. It would also help if all the studieswerefromhumans:P
  • 30. Retina Model synthesis further downstream INPUT Light OUTPUT Pupil size INTERMEDIATE OUTPUT Electroretinography (ERG) “KNOWN BEHAVIOR” Auxiliary OUTPUT dLGN Build on top of previous models. We “know” how specific light stimulus is processed by the retina (ERG), and how is this reflected in pupil behavior (PLR) via olivary pretectal nucleus (OPN). So using the same parameters, record the activityofLGN for example whichisnice atleast forbasicscience, not necessarilyforpathologyscreening. A: LED spectral power densities and in vivo photoreceptor spectral sensitivity (normalised). The output of blue and yellow LEDs was adjusted to produce equivalent effects on rods (black line). By contrast, the blue LED, always appeared brighter for melanopsin (green line). B: Protocol 1. Melanopsin- isolating steps in dLGN and retina, respectively) presentations of the blue LED were interleaved with 210 or 180 sec of the (dLGN and retina, respectively) yellow to produce a ‘step’ visible only to melanopsin. C: Protocol 2. Irradiance slowly ramped up (0.5 ND per 200 seconds) before remaining at a steady state for 10 seconds. D: The effective change in photon flux for melanopsin (green) and rods (black) across a full repeat of Protocol 2. Settings of ND filter at the point of each melanopsin isolating step are provided above. - Davis et al. (2015) INTERMEDIATE OUTPUT #2 Ganglion cell firingrates Responsestomelanopsin-isolatingstepsand gradual irradiance rampsin retina. - Davis et al. (2015) Responsesto melanopsin-stepsin the dLGN. - Davis et al. (2015)
  • 31. Retina Model synthesis INPUT Light INTERMEDIATE OUTPUT Electroretinography(ERG) OUTPUT Pupil size INTERMEDIATE OUTPUT #2 Ganglion cell firingrates Auxiliary OUTPUT dLGN Sonow we know how retina worksin a data-drivendeep learningsense (noexplicitmodelling ofretinainbiological sense). We can heuristically cheatand define connectionsasdefined fromliterature So as we feed in data from studies, the interactions between blocks are “automagically” quantified by adjusting the convolutional weights in the deep learning model. At some point if we have enough data we could also start to relax the circuitconstraints and hypothesize thatthere could be recurrent feedback from dLGN to OPN (controlling pupil size), and do 'blindcausalityanalysis' (Nikolaprobably an experton that) https://arxiv.org/abs/1601.03610 We have proposed a novel framework for causal analysis in time-series which does not require any assumptions about the statistical relationships among the variables of the study, i.e., it is model-free. Our results show that Twitter data polarity does indeed have a causal impact on the stock market prices of the examined companies. Hence, we believe social media data could represent a valuable source of information for understanding the dynamics of stock market movements http://www.slideshare.net http://dx.doi.org/10.1534/genetics.114.165704
  • 32. Retina Model synthesis Pathologies? INPUT Light INTERMEDIATE OUTPUT Electroretinography (ERG) OUTPUT Pupil size INTERMEDIATE OUTPUT #2 Ganglion cell firingrates Auxiliary OUTPUT dLGN In case with glaucoma, one would expect that the peripheral retina gets destroyed first (A) Schematic diagram showing the flash stimulation sequence of the slow-sequence (slow flickering stimulation, MOOO) multifocal electroretinogram (mfERG). (B) The first-order kernel of the slow- sequence mfERG from the central (rings 1 to 2) and peripheral (rings 3 to 6)region - Chanetal.(2011)s. Overlapping visual field test-region layout and luminance characteristics of the multifocal pupillographic objective perimetrystimuli forall protocols.  -Carleetal. (2014) Now we can define normal and pathologies as classes as you would in typical image classificationtasks ('dogs', 'cats', 'etc'), but instead of just using single image (whether it be fundus or OCT (SD/SS/AO/Angiography), we can combine boththe image and behavioral response for better quantificationof the retinal pathology.
  • 33. Retina Model synthesis VISUAL FIELD Old school psychophysical functional measure that is often found stressful by the patients https://doi.org/10.1016/j.ophtha.2017.04.021 De Moraes CG, Hood DC, Thenappan A, Girkin CA, Medeiros FA, Weinreb RN, Zangwill LM, Liebmann JM. Central visual field damage seen on the 10-2 test is often missed with the 24-2 strategy in all groups. This finding has implications for the diagnosis of glaucoma and classification of severity. JAMA Ophthalmol. 2017;135(7):783-788. doi: 10.1001/jamaophthalmol.2017.1659 JAMA Ophthalmol. 2017;135(7):742-747. doi: 10.1001/jamaophthalmol.2017.1396 A deep-learning based automatic glaucoma identification ARVO 2017: 320 Visual Fields, Vision Function, Psychophysics I Serife Seda S. Kucur, Mathias Abegg, Sebastian Wolf, Raphael Sznitman. ARTORG Center, University of Bern, Bern, Switzerland; Department of Opthalmology, Inselspital Bern, Bern, Switzerland. The inherent local and global characteristics of visual fields (VFs) can be exploited in a strong data-driven sense and could provide better understanding of VFs with regards to glaucoma. Ultimately, this may help to efficiently automatize the diagnosis process. Our hypothesis is that alternative representations of raw VFs, in terms of different spatial scales, could be learned by computers using machine learning techniques towards an effective automatized glaucoma identification task. Accordingly, we present a Convolutional Neural Network (CNN)-based approach for classification of VFs as being glaucomatous or non-glaucomatous. Conclusions: These results support the fact that processing Vfs through a CNN generates different representation of data in terms of its hidden characteristics and patterns that are efficient to discriminate between glaucomatous and non-glaucomatous VFs in an automated way. The performance could be further improved with a different CNN architecture. The trained CNNs have the potential to be utilized for glaucoma progression analysis as well https://doi.org/10.1016/j.ophtha.2017.01.027 http://dx.doi.org/10.1097/IJG.0000000000000710
  • 34. Retina Model synthesis beyond retinopathies #1 What to diagnose fromthe eye, e.g. neurodegenerative disease such as alzheimer’s disease Is the Eye an Extension of the Brain in Central Nervous System Disease? Lies De Groef1,2 and Maria Francesca Cordeiro1,3,4 Journal of Ocular Pharmacology and Therapeutics. June 2017, https://doi.org/10.1089/jop.2016.0180 1 Glaucoma and Retinal Neurodegenerative Disease Research Group, Institute of Ophthalmology, University College London, London, United Kingdom. 2 Neural Circuit Development and Regeneration Research Group, Department of Biology, University of Leuven, Leuven, Belgium. 3 Western Eye Hospital, Imperial College Healthcare NHS Trust, London, United Kingdom. 4 ICORG, Department of Surgery and Cancer, Imperial College London, London, United Kingdom. Compilation of examples to illustrate the concept ‘‘the eye as a window to the brain’’. Typical ocular diseases, such as uveitis, glaucoma, and AMD, have in common several pathological mechanisms with CNS diseases, for example, MS and AD. Both in vivo and post mortem examinations of the eye can therefore be used to study the disease mechanisms underlying these pathologies in the eye and brain. (1) fluorescein angiography; (2) intraocular pressure measurement (copyright iCare, TonoLab); (3) optical coherence tomography scan; (4) confocal scanning laser ophthalmoscopy imaging of curcumin-labeled protein aggregates; (5) retinal oximetry; (6) ZO-1 tight junction immunostaining on wholemounted retina; (7) transmission electron microscopy image of trabecular meshwork; (8) Iba-1 microglia immunostaining on retinal section; (9) Brn3a retinal ganglion cell immunostaining on wholemounted retina; (10) β-amyloid immunostaining on retinal section; and (11) concanavalin A vessel labeling on wholemounted retina. AD, Alzheimer’s; AMD, age-related macular degeneration; MS, multiple sclerosis Front Aging Neurosci. 2017; 9: 214. Published online 2017 Jul 6. doi: 10.3389/fnagi.2017.00214 The Role of Microglia in Retinal Neurodegeneration: Alzheimer's Disease, Parkinson, and Glaucoma Ana I. Ramirez,1,2 Rosa de Hoz,1,2 Elena Salobrar-Garcia,1,3 Juan J. Salazar,1,2 Blanca Rojas,1,3 Daniel Ajoy,1 Inés López-Cuenca,1 Pilar Rojas,1,4 Alberto Triviño,1,3 and José M. Ramírez1,3,* Front Neurol. 2017; 8: 162. Published online 2017 May 4. doi: 10.3389/fneur.2017.00162 Retinal Ganglion Cells and Circadian Rhythms in Alzheimer’s Disease, Parkinson’s Disease, and Beyond Chiara La Morgia,1,2,* Fred N. Ross-Cisneros,3 Alfredo A. Sadun,3,4 and Valerio Carelli1,2 Summary of circadian rhythm abnormalities in AD, PD, and HD. AD, Alzheimer’s disease; PD, Parkinson’s disease; HD, Huntington’s disease; IV, intra-daily variability; IS, inter-daily stability; RA, relative amplitude; BP, blood pressure; HR, heart rate. Schematic representation of the hypothetical events associated with the neuroinflammation in AD (A), PD (B), and glaucoma (C). AD, Alzheimer's Disease; PD, Parkinson's Disease; ILM, inner limitant membrane; NFL, nerve fiber layer; GCL, ganglion cell layer; IPL, inner plexiform layer; INL, inner nuclear layer; OPL, outer plexiform layer; ONL, outer nuclear layer; OLM, outer limitant membrane; PL, photoreceptor layer; RPE, retinal pigment epithelium; BM, Bruch membrane; C, choroid; Aβ, beta-amyloid; pTau, phosphorylated tau.
  • 35. Health Economics for Medical Startups | Background
  • 36. Business Models focus ● Often technical founders focus too much on the technology, and do no achieve the Product-market fit – In medical startups, it is often very useful to do proper health economics calculations to see your idea to customers and investors. ● In other words, how much can your solution make the healthcare more efficient economically while improving quality of care to the patient. – Other common problem in the long run is the reimbursement as in most countries, the patient itself does not that pay fully the healthcare that the patient receives, and the market access is complicated with varying regulations/policies in each country. http://startupheretoronto.com www.smi-online.co.uk
  • 37. Business Models Innovations on the model https://hbr.org/2016/10 Healx: A Case Study Informed by our business model framework, we advised (and Cambridge Judge Business School’s business accelerator supported) the tech venture Healx, which focuses on the treatment of patients with rare diseases in the emerging field of personalized medicine. A big challenge for pharmaceutical companies in this domain is that rare-disease markets are very small, so companies usually have to charge astronomical prices. (One drug, Soliris, used in the treatment of paroxysmal nocturnal hemoglobinuria, costs about $500,000 per patient-year.) Enter Healx, with a platform that leverages big data technology and analytics across multiple databases owned by various organizations within global life sciences and health care to efficiently match treatments to rare-disease patients. Its initial business model hit three of our six key features. First, Healx’s value proposition was about asset sharing (for example, making available clinical-trial databases that record the effectiveness of most drugs across therapeutic areas and diseases, including rare ones). Second, the business promised more personalization by revealing drugs with high potential for treating the rare diseases covered. Finally, Healx’s model would, in theory, create a collaborative ecosystem by bringing together big pharma (which has the treatment and trial data) and health care providers (which have data about effectiveness and incompatibility reactions and also personal genome descriptions). https://healx.io/ More recently, Healx has developed a machine-learning algorithm that can use a patient’s biological information not only to match drugs to disease symptoms but also to predict exactly which drug will achieve what level of effectiveness for that particular patient. The latest version of its business model brings personalization to the maximum possible level and adds agility, because the treating clinician—armed with the biological data and the algorithm—can make better treatment decisions directly with the patient and doesn’t have to rely on fixed rules of thumb about which of the few available off-label drugs to use. In this way, Healx is able to support decentralized, real-time, accurate decision making. This version of the Healx model has even more transformation potential—it exhibits four of the six features; it has already generated revenue from customers; and in the long term it could empower patients by giving them much more information before they consult a medical practitioner. Although it is still too early to tell whether that potential will be realized, Healx is clearly a venture to watch. It has earned a number of prizes (including the 2015 Life Science Business of the Year and the 2016 Graduate Business of the Year in the Cambridge cluster) and sizable investments from several global funds.
  • 38. LOSs Function performance quantification ● In medical studies, the ROC curve and especially Area Under the Curve (AUC) is used as an easy scalar to describe the performance of the classifier. TensorFlow allows direct optimization of ROC http:dx.doi.org/10.1093/bib/bbr008 http://arxiv.org/abs/1605.06652 Conclusion: The AUC is an unreliable measure of screening performance because in practice the standard deviation of a screening or diagnostic test in affected and unaffected individuals can differ. The problem is avoided by not using AUC at all, and instead specifying detection rates (DRs) for given false positive rates (FPRs) or FPRs for given Drs. http://dx.doi.org/10.1177/0969141313517497 http://tflearn.org/objectives/ Mozer, Michael C. "Optimizing classifier performance via an approximation to the Wilcoxon-Mann-Whitney statistic." (2003). aaai.org/Papers Front Public Health. 2015; 3: 57. Published online 2015 Apr 20. doi: 10.3389/fpubh.2015.00057 PMCID: PMC4403252 Threshold-Free Measures for Assessing the Performance of Medical Screening Tests
  • 39. HEALTH ECONOMICAL Loss function wikipedia.org Analogies from churn prediction? http://dx.doi.org/10.1186/s40165-015-0014-6 “Nevertheless, current state-of-the-art classification algorithms are not well aligned with commercial goals, in the sense that, the models miss to include the real financial costs and benefits during the training and evaluation phases. In the case of churn, evaluating a model based on a traditional measure such as accuracy or predictive power, does not yield to the best results when measured by the actual financial cost, ie. investment per subscriber on a loyalty campaign and the financial impact of failing to detect a real churner versus wrongly predicting a non-churner as a churner” What are the economical costs of each block in the contingency table, optimization for medical economics? - More expensive to have false negatives as patients will not be diagnosed both in terms of economical cost and reduced quality of life for patients
  • 40. Health economics models https://dx.doi.org/10.3310/hta11410 Screening in UK for Glaucoma, NHS Setting Published: Ann Intern Med. 2013;159(7):484-489 DOI: 10.7326/0003-4819-159-6-201309170-00686 Estimate of needed duration and number of subjects by Steve Kymes needed for proper health economical study for glaucoma screening program. Presented by John Boland at “Should we screen for glaucoma?” session at World Glaucoma Congress 2017 in Helsinki, Finland. Indian J Ophthalmol. 2011 Jan; 59(Suppl1): S24–S30. doi: 10.4103/0301-4738.73684 PMCID: PMC3038514 Cost-effectiveness of screening for open angle glaucoma in developed countries Anja Tuulonen Clin Ophthalmol. 2017; 11: 337–346. doi: 10.2147/OPTH.S120398 PMCID: PMC5317344 Cost and detection rate of glaucoma screening with imaging devices in a primary care center Alfonso Anton,1,2,3,4 Monica Fallon,3,5 Francesc Cots,2 María A Sebastian,6 Antonio Morilla-Grasa,4Sergi Mojal,3 and Xavier Castells2
  • 41. RISK STRATIFICATION & Screening Target screening for high-risk cases (family history, age, ethnicity, gender) https://doi.org/10.1016/j.ajo.2017.05.017 (2016) https://doi.org/10.1109/TMI.2016.2608782 We introduce a novel Bayesian nonparametric model that uses the concept of disease trajectories for disease subtype identification.. We investigate several models with our algorithm, and show that one with age, pack years (a measure of cigarette exposure), and smoking status as predictors gives the best compromise between estimated predictive performance and model complexity. https://arxiv.org/abs/1705.07674 The proposed risk score incorporates both the patients’ non-stationary temporal physiological information and their individual baseline co-variates in order to accurately describe the patients’ physiological trajectories. Aaron Zalewski ; William Long ; Alistair E. W. Johnson ; Roger G. Mark ; Li-wei H. Lehman Date of Conference: 16-19 Feb. 2017, https://doi.org/10.1109/BHI.2017.7897302 https://arxiv.org/abs/1704.08797
  • 42. RISK factors For example for Glaucoma “Overview of ethnicity and race” by M. Roy Wilson (United States) at Risk Profiling symposium at World Glaucoma Congress 2017, Helsinki, Finland http://dx.doi.org/10.1001/jamaophthalmol.2015.1478 http://dx.doi.org/10.1126/science.aam7935
  • 43. “Doctor AI” Systems | Introduction
  • 44. AI Doctor https://arxiv.org/abs/1512.03542 http://arxiv.org/abs/1602.00357 http://arxiv.org/abs/1511.02554 Longitudinal analysis → try to diagnose pathologies as early as possible. Incorporate disease progression measurements and treatment interventions for optimal personalized treatment. Feature engineering remains a major bottleneck when creating predictive systems from electronic medical records. At present, an important missing element is detecting predictive regular clinical motifs from irregular episodic records. We present Deepr (short for Deep record), a new end-to-end deep learning system that learns to extract features from medical records and predicts future risk automatically. Deepr transforms a record into a sequence of discrete elements separated by coded time gaps and hospital transfers. On top of the sequence is a convolutional neural net that detects and combines predictive local clinical motifs to stratify the risk. Deepr permits transparent inspection and visualization of its inner working. We validate Deepr on hospital data to predict unplanned readmission after discharge. Deepr achieves superior accuracy compared to traditional techniques, detects meaningful clinical motifs, and uncovers the underlying structure of the disease and intervention space. http://arxiv.org/abs/1607.07519
  • 45. Condition dynamics Long short-term memory (LSTM) C memory of LSTM x diagnoses (features vector) p procedures, medications f illness "forgetting" (curing or toxicity) m planned/unplanned admission flag h weighed "illness pooling" i input gate (new information updated to memory) o output gate (disease state) http://arxiv.org/abs/1511.03677 https://arxiv.org/abs/1510.07641
  • 46. Condition dynamics always missing data in clinical time series TREATING MISSING DATA Various options 1. Zero-Imputation Set to zero when missing data 2. FORWARD-FILLING use previous values 3. MISSINGNESS Treat the missing value as a signal, as lack of a value measured e.g. in an ICU can carry information itself (Lipton et al. 2016) 4. BAYESIAN STATE-SPACE MODELING to fill the missing data (Luttinen et al. 2016, BayesPy package) 5. GENERATIVE MODELING Train the deep network to generate missing samples (Im et al. 2016, RNN GAN; see also github: sequence_gan) http://arxiv.org/abs/1606.01865 https://arxiv.org/abs/1606.04130
  • 47. Condition dynamics -based Individualized treatment ● Schmidt-Erfurth and Waldstein (2016): There is a critical unmet medical need to identify, characterize, and validate biomarkers that could provide solid guidance for an efficient individualized treatment with regards to optimal functional outcome and disease management. Such biomarkers would enable the treating physician to tailor personalized treatment to each patient's individual disease and need, in order to provide adequate disease control, minimize recurrence and neurosensory damage, and limit the number of invasive and costly interventions. Relationship between initial visual acuity, visual acuity change and final visual acuity during therapy of neovascular age-related macular degeneration (i.e.,the ceiling effect). The interpolation curves illustrate final visual acuity levels dependent on baseline visual acuity in the controlled trials CATT and IVAN as well inthe real-world UK neovascular AMD database study Role of subretinal fluid as a treatment-modifying imaging biomarker. In patients with subretinal fluid at baseline (blue graphs), antiangiogenic therapy leads to identical visual acuity outcomes, regardless of treatment regimen (monthly versus every 12 weeks dosing). In contrast, patients without subretinal fluid at baseline (red graphs) demonstrate unfavourable outcomes if treatment was not administered on a monthly basis. Pigment-epithelial detachment as risk factor for vision loss during individualized dosing. In the VIEW studies, patients received continuous anti-VEGF therapy during the first 48 weeks. At 52 weeks, a discontinuous, “as-needed” dosing regimen was introduced. Only in a precisely defined patient population, i.e. eyes with pigment-epithelial detachments developing secondary intraretinal cystoid fluid (IRC, red graph), the reactive dosing regimen led to pronounced vision loss. Future therapeutic approaches will likely focus on early and/or disease modifying interventions aiming to protect the functional and structural integrity of the morphologic complex that is primarily affected in AMD, i.e. the choriocapillary e RPE e photoreceptor unit. Multimodal innovative imaging technologies, such as PS-OCT, OCT angiography, and adaptive optics allow access to yet unidentified biomarkers representing the origin of neovascular AMD as well as functionally relevant therapeutic aims. Improved big-data applicability and reproducibility aided by computerized OCT analysis will likely allow personalized antiangiogenic therapy with minimal interventions, while providing maximum disease control,using advanced imaging software and hardware. It is the responsibility of the scientific and clinical community to follow the open path of advanced imaging in a collaborative and interdisciplinary approach together with ophthalmologists, biologists, physicists, and computer scientists in an efficient interdisciplinary approach.
  • 48. Condition dynamics risk factors for glaucoma progression https://doi.org/10.1016/j.ajo.2017.06.003 To determine the intraocular and systemic risk factor differences between a cohort of rapid glaucoma disease progressors and non- rapid disease progressors. Conclusion: Cardiovascular disease is an important risk factor for rapid glaucoma disease progression irrespective of IOP control.
  • 49. Condition dynamics Disease progression #1 Clin Ophthalmol. 2017; 11: 1015–1020. May 23. doi: 10.2147/OPTH.S116265 PMCID: PMC5449101 Automated retinal imaging and trend analysis – a tool for health monitoring Karin Roesch, Tristan Swedish, and Ramesh Raskar MIT Media Lab, Massachusetts Institute of Technology, Cambridge, MA, USA The future of health diagnostics. Current diagnostics are based on a “snapshot” in time and limited data points. In the future, large datasets acquired over time through constant monitoring will be analyzed to establish baselines and trends, enabling preventative interventions. Knowing when a feature occurred is key. For example, the MA population is dynamic and changes occur in a matter of months. For diabetic retinopathy (DR), it has been established that microaneurysms (MAs) are the earliest lesions visible.6 Additionally, MA turnover rates are indicative of early-stage DR as well as the likelihood of DR progression to macular edema. Po-Hsiang Chiu, George Hripcsak Department of Biomedical Informatics, Columbia University, 622 W. 168th Street, New York, NY, USA https://doi.org/10.1016/j.jbi.2017.04.009 Learning statistical models of phenotypes using noisy labeled training data Vibhu Agarwal Tanya Podchiyska Juan M Banda Veena Goel Tiffany I LeungEvan P Minty Timothy E Sweeney Elsie Gyang Nigam H Shah J Am Med Inform Assoc (2016) 23 (6): 1166-1173. DOI: https://doi.org/10.1093/jamia/ocw028
  • 50. Condition dynamics Disease progression #2 Hrvoje Bogunović; Alessio Montuoro; Magdalena Baratsits; Maria G. Karantonis; Sebastian M. Waldstein; Ferdinand Schlanitz; Ursula Schmidt-Erfurth Investigative Ophthalmology & Visual Science June 2017, Vol.58, BIO141-BIO150. DOI: 10.1167/iovs.17-21789 Observations at baseline and the first follow-up are used for predicting drusen regression in the future, for example, the following 1-year period. Examples of drusen thickness maps and the drusen regression prediction within 1-year period. Last column shows true positives (green), false positives (orange), and false negatives (blue). Each row represents one example eye. http://dx.doi.org/10.1001/jamaophthalmol.2016.5111 http://dx.doi.org/10.1002/sim.7300 Application of our approach using linear mixed models to Alzheimer’s Disease Neuroimaging Initiative data with bootstrapped 95% CI including boxplots of neocortical Aβ burden (standard uptake value ratio (SUVR)) for each diagnosis group, separately for amyloid–β positive and negative individuals. It takes 24.47 years to progress from an SUVR of 0.79 to 1.01. This is equivalent to a rate of 0.009 increase in SUVR per year. Similarly, it takes 10.76 years to progress from an SUVR of 0.73 to 0.79. See the text for further details. HC, healthy control; MCI, mild cognitively impaired; AD, Alzheimer’s disease
  • 51. Text Analysis | Introduction
  • 52. Condition dynamics Natural Language processing (NLP) http://arxiv.org/abs/1602.05568 http://arxiv.org/abs/1602.03686 http://homepages.inf.ed.ac.uk/ballison/pdf/lrec_skipgrams.pdf http://www.bioscience.ai/schedule http://arxiv.org/abs/1508.04112
  • 53. Text analysis for clinical notes #1 http://dx.doi.org/10.3233/978-1-61499-753-5-201 Medical Text Classification using Convolutional Neural Networks Mark Hughes, Irene Li, Spyros Kotoulas, Toyotaro Suzumura (Submitted on 22 Apr 2017). https://arxiv.org/abs/1704.06841 We present an approach to automatically classify clinical text at a sentence level. We are using deep convolutional neural networks to represent complex features. We train the network on a dataset providing a broad categorization of health information. Through a detailed evaluation, we demonstrate that our method outperforms several approaches widely used in natural language processing tasks by about 15%.
  • 54. Text analysis for clinical notes #2 13 April 2017. https://doi.org/10.1109/BHI.2017.7897302 https://doi.org/10.1016/j.jbi.2017.07.006 We proposed the first models based on recurrent neural networks (more specifically Long Short-Term Memory - LSTM) for classifying relations from clinical notes. We also evaluated the impact of word embedding on the performance of LSTM models and showed that medical domain word embedding help improve the relation classification. These results support the use of LSTM models for classifying relations between medical concepts, as they show comparable performance to previously published systems while requiring no manual feature engineering. In this work, we explore the use of Hierarchical Dirichlet Processes (HDP) as a Bayesian nonparametric framework to infer patients' states of health by combining multiple sources of data. In particular, we employ HDP to combine clinical time series and text from the nursing progress notes in a probabilistic topic modeling framework for patient risk stratification iDoctor: Personalized and professionalized medical recommendations based on hybrid matrix factorization Future Generation Computer Systems Volume 66, January 2017, Pages 30-35 https://doi.org/10.1016/j.future.2015.12.001
  • 55. Personalized Medicine | Introduction
  • 56. Precision / personalized medicine #1 re-work.co/blog http://dx.doi.org/10.1101/070490 “For the first time, we demonstrate that DLNN trained on a large pharmacogenomic data set can effectively predict the therapeutic response of specific drugs in specific cancer types, from a large panel of both drugs and cancer cell lines. These findings serve as a proof of concept for the application of DLNN to predict therapeutic responsiveness, a milestone in precision medicine.” http://dx.doi.org/10.1056/NEJMp1500523 http://dx.doi.org/10.3389%2Ffpsyt.2016.00034 http://dx.doi.org/10.1016/j.media.2016.06.024
  • 57. Precision / personalized medicine #2 We introduce an IoT driven architecture and discuss how non- invasive, affordable, unobtrusive sensing using mobile phones, wearables and nearables is making physiological and pathological data collection from human body possible in thus far unimaginable ways. We also introduce breakthrough technologies in form of exosomes and 3D organ printing that has the potential to disrupt the future healthcare landscape. http://dx.doi.org/10.1007/978-3-319-42141-4_9 https://doi.org/10.1109/TMM.2016.2614225 To facilitate the intensive computation required for interactive analytics, we design an efficient sparse principal component analysis (SPCA) solver based on a variance reduced stochastic gradient technique. The benefits of our method are demonstrated by analyzing two different EHR patient cohorts, a public and a private dataset containing EHRs of 101 767 and 223 076 patients, respectively. Our evaluations show that PHENOTREE can detect clinically meaningful hierarchical phenotypes. http://dx.doi.org/10.3390/ijms17091555
  • 58. Precision / personalized medicine #3 Multimorbidity space and dynamic disease progression. http://dx.doi.org/10.1038/nrg.2016.87 The co-occurrence of diseases can inform the underlying network biology of shared and multifunctional genes and pathways. In addition, comorbidities help to elucidate the effects of external exposures, such as diet, lifestyle and patient care. With worldwide health transaction data now often being collected electronically, disease co-occurrences are starting to be quantitatively characterized. Linking network dynamics to the real-life, non-ideal patient in whom diseases co-occur and interact provides a valuable basis for generating hypotheses on molecular disease mechanisms, and provides knowledge that can facilitate drug repurposing and the development of targeted therapeutic strategies.
  • 59. Example Clinical AI Pipelines
  • 60. Glaucoma decision support tools Old-school methodsformultimodal and structural features Development of machine learning models for diagnosis of glaucoma Seong Jae Kim, Kyong Jin Cho, Sejong Oh Published: May 23, 2017. https://doi.org/10.1371/journal.pone.0177726 We used 100 cases of data as a test dataset and 399 cases of data as a training and validation dataset. To develop the glaucoma prediction model, we considered four machine learning algorithms: C5.0, random forest (RF), support vector machine (SVM), and k-nearest neighbor (KNN). Color-fundus and red-free fundus photography (A), peripapillary RNFL thickness measured by SD-OCT (B), and automated 30–2 visual field test (C). The presence of a tigroid fundus and peripapillary atrophy was observed, and there was a decrease in the RNFL thickness on the peripapillary RNFL thickness scan. In the visual field test, the abnormalities were judged to be of no clinical significance. Computers in Biology and Medicine Volume 8, Issue 1, January 1978, Pages 25-40 Glaucoma consultation by computer Sholom Weiss, Casimir A. Kulikowski, Aran Safir https://doi.org/10.1016/0010-4825(78)90011-2 Automated detection of glaucoma using structural and non structural features SpringerPlus December 2016, 5:1519 Anum A. Salam, Tehmina Khalil, M. Usman Akram, Amina Jameel, Imran Basit First Online: 09 September 2016 https://doi.org/10.1186/s40064-016-3175-4
  • 61. Tensor Networks Inspiration from quantum networks #1 Supervised Learning with Quantum- Inspired Tensor Networks E. Miles Stoudenmire, David J. Schwab last revised 18 May 2017 https://arxiv.org/abs/1605.05775 Deep Learning and Quantum Entanglement: Fundamental Connections with Implications to Network Design Yoav Levine, David Yakira, Nadav Cohen, Amnon Shashua last revised 10 Apr 2017 https://arxiv.org/abs/1704.01552 Neural networks for computing best rank-one approximations of tensors and its applications Maolin Che, Andrzej Cichocki, Yimin Wei. 22 May 2017 https://doi.org/10.1016/j.neucom.2017.04.058 This paper presents the neural dynamical network to compute a best rank-one approximation of a real-valued tensor. We implement the neural network model by the ordinary differential equations (ODE), which is a class of continuous- time recurrent neural network. Finally, we generalize the proposed neural networks to the computation of the restricted singular values and the associated restricted singular vectors of real- valued tensors. We illustrate and validate theoretical results via numerical simulations. Keywords: Neural network, Ordinary differential equations, Lyapunov function, Lyapunov stability theory, Rank-one tensor, Best rank-one approximation, Z-eigenpair, Symmetric-definite tensor pair, H-eigenpair, The local maximal generalized eigenpair, The local minimal generalized eigenpair, Generalized tensor eigenpair, Local optimal rank-one approximation, Restricted singular value, Restricted singular vector We theoretically analyze convolutional arithmetic circuit (ConvACs), and empirically validate our findings on more common ConvNets which involve ReLU activations and max pooling. Beyond the results described above, the description of a deep convolutional network in well- defined graph-theoretic tools and the formal connection to quantum entanglement, are two interdisciplinary bridges that are brought forth by this work. Neural-network representation of the many-body ground states. convolutional neural networks, can constitute the basis of more advanced NQS and therefore have the potential for increasing their expressive power.
  • 62. Tensor Networks Inspiration from quantum networks #2 Low-Rank Tensor Networks for Dimensionality Reduction and Large-Scale Optimization Problems: Perspectives and Challenges PART 1 A. Cichocki, N. Lee, I.V. Oseledets, A.-H. Phan, Q. Zhao, D. Mandic last revised 19 Jul 2017 (this version, v2) https://arxiv.org/abs/1609.00893 Tensor Networks for Dimensionality Reduction and Large-scale Optimization: Part 2 Applications and Future Perspectives A. Cichocki, N. Lee, I.V. Oseledets, A.-H. Phan, Q. Zhao, D. Mandic Foundations and Trends® in Machine Learning (2017): Vol. 9: No. 6, pp 431-673. http://dx.doi.org/10.1561/2200000067 “Tensor decompositions and tensor network algorithms require sophisticated software libraries, which are being rapidly developed. The TT Toolbox, developed by Oseledets and coworkers, (http://github.com/oseledets/TT-Toolbox) for MATLAB and (http://github.com/oseledets/ttpy) for PYTHON is currently the most complete software for the TT (MPS/MPO) and QTT networks. The TT toolbox supports advanced applications, which rely on solving sets of linear equations (including the AMEn algorithm), symmetric eigenvalue decomposition (EVD), and inverse/psudoinverse of huge matrices.” Keywords: Tensor networks, Function-related tensors, CP decomposition, Tucker models, tensor train (TT) decompositions, matrix product states (MPS), matrix product operators (MPO), basic tensor operations, multiway component analysis, multilinear blind source separation, tensor completion, linear/multilinear dimensionality reduction, large-scale optimization problems, symmetric eigenvalue decomposition (EVD), PCA/SVD, huge systems of linear equations, pseudo-inverse of very large matrices, Lasso and Canonical Correlation Analysis (CCA)
  • 63. Tensor Networks in Healthcare SCH: INT: Collaborative Research: High-throughput Phenotyping on Electronic Health Records using Multi-Tensor Factorization Jimeng Sun, Bradley Malin, Joshua Denny, Joydeep Ghosh, Abel Kho Funding Source: NSF Smart Connect Health Integrated Grant: Award Number 1418511 http://www.sunlab.org/research/phenotyping/ Techniques Task 1: Phenotyping Generation: How to turn EHR data into meaningful clinical concepts (Phenotypes)? Task 2: Phenotyping Refinement: How to incoporate feedback to ensure the generated phenotypes clinically meaningful? Task 3: Phenotyping Adaptation: How to port phenotypes from one institution to another? Applications App 1: Cohort Construction: Validate the generated phenotypes recover some existing phenotypes (from PheKB) App 2: GWAS: Develop genomic-wide association studies using the generated phenotypes (as target or control variables) App 3:Predictive modeling: Use generated phenotypes as features to faciliate predictive modeling https://arxiv.org/abs/1704.03141
  • 64. Tensor Networks in Industry Animashree Anandkumar Associate Professor (with tenure) University of California Irvine I am a faculty at CS department within ICS at University of California Irvine since December 2016. Before that I was a faculty at EECS department at UCIrvine since August 2010. I am a member of the center for pervasive communications and computing (CPCC). I am currently a principal scientist at Amazon Web Services (AWS) and on leave from UCI. My research focus is in the high-dimensional learning of probabilistic graphical models and latent variable models. Broadly I am interested in machine learning, high-dimensional statistics, tensor methods, statistical physics, information theory and signal processing. https://youtu.be/gEFaLKzrKYc?t=6m52s https://youtu.be/KmvZu9qJNzg?t=7m15s https://youtu.be/B4YvhcGaafw?t=5m40s https://www.oreilly.com/ideas/lets-build-open-source-tensor- libraries-for-data-science
  • 66. UNCERTAINTY ANALYSIS ’Layperson’background development at internet giants like Google and Facebook.  https://www.wired.com/2016/12/uber-buys-mysterious-startup-make-ai-company/
  • 67. UNCERTAINTY ANALYSIS In practicefor retinalimaging https://doi.org/10.1101/084210 Here we propose to estimate the uncertainty of DNNs in medical diagnosis based on a recent theoretical insight on the link between dropout networks and approximate Bayesian inference. Using the example of detecting diabetic retinopathy (DR) from fundus photographs, we show that uncertainty informed decision referral improves diagnostic performance. Experiments across different networks, tasks and datasets showed robust generalization. Depending on network capacity and task/dataset difficulty, we surpass 85% sensitivity and 80% specificity as recommended by the NHS when referring 0%-20% of the most uncertain decisions for further inspection. We analyse causes of uncertainty by relating intuitions from 2D visualizations to the high-dimensional image space, showing that it is in particular the difficult decisions that the networks consider uncertain. bioRxiv preprint first posted online Oct. 28, 2016
  • 69. Visualizing disease Clinicians want answers Mitigating the resistance from clinical community, put effort in explaining the diagnosis Roth et al. (2015) Ribeiro et al. (2016)Baskaran et al. (2012) ClinicalHeuristic Glaucomadecision tree "Cliniciansneedthe data-drivenmodelpredictionstoalignwiththeirdomainknowledge" Dr. Jenna Wiens @ NIPS 2016, “NIPS 2016 Workshop on Machine Learning for Health” http://www.nipsml4hc.ws/jenna-wiens Essentially the causal decision tree becomes now “hard- to-interpret” deep learning model. How to communicate this paradigm shift to clinicians?
  • 70. Visualization state-of-the-art techniques in General DOI: 10.1111/cgf.13210 An example of modeling with visual analytics. BaobabView [Van den Elzen and van Wijk (2011)] uses a tree-like interactive view to support a manually controlled decision tree construction process An example of model selection. Squares [Ren et al. (2017)] uses small multiples composed of grids of different colors and visual textures to display the distribution of probabilities in classification © VADER Lab at ASU 2017. All rights for the techniques and images belong to their respective owners.
  • 71. Visualization high-dimensional visualization #1 Shusen Liu ; Dan Maljovec ; Bei Wang ; Peer-Timo Bremer ; Valerio Pascucci (2016) https://doi.org/10.1109/TVCG.2016.2640960 Dominik Sacha ; Leishi Zhang ; Michael Sedlmair ; John A. Lee ; Jaakko Peltonen ; Daniel Weiskopf ; Stephen C. North ; Daniel A. Keim (2016) https://doi.org/10.1109/TVCG.2016.2598495
  • 72. Visualization high-dimensional visualization #2 http://dx.doi.org/10.1111/cgf.13237 Dimensionality reduction provides a scalable alternative to create visualizations (projections) that enable insight into the structure of such datasets. However, applying dimensionality reduction independently for each dataset in a sequence may introduce unnecessary variability in the resulting sequence of projections, which makes tracking the evolution of the data significantly more challenging. We show that this issue affects t-SNE, a widely used dimensionality reduction technique. In this context, we propose dynamic t-SNE, an adaptation of t-SNE that introduces a controllable trade- off between temporal coherence and projection reliability. Our evaluation in two time-dependent datasets shows that dynamic t-SNE eliminates unnecessary temporal variability and encourages smooth changes between projections. https://doi.org/10.2312/eurovisshort.20161164
  • 73. Visualization ”unboxing” ConvNet black box #1 https://arxiv.org/abs/1311.2901; Cited by 2,133 articles https://doi.org/10.1109/TVCG.2016.2598838 To enable a more intuitive exploration process, we are open-sourcing the Embedding Projector, a web application for interactive visualization and analysis of high-dimensional data recently shown as an A.I. Experiment, as part of TensorFlow. We are also releasing a standalone version at projector.tensorflow.org, where users can visualize their high-dimensional data without the need to install and run TensorFlow.
  • 74. Visualization ”unboxing” ConvNet black box #2 HILDA’17, Chicago, IL, USA http://dx.doi.org/10.1145/3077257.3077260 https://arxiv.org/abs/1704.01942 “ACTIVIS has been deployed on Facebook’s machine learning platform. We present case studies with Facebook researchers and engineers, and usage scenarios of how ACTIVIS may work with different models.” Minsuk Kahng is with Georgia Tech; Pierre Andrews is with Facebook; Aditya Kalro is with Facebook; Duen Horng (Polo) Chau. DARVIZ: deep abstract representation, visualization, and verification of deep learning models ICSE-NIER '17 Proceedings of the 39th International Conference on Software Engineering: New Ideas and Emerging Results Track. https://doi.org/10.1109/ICSE-NIER.2017.13 ShapeShop: Towards Understanding Deep Learning Representations via Interactive Experimentation CHI EA '17 Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems https://doi.org/10.1145/3027063.3053103
  • 75. Visualization ”unboxing” recurrent/Sequence black box #1 https://arxiv.org/abs/1705.08153 Uninterpretable examples. Left: Illustration of an arbitrary set of parameters for an LSTM trained on the MIT-BIH dataset. Numbers indicate different connections for the input weight vector (rectangle) and the hidden layer weight matrix (square). Right: The memory values c for arbitrary units in the LSTM trained on the MIT- BIH data LSTM hidden unit outputs compared to wavelet coefficients. The top of each column is the original sample that was correctly classified using the respective LSTM model. The following two pairs of rows are the cherry-picked pairs of wavelet coefficients and hidden unit outputs that are roughly similar. The type of wavelet coefficient and the specific hidden unit are indicated above each plot. The Daubechies wavelet coefficients are 108 time steps long (instead of 216) because it makes use of the discrete wavelet transform. The wavelet coefficients were computed using the PyWavelets package in Python The sample saliencies for the ECG data using different techniques depicted in each column. The occlusion width is the number of time steps that are occluded per instance. All the samples shown have a length of 216 time steps (x-axis) and were correctly classified by the model. The importance of each input step is shown on a scale of 0 to 1, with 1 being the most important. The type of ECG signal is indicated on the left with LBBB – left bundle branch block beat, RBBB – right bundle branch block beat, Paced – paced beat, and V-fib – ventricular fibrillation. Class mode visualizations. The optimized class modes for the ECG data (left) and the MNIST data (right). Here the input is optimized with respect to each class in order to find the most likely input for each class. The class for each plot is indicated on the left of the image. This technique did not yield interpretable results.
  • 76. Visualization Medical deep learning models #1 https://arxiv.org/abs/1707.02485 Overall illustration of MDNet. We use a bladder image with its diagnostic report as an example. The image model generates an image feature to pass to LSTM in the form of a task tuple and a Conv feature embedding (for the attention model) computed by the AAS module (defined in the method). LSTM executes prediction tasks according to the specified image feature type The illustration of class-specific attention. From top to bottom, test images, pathologist annotations, and class attention maps. Like the pathologist annotations, the attention maps are most activated in urothelial regions, largely ignoring stromal or background regions. Best viewed in color. http://dx.doi.org/10.1016/j.oret.2016.12.009 An occlusion test (Zeiler and Fergus, 2016) was performed to identify the areas contributing most to the neural network's assigning the category of AMD. A blank 20 × 20-pixel box was systematically moved across every possible position in the image and the probabilities were recorded. The highest drop in the probability represents the region of interest that contributed the highest importance to the deep learning algorithm.
  • 77. Visualization Medical deep learning models #2 https://arxiv.org/abs/1703.10757 Inspired by Zhou et al. (2016), we present in this section the idea of generating the Regression Activation Maps (RAM) of an input image to localize the discriminative region towards the regression outcomes. It is known that the convolutional units of each layers of CNN act as visual concept detectors to identify low-level concepts like textures or materials, to high-level concepts like objects or scenes. Deeper into the network, the units become increasingly discriminative. However, the fully- connected layers will make it difficult to identify the importance of different units for identifying the output labels (regression values, in our networks). Instead, using global averaging pooling (GAP) and the linear output unit, we can directly visualize the region of interest (ROI) that are most discriminative for a given regression value. As we use regression for the purpose of classification, each single RAM obtained for each single image explicitly depict the ROI on different clinical level. In this work, we provided a deep learning model that includes regression activation maps layer (RAM). The RAM layer can provide the robust interpretability of the proposed detection model by monitoring the pathogenesis so that the proposed model can be taken as an assistant for clinicians
  • 78. Interpretability to EHR Mining and decision making #1 https://youtu.be/co3lTOSgFlA The source code of RETAIN is publicly available at https://github.com/mp2893/retain Model Interpretation for Heart Failure Prediction We demonstrate the interpretability of RETAIN by studying its behavior in the HF prediction task. We choose a HF patient from the test set and calculate the contribution of the variables (medical codes in this case) for making the binary prediction. Figure 3a is the visualization of the contributions of the variables in each visit. The patient suffered from skin problems, skin disorder (SD), benign neoplasm (BN), excision of skin lesion (ESL), for some time before showing symptoms of HF, cardiac dysrhythmia (CD), heart valve disease (HVD) and coronary atherosclerosis (CA), then being diagnosed with HF at the end. We can see that skin-related codes from the earlier visits made little contribution to HF prediction as expected. RETAIN properly puts much attention to the HF-related codes that occurred in recent visits.
  • 79. Interpretability to EHR Mining and decision making #1 GRAM: Graph-based Attention Model for Healthcare Representation Learning Edward Choi, Mohammad Taha Bahadori, Le Song, Walter F. Stewart, Jimeng Sun‘ last revised 1 Apr 2017 (this version, v3) https://arxiv.org/abs/1611.07012 “Deep learning methods exhibit promising performance for predictive modeling in healthcare, but two important challenges remain: - Data insufficiency: Often in healthcare predictive modeling, the sample size is insufficient for deep learning methods to achieve satisfactory results. -Interpretation: The representations learned by deep learning methods should align with medical knowledge. To address these challenges, we propose a GRaph-based Attention Model, GRAM that supplements electronic health records (EHR) with hierarchical information inherent to medical ontologies.” https://jkulas12.github.io/GRAM_Visualization/ :
  • 81. Dataset Size How much samples? The more the better, but there are obvious problems with obtaining huge medical datasets https://arxiv.org/abs/1511.06348 (A) The number of misclassified images on each body part class and (B) of total misclassified ones on whole body in increasing number of training data sets. Classification accuracy results according to increasing size of training data sets There is rule-of-thumb (#1)stating that one should have 10x the number of samples as parameters in the network (for more formal approach, see VC dimension), and for example the ResNet (He et al. 2015) in the ILSVRC2015 challenge had around 1.7M parameters, thus requiring 17M images with this rule-of-thumb. https://www.researchgate.net/post/What_is_the_minimum_sample_size_required_ to_train_a_Deep_Learning_model-CNN
  • 82. Dataset Size How much samples? More is better always if you train with higher capacity models https://arxiv.org/abs/1707.02968 Since 2012, there have been significant advances in representation capabilities of the models and computational capabilities of GPUs. But the size of the biggest dataset has surprisingly remained constant. What will happen if we increase the dataset size by 10× or 100×? Our experiments yield some surprising (and some expected) findings: Better Representation Learning Helps! Our first observation is that large-scale data helps in representation learning as evidenced by improvement in performance on each and every vision task we study. This suggests that collection of a larger- scale dataset to study pretraining may greatly benefit the field. Our findings also suggest a bright future for unsupervised or self-supervised [10, 42] representation learning approaches. It seems the scale of data can overpower noise in the label space. Performance increases linearly with orders of magnitude of training data! Perhaps the most surprising element of our finding is the relationship between performance on vision tasks and the amount of training data (log-scale) used for representation learning. We find that this relationship is still linear! Even with 300M training images, we do not observe any plateauing effect for the tasks studied. Capacity is Crucial: We also observe that to fully exploit 300M images, one needs higher capacity models. For example, in case of ResNet-50 the gain on COCO object detection is much smaller (1.87%) compared to (3%) when using ResNet- 152. Training with Long-tail: Our data has quite a long tail and yet the representation learning seems to work. This long-tail does not seem to adversely affect the stochastic training of ConvNets (training still converges). New state of the art results: Finally, our paper presents new state-of-the-art results on several benchmarks using the models learned from JFT-300M. For example, a single model (without any bells and whistles) can now achieve 37.4 AP as compared to 34.3 AP on the COCO detection benchmark.
  • 83. Dataset Size data augmentation #1 s Images from: ftp://ftp.dca.fee.unicamp.br/pub/docs/vonzuben/ia353_1s15/topico10_IA353_1s2015.pdf | Wu et al. (2015) Synthetically increase the number of training sample by distorting them in way expected from the dataset (random xy-shifts, left-right flips, add gaussian noise, blur, etc.) → This have shown to reduce overfitting. As noted in the previous slides on image quality, it is useful to train the model with various image quality levels Köhler et al. (2013) The most successful convolutional architectures are developed starting from ImageNet, a large scale collection of images of object categories downloaded from the Web. This kind of images is very different from the situated and embodied visual experience of robots deployed in unconstrained settings. To reduce the gap between these two visual experiences, this paper proposes a simple yet effective data augmentation layer that zooms on the object of interest and simulates the object detection outcome of a robot vision system. The layer, that can be used with any convolutional deep architecture, brings to an increase in object recognition performance of up to 7%, in experiments performed over three different benchmark databases. https://arxiv.org/abs/1705.02139
  • 84. Dataset Size data augmentation #2 Apply domain-specific perturbations Dataset Augmentation in Feature Space Terrance DeVries, Graham W. Taylor(Submitted on 17 Feb 2017) https://arxiv.org/abs/1702.05538 Dreaming More Data: Class-dependent Distributions over Diffeomorphisms for Learned Data Augmentation Søren Hauberg, Oren Freifeld, Anders Boesen Lindbo Larsen, John Fisher, Lars Hansen ; Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:342-350, 2016. http://proceedings.mlr.press/v51/hauberg16.html Our approach is, however, not limited to MNIST: ● Image alignment and registration is a routine task in many medical imaging tasks, such as the analysis of MRI. ● We make similar observations for time-series data such as acoustic signals. Here dynamic time warping (DTW) is often used as preprocessing to remove differences in the temporal speed of individual signals. ● Mesh alignment is also standard pre-processing step in the analysis of three- dimensional meshes. As deep models are beginning to appear for three- dimensional data it would be interesting to combine them with learned augmentation schemes. https://doi.org/10.1016/j.neucom.2016.12.025 In this paper, we propose five data augmentation methods dedicated to face images, including landmark perturbation and four synthesis methods (hairstyles, glasses, poses, illuminations). The proposed methods effectively enlarge the training dataset, which alleviates the impacts of misalignment, pose variance, illumination changes and partial occlusions, as well as the overfitting during training
  • 85. Dataset Size Generative synthetic data Augmentation through generative adversarial models (GAN) the CVPR 2017 awards are out! The two winners are Densely Connected Convolutional Networks by Facebook and Improving the Realism of Synthetic Images https://arxiv.org/abs/1612.07828 https://machinelearning.apple.com/2017/07/07/GAN.html https://github.com/wayaai/SimGAN https://arxiv.org/abs/1706.02071 https://github.com/val-iisc/deligan TextureGAN: Controlling Deep Image Synthesis with Texture Patches Wenqi Xian, Patsorn Sangkloy, Jingwan Lu, Chen Fang, Fisher Yu, James Hays (Submitted on 9 Jun 2017) https://arxiv.org/abs/1706.02823
  • 86. Dataset Size semi-supervised training #1 Jointly use labeled and unlabeled data https://arxiv.org/abs/1705.08850 Our empirical results show that using the tangents of the data manifold (as estimated by the generator of the GAN) to inject invariances in the classifier improves the performance on semi- supevised learning tasks. https://arxiv.org/abs/1706.00400 N. Siddharth, Brooks Paige, Jan-Willem Van de Meent, Alban Desmaison, Frank Wood, Noah D. Goodman, Pushmeet Kohli, Philip H.S. Torr Here we are interested in learning disentangled representations that encode distinct aspects of the data into separate variables. We propose to learn such representations using model architectures that generalize from standard Variational autoencoders (VAEs) employing a general graphical model structure in the encoder and decoder. This allows us to train partially-specified models that make relatively strong assumptions about a subset of interpretable variables and rely on the flexibility of neural networks to learn representations for the remaining variables. We further define a general objective for semi- supervised learning in this model class, which can be approximated using an importance sampling procedure.
  • 87. Dataset Size semi-supervised training #2 https://arxiv.org/abs/1707.03631 https://arxiv.org/abs/1705.09783 In this work, we present a semi-supervised learning framework that uses generated data to boost task performance. Under this framework, we characterize the properties of various generators and theoretically prove that a complementary (i.e. bad) generator improves generalization. Empirically our proposed method improves the performance of image classification on several benchmark datasets. Our proposed method, adversarial dropout, can be viewed from the dropout and from the adversarial training perspectives. Our proposed adversarial dropout can be interpreted as dropout masks whose direction is counter-optimized, adversarially, to the model’s label assignment. However, it should be noted that adversarial dropout and traditional adversarial training with additive perturbation are different because adversarial dropout induces the sparse structure of neural network while the other do not make changes on the neural network directly.
  • 88. Dataset Size Active learning and “smart” labeling #1 When labelingis very time-consuming, activelearning can help us in choosing which unlabeled samples to label Active Learning and Proofreading for Delineation of Curvilinear Structures Mosinska, Agata Justyna; Tarnawski, Jakub; Fua, Pascal Presented at: MICCAI, Quebec City, Canada, September 10-14, 2017 https://infoscience.epfl.ch/record/229472 https://arxiv.org/abs/1704.07433
  • 89. Dataset Size Transfer learning Leveraging features learned from bigger non-medical datasets Our approach fine-tunes a pre-trained convolutional neural network (CNN), GoogLeNet. The fine-tuned CNN could effectively identify pathologies in comparison to classical learning. Our algorithm aims to demonstrate that models trained on non-medical images can be fine-tuned for classifying OCT images with limited training data. Biomedical Optics Express Vol. 8, Issue 2, pp. 579-592 (2017) https://doi.org/10.1364/BOE.8.000579 International Workshop on Large-Scale Annotation of Biomedical Data and Expert Label Synthesis International Workshop on Deep Learning in Medical Image Analysis LABELS 2016, DLMIA 2016: Deep Learning and Data Labeling for Medical Applications pp 188-196 Understanding the Mechanisms of Deep Transfer Learning for Medical Images https://doi.org/10.1007/978-3-319-46976-8_20 Hariharan Ravishankar, Prasad Sudhakar, Rahul Venkataramani, Sheshadri Thiruvenkadam, Pavan Annangi, Narayanan Babu, Vivek Vaidya Deep Learning and Convolutional Neural Networks for Medical Image Computing Pp 181-193 Part of the Advances in Computer Vision and Pattern Recognition book series (ACVPR) On the Necessity of Fine-Tuned Convolutional Neural Networks for Medical Imaging https://doi.org/10.1007/978-3-319-42999-1_11 Nima Tajbakhsh, Jae Y. Shin, Suryakanth R. Gurudu, R. Todd Hurst, Christopher B. Kendall, Michael B. Gotway, Jianming Liang In this paper, we studied the necessity of fine-tuning and the effective level of knowledge transfer to 4 medical imaging applications. Our experiments demonstrated medical imaging applications were conducive to transfer learning and that fine-tuned CNNs were necessary to achieve high performance particularly with limited training datasets. We also showed that the desired level of fine-tuning differed from one application to another. While deeper levels of fine-tuning were suitable for polyp and PE detection, intermediate fine-tuning worked the best for interface segmentation and colonoscopy frame classification. Our findings led us to conclude that layer- wise fine-tuning is a practical way to reach the best performance based on the amount of available data.
  • 90. Dataset Quality Beyond A giant with feet of clay: on the validity of the data that feed machine learning in medicine Federico Cabitza, Davide Ciucci, Raffaele Rasoini last revised 26 Jun 2017 https://arxiv.org/abs/1706.06838 We point out how uncertainty is so ingrained in medicine that it biases also the representation of clinical phenomena, that is the very input of ML models, thus undermining the clinical significance of their output. Recognizing this can motivate both medical doctors, in taking more responsibility in the development and use of these decision aids, and the researchers, in pursuing different ways to assess the value of these systems. In so doing, both designers and users could take this intrinsic characteristic of medicine more seriously and consider alternative approaches that do not "sweep uncertainty under the rug" within an objectivist fiction, which everyone can come up by believing as true. 5 Garbage in, Gospel out The question of the quality of medical record and of the data extracted from there is still understudied [ Cabitza and Batini, 2016; Stetson et al. 2012], let alone in regard to machine learning projects [Feldman et al. 2017]. The assumption that medical data could support secondary uses has been challenged since almost 25 years ago, and also strongly so, e.g., by Reiser 1991, who described several cases of erroneous, missing and ambiguous data, and by Burnum (1989), who provocatively wrote that “all medical record information should be regarded as suspect; much of it is fiction” (p. 484)” JAMA. Published online July 20, 2017. doi: 10.1001/jama.2017.7797 https://doi.org/10.1177/0272989X12465490 Conclusions: Our exploratory analysis method reveals unexpected effects. It indicates that, despite the original study detecting no significant average effect, computer- aided detection (CAD) helped the less discriminating readers but hindered the more discriminating readers. Such differential effects, although subtle, may be clinically significant and important for improving both computer algorithms and protocols for their use. They should be assessed when evaluating CAD and similar warning systems.
  • 93. RETINA A schematic view of the retina showing the organization of different neuronal populations and their synaptic connections. Rods and cones are confined to the photoreceptor layer. Light detected by rods and cones is processed and signalled to retinal ganglion cells (RGCs) through horizontal, amacrine and bipolar cells. RGCs are the only output neurons from the retina to the brain. A subset of RGCs (4–5% of the total number of RCGs) are intrinsically photosensitive RGCs (ipRGCs) containing the photopigment melanopsin. There are at least five subtypes of ipRGCs (M1–M5) with different morphological and electrophysiological properties, which show widespread projection patterns throughout the brain. LeGates et al. (2014): “Light as a central modulator of circadian rhythms, sleep and affect” Retinal circuits. (a) The cellular and synaptic (i.e., plexiform) layers of the retina. Some of the various cell types composing the five classes of neurons are shown: rod and cone photoreceptors, horizontal cells (HCs), ON and OFF cone bipolar cells (BCs), rod BCs, AII and wide-field (WF) amacrine cells (ACs), and ON and OFF ganglion cells (GCs). The ON and OFF BC axon terminals and GC dendrites stratify in separate halves of the inner plexiform layer. (b) Several cell types from panel a, redrawn to illustrate how rod signals pass through the inner retina. Excitatory (+) and inhibitory (−) synapses are shown. A gap junction (denoted by the resistor symbol) allows bidirectional current flow between AII ACs and ON cone BCs. The AII AC splits the ON rod BC signal into ON and OFF components using either electrical (gap junction, ON) or chemical (glycinergic, OFF) synapses. Note that in daylight conditions, cone-mediated drive to the AII influences the OFF pathway as follows: cone → ON cone BC → AII AC → OFF BC and GC. Example of circuit switching. (a) The excitatory input to an ON ganglion cell (GC) is driven by both rod and cone circuits. The rod circuits actually signal via the cone bipolar cell terminal. The inhibition from the surround is mediated by a wide-field amacrine cell (WF AC) driven exclusively by cone circuits. (b) When the rod circuit is active, the ON GC has a receptive field with an excitatory center component only. When the cone circuit is active, the inhibitory surround component switches on. Synaptic motifs. (a) From the perspective of a bipolar cell (pipette attached), inhibition arising from amacrine cells (ACs) occurs via multiple synaptic motifs. Excitatory (+) and inhibitory (−) synapses are indicated; feedback and feedforward synapses can occur in both ON and OFF systems, and crossover inhibition acts between ON and OFF systems. The illustrated circuit is an ON → OFF inhibitory one, but the opposite pattern (OFF → ON) could also occur. (b) From the perspective of a ganglion cell (GC) (pipette attached), inhibition from ACs occurs via multiple synaptic motifs. This panel follows the same conventions as used in panel a. Note! Melanopsin-containing retinal ganglion cells (pRGC, ipRGC, mRGC, the same thing) were discovered only recently in 2002 by Berson et al. [Cited by 1956], thus you might find them missing from textbook versions of retinal circuits Initially they were thought of contributing mainly on sleep/alertness and circadian rhythm regulation, but recently it has been shown that they contribute to image forming vision as well.
  • 94. RETINA response characteristics: Spectral #1 SPECTRAL PROPERTIES Teikari thesis (2012) Enezi et al. 2011. Stockmann And Sharpe (2000), CVRL Govardovskii et al. 2000 van de Kraats and van Norren 200 7 Walraven 2003 CIE Report “For environmental light” “At retinal level if you would not have ocular media” The absorbance spectrum of an exemplary vertebrate rhodopsin (lmax ~ 500 nm), considered as a sum of absorbance bands, indicated by alpha (a), beta (b), gamma (g), sigma (s) and epsilon (e) normalized to the peak absorbance of the alpha-band (after Stavenga and van Barneveld 1975, from Stavenga 2010). The sidelobe on the short-wave side come from the beta band (see template from Govardovskii et al. 2000) Self-screening effect changes the width/peak of the absorption spectrum. (A) Percentage absorption spectra of various concentrations of photopigment (OD - optical density in log units). (B) An illustration of self-screening in at various photoreceptor lengths. Human rod photoreceptor is ~25 mm, (Pugh and Lamb 2000) and the cone photoreceptor 13 mm (Baylor et al. 1984). The longest known photoreceptor has been found in dragonfly, the length being 1,100 mm (Labhart and Nilsson 1995). “Human crystalline lens strongly absorb blue light and UV” V'(l) is the spectral sensitivity for night vision, and V(l) for daytime vision. Not shown is mesopic vision VM(l) that is a nonlinear combination of daytime and night vision operating on dim light color vision. Quantally defined daytime sensitivity (2º central vision, Sharpe et al., 2005): V*(l) = [1.891·l(l) + m(l)]/2.80361 Where l is long-wavelength ('red') cone sensitivity, and m medium-wavelength (green) cone sensitivity Note! Melanopsin and  S­cones do not  seem to  contribute to  central vision  luminance  perception vs. RGB Luminance Stockman, A., & Sharpe, L. T. (2008). Spectral sensitivity In The Senses: A Comprehensive Reference, Volume 2: Vision II (pp. 87-100) Goodeve et al., 1942 Without the crystalline lens (aphakic eye), visual sensitivity would extend to ultraviolet
  • 95. RETINA response characteristics: Spectral #2 Dominance of L cones over S cones across species. Measured S cone proportion is shown for a variety of animals. For some animals, two measurements at different locations on the retina are shown. Large variation in L cone proportion indicates dorso-ventral asymmetries, like those discussed in Szél et al. (2000). Science  10 Jun 2011:Vol. 332, Issue 6035, pp. 1307-1312 DOI: 10.1126/science.1200172 For a short wavelength–sensitive pigment, although its noise literally disappears at lmax < 400 nm (Fig. C), nonspecific light absorption by proteins, peaking at ~280 nm, becomes a limiting factor. These considerations probably explain, at least partially, why the lmax values of native visual pigments are confined to the narrow bandwidth of ~360 to 620nm,limitingcolorvisionaccordingly. Predicted thermal-noise rate constant as a function of lmax . Black circles, rhodopsins; red squares, cone pigments. http://dx.doi.org/10.1016/S0896-6273(00)80845-4 Present-day vertebrates vary enormously in the sophistication of their color vision, the density and spatial distribution of cone classes, and the number and absorption maxima of their cone pigments (38, 30, 31 and 100). At one extreme, most mammals have only three pigments: the two ancestral cone pigments and rhodopsin. At the other evolutionary extreme, chickenspossess sixpigments: fourconepigments,onerhodopsin, and a pinealvisual pigment, pinopsin. In this evolutionary comparison, humans and their closest primate relatives represent an intermediate level of complexity. Humans have four visual pigments (1999): a single member of the <500 nm family of cone pigments (the blue or short-wave pigment, with an absorption maximum at 425 nm), two highly homologous members of the >500 nm∼ family (the green or middle-wave pigment, and red or long-wave pigment, with absorption maxima at 530 and 560 nm,∼ ∼ respectively), and rhodopsin. The presence of only a single gene encoding a >500 nm pigment in almost all New World primates, and in all nonprimate mammals studied to date, places the red/green visual pigment gene duplication in the Old World primate lineage at 30–40 million years ago, shortly after the geologic split between Africa and South America (∼ Jacobs1993) DOI:10.1098/rstb.2009.0050 The spectral tuning of vertebrate opsins will also be influenced by their evolutionary history (Goldsmith1990). Melanopsin acts as a bistable pigment able to regenerate (recycle) itschromophore (11-cis-retinal) using all-trans-retinal and long-wavelength light inamanner reminiscentoftheinvertebratephotopigments(Melyan etal.2005). In this regard melanopsin may be unique among mammalian photopigments in formingastableassociationwithall-trans-retinal. Interestingly, the melanopsins appear to share some of the key characteristics of an invertebrate-like signal transduction pathway. Both pRGCs and cells transfected with melanopsin show depolarizing responses to light and, displays chromophore bistability/tristability (Emanueletal.2015) another feature of the invertebrate photopigments. Amino acid sequence features of melanopsin protein resulting delayed deactivationoftemporalintegrationoflightsignal(Mureetal.,2016).
  • 96. RETINA response characteristics: Spectral #3 Transducing intermediate pigment states Schematic representation of the photochemical invertebrate rhodopsin cycle of blowfly (Calliphora). rhodopsin R excited by light absorption converts to bathorhodopsin B. Thermal decay via lumirhodopsin L to metarhodopsin M follows. The back reaction proceeds via putative intermediates K and possibly N.Timeconstants oftheconversion stepsareindicated (Kruizinga etal. 1983 ). Vertebrate rhodopsin intermediates. (A)Decay of theactivated Meta II state to Meta III. Illumination of rhodopsin’s dark state (lmax = 500 nm ) produces the Meta I/Meta II photoproduct equilibrium. By applying a second illumination, the decay product Meta III of the second pathway can be converted back to Meta I/Meta II (again consisting mostly of Meta II), while the decay products of the first pathway, opsin and all-trans retinal, remain largely unreactive .(B, Bovine) rhodopsin transduction. Activation of rhodopsin is achieved by light-dependent isomerization of the chromophore and subsequent thermal relaxation of the receptor on the millisecond time scale to the active receptor conformation ( Bartl and Vogel 2007). A State Model for Tristable Melanopsin (A) State diagram of melanopsin (top) based on parameters measured biochemically from purified pigment (Matsuyamaet al., 2012). Shown are melanopsin (R), metamelanopsin (M), and extramelanopsin (E) with chromophores designated. Below are plotted the relative photosensitivities (i.e., products of the extinction coefficients and quantum efficiencies) of these states as a function of wavelength. (B) Predicted equilibrium fraction of each pigment state as a function of wavelength. Lines show the R state (black), M state (blue), and E state(red) - Emanuel etal. 2015 Photoreversal of vertebrate rhodopsin (Williams1964). Both the test flash and the bleaching light consisted of long wavelengths primarily absorbed by rhodopsin. The blue, photoregenerating flash contained wavelengths absorbed by the longer-lived intermediates of the bleaching process. This photoreversal might in practice enhanceBlue Light Hazard (Grimmetal.2000). Regeneration of pigment to responsive state by second illumination both with 'invertebrate'-like melanopsin andvertebraterhodopsin. DOI: 10.1042/bj3301201 http://dx.doi.org/10.1016/j.visres.2005.12.017 | Cited by 26 Time courses on amounts of photolysis products in goldfish cones normalized tobleached visual pigment.  Decompositionof final T- and L-spectra of rod outer segments at 1800 s postbleach (noisy curves) into components. It reveals, in addition to dehydroretinal and P480, a generation of a small amount of dehydroretinol. The sum of RAL, ROL, and P480 (bold curves) provides a good approximation of theexperimental spectra.
  • 97. RETINA response characteristics: Spectral #4 http://dx.doi.org/10.1038/13185 Genetic and psychophysical results from the latter class indicated that limited red–green discrimination can be achieved with pigments that have the same peak wavelength sensitivity and that differ only in optical density. Types of color blindness with their prevalence faculty.montgomerycollege.edu http://dx.doi.org/10.1016/j.visres.2011.08.016 sensationalcolor.com/understanding-color www.npr.org/2014/11/16 http://www.bbc.co.uk/news/entertainment-arts-27884975 By Colin Schultz | smithsonian.com | August 20, 2012
  • 98. RETINA response characteristics: Spectral #5 A tiny group of people can see ‘invisible’ colours (“tetrachromacy” - four cones, instead of three cones) that no-one else can perceive, discovers David Robson. How do they do it? “Jordan’s “acid test” involved coloured discs showing different mixtures of pigment, such as a green made of yellow and blue. The mixtures were too subtle for most people to notice: almost all people would see the same shade of olive green, but each combination should give out a subtly different spectrum of light that would be perceptible to someone with a fourth cone. Sure enough, Jordan’s subject was able to differentiate between the different mixtures each time. “ http://www.bbc.com/future/story/20140905-the-women-with-super-human-vision While tetrachromacy is so rare that it makes headlines every time a new case emerges, it might come as a surprise that women with four cone types in their retinas are actually more common than we think. Researchers estimate that they represent as much as 12% of the female population (4). So why aren’t we surrounded by women with extraordinary colour vision? Researchers have found that only a small fraction of women who possess an extra cone type actually get to enjoy more colours. So what does it take to be a true tetrachromat? How does the human retina come to produce four cone types, and why does it only concern women? More importantly, why don’t all women fulfil their genetic potential? And how do we find the special women who do? theneurosphere.com/2015/12/17 [4] Jordan, G. et al. (2010). The dimensionality of color vision in carriers of anomalous trichromacy.  Journal of Vision, 10. Tetrachromats are rare enough, but Concetta Antico is particularly remarkable, since, as an artist, she is able to give us a rare view into that world. “Her artwork might tap into a structure that all of us can appreciate,” says Kimberly Jameson at the University of California, Irvine, who has studied Antico extensively. It’s even possible that she might suggest ways for more people to see the same way.
  • 99. RETINA response characteristics: Intensity Illumination levels. Typical ambient light levels are compared with photopic luminance (log cd m-2 ), pupil diameter (mm), photopic and scotopic retinal illuminance (log photopic and scotopic trolands respectively) and visual function. The scotopic, mesopic and photopic regions are defined according to whether rods alone, rods and cones, or cones alone operate. The conversion from photopic to scotopic values assumed a white standard CIE D65 illumination. ( Stockman and Sharpe 2006) How these four separate mechanisms — photopigment depletion, pupil contraction, cellular adaptation and response compression — coordinate luminance adaptation is not yet known. However, Peter Kaiser and Robert Boynton provide a quantitative illustration of how the four principal processes might interact, as shown below. http://www.handprint.com/HP/WCL/color4.htm l http://www.handprint.com/HP/WCL/color4.htm l Top right: Spectral response of the eye for point sources. Peak cone sensitivity is over 200 times lower than peak rod sensitivity. Relative sensitivities of S, L and M cones are shown within photopic mode; by combining their inputs, the brain creates colors. Bottom left: Exposed to low- light conditions in full photopic mode, cone sensitivity increases 30-100 times within ~10 minutes, reaching its maximum sensitivity level (the darker it is, the faster transition from cones-to-rods function; in near-complete darkness, the cones shut down almost instantly). At the point of cones-rods break, rods become dominant, gaining in sensitivity some 200-1000 times over peak cone sensitivity within the next ~20 minutes (individual sensitivity varies within the shown approximate range: by a factor of ~3 and ~10 for the cones and rods, respectively). In the process, peak sensitivity shifts from ~555nm (photopic) to ~507nm (scotopic). The response range shifts from ~400-730nm to ~370- 650nm, respectively. Dark-to-light eye adaptation takes considerably less: only about 7 minutes. a Maximum sensitivity level, after ~10 min in darkness; maximum bright-light cone sensitivity is 30-100 times lower. http://www.telescope-optics.net/eye_spectral_response.htm
  • 100. RETINA response characteristics: Circuit (A) Time course of the Early Receptor Potential (ERP) and Late Receptor Potential (LRP) in monkey retina compared to ERG a-wave (redrawn from Brown and Murakami1964). (B) Intensity-dependence of human ERP illustrating the log-linear relationship between light intensity in pigment-level responses, and non-linear relationship between light intensity and a- wave response (redrawn from Debecker and Zanen 1975 ). Graph from Teikari thesis (2012). The cells of the retina and their response to a spot light flash. The photoreceptors are the rods and cones in which a negative receptor potential is elicited. This drives the bipolar cell to become either depolarized or hyperpolarized. The amacrine cell has a negative feedback effect. The ganglion cell fires an action pulse so that the resulting spike train is proportional to the light stimulus level. (bem.fi) The classical photoreceptors cones and rods are not designed to encode absolute light levels (unlike melanopsin RGCs), and non-linearity in the visual processing is introduced very early already on the retinal level. The pigment conformational change (cis to trans) is linear in relation to light intensity, but photoreceptor response already is nonlinear (a-wave) The dependence of the b-wave amplitude (solid squares) and the a-wave amplitude (open squares) on the log intensity of the light stimulus. The data points describe the mean ± 2 SD of the responses obtained in the dark-adapted state from 40 eyes of20 volunteers with normal vision. The relationship between he b wave amplitude and the a wave amplitude obtained from responses evoked in the dark-adapted state. The continuous line describes the mean relationship, while the 2 dashed lines bind the normal range (mean+2SD). Open and solid triangles represent normal ERG data obtained, respectively, from papers by Berson and Weleber. Data of 2 patients are also illustrated; one patient suffers from high myopia (open circles), while the other complained of nyctalopia (solid circles). Relationship between the amplitudesof the b wave and the a wave asa useful index for evaluating the electroretinogram. I. Periman Br JOphthalmol 1983;67:443-448 doi:10.1136/bjo.67.7.443 Cited by69
  • 101. Retina advanced processing (2010) http://dx.doi.org/10.1016/j.neuron.2009.12.009, Cited by 266 (A) Detection of dim light flashes in the rod- to-rod bipolar pathway. Each photoreceptor output is sent through a band-pass temporal filter followed by a thresholding operation before summation by the rod bipolar cell. Computations Performed by the Retina and Their Underlying Microcircuits. (B) Sensitivity to texture motion. The bipolar cells have biphasic dynamics and thus respond transiently. Only the depolarized bipolar cells communicate to the ganglion cell, because of rectification in synaptic transmission. (C) Detection of differential motion Polyaxonal amacrine cells in the periphery are excited by the same motion-sensitive circuit and send inhibitory inputs to the center. If motion in the periphery is synchronous with that in the center, the excitatory transients will coincide with the inhibitory ones, and firing is suppressed. (D) Detection of approaching motion. The circuit that generates this approach sensitivity is composed of excitation from OFF bipolar cells and inhibition from amacrine cells that are activated by ON bipolar cells, at least partly via gap junction coupling. Importantly, these inputs are nonlinearly rectified before integration by the ganglion cell (E) Rapid encoding of spatial structures with spike latencies The responses result from a circuit that combines synaptic inputs from both ON and OFF bipolar cells whose signals are individually rectified. The timing differences in the responses follow from a delay (∆t) in the ON pathwa. y (F) Switching circuit. A control signal selectively gates one of two potential input signals. (Right) In the retina, such a control signal is driven by certain wide-field amacrine cells (A1), which are activated during rapid image shifts in the periphery. Their activation leads to a suppression of OFF bipolar signals and, through a putative local amacrine cell (A2), to disinhibition of ON bipolar signals Journal of Vision May 2008, Vol.8, 15 doi: 10.1167/8.5.15 Cited by 42 Basic data from Hofer, Singer, et al. (2005). (Top panels) Schematics of retinal mosaics used for the 5 observers. These are subsets of the full regions characterized for each observer. For each observer, the imaging and densitometry data were insufficient to assign a class or exact location to some cones. These parameters were filled in according to the procedure described by Hofer, Singer, et al. (2005). In the schematics, L cones are colored red, M cones green, and S cones blue. L:M ratios of mosaics used: HS 1:3.1; YY 1.2:1; AP 1.3:1; MD 1.6:1, BS, 14.7:1 Roorda (2011): Instead of eliciting three classes of response generated from the stimulation of the three cone classes, they found that subjects demanded as many as 7 color categories. Analysis of the responses suggested that the color appearance generated by a single cone is more a function of how it is situated with respect to other cones rather than by its spectral subtype. Cones that are in a position to provide strong chromatic cues generate colored percepts, whereas cones that are not in a good position to do so generate achromatic, or white percepts. Given the random arrangement of the three cone classes in the retina, it is sensible that the visual system would develop in this way to best handle the dual role that retina has in conveying both spatial and color vision. An adaptive optics system was used to measure and correct for aberrations in the optics of individual observers. This enabled resolution of individual cones in acquired fundus images. Science Advances 14 Sep 2016: Vol. 2, no. 9, e1600797
  • 102. Retina already recurrent as well Published: May 3, 2011 http://dx.doi.org/10.1371/journal.pbio.1001058 Published: May 3, 2011 | http://dx.doi.org/10.1371/journal.pbio.1001057 A conceptual model of positive feedback in the outer retina. (A) Diagram depicting the differential spread of positive and negative feedback within an HC. The top bar denotes the illumination pattern. A cone depolarized in darkness will release glutamate, activating AMPA receptors (APMARs), causing depolarization and Ca2+ influx. The rise in Ca2+ is restricted to the specific dendrite that contacts the cone, and the resulting positive feedback is localized to that cone. The depolarization spreads electrotonically through the HC, resulting in negative feedback from all of the dendrites. (B) Model simulations of the effect of feedback on synaptic release from a linear array of cones exposed to a dark spot on a non- saturating light background (see Methods). The positive feedback signal (blue) is localized to HC dendrites in contact with dark cones while the negative feedback signal (red) electrotonically spreads through the HCs. Traces show simulated cone release with no feedback (green), with negative feedback, (red), and with equally weighted negative and positive feedback (blue). Spatial circuitry models in dark- and light-adapted conditions. A: in dark-adapted conditions, OFF bipolar cells receive wide spatial inhibition from wide- field GABAergic amacrine cells. Coupling between both AII and other glycinergic amacrine cells likely contribute to increasing the wide spatial spread of glycinergic signals to OFF bipolar cells. B: in light-adapted conditions, OFF bipolar cells receive spatially narrow glycinergic input, likely due to uncoupling of AII and other glycinergic amacrine cells. Light stimuli distant from the bipolar cell likely active serial inhibitory connections between GABAergic amacrine cells, which would shorten spatial GABAergic signals to OFF bipolar cells. C: functional schematic of changing bipolar cell center-surround sizes. In dark-adapted conditions, OFF bipolar cells receive wide and strong inhibition, so their inhibitory surrounds are large. If 2 small spots of light are presented to the retina, spot A stimulates excitatory output from the center of one OFF bipolar cell, whereas spot B stimulates surround inhibitory connections to that same cell. Overall output is reduced in this instance due to the addition of inhibitory input. In light-adapted conditions, OFF bipolar cells receive narrow and weaker inhibition, so their inhibitory surrounds are small. In these conditions, spot B does not stimulate the inhibitory surround, and there is no reduction in excitatory bipolar cell output from spot A. Thus the strength of the bipolar cell output in the light-adapted case is stronger doi:10.1152/jn.00948.2015
  • 103. Neuroscience Deep Learning | Background
  • 104. fMRI+EEG+Behavioral data multimodal data http://dx.doi.org/10.1016/j.neuroimage.2015.12.030 Specifically, we show how combining either EEG and fMRI with a behavioral model can perform substantially better than a behavioral-data-only model in both generativeandpredictivemodelinganalyses. We then show how a trivariate model – a model including EEG, fMRI, and behavioral data – outperforms bivariate models in both generative and predictive modelinganalyses Graphicaldiagramderivedfrom Turneretal.(2016) [seepreviousslidesfor EEG+fMRI: Observable data are representedasgrayboxes,whereas unknown(latent) variables arerepresented asemptycircles.Theorangeplate representsthebehavioraldata/model,the greenplaterepresentstheEEGdata/model, andtheblueplaterepresentsthefMRI data/model. ζ I Themethodallows foranybehavioral modeltobe combinedwith multipleneural measures. Generative Deep Network Improvetheexisting generativemodels
  • 105. MEG Visual processing with deep learning http://dx.doi.org/10.1016/j.neuroimage.2016.03.063 Magnetoencephalography (MEG) Image set and single-image decoding. (A) The stimulus set comprised 48 indoor scene images differing in the size of the space depicted (small vs. large), as well as clutter, contrast, and luminance level; here each experimental factor combination is exemplified by one image. The image set was based on behaviorally validated images of scenes differing in size and clutter level, de-correlating factors size and clutter explicitly by experimental design (Park et al., 2015). The deep neural network architecture “AlexNet” was implemented following Krizhevsky et al. (2012). We chose this particular architecture because it was the best performing model in object classification in the ImageNet 2012 competition ( Russakovsky et al., 2014) Supplementary Movie 1. The deep scene model accounts for more of the MEG size signal than other models. (A) We combined representational similarity with partial correlation analysis to determine which computational models explained emerging representations of scene size in the brain. Together our data provide a first description of an electrophysiological signal for layout processing in humans and suggest that deep neural networks are a promising framework to investigate how spatial layout representations emerge in the human brain. Future studies using image sets optimized to drive low-and high level visual cortex equally are necessary, to test whether layer-specific representations in deep neural networks can be mapped in both time and in space onto processing stages in the human brain. Sidenote! AlexNet was indeed revolutionary at its time, but the 2015 winner ResNet from Microsoft surpassed human performance
  • 106. Brain Circuit feed-forward vs recurrent a | Feedforward network. The diagram shows a multilayer perceptron, consisting of three sequential layers of neurons (represented by circles), in which every neuron from each layer is connected to every neuron of the next layer. In this network, inputs are sequentially processed layer by layer in a unidirectional fashion, from the input layer on the left, to the ‘hidden’ layer in the middle, to the output layer on the right. The simple addition of synaptic weights in the output layer results in the generation of selective responses. The computation is an emergent property of the activity of the entire network. b | Recurrent network: an example of an attractor (feedback) neural network in which four pyramidal neurons (blue) are connected to themselves through recurrent axons (thin lines) with synaptic weights (wij) that change owing to a learning rule. The network receives an external set of inputs (top connections) and generates an output (bottom arrows). In networks with recurrent and symmetric connectivity the activity becomes ‘attracted’ to particular stable patterns. http://dx.doi.org/10.1038/nrn3962, Cited by 49 Nature Reviews Neuroscience 11, 615-627 (September 2010) doi:10.1038/nrn2886 The feedforward network as a model of information processing in the brain. a | A schematic of hierarchical processing in the visual systems of primates. Similar schematic models have also been described for other sensory and motor areas. b | Each module in part a can be considered as a recurrent network of excitatory and inhibitory neurons. Each of the rectangular boxes represents a recurrent random network. The hierarchical structure of the brain is conceived here as a network of recurrent networks with forward and backward excitatory connections. So far, only the feedforward part (shown in black) of such a network of networks has been investigated in a systematic manner. Recurrent excitation and inhibition within one group and excitatory synapses that do not contribute to the feedforward hierarchy of subsequent groups (shown in grey) have not been considered yet
  • 107. Residual variants state-of-the-art Deep feedforward network https://arxiv.org/abs/1512.03385; Cited by 578 https://arxiv.org/abs/1603.05027 https://arxiv.org/abs/1602.07261 https://arxiv.org/abs/1602.07360 http://dx.doi.org/10.1007/978-3-319-46976-8_19 Skip connections https://arxiv.org/abs/1604.08671 The framework of the proposed DEGREE network. The recurrent residual network recovers sub-bands of the HR image features iteratively and edge features are utilized as the guidance in image SR for preserving sharp details.
  • 108. Circuit design deep networks vs Human brain #1 Center for Data Science, New York University Department of Brain and Cognitive Sciences, MIT Department of Psychology and Center for Brain Science, Harvard University Center for Brains Minds and Machines https://arxiv.org/abs/1604.00289; Citedby13 Science 11 Dec 2015: Vol. 350, Issue 6266, pp. 1332-1338 DOI: 10.1126/science.aab3050; Cited by70
  • 109. Circuit design deep networks vs Human brain #2 https://arxiv.org/abs/1604.03640 Center for Brains, Minds and Machines, McGovern Institute, MIT How similar is an ultra-deep residual network to the primate cortex? A notable difference is the depth. While a residual network has as many as 1202 layers, biological systems seem to have two orders of magnitude less. In fact, there are about half a dozen areas in the ventral stream of visual cortex from the retina to the Inferior Temporal cortex. Notice that it takes in the order of 10ms for neural activity to propagate from one area to another one. The evolutionary advantage of having fewer layers is apparent: it supports rapid (100msec from image onset to meaningful information in IT neural population) visual recognition, which is a key ability of human and non-human primates. It is intriguingly possible to account for this discrepancy by taking into account recurrent connections within each visual area. Areas in visual cortex comprise six different layers with lateral and feedback connections, which are believed to mediate some attentional effectsand even learning (such as backpropagation). “Unrolling” in time the recurrent computations carried out by the visual cortex provides an equivalent “ultra-deep” feedforward network, which might represent a more appropriate comparison with the state-of-the-art computer vision models. In addition, we conjecture that the effectiveness of recent “ultra-deep” neural networks primarily come from the fact they can efficiently model the recurrent computations that are required by the recognition task. We show compelling evidences for this conjecture by demonstrating that 1. a deep residual network is formally equivalent to a shallow RNN; 2. such a RNN with weight sharing, thus with orders of magnitude less parameters (depending on the unrolling depth), can retain most of the performance of the corresponding deep residual network. Furthermore, we generalize such a RNN into a class of models that are more biologically-plausible models of cortex and show their effectiveness on CIFAR-10. The transition matrices used in the paper. “BN” denotes Batch Normalization and “Conv” denotes convolution. Deconvolution layer (denoted by “Deconv”) is [34] used as a transition function from a spacially small state to a spacially large one. BRCx2/BRDx2 denotes a BN-ReLU- Conv/Deconv-BN-ReLU-Conv/Deconv pipeline (similar to a residual module). There is always a 2x2 subsampling/upsampling between nearby states (e.g., V1/h1: 32x32, V2/h2: 16x16, V4/h3:8x8, IT:4x4). Stride 2 (convolution) or upsampling 2 (deconvolution) is used in transition functions to match the spacial sizes of input and output states. The intermediate feature sizes of transition function BRCx2/BRDx2 or BRCx3/BRDx3 are chosen to be the average feature size of input and output states. “+I” denotes a identity shortcut mapping. The design of transition functions could be an interesting topic for future research.
  • 110. Circuit design deep networks vs Human brain #3 HYPOTHESIS & THEORY ARTICLE Front. Comput. Neurosci., 14 September 2016 | http://dx.doi.org/10.3389/fncom.2016.00094; Cited by 5 Putative differences between conventional and brain-like neural network designs. (A) In conventional deep learning, supervised training is based on externally-supplied, labeled data. (B) In the brain, supervised training of networks can still occur via gradient descent on an error signal, but this error signal must arise from internally generated cost functions. (C) Internally generated cost functions and error-driven training of cortical deep networks form part of a larger architecture containing several specialized systems. Although the trainable cortical areas are schematized as feedforward neural networks here, LSTMs or other types of recurrent networks may be a more accurate analogy, and many neuronal and network properties such as spiking, dendritic computation, neuromodulation, adaptation and homeostatic plasticity, timing- dependent plasticity, direct electrical connections, transient synaptic dynamics, excitatory/inhibitory balance, spontaneous oscillatory activity, axonal conduction delays ( Izhikevich, 2006) and others, will influence what and how such networks learn. THISARTICLE ISPART OF THE RESEARCH TOPIC ArtificialNeural NetworksasModelsofNeuralInformationProcessing Machine learning and neuroscience speak different languages today. Brain science has discovered a dazzling array of brain areas (Solari and Stoner, 2011), cell types, molecules, cellular states, and mechanisms for computation and information storage. Machine learning, in contrast, has largely focused on instantiations of a single principle: function optimization. We will argue here, however, that neuroscience and machine learning are again ripe for convergence. Three aspects of machine learning are particularly important in the context of this paper. Hypothesis 1 – The Brain Optimizes Cost Functions Hypothesis 2 – Cost Functions Are Diverse across Areas and Change over Development Hypothesis 3 – Specialized Systems Allow Efficient Solution of Key Computational Problems Machine learning may be equally transformed by neuroscience. Within the brain, a myriad of subsystems and layers work together to produce an agent that exhibits general intelligence. Hypothesis 1 – Existence of Cost Functions Hypothesis 2 – Biological Fine-structure of Cost Functions Hypothesis 3 – Embedding within a Pre-structured Architecture Hypothesis 1– Did Evolution Separate Cost Functions from OptimizationAlgorithms? We hypothesize that the brain also acquired such a separation between optimization mechanisms and cost functions. When did the division between cost functions and optimization algorithms occur? How is this separation implemented? How did innovations in cost functions and optimization algorithms evolve? And how do our own cost functions and learning algorithms differ from those of other animals?