SlideShare a Scribd company logo
1CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Name	
  of	
  Mee)ng	
  •	
  Loca)on	
  •	
  Date	
  	
  -­‐	
  	
  Change	
  in	
  Slide	
  Master	
  
LSST/DM:	
  Building	
  a	
  Next	
  Genera7on	
  Survey	
  Data	
  
Processing	
  System	
  
	
  
Mario	
  Juric	
  
LSST	
  Data	
  Management	
  Project	
  Scien5st	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
CFA CODE COFFEE
June 4, 2014
Robyn	
  Allsman,	
  
Yusra	
  AlSayyad,	
  
Tim	
  Axelrod,	
  
Jacek	
  Becla,	
  
Andrew	
  Becker,	
  	
  	
  
Steve	
  Bickerton,	
  
Jim	
  Bosch,	
  	
  
Bill	
  Chickering,	
  
Andy	
  Connolly,	
  	
  
Greg	
  Daues,	
  
Gregory	
  Dubois-­‐
Fellsman,	
  
Mike	
  Freemon,	
  
Andy	
  Hanushevsky,	
  
Fabrice	
  Jammes,	
  
Lynne	
  Jones,	
  
Jeff	
  Kantor,	
  
	
  
Kian-­‐Tat	
  Lim,	
  
Dus5n	
  Lang,	
  	
  
Ron	
  Lambert,	
  
Robert	
  Lupton	
  (the	
  Good),	
  	
  
Simon	
  Krughoff,	
  
Serge	
  Monkewitz,	
  
Jon	
  Myers,	
  
Russell	
  Owen,	
  
Steve	
  Pietrowicz,	
  
Ray	
  Plante,	
  
Paul	
  Price,	
  	
  
Andrei	
  Salnikov,	
  
Dick	
  Shaw,	
  
Schuyler	
  Van	
  Dyk,	
  
Daniel	
  Wang	
  
	
  
and	
  the	
  LSST	
  Project	
  Team	
  
2CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
A	
  Dedicated	
  Survey	
  Telescope	
  
−  A	
  wide	
  (half	
  the	
  sky),	
  deep	
  (24.5/27.5	
  mag),	
  fast	
  (image	
  the	
  sky	
  once	
  every	
  3	
  days)	
  
survey	
  telescope.	
  Beginning	
  in	
  2022,	
  it	
  will	
  repeatedly	
  image	
  the	
  sky	
  for	
  10	
  years.	
  
−  The	
  LSST	
  is	
  an	
  integrated	
  survey	
  system.	
  The	
  Observatory,	
  Telescope,	
  Camera	
  and	
  
Data	
  Management	
  system	
  are	
  all	
  built	
  to	
  support	
  the	
  LSST	
  survey.	
  There’s	
  no	
  PI	
  
mode,	
  proposals,	
  or	
  )me.	
  
	
  
−  The	
  ul7mate	
  deliverable	
  of	
  LSST	
  is	
  not	
  the	
  telescope,	
  nor	
  the	
  instruments;	
  it	
  is	
  the	
  
fully	
  reduced	
  data.	
  
•  All	
  science	
  will	
  be	
  come	
  from	
  survey	
  catalogs	
  and	
  images	
  
	
  
Telescope	
   	
  è 	
  	
  	
  	
  	
  Images	
   	
  è	
  	
  	
  	
  	
  Catalogs	
  
3CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Open	
  Data,	
  Open	
  Source:	
  A	
  Community	
  Resource	
  
−  LSST	
  data,	
  including	
  images	
  and	
  catalogs,	
  will	
  be	
  available	
  with	
  no	
  
proprietary	
  period	
  to	
  the	
  astronomical	
  community	
  of	
  the	
  United	
  States,	
  
Chile,	
  and	
  Interna7onal	
  Partners	
  
	
  
−  Alerts	
  to	
  variable	
  sources	
  (“transient	
  alerts”)	
  will	
  be	
  available	
  world-­‐wide	
  
within	
  60	
  seconds,	
  using	
  standard	
  protocols	
  
	
  
−  LSST	
  data	
  processing	
  stack	
  will	
  be	
  free	
  soYware	
  (licensed	
  under	
  the	
  GPL,	
  
v3-­‐or-­‐later)	
  
−  All	
  science	
  will	
  be	
  done	
  by	
  the	
  community	
  (not	
  the	
  Project!),	
  using	
  LSST’s	
  
data	
  products	
  
4CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Why	
  LSST:	
  The	
  Science	
  
5CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
History	
  
	
  
1996-­‐2000	
  “Dark	
  MaSer	
  Telescope”	
  
This	
  project	
  began	
  as	
  a	
  quest	
  to	
  
understand	
  cosmology	
  and	
  the	
  Solar	
  
System.	
  
	
  
2000	
  -­‐	
  …	
  	
  	
  “LSST”	
  
Emphasizes	
  a	
  broad	
  range	
  of	
  science	
  
from	
  the	
  same	
  mul7-­‐wavelength	
  
survey	
  data,	
  including	
  unique	
  7me	
  
domain	
  explora7on	
  
	
  
A	
  single	
  telescope,	
  a	
  single	
  data	
  set,	
  
can	
  serve	
  to	
  answer	
  a	
  wide	
  swath	
  of	
  
science	
  ques7ons	
  
The	
  evolu1on	
  of	
  LSST	
  design	
  
LSST:	
  Evolu7on	
  of	
  Design	
  and	
  Purpose	
  
CfA	
  Code	
  Coffee	
  •	
  Harvard-­‐Smithsonian	
  Center	
  for	
  Astrophysics	
  •	
  June	
  4,	
  2014.	
  
LSST:	
  A	
  Deep,	
  Wide,	
  Fast,	
  Optical	
  Sky	
  Survey	
  
	
  
	
  
	
  
8.4m	
  telescope 	
  18000+	
  deg2 	
  10mas	
  astrom. 	
  r<24.5	
  (<27.5@10yr)	
  
	
  
ugrizy 	
  0.5-­‐1%	
  photometry	
  
3.2Gpix	
  camera 	
  30sec	
  exp/4sec	
  rd 	
  	
   	
  15TB/night 	
  37	
  B	
  objects	
  
	
  
Imaging	
  the	
  visible	
  sky,	
  once	
  every	
  3	
  days,	
  for	
  10	
  years	
  (825	
  revisits)	
  
http://lsst.org	
  
7CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Fron7ers	
  of	
  Survey	
  Astronomy	
  
−  Time	
  domain	
  science	
  	
  
•  Nova,	
  supernova,	
  GRBs	
  	
  
•  Source	
  characteriza)on	
  	
  
•  Instantaneous	
  discovery	
  	
  
−  Census	
  of	
  the	
  Solar	
  System	
  
•  NEOs,	
  MBAs,	
  Comets	
  
•  KBOs,	
  Oort	
  Cloud	
  
−  Mapping	
  the	
  Milky	
  Way	
  
•  Tidal	
  streams	
  
•  Galac)c	
  structure	
  
−  Dark	
  energy	
  and	
  dark	
  mafer	
  
•  Strong	
  lensing	
  
•  Weak	
  lensing	
  
•  Constraining	
  the	
  nature	
  of	
  dark	
  energy	
  
8CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Funding	
  Status	
  
−  December	
  6th,	
  2013:	
  Passed	
  the	
  
NSF	
  Final	
  Design	
  Review;	
  declared	
  
ready	
  for	
  Construc1on!	
  
−  January	
  17th,	
  2014:	
  FY2014	
  
budget	
  signed,	
  with	
  NSF	
  
appropria1on	
  allowing	
  for	
  LSST	
  
start.	
  
−  May	
  8th,	
  2014:	
  NSB	
  authorizes	
  
NSF	
  Director	
  to	
  start	
  the	
  project.	
  
−  Expec5ng	
  the	
  signing	
  of	
  
coopera5ve	
  agreement	
  and	
  start	
  
of	
  construc5on	
  in	
  July	
  2014!	
  
CfA	
  Code	
  Coffee	
  •	
  Harvard-­‐Smithsonian	
  Center	
  for	
  Astrophysics	
  •	
  June	
  4,	
  2014.	
  
Loca)on:	
  Cerro	
  Pachon,	
  Chile	
  
Leveling	
  of	
  El	
  Peñón	
  (the	
  summit	
  of	
  Cerro	
  Pachón)	
  
10CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
LSST	
  Observatory	
  (cca.	
  late	
  ~2018)	
  
11CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
12CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Combined	
  Primary/Ter7ary	
  Mirror	
  
Thin	
  Meniscus	
  Secondary	
  
−  Primary-­‐Ter)ary	
  was	
  cast	
  in	
  the	
  spring	
  of	
  2008.	
  
−  Fabrica)on	
  underway	
  at	
  the	
  Steward	
  Observatory	
  
Mirror	
  Lab	
  -­‐	
  comple)on	
  by	
  the	
  end	
  of	
  2014.	
  
	
  
	
  
−  Secondary	
  substrate	
  fabricated	
  by	
  Corning	
  in	
  2009.	
  
−  Currently	
  in	
  storage	
  wai)ng	
  for	
  construc)on.	
  	
  
13CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
LSST	
  Camera	
  
Parameter	
   Value	
  
Diameter	
   1.65	
  m	
  
Length	
   3.7	
  m	
  
Weight	
   3000	
  kg	
  
F.P.	
  Diam	
   634	
  mm	
  
1.65 m
5’-5”
–  3.2 Gigapixels
–  0.2 arcsec pixels
–  9.6 square degree FOV
–  2 second readout
–  6 filters
14CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Bandpasses:	
  u,g,r,i,z,y	
  
LSST/DM: Building a Next Generation Survey Data Processing System
16CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Next	
  mee5ng:	
  August	
  11-­‐15th	
  2014,	
  Phoenix,	
  AZ	
  (hSp://ls.st/hf9)	
  
Community:	
  LSST	
  Science	
  Collabora7ons	
  
2012	
  All	
  Hands	
  Mee)ng	
  Group	
  Photo,	
  Aug	
  13-­‐17	
  2012,	
  Marana,	
  AZ	
  
17CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
LSST	
  From	
  the	
  Astronomer’s	
  Perspec7ve	
  
−  A	
  stream	
  of	
  ~10	
  million	
  )me-­‐domain	
  events	
  per	
  night,	
  detected	
  and	
  
transmiled	
  to	
  event	
  distribu)on	
  networks	
  within	
  60	
  seconds	
  of	
  
observa)on.	
  
−  A	
  catalog	
  of	
  orbits	
  for	
  ~6	
  million	
  bodies	
  in	
  the	
  Solar	
  System.	
  
−  A	
  catalog	
  of	
  ~37	
  billion	
  objects	
  (20B	
  galaxies,	
  17B	
  stars),	
  ~7	
  trillion	
  
observa)ons	
  (“sources”),	
  and	
  ~30	
  trillion	
  measurements	
  (“forced	
  
sources”),	
  produced	
  annually,	
  accessible	
  through	
  online	
  databases.	
  
−  Deep	
  co-­‐added	
  images.	
  
−  Services	
  and	
  compu)ng	
  resources	
  at	
  the	
  Data	
  Access	
  Centers	
  to	
  
enable	
  user-­‐specified	
  custom	
  processing	
  and	
  analysis.	
  
−  Sonware	
  and	
  APIs	
  enabling	
  development	
  of	
  analysis	
  codes.	
  
Level	
  3	
  Level	
  1	
  Level	
  2	
  
18CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
LSST	
  Data	
  Management	
  System	
  
(from	
  readout	
  to	
  delivery	
  to	
  the	
  user)	
  
	
  
19CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
LSST	
  Data	
  Management:	
  Roles	
  
−  Archive	
  Raw	
  Data:	
  Receive	
  the	
  incoming	
  stream	
  of	
  images	
  that	
  the	
  
Camera	
  system	
  generates	
  to	
  archive	
  the	
  raw	
  images.	
  	
  
−  Process	
  to	
  Data	
  Products:	
  Detect	
  and	
  alert	
  on	
  transient	
  events	
  within	
  
one	
  minute	
  of	
  visit	
  acquisi)on.	
  Approximately	
  once	
  per	
  year	
  create	
  and	
  
archive	
  a	
  Data	
  Release,	
  a	
  sta)c	
  self-­‐consistent	
  collec)on	
  of	
  data	
  products	
  
generated	
  from	
  all	
  survey	
  data	
  taken	
  from	
  the	
  date	
  of	
  survey	
  ini)a)on	
  to	
  
the	
  cutoff	
  date	
  for	
  the	
  Data	
  Release.	
  
−  Publish:	
  Make	
  all	
  LSST	
  data	
  available	
  through	
  an	
  interface	
  that	
  uses	
  
community-­‐accepted	
  standards,	
  and	
  facilitate	
  user	
  data	
  analysis	
  and	
  
produc7on	
  of	
  user-­‐defined	
  data	
  products	
  at	
  Data	
  Access	
  Centers	
  (DACs)	
  
and	
  external	
  sites.	
  
20CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
HQ	
  Site	
  
Science	
  Opera)ons	
  
Observatory	
  Management	
  
Educa)on	
  and	
  Public	
  Outreach	
  
Archive	
  Site	
  
Archive	
  Center	
  
Alert	
  Produc)on	
  
Data	
  Release	
  Produc)on	
  
Calibra)on	
  Products	
  Produc)on	
  
EPO	
  Infrastructure	
  
	
  Long-­‐term	
  Storage	
  (copy	
  2)	
  
Data	
  Access	
  Center	
  
Data	
  Access	
  and	
  User	
  Services	
  
Summit	
  and	
  Base	
  Sites	
  
Telescope	
  and	
  Camera	
  
Data	
  Acquisi)on	
  
Crosstalk	
  Correc)on	
  
Long-­‐term	
  storage	
  (copy	
  1)	
  
Chilean	
  Data	
  Access	
  Center	
  
Dedicated	
  Long	
  Haul	
  
Networks	
  
	
  
Two	
  redundant	
  40	
  Gbit	
  links	
  from	
  La	
  
Serena	
  to	
  Champaign,	
  IL	
  (exis)ng	
  fiber)	
  
21CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Infrastructure:	
  Petascale	
  Compu7ng,	
  Gbit	
  Networks	
  
Long	
  Haul	
  Networks	
  to	
  transport	
  
data	
  from	
  Chile	
  to	
  the	
  U.S.	
  
	
  
•  200	
  Gbps	
  from	
  Summit	
  to	
  La	
  Serena	
  (new	
  fiber)	
  
•  2x40	
  Gbit	
  (minimum)	
  for	
  La	
  Serena	
  to	
  Champaign,	
  IL	
  
(protected,	
  exis1ng	
  fiber)	
  
Archive	
  Site	
  and	
  U.S.	
  
Data	
  Access	
  Center	
  
NCSA,	
  Champaign,	
  IL	
  
Base	
  Site	
  and	
  Chilean	
  
Data	
  Access	
  Center	
  
La	
  Serena,	
  Chile	
  
The	
  compu1ng	
  cluster	
  at	
  the	
  
LSST	
  Archive	
  (at	
  NCSA)	
  will	
  
run	
  the	
  processing	
  pipelines.	
  
	
  
•  Single-­‐user,	
  single-­‐applica1on,	
  
dedicated	
  data	
  center	
  
•  Process	
  images	
  in	
  real-­‐1me	
  to	
  detect	
  
changes	
  in	
  the	
  sky	
  
•  Produce	
  annual	
  data	
  releases	
  
22CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
“Applica7ons”:	
  Scien7fic	
  Core	
  of	
  LSST	
  DM	
  
−  Applica1ons	
  carry	
  core	
  scien)fic	
  algorithms	
  
that	
  process	
  or	
  analyze	
  raw	
  LSST	
  data	
  to	
  
generate	
  output	
  Data	
  Products	
  
	
  
−  Variety	
  of	
  processing	
  
•  Image	
  processing	
  
•  Measurement	
  of	
  source	
  proper)es	
  
•  Associa)ng	
  sources	
  across	
  space	
  and	
  )me,	
  e.g.	
  
for	
  tracking	
  solar	
  system	
  objects	
  
	
  
−  Applica1ons	
  framework	
  layer	
  (afw;	
  not	
  shown)	
  
allows	
  them	
  to	
  be	
  wrilen	
  in	
  a	
  high-­‐level	
  
language	
  
	
  
23CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Middleware	
  Layer:	
  Isola7ng	
  Hardware,	
  Orchestra7ng	
  
SoYware	
  
Enabling	
  execu1on	
  of	
  science	
  pipelines	
  on	
  hundreds	
  of	
  
thousands	
  of	
  cores.	
  
	
  
•  Frameworks	
  to	
  construct	
  pipelines	
  out	
  of	
  basic	
  algorithmic	
  
components	
  
•  Orchestra)on	
  of	
  execu)on	
  on	
  thousands	
  of	
  cores	
  
•  Control	
  and	
  monitoring	
  of	
  the	
  whole	
  DM	
  System	
  
Isola1ng	
  the	
  science	
  pipelines	
  from	
  details	
  of	
  underlying	
  
hardware	
  
	
  
•  Services	
  used	
  by	
  applica)ons	
  to	
  access/produce	
  data	
  and	
  
communicate	
  
•  "Common	
  denominator"	
  interfaces	
  handle	
  changing	
  underlying	
  
technologies	
  
24CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Database	
  and	
  Science	
  UI:	
  Delivering	
  to	
  Users	
  
Massively	
  parallel,	
  
distributed,	
  fault-­‐tolerant	
  
rela5onal	
  database.	
  
	
  
•  To	
  be	
  built	
  on	
  exis)ng,	
  robust,	
  well-­‐
understood,	
  technologies	
  (MySQL	
  and	
  
xrootd)	
  
•  Commodity	
  hardware,	
  open	
  source	
  
•  Advanced	
  prototype	
  in	
  existence	
  (qserv)	
  
Science	
  User	
  Interface	
  to	
  enable	
  the	
  
access	
  to	
  and	
  analysis	
  of	
  LSST	
  data	
  
	
  
•  Web	
  and	
  machine	
  interfaces	
  to	
  LSST	
  databases	
  
•  Visualiza)on	
  and	
  analysis	
  capabili)es	
  
25CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Going	
  Where	
  the	
  Talent	
  is:	
  One	
  Distributed	
  Team	
  
Infrastructure	
  
Middleware	
  
Core	
  Algorithms	
  (“Apps”)	
  
Database	
  
UI	
  
	
  	
  	
  Mgmt,	
  I&T,	
  and	
  Science	
  QA	
  
26CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
The	
  LSST	
  Soiware	
  Stack	
  
(science	
  pipelines,	
  middleware,	
  database,	
  user	
  interfaces)	
  
	
  
“Enabling	
  LSST	
  science	
  by	
  crea1ng	
  a	
  well	
  documented,	
  state-­‐
of-­‐the-­‐art,	
  high-­‐performance,	
  scalable,	
  mul1-­‐camera,	
  open	
  
source,	
  O/IR	
  survey	
  data	
  processing	
  and	
  analysis	
  system.”	
  
27CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
LSST	
  Science	
  Pipelines	
  
−  02C.01.02.01/02. 	
  Data	
  Quality	
  Assessment	
  Pipelines	
  	
  	
  (slides	
  by	
  Juric)	
  
−  02C.01.[02.01.04,04.01,04.02]	
   	
  Calibra7on	
  Pipelines	
  	
  	
  (slides	
  by	
  Axelrod,	
  Yoachim)	
  
−  02C.03.01.	
   	
   	
  Single-­‐Frame	
  Processing	
  Pipeline	
  	
  	
  (slides	
  by	
  Krughoff,	
  Lupton)	
  
−  02C.03.02.	
   	
   	
  Associa7on	
  pipeline	
  (slides	
  by	
  Lupton)	
  
−  02C.03.03.	
   	
   	
  Alert	
  Genera7on	
  Pipeline	
  	
  	
  (slides	
  by	
  Becker)	
  
−  02C.03.04.	
   	
   	
  Image	
  Differencing	
  Pipeline	
  	
  	
  (slides	
  by	
  Becker)	
  
−  02C.03.06.	
   	
   	
  Moving	
  Object	
  Pipeline	
  	
  	
  (slides	
  by	
  Jones)	
  
−  02C.04.03.	
   	
   	
  PSF	
  Es7ma7on	
  Pipeline	
  	
  (slides	
  by	
  Lupton)	
  
−  02C.04.04.	
   	
   	
  Image	
  Coaddi7on	
  Pipeline	
  	
  	
  (slides	
  by	
  AlSayyad)	
  
−  02C.04.05.	
   	
   	
  Deep	
  Detec7on	
  Pipeline	
  	
  	
  (slides	
  by	
  Lupton)	
  
−  02C.04.06.	
   	
   	
  Object	
  Characteriza7on	
  Pipeline	
  	
  	
  (slides	
  by	
  Lupton,	
  Bosch)	
  
−  02C.01.02.03. 	
   	
  Science	
  Pipeline	
  Toolkit	
  	
  
	
   	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  (slides	
  by	
  Dubois-­‐Felsmann)	
  
−  02C.03.05/04.07 	
  Applica7on	
  Framework	
  
	
   	
   	
   	
   	
   	
  	
  	
   	
   	
   	
  (slides	
  by	
  Lupton)	
  
Calibra1on	
  reviewed	
  in	
  July	
  ’13,	
  
by	
  Wood-­‐Vasey	
  et	
  al.	
  
Pipelines	
  reviewed	
  in	
  Sep.	
  ’13,	
  
by	
  Magnier	
  et	
  al.	
  	
  
Level	
  1	
  Level	
  2	
  L3	
  
Data	
  Management	
  Applica1ons	
  Design	
  (LDM-­‐151)	
  
28CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Implementa7on	
  Strategy:	
  Transfer	
  Know-­‐how,	
  not	
  Code	
  
−  Difficulty	
  adap7ng	
  exis7ng	
  public	
  codes	
  to	
  LSST	
  requirements	
  
(AstroMa7c	
  suite,	
  PHOTO,	
  Elixir,	
  IRAF-­‐based	
  pipelines,	
  etc.)	
  
•  Need	
  to	
  run	
  efficiently	
  at	
  scale	
  
•  Need	
  to	
  be	
  flexible	
  (plugging/unplugging	
  of	
  algorithms	
  at	
  run)me)	
  
•  Need	
  to	
  have	
  it	
  developed	
  by	
  a	
  large	
  team	
  (20+	
  scien)sts	
  and	
  
programmers)	
  
•  Need	
  to	
  be	
  maintainable	
  over	
  ~25	
  years	
  of	
  R&D,	
  Construc)on,	
  and	
  
Survey	
  Opera)ons	
  
•  Need	
  to	
  run	
  on	
  a	
  variety	
  of	
  hardware	
  and	
  sonware	
  pla{orms	
  
•  Need	
  to	
  have	
  logging	
  and	
  provenance	
  built	
  into	
  the	
  design	
  
−  Early	
  on	
  (~2006),	
  a	
  decision	
  was	
  made	
  to	
  (largely)	
  transfer	
  the	
  scien7fic	
  
know-­‐how,	
  but	
  not	
  code.	
  
29CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Maintainable	
  Design	
  /	
  Language	
  Choices	
  
−  LSST	
  sonware	
  stack	
  is	
  largely	
  wrilen	
  from	
  scratch,	
  in	
  Python,	
  unless	
  
computa)onal	
  demands	
  require	
  the	
  use	
  of	
  C++	
  
•  C++:	
  
-  Computa)onally	
  intensive	
  code	
  
-  Made	
  available	
  to	
  Python	
  via	
  SWIG	
  
•  Python:	
  
-  All	
  high-­‐level	
  code	
  
-  Prefer	
  Python	
  to	
  C++	
  unless	
  performance	
  demands	
  otherwise	
  
−  Modularity	
  
•  Virtually	
  everything	
  is	
  a	
  Python	
  module.	
  
•  ~60	
  packages	
  (git	
  repositories,	
  ~corresponding	
  to	
  python	
  packages)	
  
−  Build	
  system:	
  scons 	
   	
  Version	
  control:	
  git	
   	
  Package	
  management:	
  EUPS 	
  	
  
30CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Modular	
  Architecture	
  
Applica)on	
  Framework	
  (comp.	
  
intensive	
  C++,	
  SWIG-­‐wrapped	
  into	
  
Python)	
  
Middleware	
  (I/O,	
  configura)on,	
  …)	
  
External	
  C/C++	
  Libraries	
  (Boost,	
  
FFTW,	
  Eigen,	
  CUDA	
  ..)	
  
External	
  Python	
  Modules	
  (numpy,	
  
pyfits,	
  matplotlib,	
  …)	
  
Camera	
  Abstrac)on	
  
Layer	
  
(obs_*	
  packages)	
  
Measurement	
  
Algorithms	
  (meas_*)	
  
Tasks	
  (ISR,	
  Detec)on,	
  Co-­‐adding,	
  …)	
  
Command-­‐line	
  driver	
  scripts	
   Cluster	
  execu)on	
  middleware	
  
…	
  
Red:	
  Mostly	
  C++	
  (but	
  Python	
  wrapped);	
  	
  	
  	
  	
  Blue:	
  Mostly	
  Python;	
  	
  	
  	
  	
  Black:	
  External	
  Libraries	
  
Middleware	
  (I/O,	
  configura)on,	
  …)	
  
31CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Module	
  Dependency	
  Tree	
  
eigen xpa fftwminuit2afwdata cuda_toolkit pysqlitemysqlclientlibpngfreetype astrometry_net suprime_ddata zlib tcltk
cfitsio doxygengsl python swig
boostmysqlpythonnumpy sconswcslib
matplotlib pyfits
sconsUtils
base
ndarray pex_exceptions
utils
daf_base geom
pex_logging pex_policy
daf_persistencepex_config
afw obs_test
coadd_utils pipe_baseskymap skypixtesting_displayQA
coadd_chisquared daf_butlerUtilsmeas_algorithms
ip_diffim ip_isrmeas_astrom meas_extensions_photometryKron meas_extensions_rotAnglemeas_extensions_shapeHSM obs_lsstSim obs_subaru
pipe_tasks
32CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Module	
  Dependency	
  Tree	
  
eigen xpa fftwminuit2afwdata cuda_toolkit pysqlitemysqlclientlibpngfreetype astrometry_net suprime_ddata zlib tcltk
cfitsio doxygengsl python swig
boostmysqlpythonnumpy sconswcslib
matplotlib pyfits
sconsUtils
base
ndarray pex_exceptions
utils
daf_base geom
pex_logging pex_policy
daf_persistencepex_config
afw obs_test
coadd_utils pipe_baseskymap skypixtesting_displayQA
coadd_chisquared daf_butlerUtilsmeas_algorithms
ip_diffim ip_isrmeas_astrom meas_extensions_photometryKron meas_extensions_rotAnglemeas_extensions_shapeHSM obs_lsstSim obs_subaru
pipe_tasks
External	
  Tools	
  and	
  Libraries	
  
AFW	
  
Camera	
  abstrac)ons	
  Measurement	
  Algorithms	
  
Top-­‐level	
  
scripts	
  
33CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
(Very	
  Basic)	
  SExtractor	
  with	
  lsst	
  primi7ves	
  (1/2)	
  
34CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
(Very	
  Basic)	
  SExtractor	
  with	
  lsst	
  primi7ves	
  (2/2)	
  
35CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
36CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Current	
  Status:	
  Advanced	
  Prototypes	
  
−  8-­‐year	
  prototyping	
  effort	
  
•  8	
  sonware	
  releases	
  (Data	
  Challenges)	
  
•  Status:	
  A	
  rapidly	
  maturing	
  state-­‐of-­‐the	
  art	
  astronomical	
  data	
  reduc)on	
  system	
  
-  ~SDSS/SExtractor	
  level	
  quality	
  of	
  reduc)ons	
  
-  Most	
  recently	
  tested	
  by	
  building	
  co-­‐adds	
  using	
  SDSS	
  Stripe	
  82	
  data	
  
-  Used	
  in	
  commissioning	
  of	
  the	
  Hyper	
  Suprime-­‐Cam	
  Survey	
  on	
  Subaru	
  
	
  
−  Prototyped	
  Features:	
  
•  Instrumental	
  signature	
  removal	
  
•  Single-­‐frame	
  processing	
  
•  Point	
  source	
  photometry	
  
•  Extended	
  source	
  photometry	
  (model	
  fi•ng)	
  
•  Deblender	
  
•  Co-­‐addi)on	
  of	
  images	
  
•  Image	
  differencing	
  
•  Object	
  characteriza)on	
  on	
  mul)-­‐epoch	
  data	
  (StackFit/Mul)Fit)	
  
•  …	
  
	
  
Planning	
  to	
  begin	
  addressing	
  it	
  over	
  
the	
  next	
  few	
  months.	
  
Figure:	
  	
  
5	
  sq.	
  deg.	
  	
  
background-­‐matched	
  
coadd	
  composite	
  
	
  
(g,r,i)	
  
~55	
  epochs	
  	
  
	
  
Region:	
  	
  Aqr	
  
Galac)c	
  lat	
  =	
  -­‐35.0	
  
	
  
	
  
	
  
	
  
New	
  Algorithms:	
  Background-­‐matched	
  co-­‐
add	
  of	
  SDSS	
  Stripe	
  82	
  in	
  the	
  vicinity	
  of	
  M2.	
  
	
  
Background	
  matching	
  preserves	
  diffuse	
  
structures.	
  
	
  
Generated	
  with	
  LSST	
  pipeline	
  prototypes.	
  
hfp://moe.astro.washington.edu/sdss/	
  
Slide:	
  Yusra	
  AlSayyad	
  
38CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Streams	
  in	
  LSST-­‐reprocessed	
  SDSS	
  Stripe	
  82	
  
Stripe	
  82	
  background-­‐matched	
  coadds	
  built	
  with	
  LSST	
  Data	
  Management	
  stack	
  (hfp://moe.astro.washington.edu)	
  
hfp://moe.astro.washington.edu/sdss/	
  
39CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Example:	
  Forced	
  Photometry	
  on	
  SDSS	
  Stripe	
  82	
  
Forced	
  Photometry	
  
	
  
For	
  every	
  detec)on	
  in	
  the	
  deep	
  co-­‐add,	
  
perform	
  PSF	
  photometry	
  on	
  individual	
  
frames	
  (ugriz).	
  Note	
  that	
  the	
  majority	
  of	
  
these	
  will	
  be	
  below	
  the	
  single-­‐frame	
  SNR	
  
detec)on	
  treshold.	
  
	
  
Averaging	
  those	
  fluxes	
  allows	
  one	
  to	
  go	
  
deeper.	
  
	
  
Len:	
  comparison	
  of	
  Ivezic	
  et	
  al.	
  (2004)	
  w	
  and	
  
y	
  color	
  loci;	
  single	
  frame	
  vs.	
  deep	
  catalog.	
  
	
  
40CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Winter	
  2014	
  SoYware	
  Release	
  
curl	
  –O	
  http://sw.lsstcorp.org/eupspkg/newinstall.sh	
  
bash	
  newinstall.sh	
  
Installing	
  
•  Supported	
  plaqorms	
  (plaqorms	
  we	
  regularly	
  build	
  on;	
  generally	
  builds	
  on	
  
any	
  Linux/BSD)	
  
•  RHEL	
  6	
  
•  OS	
  X	
  10.8	
  Mountain	
  Lion	
  
•  OS	
  X	
  10.9	
  Mavericks	
  
	
  
41CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
WARNING!	
  ADVERTENCIA!	
  AVERTISSEMENT!	
  
THIS	
  IS	
  STILL	
  NOT	
  A	
  FINISHED,	
  POLISHED,	
  READY-­‐TO-­‐USE	
  END-­‐
USER	
  PRODUCT!	
  BEFORE	
  DOWNLOADING,	
  PLEASE	
  MAKE	
  SURE	
  
TO	
  READ	
  THE	
  DM	
  STACK	
  FAQ:	
  
42CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
WARNING!	
  ADVERTENCIA!	
  AVERTISSEMENT!	
  
THIS	
  IS	
  STILL	
  NOT	
  A	
  FINISHED,	
  POLISHED,	
  READY-­‐TO-­‐USE	
  END-­‐
USER	
  PRODUCT!	
  BEFORE	
  DOWNLOADING,	
  PLEASE	
  MAKE	
  SURE	
  
TO	
  READ	
  THE	
  DM	
  STACK	
  FAQ:	
  
	
  
hfp://dev.lsstcorp.org/trac/wiki/DM/Policy/UsingDMCode/
FAQ	
  
	
  
KEY	
  POINTS:	
  
-­‐  POOR	
  DOCUMENTATION	
  
-­‐  YOU’RE	
  DOWNLOADING	
  UNSUPPORTED,	
  PROTOTYPE,	
  CODE	
  
-­‐  THIS	
  CODE	
  WILL	
  NOT	
  WORK	
  OUT	
  OF	
  THE	
  BOX	
  FOR	
  CAMERAS	
  
OTHER	
  THAN	
  LSST	
  (AND	
  SDSS).	
  
-­‐  EXPECT	
  TO	
  WRITE	
  SOME	
  PYTHON	
  CODE	
  
43CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
The	
  Big	
  Picture:	
  	
  
	
  
Preparing	
  for	
  the	
  Data	
  
Driven	
  Astronomy	
  of	
  the	
  
Next	
  Decade	
  (and	
  beyond)	
  
44CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
“Astro	
  2020”:	
  Rise	
  of	
  the	
  Machines	
  
−  We’re	
  witnessing	
  a	
  change	
  in	
  how	
  astronomy	
  
is	
  done,	
  and	
  the	
  technical	
  knowledge	
  and	
  
tools	
  needed	
  to	
  do	
  it.	
  
•  The	
  rise	
  of	
  big	
  projects	
  and	
  end	
  to	
  data	
  
scarcity	
  
•  The	
  rise	
  of	
  systema)cs	
  limited	
  science	
  
•  The	
  rise	
  of	
  open	
  (source),	
  (massively)	
  
collabora)ve,	
  science	
  
45CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
“Astro	
  2020”:	
  Rise	
  of	
  the	
  Machines	
  
−  We’re	
  witnessing	
  a	
  change	
  in	
  how	
  astronomy	
  
is	
  done,	
  and	
  the	
  technical	
  knowledge	
  and	
  
tools	
  needed	
  to	
  do	
  it.	
  
•  The	
  rise	
  of	
  big	
  projects	
  and	
  end	
  to	
  data	
  
scarcity	
  
•  The	
  rise	
  of	
  systema)cs	
  limited	
  science	
  
•  The	
  rise	
  of	
  open	
  (source),	
  (massively)	
  
collabora)ve,	
  science	
  
−  Consequences	
  
•  Ability	
  to	
  collect	
  data	
  has	
  outstripped	
  the	
  
ability	
  to	
  analyze	
  it	
  
-  Extrac)on	
  of	
  features	
  from	
  the	
  data	
  (“image	
  
processing”)	
  
-  Mining	
  of	
  knowledge	
  from	
  the	
  data	
  (“data	
  
mining”)	
  
•  We	
  cri)cally	
  dependent	
  on	
  compu)ng	
  
infrastructure	
  and	
  sonware/algorithm	
  
research	
  for	
  astronomical	
  progress	
  
-  Yet	
  we	
  don’t	
  generally	
  acknowledge,	
  
encourage,	
  or	
  teach	
  it	
  
	
  
46CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
“Astro	
  2020”:	
  Rise	
  of	
  the	
  Machines	
  
−  We’re	
  witnessing	
  a	
  change	
  in	
  how	
  astronomy	
  
is	
  done,	
  and	
  the	
  technical	
  knowledge	
  and	
  
tools	
  needed	
  to	
  do	
  it.	
  
•  The	
  rise	
  of	
  big	
  projects	
  and	
  end	
  to	
  data	
  
scarcity	
  
•  The	
  rise	
  of	
  systema)cs	
  limited	
  science	
  
•  The	
  rise	
  of	
  open	
  (source),	
  (massively)	
  
collabora)ve,	
  science	
  
−  Consequences	
  
•  Ability	
  to	
  collect	
  data	
  has	
  outstripped	
  the	
  
ability	
  to	
  analyze	
  it	
  
-  Extrac)on	
  of	
  features	
  from	
  the	
  data	
  (“image	
  
processing”)	
  
-  Mining	
  of	
  knowledge	
  from	
  the	
  data	
  (“data	
  
mining”)	
  
•  We	
  cri)cally	
  dependent	
  on	
  compu)ng	
  
infrastructure	
  and	
  sonware/algorithm	
  
research	
  for	
  astronomical	
  progress	
  
-  Yet	
  we	
  don’t	
  generally	
  acknowledge,	
  
encourage,	
  or	
  teach	
  it	
  
	
  
−  Challenges	
  
•  Eleva)ng	
  sonware	
  engineering	
  to	
  a	
  
foo)ng	
  equal	
  to	
  mathema)cs?	
  
-  Learn-­‐by-­‐osmosis	
  not	
  sufficient	
  any	
  
more	
  
•  T(construc)on)	
  >>	
  T(discovery)	
  
-  Research	
  becoming	
  more	
  data	
  driven	
  
-  Broad	
  interests	
  in	
  astrophysics	
  
-  Sta)s)cs,	
  CS,	
  sonware	
  engineering,	
  etc.	
  
•  Sonware	
  reusability	
  
-  Increasing	
  complexity	
  makes	
  
perpetual	
  wheel	
  reinven7ons	
  
infeasible	
  (and,	
  honestly,	
  silly…)	
  
47CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
LSST:	
  Helping	
  Build	
  the	
  Common	
  Codebase	
  for	
  
the	
  Next	
  Quarter	
  Century	
  
−  LSST	
  sonware	
  will	
  be	
  general	
  purpose	
  and	
  
highly	
  reusable	
  by	
  design.	
  
•  Necessary	
  to	
  deal	
  with	
  real-­‐world	
  hardware	
  
•  Necessary	
  to	
  be	
  able	
  to	
  process	
  precursor	
  
data	
  
•  Necessary	
  to	
  enable	
  science	
  (“Level	
  3”)	
  
sonware	
  to	
  be	
  wrilen	
  on	
  top	
  of	
  it	
  
−  Opportuni7es	
  for	
  using	
  LSST-­‐derived	
  code	
  
on	
  other	
  data	
  sets	
  
•  More	
  work	
  ahead,	
  but	
  becoming	
  a	
  state	
  of	
  
the	
  art,	
  well	
  supported,	
  codebase	
  
•  Possibili)es:	
  SDSS,	
  CFHT-­‐LS,	
  PanSTARRS,	
  
HSC,	
  DES,	
  WFIRST,	
  Euclid,	
  …	
  
•  Good	
  basis	
  for	
  analysis	
  frameworks	
  (LSST	
  
DESC)	
  
•  Leveraging	
  a	
  100M+	
  NSF	
  investment	
  in	
  
large	
  survey	
  data	
  management	
  
−  The	
  benefits	
  feed	
  back	
  to	
  LSST:	
  more	
  
users,	
  less	
  bugs,	
  beler	
  understanding,	
  
shorter	
  path	
  to	
  science.	
  
48CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
LSST:	
  A	
  Piece	
  of	
  the	
  Puzzle	
  
−  LSST	
  can	
  help	
  posi7on	
  us	
  for	
  the	
  future	
  in	
  two	
  
ways	
  
•  With	
  code	
  (see	
  previous	
  slide)	
  
•  With	
  people/culture	
  
−  SoYware	
  Development	
  Culture	
  
•  We	
  will	
  run	
  the	
  sonware	
  effort	
  as	
  an	
  open	
  source	
  
project	
  with	
  reusability	
  in	
  mind	
  
-  A	
  source	
  tarball	
  at	
  the	
  very	
  end	
  is	
  not	
  useful!	
  
-  Open	
  bug	
  trackers,	
  mailing	
  lists,	
  repositories	
  
-  S7ll	
  have	
  a	
  job	
  to	
  do!	
  But	
  that	
  doesn’t	
  mean	
  we	
  
must	
  do	
  it	
  in	
  a	
  closed,	
  insulated,	
  manner!	
  
•  Think	
  Fedora	
  Project/RedHat,	
  Android/Google,	
  
Debian/Ubuntu/Mint	
  
•  Use	
  what	
  works:	
  numpy,	
  scipy,	
  astropy,	
  etc…	
  
-  Improve	
  upstream	
  rather	
  than	
  fork!	
  
-  Where	
  we	
  run	
  into	
  problems:	
  poor	
  sonware	
  
engineering,	
  performance	
  issues,	
  licenses	
  
•  Startup	
  mentality:	
  excellence	
  wins,	
  agile	
  process,	
  
con)nuous	
  change	
  &	
  learning,	
  collabora)ve	
  spirit,	
  
sense	
  of	
  urgency	
  and	
  excitement.	
  
−  People	
  
•  We	
  will	
  have	
  40+	
  people	
  working	
  on	
  
LSST	
  Data	
  Management	
  over	
  (1)8+	
  yrs	
  
-  Crea)ng	
  a	
  career	
  path	
  for	
  sonware	
  
instrumentalists	
  
•  We	
  can	
  help	
  train	
  a	
  whole	
  genera)on	
  
of	
  “data	
  driven	
  astronomers”	
  
-  Impar)ng	
  the	
  know-­‐how	
  needed	
  to	
  
make	
  the	
  best	
  use	
  of	
  the	
  next	
  
genera)on	
  of	
  surveys	
  
49CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
@LSST	
  	
  @mjuric	
  

More Related Content

PDF
GaiaCal2014: Creating and Calibrating LSST Data Product
PPTX
AstroInformatics 2015: Large Sky Surveys: Entering the Era of Software-Bound ...
PPTX
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
PPT
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
PDF
SKA Regional Sciences Centres - A Platform for Global Astronomy
PPTX
LSST Solar System Science: MOPS Status, the Science, and Your Questions
PDF
Weather Station Data Publication at Irstea: an implementation Report.
PDF
Computational Training and Data Literacy for Domain Scientists
GaiaCal2014: Creating and Calibrating LSST Data Product
AstroInformatics 2015: Large Sky Surveys: Entering the Era of Software-Bound ...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
SKA Regional Sciences Centres - A Platform for Global Astronomy
LSST Solar System Science: MOPS Status, the Science, and Your Questions
Weather Station Data Publication at Irstea: an implementation Report.
Computational Training and Data Literacy for Domain Scientists

What's hot (20)

PPTX
Solar System Processing with LSST: A Status Update
PPTX
Using the Data Cube vocabulary for Publishing Environmental Linked Data on la...
PPTX
Novel Techniques & Connections Between High-Pressure Mineral Physics, Microto...
PDF
Data Infrastructure Development for SKA/Jasper Horrell
PDF
DSD-INT 2015 - Foreshore wave attenuation modelling with Xbeach using EO data...
PDF
NASA Advanced Computing Environment for Science & Engineering
PPTX
Overview of hyperspectral remote sensing of impervious surfaces
PPT
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
PDF
Data Science Education: Needs & Opportunities in Astronomy
PDF
B0DEGA 3D VO Archive - IVOA 2010 Fall Interop
PDF
Introduction to Some New Era Fields in Pakistan
PPTX
Linked Sensor Data cube
PDF
DSD-INT 2015 -EU FP7 project “FAST” introduction - Mindert de Vries
PPT
Statistical data in RDF
PDF
FDL 2017 Lunar Water and Volatiles
PDF
20131107 damasso great
PDF
FDL 2017 3D Shape Modeling
PPTX
ESCAPE Kick-off meeting - KM3Net, Opening a new window on our universe (Feb 2...
PDF
FDL 2017 Solar Storm Prediction Presentation
PDF
Planet hunters x_kic_8462852_were__is_the_flux
Solar System Processing with LSST: A Status Update
Using the Data Cube vocabulary for Publishing Environmental Linked Data on la...
Novel Techniques & Connections Between High-Pressure Mineral Physics, Microto...
Data Infrastructure Development for SKA/Jasper Horrell
DSD-INT 2015 - Foreshore wave attenuation modelling with Xbeach using EO data...
NASA Advanced Computing Environment for Science & Engineering
Overview of hyperspectral remote sensing of impervious surfaces
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
Data Science Education: Needs & Opportunities in Astronomy
B0DEGA 3D VO Archive - IVOA 2010 Fall Interop
Introduction to Some New Era Fields in Pakistan
Linked Sensor Data cube
DSD-INT 2015 -EU FP7 project “FAST” introduction - Mindert de Vries
Statistical data in RDF
FDL 2017 Lunar Water and Volatiles
20131107 damasso great
FDL 2017 3D Shape Modeling
ESCAPE Kick-off meeting - KM3Net, Opening a new window on our universe (Feb 2...
FDL 2017 Solar Storm Prediction Presentation
Planet hunters x_kic_8462852_were__is_the_flux
Ad

Similar to LSST/DM: Building a Next Generation Survey Data Processing System (20)

PPTX
Astronauts and Robots 2015: Jonas Zmuidzinas, JPL
PDF
CSS161010: a luminous, fast blue optical transient with broad blueshifted hyd...
PDF
Astronomical Research in the Classroom with the Faulkes Telescope Project by ...
PDF
Cepheid calibrations of_modern_type_ia_supernovae_implications_for_the_hubble...
PPT
Telescopes & astronomy andie
KEY
The Changing Face(s) of Astronomy
KEY
National Academy of Sciences
PDF
Lucky imaging - Life in the visible after HST
PPTX
Hubble space telescope
PDF
hubblespacetelescope-150407101134-conversion-gate01.pdf
PPT
Galaxy Forum SEA Indonesia 2017 -- Pam Tuan-Anh VNSC/VAST
PDF
Wolfram Data Summit: New Frontiers in Astronomy
PPT
Soho Image Anomalies - Analysis 1
PDF
Dr Jen Gupta - Understanding nature’s death ray guns - 13 Oct 2015
ZIP
Google Sky At GSFC
PPTX
Hubble Telescope
PPT
Presentation about James Webb's Telescope
PPT
012898.PPT
ZIP
New Frontiers in Astronomy
PPS
Hubble Heritage Project
Astronauts and Robots 2015: Jonas Zmuidzinas, JPL
CSS161010: a luminous, fast blue optical transient with broad blueshifted hyd...
Astronomical Research in the Classroom with the Faulkes Telescope Project by ...
Cepheid calibrations of_modern_type_ia_supernovae_implications_for_the_hubble...
Telescopes & astronomy andie
The Changing Face(s) of Astronomy
National Academy of Sciences
Lucky imaging - Life in the visible after HST
Hubble space telescope
hubblespacetelescope-150407101134-conversion-gate01.pdf
Galaxy Forum SEA Indonesia 2017 -- Pam Tuan-Anh VNSC/VAST
Wolfram Data Summit: New Frontiers in Astronomy
Soho Image Anomalies - Analysis 1
Dr Jen Gupta - Understanding nature’s death ray guns - 13 Oct 2015
Google Sky At GSFC
Hubble Telescope
Presentation about James Webb's Telescope
012898.PPT
New Frontiers in Astronomy
Hubble Heritage Project
Ad

Recently uploaded (20)

PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PDF
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
PDF
. Radiology Case Scenariosssssssssssssss
PDF
HPLC-PPT.docx high performance liquid chromatography
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
SCIENCE10 Q1 5 WK8 Evidence Supporting Plate Movement.pptx
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PDF
Sciences of Europe No 170 (2025)
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
Classification Systems_TAXONOMY_SCIENCE8.pptx
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
. Radiology Case Scenariosssssssssssssss
HPLC-PPT.docx high performance liquid chromatography
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
SCIENCE10 Q1 5 WK8 Evidence Supporting Plate Movement.pptx
INTRODUCTION TO EVS | Concept of sustainability
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
7. General Toxicologyfor clinical phrmacy.pptx
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
neck nodes and dissection types and lymph nodes levels
bbec55_b34400a7914c42429908233dbd381773.pdf
Derivatives of integument scales, beaks, horns,.pptx
Taita Taveta Laboratory Technician Workshop Presentation.pptx
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
Sciences of Europe No 170 (2025)
AlphaEarth Foundations and the Satellite Embedding dataset
Biophysics 2.pdffffffffffffffffffffffffff
microscope-Lecturecjchchchchcuvuvhc.pptx

LSST/DM: Building a Next Generation Survey Data Processing System

  • 1. 1CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Name  of  Mee)ng  •  Loca)on  •  Date    -­‐    Change  in  Slide  Master   LSST/DM:  Building  a  Next  Genera7on  Survey  Data   Processing  System     Mario  Juric   LSST  Data  Management  Project  Scien5st                       CFA CODE COFFEE June 4, 2014 Robyn  Allsman,   Yusra  AlSayyad,   Tim  Axelrod,   Jacek  Becla,   Andrew  Becker,       Steve  Bickerton,   Jim  Bosch,     Bill  Chickering,   Andy  Connolly,     Greg  Daues,   Gregory  Dubois-­‐ Fellsman,   Mike  Freemon,   Andy  Hanushevsky,   Fabrice  Jammes,   Lynne  Jones,   Jeff  Kantor,     Kian-­‐Tat  Lim,   Dus5n  Lang,     Ron  Lambert,   Robert  Lupton  (the  Good),     Simon  Krughoff,   Serge  Monkewitz,   Jon  Myers,   Russell  Owen,   Steve  Pietrowicz,   Ray  Plante,   Paul  Price,     Andrei  Salnikov,   Dick  Shaw,   Schuyler  Van  Dyk,   Daniel  Wang     and  the  LSST  Project  Team  
  • 2. 2CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. A  Dedicated  Survey  Telescope   −  A  wide  (half  the  sky),  deep  (24.5/27.5  mag),  fast  (image  the  sky  once  every  3  days)   survey  telescope.  Beginning  in  2022,  it  will  repeatedly  image  the  sky  for  10  years.   −  The  LSST  is  an  integrated  survey  system.  The  Observatory,  Telescope,  Camera  and   Data  Management  system  are  all  built  to  support  the  LSST  survey.  There’s  no  PI   mode,  proposals,  or  )me.     −  The  ul7mate  deliverable  of  LSST  is  not  the  telescope,  nor  the  instruments;  it  is  the   fully  reduced  data.   •  All  science  will  be  come  from  survey  catalogs  and  images     Telescope    è          Images    è          Catalogs  
  • 3. 3CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Open  Data,  Open  Source:  A  Community  Resource   −  LSST  data,  including  images  and  catalogs,  will  be  available  with  no   proprietary  period  to  the  astronomical  community  of  the  United  States,   Chile,  and  Interna7onal  Partners     −  Alerts  to  variable  sources  (“transient  alerts”)  will  be  available  world-­‐wide   within  60  seconds,  using  standard  protocols     −  LSST  data  processing  stack  will  be  free  soYware  (licensed  under  the  GPL,   v3-­‐or-­‐later)   −  All  science  will  be  done  by  the  community  (not  the  Project!),  using  LSST’s   data  products  
  • 4. 4CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Why  LSST:  The  Science  
  • 5. 5CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. History     1996-­‐2000  “Dark  MaSer  Telescope”   This  project  began  as  a  quest  to   understand  cosmology  and  the  Solar   System.     2000  -­‐  …      “LSST”   Emphasizes  a  broad  range  of  science   from  the  same  mul7-­‐wavelength   survey  data,  including  unique  7me   domain  explora7on     A  single  telescope,  a  single  data  set,   can  serve  to  answer  a  wide  swath  of   science  ques7ons   The  evolu1on  of  LSST  design   LSST:  Evolu7on  of  Design  and  Purpose  
  • 6. CfA  Code  Coffee  •  Harvard-­‐Smithsonian  Center  for  Astrophysics  •  June  4,  2014.   LSST:  A  Deep,  Wide,  Fast,  Optical  Sky  Survey         8.4m  telescope  18000+  deg2  10mas  astrom.  r<24.5  (<27.5@10yr)     ugrizy  0.5-­‐1%  photometry   3.2Gpix  camera  30sec  exp/4sec  rd      15TB/night  37  B  objects     Imaging  the  visible  sky,  once  every  3  days,  for  10  years  (825  revisits)   http://lsst.org  
  • 7. 7CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Fron7ers  of  Survey  Astronomy   −  Time  domain  science     •  Nova,  supernova,  GRBs     •  Source  characteriza)on     •  Instantaneous  discovery     −  Census  of  the  Solar  System   •  NEOs,  MBAs,  Comets   •  KBOs,  Oort  Cloud   −  Mapping  the  Milky  Way   •  Tidal  streams   •  Galac)c  structure   −  Dark  energy  and  dark  mafer   •  Strong  lensing   •  Weak  lensing   •  Constraining  the  nature  of  dark  energy  
  • 8. 8CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Funding  Status   −  December  6th,  2013:  Passed  the   NSF  Final  Design  Review;  declared   ready  for  Construc1on!   −  January  17th,  2014:  FY2014   budget  signed,  with  NSF   appropria1on  allowing  for  LSST   start.   −  May  8th,  2014:  NSB  authorizes   NSF  Director  to  start  the  project.   −  Expec5ng  the  signing  of   coopera5ve  agreement  and  start   of  construc5on  in  July  2014!  
  • 9. CfA  Code  Coffee  •  Harvard-­‐Smithsonian  Center  for  Astrophysics  •  June  4,  2014.   Loca)on:  Cerro  Pachon,  Chile   Leveling  of  El  Peñón  (the  summit  of  Cerro  Pachón)  
  • 10. 10CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. LSST  Observatory  (cca.  late  ~2018)  
  • 11. 11CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
  • 12. 12CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Combined  Primary/Ter7ary  Mirror   Thin  Meniscus  Secondary   −  Primary-­‐Ter)ary  was  cast  in  the  spring  of  2008.   −  Fabrica)on  underway  at  the  Steward  Observatory   Mirror  Lab  -­‐  comple)on  by  the  end  of  2014.       −  Secondary  substrate  fabricated  by  Corning  in  2009.   −  Currently  in  storage  wai)ng  for  construc)on.    
  • 13. 13CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. LSST  Camera   Parameter   Value   Diameter   1.65  m   Length   3.7  m   Weight   3000  kg   F.P.  Diam   634  mm   1.65 m 5’-5” –  3.2 Gigapixels –  0.2 arcsec pixels –  9.6 square degree FOV –  2 second readout –  6 filters
  • 14. 14CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Bandpasses:  u,g,r,i,z,y  
  • 16. 16CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Next  mee5ng:  August  11-­‐15th  2014,  Phoenix,  AZ  (hSp://ls.st/hf9)   Community:  LSST  Science  Collabora7ons   2012  All  Hands  Mee)ng  Group  Photo,  Aug  13-­‐17  2012,  Marana,  AZ  
  • 17. 17CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. LSST  From  the  Astronomer’s  Perspec7ve   −  A  stream  of  ~10  million  )me-­‐domain  events  per  night,  detected  and   transmiled  to  event  distribu)on  networks  within  60  seconds  of   observa)on.   −  A  catalog  of  orbits  for  ~6  million  bodies  in  the  Solar  System.   −  A  catalog  of  ~37  billion  objects  (20B  galaxies,  17B  stars),  ~7  trillion   observa)ons  (“sources”),  and  ~30  trillion  measurements  (“forced   sources”),  produced  annually,  accessible  through  online  databases.   −  Deep  co-­‐added  images.   −  Services  and  compu)ng  resources  at  the  Data  Access  Centers  to   enable  user-­‐specified  custom  processing  and  analysis.   −  Sonware  and  APIs  enabling  development  of  analysis  codes.   Level  3  Level  1  Level  2  
  • 18. 18CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. LSST  Data  Management  System   (from  readout  to  delivery  to  the  user)    
  • 19. 19CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. LSST  Data  Management:  Roles   −  Archive  Raw  Data:  Receive  the  incoming  stream  of  images  that  the   Camera  system  generates  to  archive  the  raw  images.     −  Process  to  Data  Products:  Detect  and  alert  on  transient  events  within   one  minute  of  visit  acquisi)on.  Approximately  once  per  year  create  and   archive  a  Data  Release,  a  sta)c  self-­‐consistent  collec)on  of  data  products   generated  from  all  survey  data  taken  from  the  date  of  survey  ini)a)on  to   the  cutoff  date  for  the  Data  Release.   −  Publish:  Make  all  LSST  data  available  through  an  interface  that  uses   community-­‐accepted  standards,  and  facilitate  user  data  analysis  and   produc7on  of  user-­‐defined  data  products  at  Data  Access  Centers  (DACs)   and  external  sites.  
  • 20. 20CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. HQ  Site   Science  Opera)ons   Observatory  Management   Educa)on  and  Public  Outreach   Archive  Site   Archive  Center   Alert  Produc)on   Data  Release  Produc)on   Calibra)on  Products  Produc)on   EPO  Infrastructure    Long-­‐term  Storage  (copy  2)   Data  Access  Center   Data  Access  and  User  Services   Summit  and  Base  Sites   Telescope  and  Camera   Data  Acquisi)on   Crosstalk  Correc)on   Long-­‐term  storage  (copy  1)   Chilean  Data  Access  Center   Dedicated  Long  Haul   Networks     Two  redundant  40  Gbit  links  from  La   Serena  to  Champaign,  IL  (exis)ng  fiber)  
  • 21. 21CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Infrastructure:  Petascale  Compu7ng,  Gbit  Networks   Long  Haul  Networks  to  transport   data  from  Chile  to  the  U.S.     •  200  Gbps  from  Summit  to  La  Serena  (new  fiber)   •  2x40  Gbit  (minimum)  for  La  Serena  to  Champaign,  IL   (protected,  exis1ng  fiber)   Archive  Site  and  U.S.   Data  Access  Center   NCSA,  Champaign,  IL   Base  Site  and  Chilean   Data  Access  Center   La  Serena,  Chile   The  compu1ng  cluster  at  the   LSST  Archive  (at  NCSA)  will   run  the  processing  pipelines.     •  Single-­‐user,  single-­‐applica1on,   dedicated  data  center   •  Process  images  in  real-­‐1me  to  detect   changes  in  the  sky   •  Produce  annual  data  releases  
  • 22. 22CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. “Applica7ons”:  Scien7fic  Core  of  LSST  DM   −  Applica1ons  carry  core  scien)fic  algorithms   that  process  or  analyze  raw  LSST  data  to   generate  output  Data  Products     −  Variety  of  processing   •  Image  processing   •  Measurement  of  source  proper)es   •  Associa)ng  sources  across  space  and  )me,  e.g.   for  tracking  solar  system  objects     −  Applica1ons  framework  layer  (afw;  not  shown)   allows  them  to  be  wrilen  in  a  high-­‐level   language    
  • 23. 23CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Middleware  Layer:  Isola7ng  Hardware,  Orchestra7ng   SoYware   Enabling  execu1on  of  science  pipelines  on  hundreds  of   thousands  of  cores.     •  Frameworks  to  construct  pipelines  out  of  basic  algorithmic   components   •  Orchestra)on  of  execu)on  on  thousands  of  cores   •  Control  and  monitoring  of  the  whole  DM  System   Isola1ng  the  science  pipelines  from  details  of  underlying   hardware     •  Services  used  by  applica)ons  to  access/produce  data  and   communicate   •  "Common  denominator"  interfaces  handle  changing  underlying   technologies  
  • 24. 24CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Database  and  Science  UI:  Delivering  to  Users   Massively  parallel,   distributed,  fault-­‐tolerant   rela5onal  database.     •  To  be  built  on  exis)ng,  robust,  well-­‐ understood,  technologies  (MySQL  and   xrootd)   •  Commodity  hardware,  open  source   •  Advanced  prototype  in  existence  (qserv)   Science  User  Interface  to  enable  the   access  to  and  analysis  of  LSST  data     •  Web  and  machine  interfaces  to  LSST  databases   •  Visualiza)on  and  analysis  capabili)es  
  • 25. 25CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Going  Where  the  Talent  is:  One  Distributed  Team   Infrastructure   Middleware   Core  Algorithms  (“Apps”)   Database   UI        Mgmt,  I&T,  and  Science  QA  
  • 26. 26CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. The  LSST  Soiware  Stack   (science  pipelines,  middleware,  database,  user  interfaces)     “Enabling  LSST  science  by  crea1ng  a  well  documented,  state-­‐ of-­‐the-­‐art,  high-­‐performance,  scalable,  mul1-­‐camera,  open   source,  O/IR  survey  data  processing  and  analysis  system.”  
  • 27. 27CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. LSST  Science  Pipelines   −  02C.01.02.01/02.  Data  Quality  Assessment  Pipelines      (slides  by  Juric)   −  02C.01.[02.01.04,04.01,04.02]    Calibra7on  Pipelines      (slides  by  Axelrod,  Yoachim)   −  02C.03.01.      Single-­‐Frame  Processing  Pipeline      (slides  by  Krughoff,  Lupton)   −  02C.03.02.      Associa7on  pipeline  (slides  by  Lupton)   −  02C.03.03.      Alert  Genera7on  Pipeline      (slides  by  Becker)   −  02C.03.04.      Image  Differencing  Pipeline      (slides  by  Becker)   −  02C.03.06.      Moving  Object  Pipeline      (slides  by  Jones)   −  02C.04.03.      PSF  Es7ma7on  Pipeline    (slides  by  Lupton)   −  02C.04.04.      Image  Coaddi7on  Pipeline      (slides  by  AlSayyad)   −  02C.04.05.      Deep  Detec7on  Pipeline      (slides  by  Lupton)   −  02C.04.06.      Object  Characteriza7on  Pipeline      (slides  by  Lupton,  Bosch)   −  02C.01.02.03.    Science  Pipeline  Toolkit                            (slides  by  Dubois-­‐Felsmann)   −  02C.03.05/04.07  Applica7on  Framework                        (slides  by  Lupton)   Calibra1on  reviewed  in  July  ’13,   by  Wood-­‐Vasey  et  al.   Pipelines  reviewed  in  Sep.  ’13,   by  Magnier  et  al.     Level  1  Level  2  L3   Data  Management  Applica1ons  Design  (LDM-­‐151)  
  • 28. 28CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Implementa7on  Strategy:  Transfer  Know-­‐how,  not  Code   −  Difficulty  adap7ng  exis7ng  public  codes  to  LSST  requirements   (AstroMa7c  suite,  PHOTO,  Elixir,  IRAF-­‐based  pipelines,  etc.)   •  Need  to  run  efficiently  at  scale   •  Need  to  be  flexible  (plugging/unplugging  of  algorithms  at  run)me)   •  Need  to  have  it  developed  by  a  large  team  (20+  scien)sts  and   programmers)   •  Need  to  be  maintainable  over  ~25  years  of  R&D,  Construc)on,  and   Survey  Opera)ons   •  Need  to  run  on  a  variety  of  hardware  and  sonware  pla{orms   •  Need  to  have  logging  and  provenance  built  into  the  design   −  Early  on  (~2006),  a  decision  was  made  to  (largely)  transfer  the  scien7fic   know-­‐how,  but  not  code.  
  • 29. 29CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Maintainable  Design  /  Language  Choices   −  LSST  sonware  stack  is  largely  wrilen  from  scratch,  in  Python,  unless   computa)onal  demands  require  the  use  of  C++   •  C++:   -  Computa)onally  intensive  code   -  Made  available  to  Python  via  SWIG   •  Python:   -  All  high-­‐level  code   -  Prefer  Python  to  C++  unless  performance  demands  otherwise   −  Modularity   •  Virtually  everything  is  a  Python  module.   •  ~60  packages  (git  repositories,  ~corresponding  to  python  packages)   −  Build  system:  scons    Version  control:  git    Package  management:  EUPS    
  • 30. 30CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Modular  Architecture   Applica)on  Framework  (comp.   intensive  C++,  SWIG-­‐wrapped  into   Python)   Middleware  (I/O,  configura)on,  …)   External  C/C++  Libraries  (Boost,   FFTW,  Eigen,  CUDA  ..)   External  Python  Modules  (numpy,   pyfits,  matplotlib,  …)   Camera  Abstrac)on   Layer   (obs_*  packages)   Measurement   Algorithms  (meas_*)   Tasks  (ISR,  Detec)on,  Co-­‐adding,  …)   Command-­‐line  driver  scripts   Cluster  execu)on  middleware   …   Red:  Mostly  C++  (but  Python  wrapped);          Blue:  Mostly  Python;          Black:  External  Libraries   Middleware  (I/O,  configura)on,  …)  
  • 31. 31CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Module  Dependency  Tree   eigen xpa fftwminuit2afwdata cuda_toolkit pysqlitemysqlclientlibpngfreetype astrometry_net suprime_ddata zlib tcltk cfitsio doxygengsl python swig boostmysqlpythonnumpy sconswcslib matplotlib pyfits sconsUtils base ndarray pex_exceptions utils daf_base geom pex_logging pex_policy daf_persistencepex_config afw obs_test coadd_utils pipe_baseskymap skypixtesting_displayQA coadd_chisquared daf_butlerUtilsmeas_algorithms ip_diffim ip_isrmeas_astrom meas_extensions_photometryKron meas_extensions_rotAnglemeas_extensions_shapeHSM obs_lsstSim obs_subaru pipe_tasks
  • 32. 32CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Module  Dependency  Tree   eigen xpa fftwminuit2afwdata cuda_toolkit pysqlitemysqlclientlibpngfreetype astrometry_net suprime_ddata zlib tcltk cfitsio doxygengsl python swig boostmysqlpythonnumpy sconswcslib matplotlib pyfits sconsUtils base ndarray pex_exceptions utils daf_base geom pex_logging pex_policy daf_persistencepex_config afw obs_test coadd_utils pipe_baseskymap skypixtesting_displayQA coadd_chisquared daf_butlerUtilsmeas_algorithms ip_diffim ip_isrmeas_astrom meas_extensions_photometryKron meas_extensions_rotAnglemeas_extensions_shapeHSM obs_lsstSim obs_subaru pipe_tasks External  Tools  and  Libraries   AFW   Camera  abstrac)ons  Measurement  Algorithms   Top-­‐level   scripts  
  • 33. 33CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. (Very  Basic)  SExtractor  with  lsst  primi7ves  (1/2)  
  • 34. 34CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. (Very  Basic)  SExtractor  with  lsst  primi7ves  (2/2)  
  • 35. 35CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
  • 36. 36CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Current  Status:  Advanced  Prototypes   −  8-­‐year  prototyping  effort   •  8  sonware  releases  (Data  Challenges)   •  Status:  A  rapidly  maturing  state-­‐of-­‐the  art  astronomical  data  reduc)on  system   -  ~SDSS/SExtractor  level  quality  of  reduc)ons   -  Most  recently  tested  by  building  co-­‐adds  using  SDSS  Stripe  82  data   -  Used  in  commissioning  of  the  Hyper  Suprime-­‐Cam  Survey  on  Subaru     −  Prototyped  Features:   •  Instrumental  signature  removal   •  Single-­‐frame  processing   •  Point  source  photometry   •  Extended  source  photometry  (model  fi•ng)   •  Deblender   •  Co-­‐addi)on  of  images   •  Image  differencing   •  Object  characteriza)on  on  mul)-­‐epoch  data  (StackFit/Mul)Fit)   •  …     Planning  to  begin  addressing  it  over   the  next  few  months.  
  • 37. Figure:     5  sq.  deg.     background-­‐matched   coadd  composite     (g,r,i)   ~55  epochs       Region:    Aqr   Galac)c  lat  =  -­‐35.0           New  Algorithms:  Background-­‐matched  co-­‐ add  of  SDSS  Stripe  82  in  the  vicinity  of  M2.     Background  matching  preserves  diffuse   structures.     Generated  with  LSST  pipeline  prototypes.   hfp://moe.astro.washington.edu/sdss/   Slide:  Yusra  AlSayyad  
  • 38. 38CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Streams  in  LSST-­‐reprocessed  SDSS  Stripe  82   Stripe  82  background-­‐matched  coadds  built  with  LSST  Data  Management  stack  (hfp://moe.astro.washington.edu)   hfp://moe.astro.washington.edu/sdss/  
  • 39. 39CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Example:  Forced  Photometry  on  SDSS  Stripe  82   Forced  Photometry     For  every  detec)on  in  the  deep  co-­‐add,   perform  PSF  photometry  on  individual   frames  (ugriz).  Note  that  the  majority  of   these  will  be  below  the  single-­‐frame  SNR   detec)on  treshold.     Averaging  those  fluxes  allows  one  to  go   deeper.     Len:  comparison  of  Ivezic  et  al.  (2004)  w  and   y  color  loci;  single  frame  vs.  deep  catalog.    
  • 40. 40CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Winter  2014  SoYware  Release   curl  –O  http://sw.lsstcorp.org/eupspkg/newinstall.sh   bash  newinstall.sh   Installing   •  Supported  plaqorms  (plaqorms  we  regularly  build  on;  generally  builds  on   any  Linux/BSD)   •  RHEL  6   •  OS  X  10.8  Mountain  Lion   •  OS  X  10.9  Mavericks    
  • 41. 41CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. WARNING!  ADVERTENCIA!  AVERTISSEMENT!   THIS  IS  STILL  NOT  A  FINISHED,  POLISHED,  READY-­‐TO-­‐USE  END-­‐ USER  PRODUCT!  BEFORE  DOWNLOADING,  PLEASE  MAKE  SURE   TO  READ  THE  DM  STACK  FAQ:  
  • 42. 42CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. WARNING!  ADVERTENCIA!  AVERTISSEMENT!   THIS  IS  STILL  NOT  A  FINISHED,  POLISHED,  READY-­‐TO-­‐USE  END-­‐ USER  PRODUCT!  BEFORE  DOWNLOADING,  PLEASE  MAKE  SURE   TO  READ  THE  DM  STACK  FAQ:     hfp://dev.lsstcorp.org/trac/wiki/DM/Policy/UsingDMCode/ FAQ     KEY  POINTS:   -­‐  POOR  DOCUMENTATION   -­‐  YOU’RE  DOWNLOADING  UNSUPPORTED,  PROTOTYPE,  CODE   -­‐  THIS  CODE  WILL  NOT  WORK  OUT  OF  THE  BOX  FOR  CAMERAS   OTHER  THAN  LSST  (AND  SDSS).   -­‐  EXPECT  TO  WRITE  SOME  PYTHON  CODE  
  • 43. 43CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. The  Big  Picture:       Preparing  for  the  Data   Driven  Astronomy  of  the   Next  Decade  (and  beyond)  
  • 44. 44CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. “Astro  2020”:  Rise  of  the  Machines   −  We’re  witnessing  a  change  in  how  astronomy   is  done,  and  the  technical  knowledge  and   tools  needed  to  do  it.   •  The  rise  of  big  projects  and  end  to  data   scarcity   •  The  rise  of  systema)cs  limited  science   •  The  rise  of  open  (source),  (massively)   collabora)ve,  science  
  • 45. 45CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. “Astro  2020”:  Rise  of  the  Machines   −  We’re  witnessing  a  change  in  how  astronomy   is  done,  and  the  technical  knowledge  and   tools  needed  to  do  it.   •  The  rise  of  big  projects  and  end  to  data   scarcity   •  The  rise  of  systema)cs  limited  science   •  The  rise  of  open  (source),  (massively)   collabora)ve,  science   −  Consequences   •  Ability  to  collect  data  has  outstripped  the   ability  to  analyze  it   -  Extrac)on  of  features  from  the  data  (“image   processing”)   -  Mining  of  knowledge  from  the  data  (“data   mining”)   •  We  cri)cally  dependent  on  compu)ng   infrastructure  and  sonware/algorithm   research  for  astronomical  progress   -  Yet  we  don’t  generally  acknowledge,   encourage,  or  teach  it    
  • 46. 46CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. “Astro  2020”:  Rise  of  the  Machines   −  We’re  witnessing  a  change  in  how  astronomy   is  done,  and  the  technical  knowledge  and   tools  needed  to  do  it.   •  The  rise  of  big  projects  and  end  to  data   scarcity   •  The  rise  of  systema)cs  limited  science   •  The  rise  of  open  (source),  (massively)   collabora)ve,  science   −  Consequences   •  Ability  to  collect  data  has  outstripped  the   ability  to  analyze  it   -  Extrac)on  of  features  from  the  data  (“image   processing”)   -  Mining  of  knowledge  from  the  data  (“data   mining”)   •  We  cri)cally  dependent  on  compu)ng   infrastructure  and  sonware/algorithm   research  for  astronomical  progress   -  Yet  we  don’t  generally  acknowledge,   encourage,  or  teach  it     −  Challenges   •  Eleva)ng  sonware  engineering  to  a   foo)ng  equal  to  mathema)cs?   -  Learn-­‐by-­‐osmosis  not  sufficient  any   more   •  T(construc)on)  >>  T(discovery)   -  Research  becoming  more  data  driven   -  Broad  interests  in  astrophysics   -  Sta)s)cs,  CS,  sonware  engineering,  etc.   •  Sonware  reusability   -  Increasing  complexity  makes   perpetual  wheel  reinven7ons   infeasible  (and,  honestly,  silly…)  
  • 47. 47CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. LSST:  Helping  Build  the  Common  Codebase  for   the  Next  Quarter  Century   −  LSST  sonware  will  be  general  purpose  and   highly  reusable  by  design.   •  Necessary  to  deal  with  real-­‐world  hardware   •  Necessary  to  be  able  to  process  precursor   data   •  Necessary  to  enable  science  (“Level  3”)   sonware  to  be  wrilen  on  top  of  it   −  Opportuni7es  for  using  LSST-­‐derived  code   on  other  data  sets   •  More  work  ahead,  but  becoming  a  state  of   the  art,  well  supported,  codebase   •  Possibili)es:  SDSS,  CFHT-­‐LS,  PanSTARRS,   HSC,  DES,  WFIRST,  Euclid,  …   •  Good  basis  for  analysis  frameworks  (LSST   DESC)   •  Leveraging  a  100M+  NSF  investment  in   large  survey  data  management   −  The  benefits  feed  back  to  LSST:  more   users,  less  bugs,  beler  understanding,   shorter  path  to  science.  
  • 48. 48CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. LSST:  A  Piece  of  the  Puzzle   −  LSST  can  help  posi7on  us  for  the  future  in  two   ways   •  With  code  (see  previous  slide)   •  With  people/culture   −  SoYware  Development  Culture   •  We  will  run  the  sonware  effort  as  an  open  source   project  with  reusability  in  mind   -  A  source  tarball  at  the  very  end  is  not  useful!   -  Open  bug  trackers,  mailing  lists,  repositories   -  S7ll  have  a  job  to  do!  But  that  doesn’t  mean  we   must  do  it  in  a  closed,  insulated,  manner!   •  Think  Fedora  Project/RedHat,  Android/Google,   Debian/Ubuntu/Mint   •  Use  what  works:  numpy,  scipy,  astropy,  etc…   -  Improve  upstream  rather  than  fork!   -  Where  we  run  into  problems:  poor  sonware   engineering,  performance  issues,  licenses   •  Startup  mentality:  excellence  wins,  agile  process,   con)nuous  change  &  learning,  collabora)ve  spirit,   sense  of  urgency  and  excitement.   −  People   •  We  will  have  40+  people  working  on   LSST  Data  Management  over  (1)8+  yrs   -  Crea)ng  a  career  path  for  sonware   instrumentalists   •  We  can  help  train  a  whole  genera)on   of  “data  driven  astronomers”   -  Impar)ng  the  know-­‐how  needed  to   make  the  best  use  of  the  next   genera)on  of  surveys  
  • 49. 49CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. @LSST    @mjuric