SlideShare a Scribd company logo
Operationalizing
Clojure
Confidently
Prasanna Gautam
Staples-SparX

02/19/2015
Image: http://www.rohitnair.net/img/design/core.jpg
–Douglas Hofstadter (“I am a Strange Loop”)
“We don't want to focus on the trees (or their leaves)
at the expense of the forest.”
My Clojure Story
Introduced to Clojure - didn’t have prior Lisp experience.
Did my senior project on simulating Mobile Ad-hoc networks
using Clojure at Trinity College in 2011.
Started working at ESPN Innovation
Worked on variety of other languages - Java, Ruby, Python,
Javascript, C++
Clojure was my primary interface to JVM for experimentation
Decided to use Clojure to deliver ESPN programming to
International Space Station
SparX
2009
2011
2011-2013
2013
2015
Requirements
Cmdr. Chris Cassidy reached out to request
regular ESPN programming.
200 MB file limit
Had to be ready every day at noon Central
Time
Obvious choice:
Lets hire people to clip and send videos
every day!
But it’s 2013
Why not automate?
Also, let’s remove ads.
Motive: Validating the video services and interfaces we
had been working on.
Ok, so why Clojure?
Why Clojure?
Two weeks to deadline
Not all the pieces were clear
No guarantees from upstream services
Human errors abound
Source of data was people pressing buttons
And, systems failing would result in similar behavior
Why Clojure?
Immutability
I could keep the system as a “constant” in ever changing world
Idempotency - re-run if failed, resume at any point in pipeline.
Java Interop
Even when I had APIs that weren’t written by my group, they
were SOAP and XML based. Yay!
Inherently refactorable if designed correctly
Post-mortem
Still in production since September 2013
Strictly enforced the “naïve” approach that “should”
work
Learned a lot of lessons that go beyond Clojure
This talk is about these lessons
- Paul Graham
(“Hackers & Painters: Big Ideas from the Computer Age”)
“When you're forced to be simple, you're forced to
face the real problem.”
Parts of the stack
Core Assumptions
Operations
Familiar Interfaces
Overrides
State
Logging
Error Handling
Iterative Development
Core: Timestamps
Programs — items that have a name and “start” and
“end” times
Program Segments, Breaks — blocks within a program
that “start” and “end” at particular times.
It’s just a map and reduce operation now!!
Take only program segments and make them into a
video.
Why was it a good idea?
Bare set of functionality to bind everything together.
Everything else is a good signal and would make
system “better” but not dependable.
Aligning timestamps in UI is dead-easy to see where
things are not aligned.
TV Programs are events too.
Core: Dependency Graph
Your tasks are dependent on previous tasks
What’s the plan when they fail to execute?
Core: Loose Coupling/Lazy
Execution
Separate data gathering and execution
You can expose the data to the user with no side-
effects.
On Operations
Functional Programs still need
Operational expertise
If you’re in big enough company with
an ops team
They don’t care about your FP
patterns - they shouldn’t have to.
Make configurations declarative
and readable
On Familiar Interfaces
Use standard configuration formats
— readable, parseable by anything
I picked Yaml
Familiar scheduling
Used cron strings thanks to
Quartz
Everything in UTC internally
Timezones treated as side-
effects
programs:)
))*)name:)AROUND)THE)HORN)
))))short_name:)ATH)
))))start_time:)"20:00:00")
))*)name:)PARDON)THE)INTERRUPTION)
))))short_name:)PTI)
))))start_time:)"20:30:00")
))*)name:)SPORTSCENTER)
))))short_name:)SportsCenter)
))))start_time:)"14:00:00")))
run:)
))cron:)0)0)14)1/1)*)?)*)
)
final_tz:)America/Anchorage)
)
On Familiar Interfaces
Started with a solid command line interface.
Took the Config and Options abstractions and exposed
as REST API.
Switches)))))))))))))))))))))))))Default))))))))Desc)
)////////)))))))))))))))))))))))))///////))))))))////)
)/c,)//config)))))))))))))))))))))nasamatic.yml))Use)this)config)file)path)
)/h,)//no/help,)//help))))))))))))false))))))))))Show)Help)
)/f,)//no/force,)//force))))))))))false))))))))))Force)run)now)instead)of)using)Cron)
)/u,)//no/upload,)//upload))))))))true)))))))))))Upload)or)not)
)/t,)//no/transcode,)//transcode))true)))))))))))Transcode)or)not)
)/B,)//hours/before/now)))))))))))0))))))))))))))How)many)hours)before)now)to)look)at)
)/d,)//no/dry/run,)//dry/run))))))false))))))))))Dry)Run)modeOptions)
)
On Familiar Interfaces
Also wrote a Web UI in AngularJS for Operations team
to use in cases of failed runs
The system failed rarely enough that I had to retrain
people all the time.
Just gave up and used the CLI tool most of the time
UI breakage due to javascript issues
Exposing the API to Slack was more popular
On Familiar Interfaces
One-to-one correspondence between CLI and JSON
Key switch type default description
upload -u,--[no-]upload flag TRUE Upload to the FTP server
transcode -t, --[no-]transcode flag TRUE Pass the files through transcoder
qc -q,--[no-]qc flag FALSE Submit file to be QC’d by Pulsar
hours-before-now -B,--hours-before-now int 0 Number of hours before to look
dry-run -d,--dry-run flag FALSE Run without affecting filesystem/uploading
filter-by-program-tag -p, --[no-]filter-by-program-
tag
flag TRUE Select contiguous programTags from
Authnet or not
short-names -s,--short-names string Programs to select as declared in the
configuration file under programs. Default
behavior is to run all programs declared in
configuration.
On Overrides
Core Abstractions - Config and Options
Config: A static set of parameters that defines the
general behavior of program. Doesn’t change too
often.
Options: A dynamic set of parameters that can override
config per-run.
Every job gets defined entirely by them.
On State
Keep the least amount of state possible
The system used no database at all for operations.
Intermediate files that were effects of steps were
relied upon
Have to keep only last-seen state for live operation.
Re-running is trivial.
On Logging
Timestamp, state, key=value
Parseable by anything! (It was Splunk’s weirdness that
led to this)
Can generate metrics from on-going operations
without instrumenting further.
Wired to PagerDuty directly
On Error Handling
Find out about error, try to fix it — if not possible, system
should try the whole process next day/job
Parent form generates random trace-id for a job
Passed to all children for that job
Any exceptions are passed via the chain and logged
Back off and Retry — if all else fails, let humans figure it
out.
(defmacro)do$with$log+
++"+Works+functionally+like+a+do+block+$$+more+or+less,+it+runs+all+the+given+forms+in+order+and+returns+the+output+of+the+last+form+it+ran..+It+logs+when+the+job+
started,+ended+or+when+it+runs+into+any+problems.+It+logs+the+error+and+rethrows+the+Throwable+upstream."+
++([[job$name+name+&+{:keys+[trace$id]+:or+{trace$id+(str+"trace$"+(rand$int+100000))}}]+&++body]+
+++(if$not+name+
+++++(throw+(IllegalArgumentException.+"You+want+to+provide+a+name+for+the+block+you+want+to+run.")))+++
+++`(let)[out#+(atom+nil)+
++++++++++start$time#+(System/currentTimeMillis)+
++++++++++~job$name+(str+~name)+
++++++++++~'trace$id+(str+~trace$id)+
+++++++++]+
++++(infoAm+"job"+~job$name+"status"+"Started"+"trace$id"+~trace$id)+
++++(reset!+out#+(try)
+++++++~@body+
+++++++(catch+Throwable++e#+
+++++++++(errorAm+"job="+~job$name++"status"+"Error"+"trace$id"+~trace$id++"message"+e#)+
+++++++++(throw+e#))))+
++++(infoAm+"job"+~job$name+"status"+"Ended"+"trace$id"+~trace$id+"time_taken"+(str+($+(System/currentTimeMillis)+start$time#+)+"ms"))+
++++@out#)+
+
+++++)+
++)+
2014-05-20 00:28:26 INFO utils-verify:1 - trace-id=trace-94295, status=Started, job=sleeps
2014-05-20 00:28:27 INFO utils-verify:1 - trace-id=trace-94295, status=Started, job=throws-error
2014-05-20 00:28:27 ERROR utils-verify:1 - job==throws-error, trace-id=trace-94295, message=java.lang.Throwable: Boo! I errored Out, status=Error
2014-05-20 00:28:27 ERROR utils-verify:1 - job==sleeps, trace-id=trace-94295, message=java.lang.Throwable: Boo! I errored Out, status=Error
Only Macro
I needed
Iterative Development
Used “lein ns-deps-graph” to see the inter-relations
between namespaces
Operational Clojure
Builds on simple concepts
they’re the units of composition
Sparingly depends on global state, if at all
Leverages existing infrastructure and people
Adapts to changes in scope and requirements
Loosely couples data and execution
Future
I had great time coming up with some of these
patterns
Particularly - config and options for jobs
Thinking about open source re-implementations
More Clojure-y things at SparX coming soon. ;)
Questions/Comments?

More Related Content

PPTX
Loom & Functional Graphs in Clojure @ LambdaConf 2015
ODP
ooc - A hybrid language experiment
ODP
ooc - A hybrid language experiment
PPTX
Velocity 2015: Building Self-Healing Systems
PPTX
Velocity 2015 building self healing systems (slide share version)
PDF
Building Hermetic Systems (without Docker)
PDF
Puppet@Citygrid - Julien Rottenberg - PuppetCamp LA '12
ODP
Deployment talk dpc 13
Loom & Functional Graphs in Clojure @ LambdaConf 2015
ooc - A hybrid language experiment
ooc - A hybrid language experiment
Velocity 2015: Building Self-Healing Systems
Velocity 2015 building self healing systems (slide share version)
Building Hermetic Systems (without Docker)
Puppet@Citygrid - Julien Rottenberg - PuppetCamp LA '12
Deployment talk dpc 13

Similar to Operationalizing Clojure Confidently (20)

PDF
Continuous Delivery: The Dirty Details
PPTX
Lecture #4 activities & fragments
PPT
Overview Of Parallel Development - Ericnel
KEY
Mobile optimization
PDF
Infrastructure as code might be literally impossible
PDF
Beyond Breakpoints: A Tour of Dynamic Analysis
PDF
PyData 2015 Keynote: "A Systems View of Machine Learning"
PPT
Naive application development
PDF
10 Ways To Improve Your Code
PDF
10 Ways To Improve Your Code( Neal Ford)
PPT
BP206 - Let's Give Your LotusScript a Tune-Up
PPTX
Architecting Single Activity Applications (With or Without Fragments)
PPT
Google mock training
PDF
Introduction to Google Colaboratory.pdf
PPTX
2 Years of Real World FP at REA
PDF
Y U NO CRAFTSMAN
PDF
How could I automate log gathering in the distributed system
PDF
Mobile Developer Summit 2012, Pune
PDF
Infrastructure as code might be literally impossible / Joe Domato (packageclo...
PPTX
Nasamatic NewHaven.IO 2014 05-21
Continuous Delivery: The Dirty Details
Lecture #4 activities & fragments
Overview Of Parallel Development - Ericnel
Mobile optimization
Infrastructure as code might be literally impossible
Beyond Breakpoints: A Tour of Dynamic Analysis
PyData 2015 Keynote: "A Systems View of Machine Learning"
Naive application development
10 Ways To Improve Your Code
10 Ways To Improve Your Code( Neal Ford)
BP206 - Let's Give Your LotusScript a Tune-Up
Architecting Single Activity Applications (With or Without Fragments)
Google mock training
Introduction to Google Colaboratory.pdf
2 Years of Real World FP at REA
Y U NO CRAFTSMAN
How could I automate log gathering in the distributed system
Mobile Developer Summit 2012, Pune
Infrastructure as code might be literally impossible / Joe Domato (packageclo...
Nasamatic NewHaven.IO 2014 05-21
Ad

Recently uploaded (20)

PDF
Approach and Philosophy of On baking technology
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Machine learning based COVID-19 study performance prediction
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
Approach and Philosophy of On baking technology
NewMind AI Weekly Chronicles - August'25 Week I
Network Security Unit 5.pdf for BCA BBA.
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Spectral efficient network and resource selection model in 5G networks
Chapter 3 Spatial Domain Image Processing.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Machine learning based COVID-19 study performance prediction
The Rise and Fall of 3GPP – Time for a Sabbatical?
Per capita expenditure prediction using model stacking based on satellite ima...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Big Data Technologies - Introduction.pptx
Unlocking AI with Model Context Protocol (MCP)
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Reach Out and Touch Someone: Haptics and Empathic Computing
Ad

Operationalizing Clojure Confidently