Clustering In The Wild

Clustering in the wild Ugo Landini CTO, Sourcesense Sergio Bossa Software Architect, Sourcesense

Agenda Why Clustering? Clustering J(2)EE Terracotta in a nutshell. Jira clustering issues. Files and indexes. Stateful applications and home grown caches. Thread and services. HTTP Session. Summary.

Why clustering? Horizontal scalability: Scale out. More computers, to improve throughput when a single one is not enough or costs too much. High availability: More computers to improve uptime. If you unplug a network cable, the system should remain up and running. 24/7, or around. Usually more important than scalability.

Clustering J(2)EE In an ideal world <distributable /> tag in your web.xml Serializable objects in your HTTP session. True, if and only if is J(2)EE Compliant Basically, no arbitrary use of resources and state Files. Threads. Sockets. ... ?

Clustering J(2)EE What do I do with my files? java.io.tmpdir JNDI lookup What do I do with the state of my application (caches, conversational state, etc.)? Stateful Enterprise Java Beans Well established caching frameworks EHCache, OSCache, JbossCache JSR 107

Clustering J(2)EE What do I do with my thread/services? JMS (MDBs and topics, mostly) Commonj (Bea and IBM effort) What do I do with my HTTP Session? Serializable objects. Use a good Load Balancer.

Wake up! Almost all successful J(2)EE applications around won't pass the Sun AVK (Application Verification Kit). Most people go straight for the simple solution and that one could be a cluster antipattern home grown caches, lucene indexes, quartz jobs, singletons... add your favourite quickie here.

Enter Terracotta Transparent (Translucid? ...) Clustering. Very few changes to already existent code. Low development effort. Open Source, free for any use. Emerging (and cool!) technology. Did I mention that we are Terracotta partner? :)

The quest for antipatterns Jira is NOT easily clusterable, so it is a nice testbed. Jira is a bug tracking, issue tracking, and project management application developed to make this process easier. Jira is the leading issue tracker in the open source world (though it is not strictly open source). People is asking for a clustered Jira! http://jira.atlassian.com/browse/JRA-7330 Did I mention that we are Atlassian partner?

Terracotta magic Terracotta moves around the bytes changed in shared objects No serialization. superstatic objects! same semantic, only new() behaves differently Demarcation of transaction with guarded block essentially moves multi-thread application semantic to cluster level. For performance reasons, for certain objects it moves behaviour and not data (logicaly managed vs physically managed objects) you can do the same thing if you need to. (distributed methods)

Terracotta in a nutshell Features, part one: Transparent JVM-level clustering. Transparently works inside your JVM as an infrastructure service. Plugs into your code thanks to bytecode injection. No API, no code changes! Hub-and-Spoke architecture. Central server based architecture. All nodes talk only to the central server. Linear scalability. No split-brain problem.

Terracotta in a nutshell Features, part two: Active/Passive mode. One central active server, n passive servers. Network Attached Memory. Shares your objects graph with the central server. Virtual Heap (on disk, with Berkeley DB) Maintains your object graph in the memory heap. Preserved Java semantics. Object equality (equals, hashCode) Concurrency. (syncronized, java.util.concurrency) Thread communication. (wait, notify)

Terracotta in a nutshell Main concepts: Roots. Defines where your shared objects graph starts. Locks. Ensures data consistency. Enables Terracotta intra-node communication. All code changing parts of the shared objects graph must be guarded by locks. Distributed methods. Enables plain old Java methods to be simultaneously called in all cluster nodes.

Out in the wild How did we actually cluster the beast?

Clustering Lucene indexes : Problems Lucene indexes are typically stored in files. Do you remember? clustering antipattern Used to improve data access speed. How to cluster them? Network based solution : SAN or NFS. Not a viable solution due to locks Messaging based solution : JMS Complicated! Indexes should improve performances, rather than make them worse!

Clustering Lucene indexes : Solution Let's store indexes in memory! Lucene: Provides support for memory-based indexes. Just use org.apache.lucene.store.RAMDirectory. Terracotta: Just a matter of configuration. And you can share your lucene indexes.

Clustering Jira caches : Problems Guess what ... Jira uses home grown caches! Do you remember? clustering antipattern From bad to worse: No unified API! Just a lot of HashMaps and HashSets. Very poor locking policies. Makes configuration-only Terracotta clustering impossible! Unfeasible to use an already existent caching framework.

Clustering Jira caches : Solution Write a new, ad-hoc, unified caching API. Goals: Simplicity. As simple as using an HashMap. Thread safety. Cache consistency. Terracotta ready. Efficiency. No bottlenecks. No liveness failures.

Caching API : Striving for simplicity. No strange methods. No cluster related configuration. Just the usual GET/PUT methods, and alike. Terracotta makes the clustering work! When choosing how to cluster the cache: Distribute behaviour, rather than data. Jira puts heavyweight objects in cache. Distribute cache invalidation, rather than cache updates. Lower hit ratio but ... Lower network traffic! Higher simplicity!

Caching API : Striving for thread safety. Carefully use Java locks (ok, this was obvious ...). Due to how Jira works: The caching API must be able to group more than one cache under the same lock. The caching API must be able to execute a code block atomically under the same lock. Not so obvious ... Use what we call “ owner based locking.”

Caching API : Striving for efficiency. Choose the right balance between too fine grained and too coarse grained locks. Do not use complex lock constructs. Use plain synchronized blocks. Use lock striping techniques.

Threads and services Jira periodically triggers threads: Do you remember? clustering antipattern Threaded Jira services: Mail sending. Backup export. Index optimization

Clustering threads and services : Problems Threads cannot be clustered. We have to cluster the launched services. Some services must be shared among cluster nodes. Other services must be distributed. How to distinguish them?

Clustering threads and services : Solution Shared services. Clustered through Terracotta XML configuration. A shared service is executed only on a single node. The default. Distributed services. Distributed through Terracotta XML configuration. A distributed service is executed on every node. Just implement com.atlassian.jira.service.JiraDistributedService

HTTP Session Two choices: Cluster it through Terracotta. Very hard. Again, Jira puts a lot of heavyweight objects into session. Leave it unclustered. Use a load balancer with sticky sessions enabled. Jira is not a mission critical application. More simplicity, less complexity. Guess what we chose ... Please give me that shiny new load balancer ...

Dealing with external code Applications are often pluggable. Jira has a rich plugin architecture. External plugins must fit and work into the cluster It is necessary to provide simple APIs or configuration options for making cluster-ready plugins. Practical example : com.atlassian.jira.service.JiraDistributedService

Summary Terracotta is a transparent clustering solution but ... You have to take a lot of decisions and trade-off. If you have to access files in a clustered environment: Slow access: network filesystem, database system. Fast access: use Terracotta network attached memory. If you have to cluster your application state: Carefully make it thread safe. Choose between distributing data or behaviour.

Summary If you have application services: Choose services to share. A shared service runs once per cluster. Choose services to distribute. A distributed service runs once per node. If you have to cluster the HTTP session state: Consider not to cluster it! If you have to deal with application plugins: Provide API hooks or configuration options.

Terracotta + Jira = Scarlet Scarlet. Clusters Jira through Terracotta. Published as a Jira extension. http://confluence.atlassian.com/x/woQuBg Open Source. We want you! Actively developed: November 06, 2007 : 1.0 Beta 1. Very soon : 1.0 Beta 2.

Clustering In The Wild

More Related Content

What's hot (20)

Viewers also liked (9)

Similar to Clustering In The Wild (20)

More from Sergio Bossa (8)

Recently uploaded (20)

Clustering In The Wild