SlideShare a Scribd company logo
Google App Engine Development Java, Data Models, and Other Things You Should Know Navin Kumar Socialwok
Introduction to Google App Engine Google App Engine is an on-demand cloud platform that can be used to rapidly develop and scale web applications.   Advantages: You are using the same architecture and tools that Google uses to scale their own applications.  Easy to develop your own applications using Java and Python Free Quotas to get you started immediately.
Java Support on Google App Engine Java support was introduced on April 2009   Remarkable milestone for several reasons: Brought the Java Servlet development model to Google App Engine You can use your favorite Java IDE to develop your applications now (Eclipse, NetBeans, IntelliJ) Database development is easy with JDO and JPA Not only limited to the Java Language, but ANY JVM-supported language can be used (JRuby, Groovy, Scala, even JavaScript(Rhino), PHP etc.)
Eclipse Support and GWT Eclipse is the premier open source Java IDE, and with the Google Plugin for Eclipse, developing Google AppEngine apps can be done very easily.   Eclipse will automatically layout your web application for you in addition to providing 1-click deployment.   GWT is also supported by the Eclipse plugin, and can also be used along with your Google AppEngine codebase. End-to-end Java development of powerful Java-based web applications.
Google Plugin for Eclipse (GWT and AppEngine)
BigTable: Behind Google's Datastore BigTable: A Distributed Storage System for Structured Data ( http://labs.google.com/papers/bigtable.html ) Built on top of GFS (Google File System) ( http://labs.google.com/papers/mapreduce.html )     Strongly consistent and uses optimistic concurrency control     But it's not a relational database  No Joins or true OR queries &quot;!=&quot; is not implemented Limitations on the use of &quot;<&quot; and &quot;>&quot;
Data Models DataNucleus ( http://www.datanucleus.org ) is used to handle the Java persistence frameworks on AppEngine   2 Choices: JDO (Java Data Objects) or JPA (Java Persistence API) (JPA will be very familiar to those who have used Hibernate or EJB persistence frameworks)   Both involve very similar coding styles.   For this talk, we will focus on JDO, but JPA is very similar, so the same concepts can be applied.    There is also a low-level datastore API that we will touch on as well
Defining Your Data Model package com.socialwok.server.data; import java.io.Serializable; import javax.jdo.annotations.*; import com.google.appengine.api.datastore.Text;  @PersistenceCapable(identityType = IdentityType.APPLICATION) public class Post implements Serializable {      private static final long serialVersionUID = 1L;      @PrimaryKey      @Persistent(valueStrategy=IdGeneratorStrategy.IDENTITY)      @Extension(vendorName=&quot;datanucleus&quot;, key=&quot;gae.encoded-pk&quot;, value=&quot;true&quot;)      private String id;      public String getId() { return id; }      @Persistent      private String title;      public String getTitle() { .. }      public void setTitle(String title) { .. }      @Persistent      private Text content;      public String getContent() { .. }      public void setContent(String content) { .. }      .. }
Creating, Deleting, and Querying  At the heart of everything is the  PersistenceManager                PersistenceManager pm =  PMF.get().getPersistenceManager();        Post post = new Post();        post.setTitle(&quot;Title&quot;);        post.setContent(&quot;Google AppEngine for Java&quot;);        try {           pm.makePersistent(post);        }                    pm.close();        ...        Post deleteMe = pm.getObjectById(Post.class, deleteId);        try {           pm.deletePersistent(deleteMe);        }        ...   Build queries using JDOQL         Query query = pm.newQuery(Post.class);        query.setFilter(&quot;title == titleParam&quot;);        query.declareParameters(&quot;String titleParam&quot;);        query.setUnique(true);        Post post = (Post) query.execute(&quot;Title&quot;);       
Relationships Owned one-to-one and one-to-many @Persistent(mappedBy=&quot;field&quot;)  annotation syntax.  Unowned relationships (one-to-one, one-to-many, many-to-many) @Persistent Key otherEntity ;  @Persistent List<Key> otherEntities;   Owned relationships create a parent-child relationship Parent and child entities are stored in the same entity group Entity group defines a location in the datastore  This is important because Transactions on the datastore can only be applied over a single entity group
Other APIs you should be aware UsersService Don't write a login, use Google's! ImagesService Picasa image manipulation web services Memcache Distributed cache for objects Very useful! More on this later... URL Fetch Mail service Send outbound emails w/ some restrictions   APIs (except UsersService) subject to quota limitations
And now for the fun stuff...  
Enterprise social collaboration application built on Google App Engine.  Utilizes a social concept of feeds (also referred to as presence and activity streams) Combines the querying of reasonable complex data with privacy requirements of social networking.   Uses tons of Google App Engine APIs, Google APIs, and GWT.     As we have built it, we have learned several aspects about Google App Engine that have allowed us to make the app reasonable fast and responsive.
Lesson 1: Utilization of Memcache Data structure of each feed is relatively complex At least 3 explicit unowned relationships        @Persistent Key user        @Persistent Key network        @Persistent List<Key> attachments    Requires querying for each these objects explicitly when representing in the feed.  Feed is fetched repeated by several (hundreds) concurrent users There is need for the feed display to be reasonable responsive for all the different users
Lesson 1 (cont.) Solution: Memcache Distributed in-memory cache  Uses  javax.cache.*  APIs Also, a lowlevel   API:  com.google.appengine.api.memcache.* Basic uses: Speed up existing common datastore queries Session data, user preferences Cache data is retained as long as possible if no expiration is set Data is not stored on any persistent storage, so you must be sure your app can handle a &quot;cache miss&quot;
Lesson 1: Memcache conclusions   Works really well! Responsive requests 2 s. => ~800 ms. resp. time (60% decrease) Cache data is generally retained for a  very  long time Distributed nature of cache provides benefits to every user on the system. The more people who use your app, the better your app performs** Even free quota for Memcache is quite generous: ~ 8.6 million API calls.
Lesson 2: Message Delivery Fanout Adapted from  Building Scalable, Complex Apps...  from Google I/O by Brett Slatkin http://code.google.com/events/io/sessions/BuildingScalableComplexApps.html   Basically deals with a problem of fan-out Socialwok has a concept of &quot;following&quot; (which is basically a subscription between users)  In our case, one user posts a single message that needs to be &quot;delivered&quot;  to all his subscribers How do we show the message efficiently to all his subscribers? We can deliver the message by reference to its recipients.
Lesson 2 (cont.): RDBMS version 2 Primary Tables 2 Join Tables To get Messages to display for the current user SELECT * from Messages INNER JOIN UserMessages USING (message_id) WHERE UserMessages.user_id = 'current_user_id' But there aren't any joins on AppEngine! User ID Name 1 Navin 2 John 3 Vikram Message ID Message User ID 1 Hello world 1 2 Another message 3 Follower ID Following ID 1 2 1 3 2 1 Recipient ID Message ID 1 34 1 67
Lesson 2: List Properties to the Rescue A list property is property in the datastore that has multiple values: @Persistent private Collection<String> values; Represented in Java using Collection fields (Set, List, etc.) Indexed in the same way that normal fields are         Densely pack information   Query like you query any single-valued property: query.setFilter(&quot;values == 2&quot;); values Index key=1,values=1 key=2,values=2 key=2,values=1
Lesson 2: Our new data definition Now we can define a collection field to store the list of recipients public class Message {      @Persistent private String msg;      @Persistent private List<String> recipients;       ... }   Query on the collection field: Query query = pm.newQuery(Message.class); query.setFilter(&quot;recipients == recptParam&quot;); List<Message> msgs =       (List<Message>) query.execute(currentUserId); But there is one issue with this: Serialization overhead when fetching the messages We don't really care about the contents of this field when displaying the messages So we will take advantage of another trick
Lesson 3: Keys-only Queries and AppEngine Key Structure We can perform queries whose return values are restricted to the keys of the entity Currently only supported in low-level datastore API AppEngine keys are structured in a very special way   Stored in protocol buffers    Consists of an app ID, and series of type-id_or_name pairs pair is entity type name and autogenerated-integer ID or user-provided name Root entities have exactly one of these pairs; child entities have one for each parent and their own Presents a unique ability to retrieve a parent entity's key from the child entity's key
Lesson 3: A solution to our Serialization Problem Now we can store the irrelevant recipients in a child entity   Here's the process: Define a child entity with the recipients field Store the recipients of the message in the child entity  Create a keys-only query on the child entity that filters on the  recipients  field. Get a list of parent keys from the list of child keys Bulk-fetch the parents from the datastore
Lesson 3 (contd.): Solution (Data Def.) public class MessageRecipients {      @PrimaryKey private Key id;      @Persistent private List<String> recipients;      @Persistent private Date date;       @Persistent(mappedBy=&quot;msgRecpt&quot;) private Message msg;       ... } public class Message {      ...      @Persistent private Date date;       @Persistent private String msg;       @Persistent private MessageRecipients msgRecpt;      ...  }
Lesson 3 (contd): Solution (Querying) DatastoreService dataSvc = ...;   Query query = new Query(&quot;MessageRecipients&quot;)    .addFilter(&quot;recipients&quot;),FilterOperator.EQUAL,userid)    .addSort(&quot;date&quot;, SortDirection.DESCENDING)     .setKeysOnly();   // <-- Only fetch keys!   List<Entity> msgRecpts = dataSvc.prepare(query).asList(); List<Key> parents = new ArrayList<Key>(); for (Entity recep : msgRecpts) {     parents.add(recep.getParent()); } // Bulk fetch parents using key list Map<Key,Entity> msgs = dataSvc.get(parents);
Cool Trick: Lite Full Text Search Most web applications nowadays need some form of full-text search Well we are on  Google  AppEngine aren't we!   Google actually did really release a basic searchable model implementation Limited to Python ( google.appengine.ext.search ) More info:  http://www.billkatz.com/2008/8/A-SearchableModel-for-App-Engine Proper full-text search is in the AppEngine roadmap    Some of our earlier lessons do apply here.
How do we build it First, it helps to understand how a basic full-text search index works First, break up the text into terms using lexographical analysis  Then store the terms in a lookup table based on key of the message With List fields, Google AppEngine gives us this one. We build queries using the same tricks.  We also apply the same tricks using child entities and key-only queries to optimize for the serialization overhead.
Live example I have deployed a modified version of Google AppEngine guestbook example: http://searchguestbook.appspot.com   If anyone wants to &quot;sign&quot; it right now, please go ahead.   We will now search the data Limited to 1-2 word queries
How it works. Applies lessons from list fields and keys-only queries   @Persistent Set<String> searchTerms; Our &quot;lexigraphical analysis&quot;: Java regular expression   String[] tokens = content.toLowerCase().split(&quot;[^\\w]+&quot;); Can use a full-text search library like Lucene to improve this part Another cool feature of list properties: merge-join Think about organizing your data in a Venn-diagram fashion and finding the intersection of your data. Watch your indexes!  Can improve this implementation by using Memcache to cache common search queries.  Code will be made available after the talk, so you can take a good look for yourself!   
Conclusions Google AppEngine for Java provides a standardized way to build applications for Google AppEngine In building Socialwok, we have learned several lessons that apply when building a scalable application on Google App Engine Get the Searchable Guestbook code here: http://searchguestbook.appspot.com/searchguestbook.tar.gz In short, Google AppEngine development has never been easier and more interesting! Get started by visiting:  http://code.google.com/appengine
Q & A

More Related Content

PDF
Java Web Programming on Google Cloud Platform [1/3] : Google App Engine
PPS
JSP Error handling
PDF
Learn Drupal 8 Render Pipeline
PDF
Di code steps
PPT
Features java9
PPTX
08.1. Android How to Use Intent (explicit)
PDF
Dependency injection with dagger 2
PDF
Advanced Dagger talk from 360andev
Java Web Programming on Google Cloud Platform [1/3] : Google App Engine
JSP Error handling
Learn Drupal 8 Render Pipeline
Di code steps
Features java9
08.1. Android How to Use Intent (explicit)
Dependency injection with dagger 2
Advanced Dagger talk from 360andev

What's hot (13)

PDF
JEE Programming - 04 Java Servlets
PDF
Tomcat + other things
PDF
Lab2-android
PDF
Beginning AngularJS
DOC
Write and run three interesting queries/tutorialoutlet
PDF
Servlets
PDF
Modelling RESTful applications – Why should I not use verbs in REST url
PDF
Indic threads delhi13-rest-anirudh
PDF
How to build a react native app with the help of react native hooks
PDF
Simpan data- ke- database
PDF
Dagger 2. Right way to do Dependency Injection
PPTX
Dagger 2. The Right Way to Dependency Injections
PDF
Building TweetEngine
JEE Programming - 04 Java Servlets
Tomcat + other things
Lab2-android
Beginning AngularJS
Write and run three interesting queries/tutorialoutlet
Servlets
Modelling RESTful applications – Why should I not use verbs in REST url
Indic threads delhi13-rest-anirudh
How to build a react native app with the help of react native hooks
Simpan data- ke- database
Dagger 2. Right way to do Dependency Injection
Dagger 2. The Right Way to Dependency Injections
Building TweetEngine
Ad

Viewers also liked (7)

PDF
Google Compute EngineとGAE Pipeline API
PDF
Google App Engine: An Introduction
PPTX
Google App Engine
PPTX
Google app engine
PDF
Google app engine
KEY
Introduction to Google App Engine
PDF
State of the Word 2011
Google Compute EngineとGAE Pipeline API
Google App Engine: An Introduction
Google App Engine
Google app engine
Google app engine
Introduction to Google App Engine
State of the Word 2011
Ad

Similar to Talk 1: Google App Engine Development: Java, Data Models, and other things you should know (Navin Kumar, CTO of Socialwok.com) (20)

PPT
The 90-Day Startup with Google AppEngine for Java
PDF
Java on Google App engine
PDF
Google Developer Days Brazil 2009 - Java Appengine
PDF
Easy ORM-ness with Objectify-Appengine - Indicthreads cloud computing confere...
PPT
Google App Engine for Java (GAE/J)
PDF
Java Support On Google App Engine
PDF
Developing, deploying and monitoring Java applications using Google App Engine
PDF
Java Web Programming on Google Cloud Platform [2/3] : Datastore
PDF
Introduction to Datastore
PPT
Google App Engine for Java
KEY
Appengine Nljug
PPT
Developing Java Web Applications In Google App Engine
PDF
Google App Engine
PDF
Google App Engine for Java
PDF
Google App Engine Developer - Day2
PDF
Google App Engine - exploiting limitations
PDF
Google app engine - Soft Uni 19.06.2014
PPTX
Googleappengineintro 110410190620-phpapp01
PPTX
Easy ORMness with Objectify-Appengine
PPTX
Easy ORMness with Objectify-Appengine
The 90-Day Startup with Google AppEngine for Java
Java on Google App engine
Google Developer Days Brazil 2009 - Java Appengine
Easy ORM-ness with Objectify-Appengine - Indicthreads cloud computing confere...
Google App Engine for Java (GAE/J)
Java Support On Google App Engine
Developing, deploying and monitoring Java applications using Google App Engine
Java Web Programming on Google Cloud Platform [2/3] : Datastore
Introduction to Datastore
Google App Engine for Java
Appengine Nljug
Developing Java Web Applications In Google App Engine
Google App Engine
Google App Engine for Java
Google App Engine Developer - Day2
Google App Engine - exploiting limitations
Google app engine - Soft Uni 19.06.2014
Googleappengineintro 110410190620-phpapp01
Easy ORMness with Objectify-Appengine
Easy ORMness with Objectify-Appengine

Recently uploaded (20)

PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Electronic commerce courselecture one. Pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
A Presentation on Artificial Intelligence
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
A comparative analysis of optical character recognition models for extracting...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Cloud computing and distributed systems.
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Electronic commerce courselecture one. Pdf
Unlocking AI with Model Context Protocol (MCP)
Encapsulation_ Review paper, used for researhc scholars
Review of recent advances in non-invasive hemoglobin estimation
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
A Presentation on Artificial Intelligence
NewMind AI Weekly Chronicles - August'25-Week II
Mobile App Security Testing_ A Comprehensive Guide.pdf
A comparative analysis of optical character recognition models for extracting...
“AI and Expert System Decision Support & Business Intelligence Systems”
Cloud computing and distributed systems.
Assigned Numbers - 2025 - Bluetooth® Document
20250228 LYD VKU AI Blended-Learning.pptx
MYSQL Presentation for SQL database connectivity
Programs and apps: productivity, graphics, security and other tools
The Rise and Fall of 3GPP – Time for a Sabbatical?
Dropbox Q2 2025 Financial Results & Investor Presentation

Talk 1: Google App Engine Development: Java, Data Models, and other things you should know (Navin Kumar, CTO of Socialwok.com)

  • 1. Google App Engine Development Java, Data Models, and Other Things You Should Know Navin Kumar Socialwok
  • 2. Introduction to Google App Engine Google App Engine is an on-demand cloud platform that can be used to rapidly develop and scale web applications.   Advantages: You are using the same architecture and tools that Google uses to scale their own applications.  Easy to develop your own applications using Java and Python Free Quotas to get you started immediately.
  • 3. Java Support on Google App Engine Java support was introduced on April 2009   Remarkable milestone for several reasons: Brought the Java Servlet development model to Google App Engine You can use your favorite Java IDE to develop your applications now (Eclipse, NetBeans, IntelliJ) Database development is easy with JDO and JPA Not only limited to the Java Language, but ANY JVM-supported language can be used (JRuby, Groovy, Scala, even JavaScript(Rhino), PHP etc.)
  • 4. Eclipse Support and GWT Eclipse is the premier open source Java IDE, and with the Google Plugin for Eclipse, developing Google AppEngine apps can be done very easily.   Eclipse will automatically layout your web application for you in addition to providing 1-click deployment.   GWT is also supported by the Eclipse plugin, and can also be used along with your Google AppEngine codebase. End-to-end Java development of powerful Java-based web applications.
  • 5. Google Plugin for Eclipse (GWT and AppEngine)
  • 6. BigTable: Behind Google's Datastore BigTable: A Distributed Storage System for Structured Data ( http://labs.google.com/papers/bigtable.html ) Built on top of GFS (Google File System) ( http://labs.google.com/papers/mapreduce.html )    Strongly consistent and uses optimistic concurrency control     But it's not a relational database  No Joins or true OR queries &quot;!=&quot; is not implemented Limitations on the use of &quot;<&quot; and &quot;>&quot;
  • 7. Data Models DataNucleus ( http://www.datanucleus.org ) is used to handle the Java persistence frameworks on AppEngine   2 Choices: JDO (Java Data Objects) or JPA (Java Persistence API) (JPA will be very familiar to those who have used Hibernate or EJB persistence frameworks)   Both involve very similar coding styles.   For this talk, we will focus on JDO, but JPA is very similar, so the same concepts can be applied.   There is also a low-level datastore API that we will touch on as well
  • 8. Defining Your Data Model package com.socialwok.server.data; import java.io.Serializable; import javax.jdo.annotations.*; import com.google.appengine.api.datastore.Text; @PersistenceCapable(identityType = IdentityType.APPLICATION) public class Post implements Serializable {     private static final long serialVersionUID = 1L;     @PrimaryKey     @Persistent(valueStrategy=IdGeneratorStrategy.IDENTITY)     @Extension(vendorName=&quot;datanucleus&quot;, key=&quot;gae.encoded-pk&quot;, value=&quot;true&quot;)     private String id;     public String getId() { return id; }     @Persistent     private String title;     public String getTitle() { .. }     public void setTitle(String title) { .. }     @Persistent     private Text content;      public String getContent() { .. }      public void setContent(String content) { .. }     .. }
  • 9. Creating, Deleting, and Querying At the heart of everything is the PersistenceManager                PersistenceManager pm =  PMF.get().getPersistenceManager();       Post post = new Post();       post.setTitle(&quot;Title&quot;);       post.setContent(&quot;Google AppEngine for Java&quot;);       try {          pm.makePersistent(post);       }                  pm.close();       ...       Post deleteMe = pm.getObjectById(Post.class, deleteId);       try {           pm.deletePersistent(deleteMe);       }       ...   Build queries using JDOQL        Query query = pm.newQuery(Post.class);       query.setFilter(&quot;title == titleParam&quot;);       query.declareParameters(&quot;String titleParam&quot;);       query.setUnique(true);       Post post = (Post) query.execute(&quot;Title&quot;);       
  • 10. Relationships Owned one-to-one and one-to-many @Persistent(mappedBy=&quot;field&quot;) annotation syntax. Unowned relationships (one-to-one, one-to-many, many-to-many) @Persistent Key otherEntity ; @Persistent List<Key> otherEntities;   Owned relationships create a parent-child relationship Parent and child entities are stored in the same entity group Entity group defines a location in the datastore This is important because Transactions on the datastore can only be applied over a single entity group
  • 11. Other APIs you should be aware UsersService Don't write a login, use Google's! ImagesService Picasa image manipulation web services Memcache Distributed cache for objects Very useful! More on this later... URL Fetch Mail service Send outbound emails w/ some restrictions   APIs (except UsersService) subject to quota limitations
  • 12. And now for the fun stuff...  
  • 13. Enterprise social collaboration application built on Google App Engine.  Utilizes a social concept of feeds (also referred to as presence and activity streams) Combines the querying of reasonable complex data with privacy requirements of social networking.   Uses tons of Google App Engine APIs, Google APIs, and GWT.     As we have built it, we have learned several aspects about Google App Engine that have allowed us to make the app reasonable fast and responsive.
  • 14. Lesson 1: Utilization of Memcache Data structure of each feed is relatively complex At least 3 explicit unowned relationships        @Persistent Key user       @Persistent Key network       @Persistent List<Key> attachments    Requires querying for each these objects explicitly when representing in the feed. Feed is fetched repeated by several (hundreds) concurrent users There is need for the feed display to be reasonable responsive for all the different users
  • 15. Lesson 1 (cont.) Solution: Memcache Distributed in-memory cache  Uses javax.cache.* APIs Also, a lowlevel  API: com.google.appengine.api.memcache.* Basic uses: Speed up existing common datastore queries Session data, user preferences Cache data is retained as long as possible if no expiration is set Data is not stored on any persistent storage, so you must be sure your app can handle a &quot;cache miss&quot;
  • 16. Lesson 1: Memcache conclusions   Works really well! Responsive requests 2 s. => ~800 ms. resp. time (60% decrease) Cache data is generally retained for a very long time Distributed nature of cache provides benefits to every user on the system. The more people who use your app, the better your app performs** Even free quota for Memcache is quite generous: ~ 8.6 million API calls.
  • 17. Lesson 2: Message Delivery Fanout Adapted from Building Scalable, Complex Apps... from Google I/O by Brett Slatkin http://code.google.com/events/io/sessions/BuildingScalableComplexApps.html   Basically deals with a problem of fan-out Socialwok has a concept of &quot;following&quot; (which is basically a subscription between users) In our case, one user posts a single message that needs to be &quot;delivered&quot;  to all his subscribers How do we show the message efficiently to all his subscribers? We can deliver the message by reference to its recipients.
  • 18. Lesson 2 (cont.): RDBMS version 2 Primary Tables 2 Join Tables To get Messages to display for the current user SELECT * from Messages INNER JOIN UserMessages USING (message_id) WHERE UserMessages.user_id = 'current_user_id' But there aren't any joins on AppEngine! User ID Name 1 Navin 2 John 3 Vikram Message ID Message User ID 1 Hello world 1 2 Another message 3 Follower ID Following ID 1 2 1 3 2 1 Recipient ID Message ID 1 34 1 67
  • 19. Lesson 2: List Properties to the Rescue A list property is property in the datastore that has multiple values: @Persistent private Collection<String> values; Represented in Java using Collection fields (Set, List, etc.) Indexed in the same way that normal fields are         Densely pack information   Query like you query any single-valued property: query.setFilter(&quot;values == 2&quot;); values Index key=1,values=1 key=2,values=2 key=2,values=1
  • 20. Lesson 2: Our new data definition Now we can define a collection field to store the list of recipients public class Message {     @Persistent private String msg;     @Persistent private List<String> recipients;      ... }   Query on the collection field: Query query = pm.newQuery(Message.class); query.setFilter(&quot;recipients == recptParam&quot;); List<Message> msgs =       (List<Message>) query.execute(currentUserId); But there is one issue with this: Serialization overhead when fetching the messages We don't really care about the contents of this field when displaying the messages So we will take advantage of another trick
  • 21. Lesson 3: Keys-only Queries and AppEngine Key Structure We can perform queries whose return values are restricted to the keys of the entity Currently only supported in low-level datastore API AppEngine keys are structured in a very special way   Stored in protocol buffers    Consists of an app ID, and series of type-id_or_name pairs pair is entity type name and autogenerated-integer ID or user-provided name Root entities have exactly one of these pairs; child entities have one for each parent and their own Presents a unique ability to retrieve a parent entity's key from the child entity's key
  • 22. Lesson 3: A solution to our Serialization Problem Now we can store the irrelevant recipients in a child entity   Here's the process: Define a child entity with the recipients field Store the recipients of the message in the child entity Create a keys-only query on the child entity that filters on the recipients field. Get a list of parent keys from the list of child keys Bulk-fetch the parents from the datastore
  • 23. Lesson 3 (contd.): Solution (Data Def.) public class MessageRecipients {     @PrimaryKey private Key id;     @Persistent private List<String> recipients;     @Persistent private Date date;     @Persistent(mappedBy=&quot;msgRecpt&quot;) private Message msg;     ... } public class Message {     ...      @Persistent private Date date;     @Persistent private String msg;     @Persistent private MessageRecipients msgRecpt;     ... }
  • 24. Lesson 3 (contd): Solution (Querying) DatastoreService dataSvc = ...;   Query query = new Query(&quot;MessageRecipients&quot;)   .addFilter(&quot;recipients&quot;),FilterOperator.EQUAL,userid)   .addSort(&quot;date&quot;, SortDirection.DESCENDING)   .setKeysOnly();   // <-- Only fetch keys!   List<Entity> msgRecpts = dataSvc.prepare(query).asList(); List<Key> parents = new ArrayList<Key>(); for (Entity recep : msgRecpts) {     parents.add(recep.getParent()); } // Bulk fetch parents using key list Map<Key,Entity> msgs = dataSvc.get(parents);
  • 25. Cool Trick: Lite Full Text Search Most web applications nowadays need some form of full-text search Well we are on Google AppEngine aren't we!   Google actually did really release a basic searchable model implementation Limited to Python ( google.appengine.ext.search ) More info: http://www.billkatz.com/2008/8/A-SearchableModel-for-App-Engine Proper full-text search is in the AppEngine roadmap   Some of our earlier lessons do apply here.
  • 26. How do we build it First, it helps to understand how a basic full-text search index works First, break up the text into terms using lexographical analysis Then store the terms in a lookup table based on key of the message With List fields, Google AppEngine gives us this one. We build queries using the same tricks. We also apply the same tricks using child entities and key-only queries to optimize for the serialization overhead.
  • 27. Live example I have deployed a modified version of Google AppEngine guestbook example: http://searchguestbook.appspot.com   If anyone wants to &quot;sign&quot; it right now, please go ahead.   We will now search the data Limited to 1-2 word queries
  • 28. How it works. Applies lessons from list fields and keys-only queries   @Persistent Set<String> searchTerms; Our &quot;lexigraphical analysis&quot;: Java regular expression   String[] tokens = content.toLowerCase().split(&quot;[^\\w]+&quot;); Can use a full-text search library like Lucene to improve this part Another cool feature of list properties: merge-join Think about organizing your data in a Venn-diagram fashion and finding the intersection of your data. Watch your indexes! Can improve this implementation by using Memcache to cache common search queries. Code will be made available after the talk, so you can take a good look for yourself!  
  • 29. Conclusions Google AppEngine for Java provides a standardized way to build applications for Google AppEngine In building Socialwok, we have learned several lessons that apply when building a scalable application on Google App Engine Get the Searchable Guestbook code here: http://searchguestbook.appspot.com/searchguestbook.tar.gz In short, Google AppEngine development has never been easier and more interesting! Get started by visiting: http://code.google.com/appengine
  • 30. Q & A

Editor's Notes

  • #7: An update of a entity occurs in a transaction that is retried a fixed number of times if other processes are trying to update the same entity simultaneously. Your application can execute multiple datastore operations in a single transaction which either all succeed or all fail, ensuring the integrity of your data.
  • #15: Before this slide, switch to application.