Tuesday, December 6, 2011

Dynamic Groovy Edges and Regression Isolation

In JEE application architecture, components are grouped into functional layers.  In a typical multi-tier enterprise web application, there are N layers, N-1 layer junctions, and (N-1) x 2 layer edges.  In the diagram below, for example, there are three layers, two layer junctions, and four layer edges.

Designing and implementing interfaces between application layers can be painful, especially if you don't have a span of control or even influence design and/or implementation.  Interfaces are notorious change zones.  Even with well-defined interface contracts, development teams are routinely confronted with interface changes that require quick turnaround times, compounded by development, test, and deployment overhead.

If we examine why we have interfaces in the first place, we soon realize that we need interoperability (data sharing) between different systems.  It would be desirable to design systems with intrinsic interoperability, but that is a lofty goal that is seldom realized outside of SOA boundaries.  No, we have different datasets in different systems.  So we have interfaces.

It is not only the data that is different, the pace with which different systems and layers evolve is also different.  Junctions are by default, separations between different layers changing at different paces.  For example, the web presentation layer changes rapidly, as the services layer more slowly and the data layer even slower still.  In architecture, we design with "pace layering" in mind, so that we can provision the appropriate interfaces to allow for change, for evolution, at different paces.

If the reader is interested in "pace layering", here is one of the best descriptions of Pace Layering, from Donna Maurer, that I have ever read:
Complex systems can be decomposed into multiple layers, where the layers change at different rates. The “fast layers” learn, absorb shocks and get attention; the “slow layers” remember, constrain and have power. One of the implications of this model is that information architects can do what they have always done–slow, deep, rich work; while tagging can spin madly on the surface. If it is worth keeping, it will seep down into the lower layers.
I agree with Isaac Asimov:  "The only constant is change..."  And, it is difficult to design for constant change.  You just can't code for every event that could possibly happen.  Applications would be woefully bloated and impossible to maintain.  So, we must embrace change, and plan the best methods for reacting to change.  In JEE applications, the best method is the one that introduces the least amount of change to the overall application.  We must also avoid regression loads, during change, if possible.

Lately our teams have been looking at creative ways to use existing Java technologies to create more nimble application interface edges.  The first step was to ask ourselves, what Java tool allows us to quickly change code without having to redeploy our applications?

My first thought was JRebel from Zero Turnaround.  Using JRebel, developers can "instantly see any code change made to an app without redeploying."  However, unless I am mistaken, JRebel is primarily for non-production use.  There may be valid reasons to use JRebel in production systems.  However, I have never worked on a project where that type of JVM manipulation was allowed in production, especially triggered from developer desktops.  There is usually so much bureaucracy and change control processes to tangle with.

Redeploying an application contained in an EAR or WAR to production is the choice of last resort.  On top of the overhead of following the proper change control processes (ITIL anyone), there is the large regression analysis and testing that needs to happen.  It would be great if we could isolate the most dynamic areas of our application (interfaces), so that we could change them in reaction to external changes beyond our control.  It would also be desirable to make these changes in an isolated manner so as not to require an EAR or WAR redeployment.

To solve this dilemma, the initial knee-jerk reaction is to write so-called "smarter" interfaces to account for known possible changes.  That approach will only get you so far.  Nay, the solution to this muddle is to step back and unlearn what we thought we learned earlier in our careers. Actually, the approach that I am getting ready to espouse is similar to the Dale Carnegie approach to handling problems.  The one thing that stuck with me from my class on Human Relations was to consider the worst scenario, accept that scenario if you could not prevent it, and then plan on a mitigation strategy.

So, the worst case scenario is that our interfaces will change; systems will evolve at different paces and our junctions will be affected.  Accept that which we cannot change.  Now, how do we make this situation better?  Well, do what we already know how to do:  write code to satisfy the change.  For JEE applications, this means writing Java...or does it?  What if we could write code that would execute as Java byte code in the JVM, but would not require us to redeploy our entire application.  That would be Groovy.  Yes, the pun was intended.

No matter what you think about the Groovy language, it does offer a solution to our dilemma, provided that we also include certain other "standard" components.  First, the solution is seen below.  we are talking about creating dynamic groovy edges.  With "Groovy Edges" our interfaces are now bracketed by dynamic code components that can actually be isolated from the rest of the application stack.
What is it about Groovy that best suits it for this task?
  1. Groovy can be written as a script and parsed at run-time and compiled into Java byte-code and then loaded by the Groovy class loader.
  2. Groovy scripts can contain Groovy and Java syntax in the script file (*.groovy).
  3. Groovy can use the same Java APIs that are used by traditional Java code.
OK, so how do we write Groovy to be dynamic?  Well, you might first jump to conclusions and recommend Compile-Time Meta-programming with AST Transformations or the Expando Meta Class.  While you might use these methods, the actual solution is even less technical provided the right infrastructure is in place, as seen below.

The solution starts with a JRockit JVM.  Since we will be parsing and compiling Groovy scripts at run-time to execute in the JVM, there will be some impact to class-loading resources.  Unfortunately, for this solution, Hotspot is not adequate, unless of course we define huge heaps with huge permgens in a 64 bit environment.  However, the Java 7 JVM shows promise with how it manages class-loading resources.

Next we need Spring; that should be no issue as many JEE applications use Spring.  Spring allows us to leverage Spring Bean Factory configurations to point to where our "Groovy Beans" scripts are stored.  For this solution, we will store these Groovy scripts outside of the JEE EAR (or WAR) files to preclude the need for deployments when the Groovy changes, and reduce application-wide regression testing and analysis.

 <!-- insert the following for debugging - refresh-check-delay="1000" -->  
      <lang:groovy id="GroovyInterfaceEdge" script-source="${groovyInterfaceEdge}" refresh-check-delay="1000" >  
        <lang:property name="key" value="value"/>  
         </lang:groovy>  

In the code snippet above, the SpringBean "GroovyEdgeObject" is loaded from a URL that is resolved by the Spring EL in the "script-source" attribute.  The "refresh-check-delay" attribute is set to refresh the object in Spring's IOC container every second if changes in the script are detected between object method calls.  With this approach we could store the remote scripts in a CMS or even Amazon Simple Storage Service (S3).  The key point is that the Groovy script is stored outside of our typical regression boundary.  And, Spring will take care of updating the IOC container when changes are detected in the script.

There is additional magic happening with the Spring EL value in the "script-source" attribute above.  This argument, "${groovyInterfaceEdge}", is known as a property placeholder.  Spring uses the "PropertyPlaceholderConfigurer" to load a properties file and then get values from that properties file that match the placeholder keys in bean definitions.

 <!-- Read service related properties from the classpath -->  
    <bean id="services.propertyConfigurer"  
              class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">  
              <property name="location">  
                   <value>classpath:services.properties</value>  
              </property>  
         </bean>   

In the code just above, the services.properties file is loaded from the classpath and the key/value pairs are made available to bean definition attributes using Spring EL placeholders.  So, "${groovyInterfaceEdge}" is resolved to a value in the services.properties file found on the application classpath.  In this example, the "${groovyInterfaceEdge}" would resolve to a URI that can be accessed via the application to load the Groovy script stored in the remote script storage facility. 

In this example, the Groovy script is accessible via a web URI.  Depending on how the CMS is configured in your environment, it may not provide this URI access to all resources.  In this case, we would build and configure a "Content Proxy" that could use the API supplied by the CMS vendor to get at the script on the behalf of the Spring Bean Factory.  When a content proxy is used, the URI resolved by the Spring EL placeholder and PropertyPlaceholderConfigurer API would be an endpoint.

 package com.ironworks.restwebservice.content;  
 import java.io.InputStream;  
 import java.text.SimpleDateFormat;  
 import javax.servlet.http.HttpServletResponse;  
 import javax.ws.rs.GET;  
 import javax.ws.rs.Path;  
 import javax.ws.rs.PathParam;  
 import javax.ws.rs.Produces;  
 import javax.ws.rs.core.Context;  
 import javax.ws.rs.core.MediaType;  
 import javax.ws.rs.core.UriInfo;  
 import org.apache.commons.logging.Log;  
 import org.apache.commons.logging.LogFactory;  
 import com.bea.content.AuthenticationException;  
 import com.bea.content.AuthorizationException;  
 import com.bea.content.ContentContext;  
 import com.bea.content.NoSuchPropertyException;  
 import com.bea.content.Node;  
 import com.bea.content.Property;  
 import com.bea.content.RepositoryException;  
 import com.bea.content.expression.Search;  
 import com.bea.content.federated.ContentManagerFactory;  
 import com.bea.content.federated.INodeManager;  
 import com.bea.content.federated.ISearchManager;  
 import com.bea.content.paging.ContentListKeys;  
 import com.bea.content.paging.ISortableFilterablePagedList;  
 //NEXT 999 LINES SUPPRESS LineLength  
 @Path("/contentProxy")  
 public class ContentProxy {  
       @Context  
       UriInfo uriInfo;  
       private static final Log log = LogFactory.getLog(ContentProxy.class);  
       private static SimpleDateFormat dateFormat = new SimpleDateFormat("EEE, dd MMM yyy HH:mm:ss z");  
      @GET  
      @Path("/{directory}/{nodename}")  
      @Produces(MediaType.TEXT_HTML)  
      public final InputStream contentLookup(  
                @PathParam("directory") final String directory,  
                @PathParam("nodename") final String nodename,  
                @Context HttpServletResponse response) {  
           final Search search =  
          new Search("/WLP Repository/" + directory, -1, "",  
              " cm_nodeName= '" + nodename + "'");  
        final ISearchManager searchManager =  
         ContentManagerFactory.getSearchManager();  
        final INodeManager nodeManager = ContentManagerFactory.getNodeManager();  
        final ContentContext cc = new ContentContext();  
        cc.setParameter(ContentListKeys.BATCH_LOADING_SIZE_KEY, -1);  
        ISortableFilterablePagedList<Node> results = null;  
        try {  
         results = searchManager.search(cc, search);  
        } catch (Exception anything) {  
         // fatal. We cannot handle it.  
         throw new RuntimeException(anything);  
        }  
        for (Node oneNode : results) {  
         try {  
          if (oneNode.getProperty("copy").getType() == Property.BINARY) {  
               response.setHeader("Last-Modified", dateFormat.format(oneNode.getModifiedDate().getTime()));  
           return nodeManager.getStream(new ContentContext(),  
             oneNode.getId(), "copy");  
          }  
         } catch (AuthorizationException e) {  
          e.printStackTrace();  
         } catch (AuthenticationException e) {  
          e.printStackTrace();  
         } catch (NoSuchPropertyException e) {  
          e.printStackTrace();  
         } catch (RepositoryException e) {  
          e.printStackTrace();  
         }  
        }    
        return null;  
       }  
      }  

The code just above is an example of just such a content proxy, though mileage may vary depending on what flavor of CMS is used.  This class, along with the Jersey REST API creates a RESTful web service endpoint that will take in content proxy requests and return the script file from the CMS.  The REST approach is well suited for this type of service, where we simply want to parse the URI components to get at the parameters of the service call.

 groovyInterfaceEdge=http://<HOST>/restServices/contentProxy/scripts/groovyInterfaceEdge  

Putting it altogether, the services.properties file entry above shows how we would configure the URI to call the RESTful content proxy endpoint.  This URI is then returned to the Spring Bean Factory when the Groovy Bean is asked for by the application.  The content proxy returns the Groovy script file (*.groovy) to the application where the Groovy class loader takes care of loading into the JVM for use by the application.

With this solution, we have used convention and well known frameworks to build a sustainable and dynamic interface edge framework.  However, this solution opens up many other possibilities for our applications.  There may be other areas where we would like to load Spring Beans that are dynamic, so that we can change key behavior of volatile areas within our application without having to incur the load or redeployment and major regression testing.

Thursday, December 1, 2011

Richmond Times Dispatch - 2011 state employee salary database - AGAIN!

I received this email from the RTD:
The Times-Dispatch and TimesDispatch.com will publish our annual report on state employee salaries this Sunday, Dec. 4. This year's searchable database includes salaries for all 104,554 state employees, including names and positions for all employees who earn above the state's average of $52,547.
They did the same thing last year.  Here is how I felt last year when I was a state employee caught up in this database:


On LinkedIn.com you refer to your company as “the leading provider of high-quality news…in central Virginia.”  I think at best that claim is suspect, and at worst, you are providing false panoply of episodic capabilities to the public, to hide the depth of your desperation to remain a viable news source.  The Richmond Times-Dispatch has a few redeeming qualities, but taken in aggregate you offer very little to your customers that they could not more freely obtain from a multitude of other online outlets.  There are times when publishing FOIA information is warranted; however, this is not one of them, at least not in its current form.  It is especially unwarranted when the purpose is to fuel a negative agenda.  In economically challenging times, Federal and State employees are easy targets for efficiency and effectiveness ratings.  However, I submit that there are many employees that work very hard and that would and could make more money should they move to the private sector and leave public service behind.  Your irresponsible action in publishing the salaries of state employees, SEARCHABLE BY NAME, smacks of arrogance and sensationalism and is bereft of any dignity or integrity that your newspaper claims to still retain.  As far as I am concerned, my name in the context of my agency and position is “Personally Identifiable Information” and as such should not be used. 

The damage of your actions far outweighs the value delivered to your readers and the general public.  There is absolutely no constructive reasoning that justifies listing employees names in your database.  From your actions it is clear that you completely disregarded the negative impact that such a database would have on state employees.  These same state employees are also your customers.  With this contemptible action, your organization is morally and ethically bankrupt, and you owe your readers an apology for wasting their time with such twaddle.

I would be very interested to know the salaries of each of your employees making over $50K, as well as their job titles. And of course, I would need the names of said employees to more easily target my public disdain for your obviously poor value proposition and the gross mismanagement of your financial resources.  If that makes no sense to you, then you now realize the situation in which thousands of state employees find themselves, without redress.  I am sure that your employees would not mind; after all, your customers are paying you for your information services.  Do they not have the right to get the best value for their money?  Are not the salaries of your employees just more information about which someone, somewhere would want to read?  Is that not how you choose to publish these days?  Of course you need not answer these rhetorical questions; your actions speak volumes about your motivations.
Just because you can, should not be a license to pursue a course of action that has no real probative or news value and genuinely provides a divisive tool for someone’s personal agenda.  Your misguided exploit is completely irresponsible and in my opinion an abuse of said state employees, your customers, and the spirit of the Freedom of Information Act.  It is also an abuse of your power and a misuse of your first amendment rights.  Furthermore, your actions underpin my long held belief that nothing useful comes from your disgraceful and at times, Yellow-Journalistic tabloid.  You may have temporarily boosted your circulation or at best your web site traffic, but you have undoubtedly isolated yet another large customer segment and undermined the public’s trust.  Moreover, if more of your readers were affected by this debauched deed you would undoubtedly apologize or even redact said folly.  Indeed, you misunderstand, in entirety, the meaning of cultivating long-term customer value and loyalty.  This entire episode has been a tactless and colossal waste of our time and attention.  You may be the press, but you are also a business.  I think you should cease acting as the self-anointed herald of worthless hokum and actually cover creditable news that makes our lives better and provides at least a modicum of value to your customers.  However, if you are disinclined to act appropriately, one can only hope that your readers recognize how shallow, foul, and jaded you have become and move on to better outlets of information.  For myself, I am almost giddy with the thought that I do not have a subscription to your worthless paper.  I now find it less than suitable to line the bottom of my dog’s crate!