Sunday, March 20, 2016

A Prototyping Platform with Jenkins Pipeline

So, I have been using the Jenkins Pipeline Plugin (formerly known as Workflow) for a few months now.  I like the idea of being able to code Groovy and Java directly into the Jenkins jobs, as Pipeline scripts.  Though I have not used it yet, I can also store the scripts in an SCM.  I think that I will eventually transition to that mode, having used inline scripting to prototype the jobs first.

MongoDB Pipeline
My latest Pipeline script parses a JSON file from an upstream job, munges the data, and then writes a new JSON document into MongoDB.  For MongoDB integration, I chose to NOT use the existing Jenkins MongoDB plugins; I needed more flexibility.  Since I know my way around Mongo and Java integration (MongoDB and SpringData), and I have admin rights to my Jenkins instance, I simply added the MongoDB Java Driver Jar file (mongo-java-driver-3.0.4.jar) to the Jenkins classpath via the WEB-INF/lib directory.  This enables me to use the MongoDB Java Driver class files in my Groovy pipeline scripts, as seen below.

import com.mongodb.*

stage 'parseData'
node {
    String path = env.HOME + "/Home/jobs/DataAPI/jobs/" + upstreamJob + "/workspace/" + file
    if (fileExists(path)) {
        println "File Exists"

        def file = readFile path
        def jsonSlurper = new groovy.json.JsonSlurper()
        def object = jsonSlurper.parseText(file)
        
        def target = object.get(0).target
        def dataPoints = object.get(0).datapoints
        
        if (dataPoints.size == 0) {
            error 'No data found.'
        } else {
            println "Datapoints:  " + dataPoints.size
            
            Map<String,Object> dataMap = new HashMap<String,Object>()
            List<Integer> seriesData = new ArrayList<Integer>()
            List<List<Integer>> seriesList = new ArrayList<List<Integer>>()
        
            dataMap.put("target",target)

            for (Object x:dataPoints) {
                if (x[0] != null) {
                    seriesData.add(Integer.valueOf(x[0].intValue()))
                    seriesData.add(x[1])
                    seriesList.add(seriesData)
                    seriesData = new ArrayList<Integer>()
                }
            }
            
            dataMap.put("series", seriesList)
            
            if (new Boolean(writeToMongo) == true) {
                def mongoClient = new MongoClient("localhost", 29009)
                def collection = mongoClient.getDB("jenkins").getCollection("apiLogs")
                collection.insert(new BasicDBObject(dataMap))
                mongoClient.close()
            }
        }
        
    } else  {
        error 'Data file does not exist at ' + path
    }
}

In the above script, I used a Groovy JSON Slurper to parse the JSON from a file and build an object.  Then I needed to munge the data into a more suitable Java object that could then be persisted directly to MongoDB via the Java API.

As a developer, I see this as a very strong case for Jenkins pipeline scripting.  Without this approach, being able to write Groovy and Java code directly into the Pipeline project, I would be at the mercy of integrating other Jenkins plugins to make this work, probably spanning multiple jobs.

Now, I get it; part of the strength of Jenkins is its collection of plugins.  However, as a long time Jenkins user and developer, I have had my share of plugin issues.  It's a freeing experience to be able to "roll-my-own" customization.  And, never has it been easier to integrate Groovy and Java then with the Pipeline plugin.

As a matter of fact, this project is part of an orchestration, two parameterized projects, triggered by a third.  The "master" project is also a pipeline project; the script is below.

stage 'collect'
node {
    build job: 'CollectData', parameters: [[$class: 'StringParameterValue', name: 'target', value: '<TOPIC_VALUE>'], [$class: 'StringParameterValue', name: 'from', value: '-15min'], [$class: 'StringParameterValue', name: 'format', value: 'json']]
}

stage 'parse'
node {
    build job: 'ParseData', parameters: [[$class: 'StringParameterValue', name: 'file', value: 'data.json'], [$class: 'StringParameterValue', name: 'upstreamJob', value: 'CollectData']]
}

Of course, this is made very easy by using the Snippet Generator that is part of every pipeline project.


You can also use the DSL reference, found here: http://<JENKINS_URL>/workflow-cps-snippetizer/dslReference, and the introduction found on GitHub:  https://github.com/jenkinsci/workflow-plugin.  The plugin page is found here:  https://wiki.jenkins-ci.org/display/JENKINS/Pipeline+Plugin.  And, fianlly, Andy Pemberton has written a reference card found here:  https://dzone.com/refcardz/continuous-delivery-with-jenkins-workflow.

A Paradigm Shift
In my opinion, the ability to freeform program so easily in the Pipeline project is a game changer for Jenkins users.  With this functionality, Jenkins is now a prototyping platform for CI/CD/DevOps as well as Integration and Monitoring.  Sure, we will still use and write plugins.  For example, in my orchestration, I used the HTTP Request Plugin in my Parameterized Build.

I used this plugin to make an HTTP API call, and passed the build parameters directly to the HTTP GET call as query_string arguments.  Now, you may ask why, if I am so stoked about Jenkins Pipeline, did I not use "curl" in a shell block in the pipeline.  Simple, I did not want blocking IO in the pipeline script.  Instead, I chose to isolate this call into a separate upstream job, and use a downstream pipeline script to munge the downloaded data.

Security
Using the Jenkins Pipeline plugin does not mean that we abandon all we know about Jenkins security and best practices.  In fact, users without Overall/Run Scripts will use the Groovy Sandbox with pre-approved scripts.  Of course, users can elect to not use the sandbox.  However, doing so means that all scripts require admin approval.

The Jenkins Reactor
In the example above, I have used Jenkins as a "batch reactor".  With the Pipeline Plugin and orchestrated jobs, I have built a reactor that allows me to run multiple processes without leaving the context of the Jenkins environment.  Who knows, in the future this orchestration may move to it's own application space.  However, for now I am incubating the prototype in my "Jenkins Reactor".  Using Jenkins this way provides me with the container and services I need to quickly integrate to other systems, and build a prototype application.