Guillaume Laforge's "Mars Rover" tutorial on Groovy DSL's

VMware P.M. for Codehaus Groovy, Guillaume Laforge, has just released a tutorial on building a DSL in Groovy, themed "Going to Mars with Groovy". Because the sample code from the Slideshare slides can't be copied and pasted, I've transcribed it (sans pictures) here for those interested. I've tested the examples in Groovy 1.8.6.

Going to Mars with Groovy

Introduction

Definition

Wikipedia: A Domain-Specific Language is a programming language or executable specification language that offers, through appropriate notations and abstractions, expressive power focused on, and usually restricted to, a particular problem domain.

In contrast to General Purpose Languages

Also known as: fluent / human interfaces, language oriented programming, little or mini languages, macros, business natural languages...

Technical examples

SQL: SELECT * FROM TABLE WHERE NAME LIKE '%SMI' ORDER BY NAME
Regex: "x.z?z{1,3}y"
other examples: Glade, XSLT, fetchmail

Where?

  • Antimalaria drug resistance simulation
  • Insurance policy risk calculation engine
  • HR skills representation
  • Nuclear safety simulations
  • Market data feeds analysis
  • Loan acceptance rules engine

Goals of DSLs

  • Use a more expressive language than a general-purpose one
  • Share a common metaphor of understanding between developers and subject matter experts
  • Have domain experts help with the design of the business logic of an application
  • Avoid cluttering business code with too much boilerplate technical code thanks to a clean separation
  • Let business rules have their own lifecycle

Pros and cons

Pros
  • Domain experts can help, validate, modify, and often develop DSL programs
  • Somewhat self-documenting
  • Enhance quality, productivity, reliability, maintainability, portability, reusability
  • Safety; as long as the language constructs are safe, any DSL sentence can be considered safe

Cons
  • Learning cost vs. limited applicability
  • Cost of designing, implementing & maintaining DSLs as well as tools/IDEs
  • Attaining proper scope
  • Trade-offs between domain specificity and general purpose language constructs
  • Efficiency cost
  • Proliferation of similar non-standard DSLs

Groovy provides...

  • A flexible and malleable syntax: scripts, native syntax constructs (list, map, ranges), closures, less punctuation...
  • Compile-time and runtime meta-programming: metaclasses, AST transformations, also operator overloading
  • The ability to easily integrate into Java / Spring apps; also security and safety

Let’s get started on your mission: build a DSL for a Mars robot

We need a robot that moves in a direction...
class Robot{
  void move(String dir){println "Rover: moving $dir"}
}

A more explicit direction...
//explore.groovy
package mars
class Robot{
  void move(Direction dir){println "Rover: moving $dir"}
}
enum Direction{
  left, right, forward, backward
}

Now how can we control it?
@echo off
set JAVA_EXE="C:/Program Files/Java/jdk1.7.0/bin/java.exe"
set JAR_EXE="C:/Program Files/Java/jdk1.7.0/bin/jar.exe"
set GVY_HOME="C:/Program Files/Groovy/groovy-1.8.6"
set EXT_DIRS=-Djava.ext.dirs=%GVY_HOME%/lib;.
%JAVA_EXE% %EXT_DIRS% org.codehaus.groovy.tools.FileSystemCompiler mars/explore.groovy
%JAR_EXE% cf Mars.jar mars/*.class
%JAVA_EXE% %EXT_DIRS% groovy.lang.GroovyShell integration.groovy

import static mars.Direction.*;
import mars.Robot;
public class Command{
  public static void main(String[] args){
    Robot robot= new Robot();
    robot.move(left);
  }
}

We can remove this syntactic noise:
  • semicolons
  • class header and close curly
  • method header and close curly
  • parentheses around "left" parameter
  • replace "Robot" type with "def", i.e. optional typing

Scripts vs Classes
Optional semicolons & parens
import static mars.Direction.*
import mars.Robot
def robot= new Robot()
robot.move left

But I don't want to compile a script for every command!

Integration

GroovyShell to the rescue. Two files, integration.groovy for the DSL developer and command.groovy for the DSL user...

//integration.groovy
def shell= new GroovyShell()
shell.evaluate(
  new File("command.groovy")
)

//command.groovy
import static mars.Direction.*
import mars.Robot
def robot= new Robot()
robot.move left

Integration mechanisms

Different solutions available:
  • Groovy's own mechanisms
    • GroovyScriptEngine, Eval, GroovyClassLoader, GroovyShell
  • Java 6: javax.script.* / JSR-223
    • Groovy provides a JSR-223 implementation
  • Spring's lang namespace
Groovy provides the highest level of flexibility and customization, but JSR-223 is a standard...

What’s wrong with our DSL?
import static mars.Direction.*
import mars.Robot
def robot= new Robot()
robot.move left

Can't we hide those imports?
Can’t we inject the robot?
Do we really need to repeat 'robot'?

What we really want is...
move left

Let's inject a robot!
  • We can pass data in/out of scripts through the Binding
  • it's basically just like a map of variable name keys and their associated values

//integration.groovy
import mars.Robot
def binding= new Binding([
  robot: new Robot()
])
def shell= new GroovyShell(binding)
shell.evaluate(
  new File("command.groovy")
)

Better?
  • Robot import removed
  • Robot injected, no 'new' needed
//command.groovy
import static mars.Direction.*
robot.move left

How to inject the direction?
Using the binding...
//integration.groovy
import mars.Direction
import mars.Robot
def binding= new Binding([
  robot:    new Robot(),
  left:     Direction.left,
  right:    Direction.right,
  backward: Direction.backward,
  forward:  Direction.forward
])
def shell= new GroovyShell(binding)
shell.evaluate(
  new File("command.groovy")
)

//command.groovy
robot.move left

Fragile in case of new directions, so:
import mars.Direction
import mars.Robot
def binding= new Binding([
  robot: new Robot(),
  *: Direction.values()
              .collectEntries{
                [(it.name()): it]
              }
])
def shell= new GroovyShell(binding)
shell.evaluate(
  new File("command.groovy")
)

Using compiler customizers...
  • Let’s have a look at them

Compilation customizers
  • Ability to apply some customization to the Groovy compilation process
  • Three available customizers:
    • ImportCustomizer: add transparent imports
    • ASTTransformationCustomizer: injects an AST transform
    • SecureASTCustomizer: restrict the groovy language to an allowed subset
  • But you can implement your own

ImportCustomizer

new GroovyShell(new Binding([robot: new Robot()]))
  .evaluate("import static Direction.*\n" +
            "robot.move left")
Cheat with string concatenation? Bad!

import mars.Robot
import org.codehaus.groovy.control.CompilerConfiguration
import org.codehaus.groovy.control.customizers.ImportCustomizer
def configuration= new CompilerConfiguration()
def imports= new ImportCustomizer()
imports.addStaticStar(mars.Direction.name)
configuration.addCompilationCustomizers(imports)

new GroovyShell(new Binding([robot: new Robot()]),
                configuration)
  .evaluate("robot.move left")

AST transformation customizer

import mars.Robot
import org.codehaus.groovy.control.CompilerConfiguration
import org.codehaus.groovy.control.customizers.ImportCustomizer
import org.codehaus.groovy.control.customizers.ASTTransformationCustomizer
import groovy.util.logging.Log
def configuration= new CompilerConfiguration()
def imports= new ImportCustomizer()
imports.addStaticStar(mars.Direction.name)
configuration.addCompilationCustomizers(
  imports,
  new ASTTransformationCustomizer(Log))

new GroovyShell(new Binding([robot: new Robot()]),
                configuration)
  .evaluate("robot.move left" + "\n" +
            "log.info 'Robot moved left'")

@Log injects a logger in scripts and classes

Secure the onboard trajectory calculator

Secure AST customizer

Idea: Secure the rocket's onboard trajectory calculation system by allowing only math expressions to be evaluated by the calculator

Let's setup our environment
  • an import customizer to import java.lang.Math.*
  • prepare a secure AST customizer

Disallow closures and methods
Black / white list imports

import org.codehaus.groovy.control.customizers.ImportCustomizer
import org.codehaus.groovy.control.customizers.SecureASTCustomizer
import static org.codehaus.groovy.syntax.Types.*

def imports= new ImportCustomizer().addStaticStars('java.lang.Math')
def secure= new SecureASTCustomizer()
secure.with{
  //disallow closure creation
  closuresAllowed= false
  //disallow method definitions
  methodDefinitionAllowed= false
  //empty white list => forbid imports
  importsWhitelist= []
  staticImportsWhitelist= []
  //only allow the java.lang.Math.* static import
  staticStarImportsWhitelist= ['java.lang.Math']

  //language tokens allowed
  tokensWhitelist= [
    PLUS, MINUS, MULTIPLY, DIVIDE, MOD, POWER, PLUS_PLUS, MINUS_MINUS,
    COMPARE_EQUAL, COMPARE_NOT_EQUAL, COMPARE_LESS_THAN, COMPARE_LESS_THAN_EQUAL,
    COMPARE_GREATER_THAN, COMPARE_GREATER_THAN_EQUAL
  ]

  //types allowed to be used  (including primitive types)
  constantTypesClassesWhiteList= [
    Integer, Float, Long, Double, BigDecimal,
    Integer.TYPE, Long.TYPE, Float.TYPE, Double.TYPE
  ]

  //classes who are allowed to be receivers of method calls
  receiversClassesWhiteList= [
    Math, Integer, Float, Double, Long, BigDecimal]
}
...

You can build a subset of the Groovy syntax!
Black / white list usage of classes

Ready to evaluate our flight equations!
import org.codehaus.groovy.control.CompilerConfiguration
def config= new CompilerConfiguration()
config.addCompilationCustomizers(imports, secure)
def shell= new GroovyShell(config)
println shell.evaluate ('cos PI/3')

But the following would have failed:
shell.evaluate 'System.exit(0)'

Back to our robot...
robot.move left

Still need to get rid of the robot prefix!

How to remove the 'robot'?
Instead of calling the move() method on the robot instance, we should be able to call the move() method directly from within the script

Inject a closure in the binding
import mars.Direction
import mars.Robot
def robot= new Robot()
binding= new Binding([
  robot: robot,
  *: Direction.values()
              .collectEntries{
                [(it.name()): it]
              },
  move: robot.&move
])
def shell= new GroovyShell(binding)
shell.evaluate(
  new File("command.groovy")
)

Method pointer (a closure) on robot's move instance method

Ready for lift off!
move left

how do you define the speed?
What we could do now is...
move left, at: 3.km/h

Mix of named and normal parameters
How to add a km property to numbers?

Adding properties to numbers
We need to:
  • define units, distance and speed
  • have a nice notation for them: that’s where we add properties to numbers!

Unit enum and Distance class
enum DistanceUnit{
  centimeter('cm', 0.01),
  meter     ('m',  1),
  kilometer ('km', 1000)

  String abbreviation
  double multiplier

  DistanceUnit(String abbr, double mult){
    this.abbreviation= abbr
    this.multiplier= mult
  }
  String toString(){ abbreviation }
}

import groovy.transform.TupleConstructor
@TupleConstructor
class Distance{
  double amount
  DistanceUnit unit
  String toString(){
    "$amount $unit"
  }
}

Different techniques
To add dynamic methods or properties, there are several approaches at your disposal:
  • ExpandoMetaClass
  • custom MetaClass
  • Categories

Let’s have a look at the ExpandoMetaClass

Using ExpandoMetaClass
import mars.Distance
import mars.DistanceUnit
Number.metaClass.getCm= {->
  new Distance(delegate, DistanceUnit.centimeter)
}
Number.metaClass.getM= {->
  new Distance(delegate, DistanceUnit.meter)
}
Number.metaClass.getKm= {->
  new Distance(delegate, DistanceUnit.kilometer)
}

Add that to integration.groovy
'delegate' is the current number

Usage in your DSL's
//integration.groovy
import mars.Direction
import mars.Robot
def binding= new Binding([
  robot: new Robot(),
  *: Direction.values()
              .collectEntries{
                [(it.name()): it]
              }
])
def shell= new GroovyShell(binding)
shell.evaluate(
  new File("command.groovy")
)

//command.groovy
40.cm
3.5.m
4.km

Distance okay, but speed?
For distance, we just added a property access after the number, but we now need to divide ('div') by the time 2.km/h

The div() method on Distance
An h duration instance in the binding

First, let’s look at time
enum TimeUnit{
  hour      ('h',   3600),
  minute    ('min',   60),
  kilometer ('s',      1)

  String abbreviation
  double multiplier

  TimeUnit(String abbr, double mult){
    this.abbreviation= abbr
    this.multiplier= mult
  }
  String toString(){ abbreviation }
}

import groovy.transform.TupleConstructor
@TupleConstructor
class Duration{
  double amount
  TimeUnit unit
  String toString(){
    "$amount $unit"
  }
}

Inject the ‘h’ hour constant in the binding
import mars.Duration
import mars.TimeUnit
import mars.Direction
import mars.Robot
def binding= new Binding([
  robot: new Robot(),
  *: Direction.values()
              .collectEntries{
                [(it.name()): it]
              },
  h: new Duration(1, TimeUnit.hour)
])


Now at (light!) speed
import groovy.transform.TupleConstructor
@TupleConstructor
class Speed{
  Distance distance
  Duration duration
  String toString(){
    "$distance/$duration"
  }
}

Operator overloading
a + b  // a.plus(b)
a - b  // a.minus(b)
a * b  // a.multiply(b)
a / b  // a.div(b) 
a % b  // a.modulo(b) 
a ** b // a.power(b) 
a | b  // a.or(b) 
a & b  // a.and(b) 
a ^ b  // a.xor(b) 
a[b]   // a.getAt(b) 
a << b // a.leftShift(b) 
a >> b // a.rightShift(b) 
a >>> b // a.rightShiftUnsigned(b)
+a     // a.unaryPlus() 
-a     // a.unaryMinus() 
~a     // a.bitwiseNegate()

Currency amounts
  15.euros + 10.dollars
Distance handling
  10.km - 10.m
Workflow, concurrency
  taskA | taskB & taskC
Credit an account
  account << 10.dollars
  account += 10.dollars
  account.credit 10.dollars

Operator overloading

Update the Distance class with a div() method following the naming convetion for operators

class Distance{
  ...
  Speed div(Duration t){
    new Speed(this, t)
  }
  ...
}

Optional return

Equivalence of notation

Those two notations are actually equivalent:
2.km/h
2.getKm().div(h) //This one might be slightly more verbose!

Named parameters usage
//explore.groovy
...
class Robot{
  void move(Map m, Direction dir){
    println "Rover: moving $dir ${m.collect{k,v-> "$k $v"}.join(", ")}"
  }
}
...

//integration.groovy
import mars.Distance
import mars.DistanceUnit
import mars.Duration
import mars.TimeUnit
import mars.Direction
import mars.Robot

Number.metaClass.getKm= {->
  new Distance(delegate, DistanceUnit.kilometer)
}

def robot= new Robot()
def binding= new Binding([
  robot: robot,
  *: Direction.values()
              .collectEntries{
                [(it.name()): it]
              },
  move: robot.&move,
  h: new Duration(1, TimeUnit.hour)
])

def shell= new GroovyShell(binding)
shell.evaluate(
  new File("command.groovy")
)

//command.groovy
move left, at: 3.km/h

'left' is a Normal parameter
'at: 3.km/h' is a Named parameter

Will call:
void move(Map m, Direction dir)
All named parameters go into the map argument
Positional parameters come afterwards

Can we get rid of the comma?
What about the colon too?

Command chains

A grammar improvement in Groovy 1.8 allowing you to drop dots & parens when chaining method calls
  • an extended version of top-level statements like println
Less dots, less parens allow you to
  • write more readable business rules
  • in almost plain English sentences (or any language, of course)

move left at 3.km/h
Enable alternation of method names 'move' and 'at'
and parameters (even named ones) 'left' and '3.km/h'

Equivalent to:
move(left).at(3.km/h)

//Java fluent API approach
class Robot{
  ...
  Direction dir
  Speed speed

  def move(Direction dir){
    this.dir= dir
    return this
  }
  def at(Speed speed){
    this.speed= speed
    return this
  }
  ...
}


  def move(Direction dir){
    [at: {Speed speed->
           println "Rover: moving $dir at $speed"
         }]
  }

Nested maps and closures

Usage in your DSLs
move left at 3.km/h

Command chains

//methods with multiple arguments (commas)
take coffee  with sugar, milk  and liquor
take(coffee).with(sugar, milk).and(liquor)

//leverage named-args as punctuation
check that: vodka  tastes good
check(that: vodka).tastes(good)

//closure parameters for new control structures
given {}  when {}  then {}
given({}).when({}).then({})

//zero-arg methods require parens
select all  unique() from names
select(all).unique().from(names)

//possible with an odd number of terms
deploy left  arm
deploy(left).arm

Final result
move forward at 3.km/h

What about security and safety?

  • Security and safety
  • JVM Security Managers
  • SecureASTCustomizer
  • Sandboxing
  • Controlling scripts execution

Play it safe in a sandbox
Playing it safe...
You have to think carefully about what DSL users are allowed to do with your DSL
Forbid things which are not allowed
  • leverage the JVM's Security Managers: this might have an impact on performance
  • use a Secure AST compilation customizer: not so easy to think about all possible cases
  • avoid long running scripts with *Interrupt transformations

Security Managers

Groovy is just a language leaving on the JVM, so you have access to the usual Security Managers mechanism
  • Nothing Groovy specific here
  • Please check the documentation on Security Managers and how to design policy files

SecureASTCustomizer

def secure= new SecureASTCustomizer()
secure.with{
  //disallow closure creation
  closuresAllowed= false
  //disallow method definitions
  methodDefinitionAllowed= false
  //empty white list => forbid certain imports
  importsWhitelist= [...]
  staticImportsWhitelist= [...]
  //only allow some static import
  staticStarImportsWhitelist= [...]
  //language tokens allowed
  tokensWhitelist= [...]
  //types allowed to be used
  constantTypesClassesWhiteList= [...]
  //classes who are allowed to be receivers of method calls
  receiversClassesWhiteList= [...]
}
def config= new CompilerConfiguration()
config.addCompilationCustomizers(secure)
def shell= new GroovyShell(config)

Controlling code execution

Your application may run user's code
  • what if the code runs in infinite loops or for too long?
  • what if the code consumes too many resources?

3 new transforms at your rescue
  • @ThreadInterrupt: adds Thread#isInterrupted checks so your executing thread stops when interrupted
  • @TimedInterrupt: adds checks in method and closure bodies to verify it's run longer than expected
  • @ConditionalInterrupt: adds checks with your own conditional logic to break out from the user code

@ThreadInterrupt

@ThreadInterrupt
import groovy.transform.ThreadInterrupt
while(true){
  if(Thread.currentThread().isInterrupted())
    throw new InterruptedException()
  //Any extraterestrial around?
}

@TimedInterrupt

@TimedInterrupt(10)
import groovy.transform.TimedInterrupt
while(true){
  move left
  //circle forever
}

InterruptedException thrown when checks indicate code ran longer than desired

@ConditionalInterrupt

Specify your own condition to be inserted at the start of method and closure bodies
  • check for available resources, number of times run, etc.
Leverages closure annotation parameters from Groovy 1.8

@ConditionalInterrupt({ battery.level < 0.1 })
import groovy.transform.ConditionalInterrupt
100.times{
  move forward at 10.km/h
}

Using compilation customizers

In our previous three examples, the usage of the interrupts were explicit, and users had to type them if they want to deplete the battery of your robot, they won’t use interrupts, so you have to impose interrupts yourself

With compilation customizers you can inject those interrupts thanks to the ASTTransformationCustomizer

What have we learnt?

Groovy Power!™
  • A flexible and malleable syntax
    • scripts vs classes, optional typing, colons and parens
  • Groovy offers useful dynamic features for DSLs
    • operator overloading, ExpandoMetaClass
  • Can write almost plain natural language sentences
    • for readable, concise and expressive DSLs
  • Groovy DSLs are easy to integrate, and can be secured to run safely in your own sandbox

Groovy is a great fit for DSLs!

And there’s more!

We haven’t dived into...
  • How to implement your own control structures with the help of closures
  • How to create Groovy "builders"
  • How to hijack the Groovy syntax to develop our own language extensions with AST Transformations
  • Source preprocessing for custom syntax
  • How to use the other dynamic metaprogramming techniques available
  • How to improve error reporting with customizers
  • IDE support with DSL descriptors (GDSL and DSLD)

Credit

Guillaume Laforge
Groovy Project Manager at VMware
Initiator of the Grails framework
Creator of the Gaelyk and Caelyf toolkits
Co-author of Groovy in Action
Follow me on...
  • My blog: http://glaforge.appspot.com
  • Twitter: @glaforge
  • Google+: http://gplus.to/glaforge
  • Email: glaforge@gmail.com

Last edited Apr 29, 2012 at 6:05 AM by gavingrover, version 1

Comments

No comments yet.