Examples from 'The openapi project' in conduit

By Paul Murrell and Ashley Noel Hinton

25 September 2014

Paul Murrell's talk, The openapi Project, introduced the goals and ideas of openapi, and demonstrated some pipeline examples in a prototype glue system oaglue. This page details the same pipeline examples being run in another prototype glue system, conduit. Both prototypes are R packages.

As 'oaglue' and 'conduit' are both under active development, the pipeline XML scripts from the talk needed to be modified to run in 'conduit'. A comparison of the pipeline XML file for the first example follows.

Original pipeline XML file:

## <pipeline xmlns="http://www.openapi.org/2014/" version="0.1">
##   <component name="brsource"/>
##   <component name="birthrate"/>
##   <component name="brplot-R"/>
##   <pipe>
##     <start component="brsource" name="brsrcfile"/>
##     <end component="birthrate" name="brsrcfile"/>
##   </pipe>
##   <pipe>
##     <start component="birthrate" name="brfile"/>
##     <end component="brplot-R" name="brfile"/>
##   </pipe>
## </pipeline>

Modified pipeline XML file:

## <pipeline xmlns="http://www.openapi.org/2014/">
##   <component name="brsource" ref="brsource.xml" type="module"/>
##   <component name="birthrate" ref="birthrate.xml" type="module"/>
##   <component name="brplot-R" ref="brplot-R.xml" type="module"/>
##   <pipe>
##     <start component="brsource" output="brsrcfile"/>
##     <end component="birthrate" input="brsrcfile"/>
##   </pipe>
##   <pipe>
##     <start component="birthrate" output="brfile"/>
##     <end component="brplot-R" input="brfile"/>
##   </pipe>
## </pipeline>

The differences between the two documents mostly relate to the naming of XML nodes and attributes. The basic structure of a list of components and a list of pipes connecting components is visible in both versions.

The module and pipeline XML files are found in the scripts sub-directory, with a further subdirectory for the Wiki NZ example and the Internet Party example. Data and source files for each example are found in the respective sub-directories.

The results of running the pipelines can be found in the pipelines sub-directory. These pipelines were run on Ubuntu 14.04 64-bit machine, using R 3.1.1. Modules requiring the Python platform were run in Python 2.7 with the NumPy package and matplotlib library available.

A tarball of the conduit package as it was when this page was created is provided. Several other R packages are required, a list of which follows the examples.

Each example title is a link to the example in the original talk slides.

Examples

openapi is a Glue System

birthrate <- loadPipeline("scripts/wikinz/birthrate-pipe.xml",
                                  "birthrate")
runPipeline(birthrate)
list.files("pipelines/birthrate/modules", recursive=TRUE)
## [1] "birthrate/birthrate.csv"  "birthrate/script.R"      
## [3] "brplot-R/birthrate-R.svg" "brplot-R/script.R"

openapi is...

library(gridGraphviz)
## Loading required package: grid
## Loading required package: graph
## Loading required package: Rgraphviz
library(gridSVG)
## 
## Attaching package: 'gridSVG'
## 
## The following object is masked from 'package:grDevices':
## 
##     dev.off
gridsvg("birthrate-pipe.svg", width=4, height=3)
grid.graph(agopenTrue(conduit:::graphPipeline(birthrate), "birthrate",
                      attrs=list(node=list(shape="ellipse"))))
dev.off()

Graph of birthrate pipeline

An openapi example

birthrate_custom <- loadPipeline("scripts/wikinz/birthrate-pipe-custom.xml",
                                 "birthrate_custom")
runPipeline(birthrate_custom)
list.files("pipelines/birthrate_custom", recursive=TRUE)
## [1] "modules/birthrate/birthrate.csv"        
## [2] "modules/birthrate/script.R"             
## [3] "modules/brplot-R/birthrate-R.svg"       
## [4] "modules/brplot-R-custom/birthrate-R.svg"
## [5] "modules/brplot-R-custom/script.R"       
## [6] "modules/brplot-R/script.R"
gridsvg("birthrate-custom-pipe.svg", width=4, height=3)
grid.graph(agopenTrue(conduit:::graphPipeline(birthrate_custom),
                      "birthrate_custom",
                      attrs=list(node=list(shape="ellipse"))))
grid.edit("box-brplot-R-custom", gp=gpar(fill="green"))
dev.off()

Graph of customised birthrate
						pipeline

Birthrate plot generated by pipeline

Another openapi example

birthrate_python <- loadPipeline("scripts/wikinz/birthrate-pipe-python.xml",
                                 "birthrate_python")
runPipeline(birthrate_python)
list.files("pipelines/birthrate_python", recursive=TRUE)
## [1] "modules/birthrate/birthrate.csv"   
## [2] "modules/birthrate/script.R"        
## [3] "modules/brplot-py/birthrate-py.svg"
## [4] "modules/brplot-py/script.py"       
## [5] "modules/brplot-R/birthrate-R.svg"  
## [6] "modules/brplot-R/script.R"
library(gridSVG)
gridsvg("birthrate-pipe-python.svg", width=4, height=3)
grid.graph(agopenTrue(conduit:::graphPipeline(birthrate_python),
                      "birthrate_python",
                      attrs=list(node=list(shape="ellipse"))))
grid.edit("box-brplot-py", gp=gpar(fill="green"))
dev.off()

Plot of python birthrate
						pipeline

Birthrate plot from python birthrate pipeline

Yet another openapi example

plotPipe <- loadPipeline("scripts/internetparty/plotPipe.xml", "plotPipe")
library(gridSVG)
gridsvg("plotPipe.svg", width=4, height=4)
grid.graph(agopenTrue(conduit:::graphPipeline(plotPipe),
                      "plotPipe",
                      attrs=list(node=list(shape="ellipse"))))
dev.off()
## Warning in grabDL(warn, wrap, ...): one of more grobs overwritten (grab
## WILL not be faithful; try 'wrap = TRUE')

Graph of Internet NZ Plot pipeline

Yet another openapi example

piePipe <- loadPipeline("scripts/internetparty/piePipe.xml")
runPipeline(piePipe)
list.files("pipelines/piePipe", recursive=TRUE)
## [1] "modules/pie/pie.svg"             "modules/pie/script.R"           
## [3] "modules/plot/non-voters.svg"     "modules/plot/population.svg"    
## [5] "modules/plot/script.R"           "modules/tidy/nonvoters.rds"     
## [7] "modules/tidy/pop2013grouped.rds" "modules/tidy/pop2013.rds"       
## [9] "modules/tidy/script.R"
library(gridSVG)
gridsvg("piePipe.svg", width=4, height=4)
grid.graph(agopenTrue(conduit:::graphPipeline(piePipe),
                      "piePipe",
                      attrs=list(node=list(shape="ellipse"))))
grid.edit("box-pie", gp=gpar(fill="green"))
dev.off()
## Warning in grabDL(warn, wrap, ...): one of more grobs overwritten (grab
## WILL not be faithful; try 'wrap = TRUE')

Graph of pie graph pipeline

reportPipe <- loadPipeline("scripts/internetparty/reportPipe.xml", "reportPipe")
runPipeline(reportPipe)
list.files("pipelines/reportPipe", recursive=TRUE)
##  [1] "modules/calculate/script.R"                 
##  [2] "modules/calculate/youngNonvotersPercent.rds"
##  [3] "modules/plot/non-voters.svg"                
##  [4] "modules/plot/population.svg"                
##  [5] "modules/plot/script.R"                      
##  [6] "modules/report/report.html"                 
##  [7] "modules/report/script.R"                    
##  [8] "modules/tidy/nonvoters.rds"                 
##  [9] "modules/tidy/pop2013grouped.rds"            
## [10] "modules/tidy/pop2013.rds"                   
## [11] "modules/tidy/script.R"
library(gridSVG)
gridsvg("reportPipe.svg", width=4, height=4)
grid.graph(agopenTrue(conduit:::graphPipeline(reportPipe),
                      "reportPipe",
                      attrs=list(node=list(shape="ellipse"))))
dev.off()
## Warning in grabDL(warn, wrap, ...): one of more grobs overwritten (grab
## WILL not be faithful; try 'wrap = TRUE')

Graph of report pipeline


sessionInfo()
## R version 3.1.2 (2014-10-31)
## Platform: x86_64-pc-linux-gnu (64-bit)
## 
## locale:
##  [1] LC_CTYPE=en_NZ.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_NZ.UTF-8        LC_COLLATE=en_NZ.UTF-8    
##  [5] LC_MONETARY=en_NZ.UTF-8    LC_MESSAGES=en_NZ.UTF-8   
##  [7] LC_PAPER=en_NZ.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_NZ.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] grid      stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
## [1] gridSVG_1.4-2    gridGraphviz_0.3 Rgraphviz_2.10.0 graph_1.44.1    
## [5] conduit_0.1-0    knitr_1.8       
## 
## loaded via a namespace (and not attached):
##  [1] BiocGenerics_0.12.1 bitops_1.0-6        codetools_0.2-10   
##  [4] digest_0.6.8        evaluate_0.5.5      formatR_1.0        
##  [7] highr_0.4           parallel_3.1.2      RBGL_1.42.0        
## [10] RCurl_1.95-4.5      RJSONIO_1.3-0       stats4_3.1.2       
## [13] stringr_0.6.2       tools_3.1.2         XML_3.98-1.1

This page was produced using the ' knitr' package in R. The page source can be found at index.Rhtml.