Training

  • Adapted to the needs of your organisation, we provide training for the use of major analytical tools (R / Python / SQL databases + Big Data). We give this training at your site, with your infrastructure, using concrete examples from your business environment. Some of these courses are also given regularly at the University of Leuven (LStat), through RBelgium or the Data Science Innovation Hub and courses on the use of Oracle R Enterprise are given with our Oracle Partner.

    The courses are unique in the world because of the breadth and depth. Courses range from the basics, to traditional data manipulation & visualisation, statistical learning & deploying your analytical solutions using R packages or by automation, in-database routines or web services.

    All offered courses can be found below or in full detail in the our brochure which you can download here. If you want to schedule one of these courses or if you are interested in a specific R course suited to your needs, let us know.

    NEW, since 2020, you can now access courses Text Mining with R and Advanced R programming online through our online school, let us know if you want to obtain access.

      • Basic

        Module: R for starters

        Course Duration 2 days

        Prerequisites None

        Course Content

        • What is R, packages available (CRAN, R-Forge, ...), R documentation search, finding help, RStudio editor, syntax
        • Data types (numeric/character/factor/logicals/NA/Dates/Times)
        • Data structures (vector/data.frame/matrix/lists and standard operations on these)
        • Saving (RData) & importing data from flat files, csv, Excel, Oracle, MS SQL Server, SAS, SPSS
        • Creating functions, data manipulation (subsetting, adding variables, ifelse, control flow, recoding, rbind, cbind) and aggregating and reshaping
        • Plotting in R using base and lattice functionality (dot plots, barcharts, graphical parameters, legends, devices)
        • Basic statistics in R (mean, variance, crosstabs, quantile, correlation, distributions, densities, histograms, boxplot, t-tests, wilcoxon test, non-parametric tests)

      • Basic

        Module: Common data manipulation & Programming in R

        Course Duration 1 day

        Prerequisites R experience from 1 month up to 2 years

        Course Content

        This module allows you to be a better programmer by writing your own functions, getting acquainted with commonly used R functions for basic data manipulation and the R object oriented programming environment.
        • with, within, by, apply family of functions & split-apply-combine strategy
        • vectorisation, parallel execution of code
        • data.table - fast group by, joining and data.table programming tricks
        • basic regular expressions
        • writing your own functions
        • do.call
        • reshaping from wide to long format
        • environments
        • S3 classes, generics and basic S4 methodology
        • handling of errors and exceptions, debugging code

      • Basic

        Module: Reporting with R

        Course Duration 1 day

        Prerequisites R experience from 1 month up to 2 years

        Course Content

        If you want to create a report using R, either static, dynamic or an interactive web page, R provides several tools to do this. This module teaches you the basics of building reports with R. It covers the following topics.
        • Sweave & knitr
        • Markdown & pandoc
        • integration with MS Office & presentations
        • making R package vignettes
        • An introduction to Shiny and interactive HTML reporting

      • Basic

        Module: Visualisation with R

        Course Duration 2 days

        Prerequisites R experience from 1 month up to 2 years

        Course Content

        This course gives you an introduction to the 4 main graphic systems in R using standard day-to-day graphics. It covers the following topics.
        • R base graphics
        • Graphical devices
        • Lattice graphics
        • ggplot2
        • Interactive graphics using htmltools

      • Basic

        Module: Data Connectivity with R

        Course Duration 1 day

        Prerequisites R experience from 1 month up to 2 years

        Course Content

        This course shows how to connect to different data sources (Excel, SQL databases, XML, JSON, Web scraping). It covers the following topics.
        • Read/Write data from/to Excel
        • Work efficiently with SQL databases
        • dplyr and SQL databases
        • Read in XML data
        • JSON and YAML from R
        • Web scraping from R and online data
        • Open Data from Belgium

      • Analytics

        Module: Statistical Machine Learning with R

        Course Duration 2 days

        PrerequisitesAt least knowledge of data manipulation with R. Knowledge of standard regressions models (lm/glm). Basic statistical knowledge.

        Course Content

        This course is a hands-on course covering the use of statistical machine learning methods available in R. The following basic learning methods will be covered and used on common datasets.
        • naive bayes
        • trees (recursive partitioning)
        • feed-forward neural networks
        • penalized regression modelling (lasso/ridge/elasticnet regularized generalized linear models)
        • bagging for classification and regression
        • random forests
        • adaboost & general boosting for classification & regression
        • if time permits: graphical lasso / penalised generalized additive models / model-based-recursive partitioning or support vector machines
        • model evaluation logic & hyperparameter tuning. Training and evaluation will be done through the use of the caret and ROCR packages
      • Analytics

        Module: Text mining with R

        Course Duration 2 days

        PrerequisitesAt least knowledge of data manipulation with R. Knowledge of standard regressions models (lm/glm). Basic statistical knowledge.

        Course Content

        This course is a hands-on course covering the use of text mining tools for the purpose of data analysis. It covers basic text handling, natural language engineering and statistical modelling on top of textual data. The following items are covered.
        • Text encodings
        • Cleaning of text data, regular expressions
        • String distances
        • Graphical displays of text data
        • Natural language processing: stemming, parts-of-speech tagging, tokenization, lemmatisation
        • Sentiment analysis
        • Statistical topic detection modelling and visualization (latent diriclet allocation)
        • Visualisation of correlations & topics
        • Word embeddings
        • Document similarities & Text alignment

      • Analytics

        Module: Applied Spatial modelling with R

        Course Duration 2 days

        PrerequisitesAt least knowledge of data manipulation with R, S3/S4 classes and standard R visualisation. Knowledge of standard regressions models (lm/glm). Basic statistical knowledge.

        Course Content

        This course is useful for data scientists and data analysts which work frequently with data with a spatial component (data with latitude/longitude information). It gives an introduction to the numerous spatial facilities of R and some standard spatial statistical models. The following items are covered during the course.
        • Importing spatial data and setting the spatial projection
        • Plotting spatial data on static and interactive maps
        • Adding graphical components to spatial maps
        • Manipulation of geospatial data, geocoding, distances, ...
        • Density estimatio and spatial point pattern analysis
        • Spatial regression
        • Kriging and spatial predictions

      • Analytics

        Module: Computer Vision with R and Python

        Course Duration 2 days

        PrerequisitesAt least knowledge of data manipulation with R and standard R visualisation. Some basic Python knowledge. Basic statistical knowledge.

        Course Content

        This course is a hands-on course covering the use of image analysis. It covers basic image manipulation, feature engineering techniques and finding patterns in images. The following items are covered.
        • image manipulation & adjustments
        • finding blobs, corners, gradients, edges & lines
        • optical character recognition
        • feature & object detection
        • applying filters
        • deep learning for image analysis
        • image segmentation

      • Analytics

        Module: Big data analytics with R on top of Spark, Hadoop and HAWQ

        Course Duration 3 days

        PrerequisitesAt least knowledge of data manipulation with R, S3 classes. Knowledge of Statistical Machine Learning with R.

        Course Content

        This course is useful for data scientists and data analysts which work frequently on big datasets in Spark/Hadoop. The following items are covered during the course.
        • Overview of the big data ecosystem for data scientists
        • Linux system commands for data scientists
        • Work with the hadoop file system (read/write, directories). Typical Hadoop files
        • Map-Reduce & mapply
        • Spark & R: SparkSQL using package sparklyr & dplyr. Spark Machine Learning using package sparklyr (data preparation, regression, randomforest & boosted trees) + spark extensions
        • HAWQ & R: Running PL/R stored procedures using Apache HAWQ. Classification & regression using MadLib & PivotalR

      • Advanced / Deploy

        Module: Creating R packages and R repositories

        Course Duration 1 day

        PrerequisitesAt least knowledge of data manipulation with R, S3/S4 classes.

        Course Content

        R is notorious for it’s flexibility by having more than 6000 packages available for direct usage. If you want to create your own package to distribute code to others inside your organization, this module teaches you how to build your own package and set up an enterprise R package repository.
        • structure of an R package
        • documenting your code and your R package using roxygen in RStudio
        • check, build, install your R package
        • unit-testing your R code
        • creating your own R package repository

      • Advanced / Deploy

        Module: Managing R processes

        Course Duration 1 day

        PrerequisitesAt least knowledge of data manipulation with R. Not afraid of the shell.

        Course Content

        R is a programming language and can be launched from the command line. This module will learn you how to launch R properly in order to automate processes
        • R and Rscript and the options
        • Startup scripts and settings
        • Handling of command line arguments
        • Understanding package library folders
        • Automating R processes in Windows & Linux
        • Handling and logging of error/warning messages

      • Advanced / Deploy

        Module: R code management, Git and Continuous Integration

        Course Duration 1 day

        PrerequisitesAt least knowledge of data manipulation with R. Not afraid of the shell. Knowledge of Sweave/Knitr. Knowledge of package building. Knowledge of R process automation.

        Course Content

        If you are an R developer and want to make sure your analysis is reproducible and traceable, you need to learn how to use a code repository. In this session, you will learn how to use RStudio for code maintenance. The following items are covered:
        • R Coding guidelines
        • Good practices in management & structuring of R code
        • Git & code repositories
          • Setup
          • Push, pulling and cloning code
          • Handling of conflicts
          • Setting branches and releases
          • Use of RStudio’s IDE to integrate with your code repository
        • R package repositories: Build a local CRAN repository with your own packages
        • Doing continuous integration with Gitlab and Travis

      • Advanced / Deploy

        Module: Integration of R into web applications

        Course Duration 2 days

        PrerequisitesAt least knowledge of data manipulation with R. Not afraid of the shell. Knowledge of Sweave/Knitr. Knowledge of package building. Knowledge of R process automation and R package building.

        Course Content

        Learn how to create basic web applications and web services in R. The following elements are covered in the course:
        • Setting up Shiny applications
        • RApache & OpenCPU
        • using R alongside javascript, htmlwidgets
        • setting up webservices using R

      • Advanced / Deploy

        Module: Shiny

        Course Content

        • This course can be supplied directly by RStudio. We don't provide a home-made course for Shiny.

    • Oracle has built "Oracle R Enterprise" as part of the Oracle Advanced Analytics Option to make the Open-Source statistical programming language R ready for the Enterprise and Big Data. Designed for problems involving large amounts of data, Oracle R Enterprise integrates R with Oracle Database. R users can run R commands and scripts for statistical and graphical analyses on data stored in the Oracle Database.

      • Oracle R

        Module: ROracle and Oracle R Enterprise - transparancy layer

        Course Duration 1 day

        PrerequisitesAt least knowledge of data manipulation with R, S3 classes.

        Course Content

        In this session you will learn how to interface R with Oracle and use the transparency layer provided by ORE. You will be given access to an Oracle database where ORE is installed so you can use the ORE suite during the exercises. The course covers the following topics.
        • ROracle - getting and sending SQL queries from Oracle
        • Installing Oracle R Enterprise (ORE)
        • Basic database connectivity: ore.exec, ore.ls, ore.synch, ore.push, ore.pull, ore.create, ore.drop, ore.get
        • ORE data types: ore.character, ore.factor, ore.logical, ore.number, ore.datetime, ore.numeric. Conversion between data types
        • ORE data structures: ore.matrix, ore.frame, ore.vector
        • ORE transparancy data operations on ore.frame/ore.vector (subset, ncol, nrow, head, ifelse, paste, is.na, sd, mean, tapply, by, c, %in%, ...) and indexing and overwriting in-database ore.vectors
        • Save R objects in Oracle ore.save, ore.load, ore.datastore and ORE data store handling
        • Basic statistics with ORE (ore.univariate, ore.summary, ore.crosstab, ore.corr, exponential smoothing, t.test, wilcoxon, IQR)

      • Oracle R

        Module: Oracle R Enterprise - advanced data manipulation

        Course Duration 0.5 days

        PrerequisitesAt least knowledge of data manipulation with R, S3 classes. Understanding Oracle R Enterprise and the transparancy layer.

        Course Content

        You will be given access to an Oracle database where ORE is installed so you can use the ORE suite during the exercises. The session covers the following topics:
        • Running R functions parallel inside the database: ore.doEval, ore.groupApply, ore.indexApply, ore.rowApply, ore.tableApply
        • Creating R scripts inside the database and accessing ORE stored procedures
        • Embedding R scripts in production database applications
        • Embedded (parallel) R execution within ORE using the R Interface as well as the SQL Interface

      • Oracle R

        Module: Data mining models inside Oracle R Enterprise and Oracle Data Mining

        Course Duration 1 day

        PrerequisitesAt least knowledge of data manipulation with R, S3 classes. Understanding Oracle R Enterprise and the transparancy layer. Knowledge of statistical modelling and machine learning.

        Course Content

        Data Mining (aka Machine Learning) refers to a set of statistical and mathematical techniques to reveal relationships and patterns in data. In contrast to ‘classical’ statistical methods, there is no need for hypothesis in advance. Applications of Data Mining Methods are forecast-models, market basket analysis, target group analysis and more. You will be given access to an Oracle database where ORE is installed so you can use the ORE suite during the exercises. Mark that if you are unfamiliar with data science algorithms, it is advised to also follow the module on ‘Statistical machine learning with R’ In this session you will become acquainted with some of the most common data mining methods and learn how to use these algorithms in ORE. The following algorithms will be covered.
        • principal component analysis and factor analysis
        • kmeans clustering and orthogonal partitioning
        • data reduction using Minimum Description Length attribute importance
        • linear models and generalized linear models
        • naive bayes, neural networks, decision tree and support vector machines
        • market basket analysis / recommendation engines (apriori)
        • bagging

  • Upcoming public courses.

      • 2019-10-17&18: Statistical Machine Learning with R: Subscribe here
      • 2019-11-14&15: Text Mining with R: Subscribe here
      • 2019-12-17&18: Applied Spatial Modelling with R: Subscribe here
      • 2020-02-19&20: Advanced R programming: Subscribe here
      • 2020-03-12&13: Computer Vision with R and Python: Subscribe here
      • 2020-03-16&17: Deep Learning/Image recognition: Subscribe here
      • 2020-04-22&23: Text Mining with R: Subscribe here
      • 2020-05-06&07: Text Mining with Python: Subscribe here

    Online courses.

    The following courses are available online anytime:
      • Text Mining with R
      • Advanced R programming
    If you want to have access let us know by filling out the email form.

    Private courses.

    If you want to schedule a private course or if you are interested in a specific course suited to your needs, let us know by filling out the email form.