Home > Programme 2017 > Workshops 2017

Workshops 2017

Sunday, 30 April 2017 - Tuesday, 02 May 2017

Software/Data Carpentry Instructor Training Workshop


  • 30 April: 14:00 - 17:00
  • 01 - 02 May: 09:00 - 16:30

Cost: R1000

Venue: 12th Floor, 10 Rua Vasco Da Gama Plain
            Foreshore, Cape Town (map)

Registration: Please note that this is a co-located event. Please complete the separate online application form.

Presented by: 

  • Dr Kari Jordan, Deputy Director of Assessment (Data Carpentry)
  • Anelda van der Walt, eResearch Consultant (NWU/Talarify)

Enquiries: eresearch@nwu.ac.za

Prerequisites: This is training for teaching and not technical training. Applicants will have to demonstrate experience in using at least one of the tools taught by the Carpentries. The lessons taught at Carpentry workshops are available at www.datacarpentry.org/lessons/ and http://software-carpentry.org/lessons.

Requirements: Participants are required to bring a laptop that has internet connectivity and a functioning browser. It is also recommended that you bring an audio and video recording device (mobile phones and laptops are OK) to record some sessions.

This workshop is aimed at researchers, postgraduates, postdoctoral fellows, and research support staff affiliated with South African academic or research institutes who are interested in becoming better teachers. In particular, this training is aimed at those who want to become Software and Data Carpentry instructors, run workshops, and contribute to the Carpentry training materials. Successful applicants will be expected to teach a Carpentry workshop within 12 months of completing the training.

The registration fee will cover the instructors travel and accommodation costs. Sponsorship may be available upon request, but you will be required to provide a motivation. Participants will be responsible for their own travel and accommodation. Lunch and coffee will be provided daily.

For more information and to view the workshop schedule, go to https://nwu-eresearch.github.io/2017-04-30-eResearchAfrica-ttt/.

Please note that while eResearch Africa supports this workshop, it is not an official conference workshop. It is jointly organised by  North-West University, DIRISA, and Takealot.


Tuesday, 02 May 2017

Containerisation for Research IT

Data storage management workshop

Time: 09:00 - 13:00

Venue: Snape Building, Upper Campus, University of Cape Town
            (View map)

Cost: R350

Presented by: Heine de Jager

Prerequisites: Linux proficiency, System administration experience

Requirements: Attendees will need to bring a laptop or tablet that has a web browser and SSH client (Putty, JuiceSSH)

Containerisation has the potential to drastically reduce the administration overhead for deploying and maintaining services and applications. By offering system interoperability and removing dependency requirements, applications are easy to deploy and readily shared. The modular nature of containers also allows for compartmentalisation of data and applications permitting non-disruptive upgrades and the simple implementation of highly available services.  

In this tutorial containerisation will be discussed in the context of research application support and research data services. The practical aspect will cover installing Docker, setting up an S3 object storage service, launching the service, and finally upload data using a client.

This workshop is ideal for IT staff involved in research support and researchers involved in scientific computing.

Outline of syllabus:

  1. Introduction to containerisation (15 min)
  2. Tutorial (3 hrs)
    • Setting up you environment
    • Installing Docker
    • Downloading containers
    • Launching
    • Connecting to services

Wednesday, 03 May 2017

Data Transfer & Operating Innovative Networks Workshop

Science DMZs, perfSONAR and Data Transfer Nodes

Time: 10:00 - 17:00

Venue: Snape Building, Upper Campus, University of Cape Town
           (View map)


  • Free for selected attendees from eligible R&E beneficiaries
  • R500 otherwise

Enquiries: pert@sanren.ac.za

Presented by: Jason Zurawski (LBL/ESnet), Scott Chevalier (Indiana University International Networks), Roderick Mooi, Kasandra Pillay, Sakhi Hadebe, Kevin Draai (SANReN/CSIR)

Prerequisites: IT networking experience will be advantageous but not required particularly if attending as an interested researcher/scientist.

Requirements: None

This workshop will offer presentations and demonstrations of the Science DMZ architecture, data transfer nodes (DTNs), data transfer tools (like GridFTP and Globus Online) and perfSONAR for network performance measurement and monitoring. "Combined, these technologies are proven to support high-performance, big data science applications, while ensuring the security and availability that modern campuses and laboratories need"1. By the end of the event, attendees will have a better understanding of the requirements for supporting the scientific use of the network, architectural strategies that can simplify these interactions and knowledge of tools that can mitigate problems users may encounter.

Who should attend?

  1. IT network engineers/technicians interested in optimising the network, eResearch / science engagement officers, researchers and scientists with big data transfer needs/challenges.
  2. Anyone interested in participating in a SANReN science DMZ and/or DTN proof of concept (following on from the presentation by Roderick Mooi and Kasandra Pillay on Tuesday).


10:00 - 10:30 – Introduction and Motivation
10:45 - 12:15 – Science DMZ Architecture and Security
13:15 - 14:15 – Data Transfer Nodes, Tool & Globus
15:45 - 17:00 – perfSONAR small nodes, debugging, @SANReN; Q&A

For more information:
1. http://fasterdata.es.net/science-dmz/
2. http://oinworkshop.com/
3. http://perfsonar.net

The power of effective visualisation, or why you should never use a pie-chart? 

Data Visualisation Workshop

Time: 13:15 - 15:30

Venue: Hlanganani Junction
           (View map)

Cost: R350

Presented by: Associate Professor Michelle Kuttel (UCT)

Prerequisites: N/A

Requirements: N/A

In the age of “Big Data", it is increasingly important to pay careful attention to the design of data visualizations.  This workshop covers the field of visual thinking, outlining current understanding of visual perception and demonstrating how we can use this knowledge to design for more effective multi-dimensional data graphics.  No programming background required.


Thursday, 04 May 2017

Introduction and Hands-On Experience with iRODS: Managing Distributed Big Data Using Policies

Time: 09:00 - 15:30

Venue: University of Stellenbosch Business School, Bellville
            (View map)

Cost: R500

Presented by: Professor Arcot Rajasekar

Prerequisites: Linux proficiency, system administration experience

Requirements: Attendees will need to bring a laptop or tablet that has a web browser and SSH client (Putty, JuiceSSH)

The Integrated Rule-Oriented Data System (iRODS) is open source data management software used by research organisations and government agencies worldwide. iRODS is released as a production-level distribution aimed at deployment in mission critical environments. It virtualises data storage resources, so users can take control of their data, regardless of where and on what device the data is stored. As data volumes grow and data services become more complex, iRODS is increasingly important in data management.

This course is designed for those who are new to iRODS or who have limited experience with iRODS but want to learn more. We will also look at a few tools that have been integrated with iRODS. Experience with the Unix command line and familiarity with the basic constructs of programming languages (e.g., variables, strings, loops) will be helpful to training participants.​

IT staff involved in large-scale data management, researchers involved in scientific computing and staff and researchers involved in digital curation and digital libraries will benefit from this workshop.

Outline of syllabus:

1. What is iRODS

2. Case Studies

3. Installing iRODS

4. Using iCommands

5. Virtualisation

6. Data Discovery

7. Workflow Automation

8. Access Tools

Friday, 05 May 2017

Library Carpentry Workshop

Time: 09:00 - 15:30

Venue: Hlanganani Junction
           (View map)

Cost: R500

Presented by: Kayleigh Roos; Isak van der Walt, Erika Mias

Prerequisites: Maximum 30 participants. The workshop is suited to beginners and no previous experience is required.

Requirements: Attendees will need to bring a laptop. The following software will need to be downloaded onto the participant's laptop prior to the workshop (a detailed mailer will be sent to participants with installation instructions):

OpenRefine - http://openrefine.org/download.html

Java - http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

The workshop will cover 2 of the 5 Library Carpentry Modules (https://librarycarpentry.github.io/), with the aim of facilitating follow-up workshops to cover the remaining modules. This introductory workshop will help librarians and other research support staff to understand the basic terms, phrases and concepts in data science. The workshop will equip attendees with knowledge and skills required for assisting researchers with data management matters and will consist of a hands-on training session in OpenRefine - an open-source software tool that assists with cleaning,editing and, to some extent, analysing data.

Who should attend?

The workshop is aimed at librarians and/or research support staff who assist researchers with data management and data cleaning, or who create, collect or work with messy data. Researchers and academic staff who are interested in learning about the basics of data science and data cleaning will also benefit and are encouraged to attend.

Outline of syllabus:

Library Carpentry (a branch of Software and Data Carpentry) is software skills training that is aimed at the needs and requirements of library professionals. Library Carpentry workshops comprise of face-to-face sessions where a core set of modules (lessons developed by librarians for librarians) are taught. The modules are maintained via an open access Github repository.

The workshops aim to help participants with automating repetitive, error-prone tasks and to create, maintain and analyse sustainable, reusable data. The workshops also aim to help participants to better understand the use of software in research and to work more effectively with IT colleagues.

The Data Intro for Librarians and OpenRefine modules that will be taught at the eRA 2017 workshop will introduce software terms and concepts relevant to librarians, with a preference for open source and widely used software. The data that will be used in the hands-on OpenRefine session will be library-related data.

Data Intro for Librarians is an introduction to working with data and terminology. It will help participants to understand the terms, phrases and concepts in software development and data science, as well as assist participants with identifying and using best practice in data structures.

OpenRefine for Librarians will give participants the opportunity to install the necessary software and do hands-on exercises to clean messy data. The module will give insight and exposure on how to work with messy data, and also common tricks and tips. Librarians will be able to see and experience the data challenges that researchers face and also a better understanding on best practices while working with data.

The Library Carpentry workshop at eRA 2017 provides an opportunity for those who are interested in becoming certified Library Carpentry trainers to volunteer as 'helpers' at the workshop. There are currently no official Library Carpentry trainers in South Africa, so the workshop will also serve to build a local community from which skilled trainers might emerge.

View slides

Friday, 05 May 2017

OpenStack Research Cloud Infrastructure Workshop

Learn how to build, operate, secure and support a national cloud

Time: 09:00 - 16:00

Venue: University of Stellenbosch Business School, Bellville
            (View map)

Cost: R500

Presented by: Sam Morrison and Justin Mammarella, NeCTAR Research Cloud Core Services, University of Melbourne, Australia 

Prerequisites: Linux proficiency, system administration experience

Requirements: Attendees will need to bring a laptop or tablet

Outline of syllabus

This day-long workshop will focus on developing technical, operational and architectural perspectives in building and running a multi-node openstack cloud for research. It is targeted at systems administrators, devops engineers, architects and cloud applications developers. 
The workshop will be delivered through a lecture, demonstration, and hands on sessions, with plenty of time to deep dive into particular areas as required. All attendees will be provided with access to the Nectar research cloud. 
Workshop topics covered: 
1. Introduction to the NeCTAR cloud 
2. Architectural overview

2.1. Core services and NeCTAR nodes

2.2. Federation of OpenStack services

a. Nova, cells, etc.
b. Neutron
c. Glance
d. Swift
e. Cinder
f. Others 

3. Node-specific overview - the Melbourne Node

3.1. Infrastructure & service overview:

a. Compute Cells
b. Storage Clusters
c. Networking
d. HPC 

4. Identity and Access management

4.1. Federation (SAML & other if relevant) 

5. Monitoring physical infrastructure & cloud resources

5.1. Tempest Checks

5.2. Automated host monitoring using Nagios and PuppetDB

5.3. Ganglia

5.4. Cloud Metrics - Grafana/Graphite/Carbon/Collectd, gnocchi integration, inbuilt plugins, roll your own.  

6. Continuous integration / continuous deployment workflows. How we make and deploy changes to the cloud.

6.1. Test cloud

6.2. Gerrit  

6.3. Jenkins  

7. Networking in the nectar cloud

7.1. Neutron ( Private networking in the cloud )  

8. Automation (Tools to make cloud support life easier)

8.1. User Notification tools.

8.2. Hivemind (Compromised instances, automated takedown procedures)  

8.3. Procedures and scripts  

9. Bonus topic(s)

9.1. Upgrades


Presenter bios: 

Sam Morrison is the Technical Lead for the NeCTAR Research Cloud, which spans 10 different data centres across Australia. In this role he is responsible for coordinating the technical team's activities in each location, in addition to OpenStack customisation work. As happy managing government funding milestone expectations as diving into python code, Sam is a contributor to many OpenStack projects including nova, horizon, cinder, keystone, glance, ceilometer and trove. Sam has been with NeCTAR from almost the beginning bringing it from a small proof of concept to what it is today. 

Justin Mammarella is a Development and Operations engineer working at the University of Melbourne. He is part of a team that delivers the cloud infrastructure forming the Melbourne Node component of the NeCTAR research cloud. His team services 10,000+ Cores , 12+ PetaBytes of data distributed across two data centres.  
With a background in Engineering and Computer Science he specialises in integrating hardware and software components integral for deploying and running an OpenStack research cloud.  
At a glance, his day to day job includes the following skills: Linux System Administration (RHEL, Ubuntu), Virtualisation Technologies (QEMU, KVM, Libvirt), OpenStack (Nova, Neutron, Cinder, Glance, RabbitMQ) , Server Deployment, Networking technologies (Midonet), Configuration Management (Puppet), Storage Clusters (CEPH, Swift, Gluster, NetAPP OnTap), Monitoring (Nagios, Ganglia, Graphite/Graphana), Software Development  (C, C++, Python, Bash).