Skip to content
Snippets Groups Projects
Commit fa0fbbd7 authored by Warris, Sven's avatar Warris, Sven
Browse files

Update course information for the November session

parent 045ae6d6
No related branches found
No related tags found
No related merge requests found
R-Big data
Effective Analysis of Big Data in R
==========
Welcome to R Big Data Course repository. This contains information for a number
Welcome to Effective Analysis of Big Data in R Course repository. This contains information for a number
of editions of the course, under several names and with slightly
different content:
* [Slides](https://git.wageningenur.nl/warri004/r-big-data/tree/master/slides)
* [Course documentation PE&RC edition, 2021](https://git.wageningenur.nl/warri004/r-big-data/raw/master/courseInfo_PERC_2021.pdf)
* [Course documentation, edition: 2021](https://git.wageningenur.nl/warri004/r-big-data/raw/master/courseInfo.pdf)
* [Data](https://git.wageningenur.nl/warri004/r-big-data/tree/master/data)
* [Examples](https://git.wageningenur.nl/warri004/r-big-data/tree/master/examples)
......
---
title: "Course information"
author: "Sven Warris"
date: "October 14/15, 2021"
date: "November 4/5, 2021"
output: pdf_document
---
Welcome to the R Big Data course. This document contains some basic information on the course.
Welcome to the *Effective Analysis of Big Data in R*. This document contains some basic information on the course.
## Course introduction
......@@ -25,12 +25,11 @@ plenty of time for hands-on exercises.
## RStudio server
There is an RStudio server available at [warris.wurnet.nl](http://warris.wurnet.nl:8787), only during the course.
There is an RStudio server available at [rstudio.containers.wur.nl](https://rstudio.containers.wur.nl), only during the course.
## Course code repository
[Main Wageningen GIT repository R Big
Data](https://git.wageningenur.nl/warri004/r-big-data)
[Main Wageningen GIT repository Effective Analysis of Big Data in R](https://git.wageningenur.nl/warri004/r-big-data)
## The lecturers
......@@ -39,33 +38,38 @@ Data](https://git.wageningenur.nl/warri004/r-big-data)
## Programme
Wageningen University Campus, Orion Building, Room B3040. Please note that you have to bring your own laptop. Lunch is
provided and there will be coffee breaks. The schedule is as follows:
### October 14
* 9.00 Welcome and introduction
* 9.30 The Big Data issue
* 10.30 Coffee break
* 11.00 Profiling and efficient R programming
* 12.30 Lunch
* 13.30 Profiling and efficient R programming (contd.)
* 15.00 Coffee break
* 15.30 Reproducible research
* 17.00 Q&A
* 17.30 Closing
### October 15
* 9.00 R memory management and big data
* 10.30 Coffee break
* 11.00 HPC in R
* 12.30 Lunch
* 13.30 Big data analyses: a case of machine learning
* 15.00 Coffee break
* 15.30 Big data analyses: a case of machine learning (contd.)
* 16.30 Q&A
* 17.00 Evaluation and closing
Location: *Wageningsche Berg*. Please note that you have to bring your own laptop. Lunch and dinner are provided and there will be coffee breaks. The schedule is as follows:
### November 4
* 08:30 Registration and welcome with coffee and tea
* 09:00 Get to know each other & introduction
* 09:30 The Big Data issue by Ron Wehrens
* 10:30 Break
* 11:00 Profiling and efficient R programming by Ron Wehrens
* 12:30 Lunch
* 13:30 Profiling & efficient R programming (contd.) by Ron Wehrens
* 15:00 Break
* 16:00 Reproducible research by Sven Warris
* 17:00 Q&A
* 17.30 Closing day 1
* 18.00 Joined dinner at course venue
### November 5
* 08:45 Welcome with coffee and tea
* 09:00 R memory management and big data by Ron Wehrens
* 10:30 Break
* 11:00 HPC in R by Sven Warris
* 12:30 Lunch
* 13:30 Big data analyses: a case of machine learning by Sven Warris
* 15:00 Break
* 15:30 Big data analyses: a case of machine learning (contd.) by Sven Warris
* 16:30 Q&A
* 17:00 Evaluation
* 17:15 Closing course
## Course information
......
No preview for this file type
---
title: "High performance computing with R"
author: "Sven Warris"
date: "October 2021"
date: "November 2021"
output: beamer_presentation
---
......@@ -138,6 +138,18 @@ Cons:
# Exercise
Let's go shopping!
a) one person, one list
b) group of 5 people, one list
c) group of 5 people, random items from the list
d) group of 5 people, items from the list per section of the shop
How will this work out in practise with 1, 3, 5 or 7 lanes open?
![Floor plan](./images/floorplan.jpg){width=50%}
# Exercise
On my computer (_20 cores, 0.5TB RAM_), I have 5 HiSeq read files (10Gb each,
2mil reads per file) on a network share. The first step in processing these
reads consists of trimming low quality base calls at the end of each read. Then
......@@ -154,6 +166,8 @@ files at once without problems
cluster. The files are stored in files of 100,000 reads each. Trimming is done
through jobs streaming the data to local SSDs.
# Algorithmic approaches
Basic idea:
......@@ -430,17 +444,6 @@ c(12,41,53,8,9,2,11,7)
* Design and implement an algorithm which find the _max_ value in _log2(N)_ steps.
* Time _%do%_ and _%dopar%_ for several lengths of the list
# Exercise
Let's go shopping!
a) one person, one list
b) group of 5 people, one list
c) group of 5 people, random items from the list
d) group of 5 people, items from the list per section of the shop
How will this work out in practise with 1, 3, 5 or 7 lanes open?
![Floor plan](./images/floorplan.jpg){width=50%}
# Serial vs. Parallel
......@@ -485,6 +488,8 @@ scatter3D(x, y, z, theta=45, xlab=xLabel, ylab=yLabel,
Chloroplast vs Chloroplast
# End of HPC
This is the end of the part on High Performance Computing
\ No newline at end of file
No preview for this file type
---
title: "Big data and machine learning"
author: "Sven Warris & Christina Papastolopoulou"
date: "October 2021"
date: "November 2021"
output: beamer_presentation
---
......
No preview for this file type
---
title: "Reproducible Research"
author: "Sven Warris & Maikel Verouden"
date: "October 2021"
date: "November 2021"
output: beamer_presentation
---
......@@ -72,7 +72,7 @@ Examples:
* Co-develop
* Develop-test-release cycle support
* Many, many tools available
* Integrated into R
* Integrated into R=
* [WUR Git lab](https://git.wur.nl)
* [github](http://www.github.com)
* Allows for **DOI**!
......
No preview for this file type
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment