Comments and answers to frequently asked questions and opinions:
Amazon Cloud
Q: Is it necessary to start/stop the instance?
A: Yes, as we pay “by the hour”, we ask you to stop (stop! not terminate!) your instance after the last exercise of the day.
Participant intro
Q: This took too long!!
A: Right! Somehow everybody used only about a minute before the break, so we stopped being strict about the time limit and thought we would finish earlier, but somehow we didn’t. Maybe there will be speed dating the next time.
Q: Could you provide a list of participant names with institutes and contact details?
A: We’ve made the list of participants available here, but without contact details as other participants previously objected to having their email addresses published online. A pdf of all slides (except for those who wanted to be excluded) is provided here.
File Formats, VCFtools, PLINK
Q: Where was the link to the morning theory lecture?
A: Agree, that could be improved! We aimed at keeping lectures and practicals of the same topic together, as well as trying to have a logical order in the topics and tasks. However, due to time constraints by the faculty this was not always possible.
Q: Why did we not start with the mapping?
A: Learning how to prepare genomic data is covered by the genomics workshop. If you’re interested in that, you may look up their exercises here. During this workshop we will mainly focus on the analysis of genomic data sets, rather than on their generation. However, in order to prepare your data for any biological interpretation, some initial adjustments are almost always necessary and we tried to address this with this exercise.
R introduction
Q: Can you provide a solution for the color plot?
A: In the data.frame “si” you have information on the samples. One of the columns is named “population” and contains numbers indicating which population the individuals were sampled from (in this case it were different geographic locations). You can use this column as input for the “col=” argument in the “plot()” function like this: plot(pca$scores[,1],pca$scores[,2],col=si[,”population”]). You can also specify “pch=16” to make the points solid, and “cex=2” to make them larger.