The following analysis examines the 114 survey responses from our EDAV class.
The distribution of text editor shows that RStudio is heavily favored.
We can see text editor preferences by program: few students of statistics prefer different editors while 40% of data-science students do not use RStudio.
Generally, for each skill, there are few people who are experts of that certain skill. Most people (more than 2/3) know nothing and a little about R advanced, R reproducible, Matlab and Github, while more people are confident about R manipulation and R graphics. In general, it’s a good mix of all kinds of skill sets.
Tool expertise by program - Generally, Data Science students know less about all these first five tools. But they are pretty good at using Github. - Other masters other than data science are good at R manipulation and R graphics. At least, none of them know nothing about these two tools. - Among all these six tools, experts are minority. Most of them are in statistics major. - Most of students, whatever his/her major is, only know a little about them. It means that we are far from to be a qualified data scientist.
The figure above helps us understand the experience distribution program-wise, categorized by 4 ordered levels (None < A little < Confident < Expert) . A few inferences that can be drawn from the graph are as follows :
A majority of class is confident in data manipulation in R with very few individuals in the none category and expert category. But its noteworthy that data manipulation in “Expert” category is the highest value among all the expert categories.
A majority of students have “little” to “none” expertise in producing reproducible research and Matlab. Hypothetically the instructor should focus on these areas.
There is an almost equal distribution of students who are confident in Github and those who know “none” about it.
This Chart shows a relation between program , tool and experience level. We removed the association between program and tool when the level of experience is None. In Master Data-Science (IDSE) Github and Matlab are the most unknown topic and certification program students have poor experience with R.reproducible. However 30% of all students do not have any knowledge in Matlab and 91% of them have worked with R.Manipulation. Analysis on Tools: