Social Network Analysis with Gephi

The next software package that we were introduced to in the DALMOOC was Gephi, which is an open source tool for conducting social network analysis. I found Gephi an easier tool to use than Tableau, and it was fairly straightforward to load the sample data that was provided and start analysing it.

These were the results of my analysis to determine the density and centrality measures of each dataset :

For the example_1 dataset:

Example_1

For the example_2 dataset:

Example_2

For the CCK11 dataset (Twitter network):

CCK_Twitter

For the CCK11 dataset (blog network):

CCK_blogs

These were the results of using the Giant Component filter, and then determining the modularity for each dataset:

For the example_1 dataset:

Example_1 modularity

For the example_2 dataset:

Example_2 modularity

For the CCK11 dataset (Twitter network):

CCK Twitter modularity

For the CCK11 dataset (blog network):

CCK blogs modularityIt was also fun to play around with the various network representations, and the options for partitioning and highlighting various properties of the network. This is the example_1 network with a few changes made to it: it’s in the Fruchterman Reingold representation, nodes are sized according to betweenness centrality, labels are turned on, and each community is a different colour

Example_1 extra

Here’s the example_2 network with similar changes:

Example_2_extra

And for the CCK dataset (Twitter network):

CCK_Twitter_extra

And finally the CCK dataset (blogs network):

CCK_blogs extraI found these exercises a useful way to get some experience with social network analysis, and I have some ideas of how I could use Gephi in a project that I’m working on.

 

Data wrangling with Tableau

The first hands-on assignment for the Data, Analytics and Learning MOOC was designed to give us some experience with using the Tableau software package to analyse and visualise data. It was a straightforward process to download and install the software, then it was time to find some data to analyse. I decided to use the data about overseas students who had come to study in Australia, for the period 2004-1013, from the Australian Higher Education Statistics. Before I could import the data into Tableau, I had to do a bit of cleanup on it. I had to combine the data from each year into a single spreadsheet, and I also had to delete countries which were not listed in the data every year. I wanted to compare the number of students coming from each country to see which countries had grown and which had shrunk. One of the tables had a column for “Country of permanent residence”, so that’s what I used. The source data is limited to countries with more than 20 students, which is why there is variation in the number of countries which are included from year to year.

After a bit of fiddling with the dimensions, measures and table calculations, I managed to produce the map I was after.

Overseas students mapIn order to create the map, I used the “Table Calculation” function to calculate the percentage difference between 2004 and 2013. This produced a map for each year, so I used the “Hide” command to hide the results for every year except 2013, and bingo – I had my map.

All in all I found working with Tableau fairly straightforward, although I did find it took a bit of trial and error to produce the analysis I was after. However, the aim of the assignment (and DALMOOC in general) wasn’t to turn us into Tableau experts, but to expose us to some of the tools which can be used for data analysis and visualisation. I now have enough of an idea of how Tableau works to be able to consider how I can use it in the future. It will be interesting to see how my introduction to Tableau compares to the other software packages that we’ll use over the next few weeks.