Skip to Main Content

Digital Tools for Research

This guide provides information about digital tools that can be useful for research data management and analysis.

Gephi

Getting Started with Gephi

When the program is launched, a welcome window appears where you need to select "Open graph file." There are a few sample datasets available in Gephi, but in this guide, we will be using an external one — a  co-occurrence network of characters in the first book of the Game of Thrones series, which you can download from this repository or from Kaggle.

After importing, Gephi will display a report with the graph's characteristics, as well as the number of nodes and edges.

Immediately after loading, the graph will open in the "Overview" tab. It will look like this.

NB! Gephi doesn't have an "Undo" button, so be careful with changes to avoid having to redo everything from scratch!

Data Laboratory

Before working with the graph, you may need to make some adjustments. This can be done in the "Data Laboratory" tab, which is designed for manipulating source data. For example:

  • adding nodes and edges manually adding, deleting and merging columns;
  • creating new columns based on regular expressions;
  • editing text and numbers in the cells;
  • importing and exporting data as a spreadsheet.

When we imported the Game of Thrones dataset from csv, the first column containing character names was interpreted as the ID, but there was no separate "Label" column, which we will need later to display node labels. To create it, we will duplicate the contents of the ID column into a new column called "Label" in the "Data Laboratory" tab. Additionally, you can replace all dashes with spaces in the "Label" column for better readability.

First, choose the column from which you'd like to copy data. Then, in the pop-up window, choose the destination column.

      

After this manipulation, the table will look like this.

Node Size and Colour

Now, let's go back to the "Overview" tab. To make the graph more visually informative, you can adjust the colour and the size of the nodes and edges according to their attributes. Let's start by calculating the modularity to divide the graph into communities. This can be done in the "Statistics" tab on the workspace panel on the right.

Now let's move on to changing the colour of the nodes and edges. This is done using the palette icon in the "Appearance" tab on the left workspace panel. By default, all nodes and edges are coloured in one colour (Unique), but we will colour them according to the communities they belong to (Partition > Modularity Class). You can also choose any other attribute you have in your data, or colour the nodes according to their degree (Ranking > Degree).

After changing the colours, your graph will look like this.

In the bottom-right corner, you will see a small label "Palette." By clicking on it, you can choose the colours that the nodes will be coloured in. Default palettes have only 8 colours, but if you have a larger graph, you may want more colours. For such cases, Gephi provides the option to generate a palette: to have as many colours as there are different values for the selected attribute, simply uncheck "Limit number of colours," then click "Generate" and "OK". You can also choose the style of the palette (under the Presets parameter): pastel tones, dark colours, vivid colours, etc.

    

Now you can adjust the node sizes — by default, they are all the same. To do this, click on the icon with circles (to the right of the palette) in the "Appearance" window on the left panel. The remaining two icons control the colour and size of the labels.

After these adjustments, you should get a visualisation similar to this. You can zoom in and out using the mouse wheel. The magnifying glass button at the bottom of the toolbar to the left of the workspace centers the graph.

Graph Layout

By default, the graph layout is random, i.e. the position of nodes and their proximity to each other do not convey any meaning. Let's make the visualization more meaningful using a layout algorithm. The "Layout" menu is located in the bottom left corner. Here is a summary of some of the layout algorithms available in Gephi 0.10 (you can download the official tutorial in PDF for a more detailed explanation).

  1. Contraction: brings nodes closer together
  2. Expansion: the opposite of contraction, increases space between nodes
  3. ForceAtlas: a force-directed algorithm, ideal for most small world/scale-free networks
  4. ForceAtlas 2: an improved version of ForceAtlas, offering faster performance and better scalability.
  5. Fruchterman-Reingold: another force-directed layout that distributes nodes in a way that minimizes edge crossings; works with undirected graphs
  6. Label Adjust: stops labels from overlapping each other
  7. Noverlap: stops nodes from overlapping each other
  8. OpenOrd: a layout similar to Fruchterman-Reingold that emphasizes community structures (clusters) within the network; works with undirected graphs
  9. Random Layout: randomises the position of nodes
  10. Rotate: rotates the graph bu 90 clockwise
  11. YifanHu: a fast algorithm that’s particularly useful for large networks; combines a force-directed model with a graph coarsening technique (multilevel algorithm) to reduce the complexity
  12. YifanHu Proportional: similar to YifanHu, but node sizes are proportional to a specific metric or attribute value
  13. *Circular Layout: orders the nodes by any metric or attribute and positions them in a circle; useful for emphasizing cyclic structures
  14. *Geo Layout: if your nodes have geographical attributes, this layout positions them based on their real-world locations

*You have to install plugins from Tools > Plugins > Available Plugins to use these layout algorithms.

Let's run the Fruchterman-Reingold algorithm and observe how nodes of the same colour, belonging to the same community, are being drawn closer together.

    

You don’t have to wait for the layout process to finish: if you're happy with the result, you can simply press "Stop" for algorithms that don't stop automatically, like Force Atlas, or cancel the process for others, like Fruchterman-Reingold. Otherwise, they can run indefinitely.

If you feel that the nodes are too close together, you can use the "Expansion".

Node Labels

The image has become more representative, but it clearly lacks node labels. There is a toolbar below the graph area for working with them. To make the labels appear, click on the black letter T; to the right, you can select the colour, font and size.

Now, the graph will look like this.

After enabling labels, you will likely notice that they overlap. To avoid this, you need to run the "Label Adjust" algorithm in the "Layout" window. The images below show the graph before and after applying the layout.

Preview Tab

Finally, in the last tab, "Preview," you can see a nicely rendered version of the graph instead of the working version. The only thing to keep in mind is that you will need to enable the labels again, but this time using the toolbar on the left within this tab.

If you're working with a large graph, it's best to uncheck the "Proportional size" option and slightly increase the font size. The rest of the label settings can be adjusted to your preference.

Most likely, at first, you will just see a blank white field without the graph. To render it, you need to click the "Refresh" button at the bottom. The same applies after any changes—if you want to see the updates, you must click "Refresh" again.

Export Graph as an Image

The graph can be saved as a PNG, SVG or PDF. If you need a small file and detail is not important, it's better to choose PNG. However, if you want to examine the graph at any zoom level without losing quality, it's better to choose PDF or SVG. The graph can be exported in two ways: using the corresponding button in the bottom left corner of the "View" tab or via the "File > Export" menu.

Export Graph as a Dynamic Visualisation

Additionally, you can export a dynamic graph and upload it online. To do this, you need to install the Sigma Exporter plugin from the "Tools > Plugins > Available Plugins" menu and then restart Gephi. Don't forget to save your project ("File > Save Project" or Ctrl + S) before restarting, to avoid losing your work!

After that, you can save the graph in Sigma by selecting "File > Export > Sigma.js template".

In the window that appears, you need to specify the folder path where you want to export the project, as well as the details for the legend: the title and a brief description of the graph, as well as what the nodes, edges, and their colours represent.

After this, a folder named "network" will appear in the specified directory. 

Hosting Dynamic Graph on GitHub

You can upload your interactive graph to GitHub to make it publicly available as a simple one-page website. Please register here and have a look at this guide to get familiar with GitHub.

All the files and subfolders from the "Network"  folder need to be uploaded to a new repository on GitHub. If you forgot to include something during the export, you can manually edit the config.json file by opening it in a text editor. Here is a list of the files that should be in your repository.

When you upload all the necessary files to your GitHub repository, go to its settings and scroll down to the "GitHub Pages" section. Then, in the "Source" dropdown, select the "Master branch" and click the "Save" button. When you refresh the page, you will see the link to your interactive visualisation.

It should look something like this. If you want, you can experiment with the index.html file to remove elements you don't like (for example, the empty space at the top of the legend).

Please note that all selectors should be functional!