Chapter 15 Web Topics
15.1 Software for Network Measures
Introduction
A number of different fields have developed network analysis tools independently of each other. Often, the same measure has been developed for different users and given different names. The recent perception that different fields were often interested in the same network properties has encouraged general reviews that clarify equivalences and similarities (Albert and Barabási 2002; Barabási and Bonabeau 2003; Dorogovtsev and Mendes 2003; Newman 2003; Boccaletti et al. 2006; Croft et al. 2008; Dorogovtsev et al. 2008; Krause et al. 2009). Even with a correction for synonyms, network measures remain diverse. A network can have structure at any of multiple hierarchical levels: one can extract separate measures for each level independently, or apply measures that examine the degree to which the structures at different levels are correlated. In this regard, network measurement tools face the same challenges as tools used to characterize spatial pattern (Bivand et al. 2008).
Below, we provide general background on network measurement software and URLs to a variety of network analysis packages:
Software environments
There are several standard environments in which one can run network analysis software. Before listing options, it is important to decide which environment may be best for you:
- Platform specific software: Programs written to run natively on computers are often platform specific (e.g., Linux, Mac, Windows). In general, there are more network software programs written for Windows, but the increasing availability of platform-independent software (see below) is changing that bias rapidly.
- Java-based software: Programs written in the Java programming language can usually run on any platform as long as Java interpreting tools are loaded on the host computer. Most modern computers now come with Java interfacing when purchased. If yours does not, check with relevant websites to see what you need to download the suitable Java software and install it.
- Matlab routines: Matlab is a programming environment originally designed for matrix mathematics. Matlab comes in separate versions for Linux, Macs, and Windows computers and is fairly expensive (although many universities and colleges have site licenses and reduced rates for faculty and students). Matlab provides thousands of very useful built-in functions that one can then combine, using a simple Matlab language, into very complex sequences of procedures. These can be saved and distributed widely. Once you have Matlab installed on your computer, there are thousands of Matlab procedures that you can download (usually free). Several of the network packages listed below run in Matlab. You can find out more about Matlab at their website (http://www.mathworks.com/products/matlab/).
- R: R is a programming environment similar to Matlab except that it was originally aimed at statistics. Like Matlab, it comes with many useful functions that can be combined using a simple R language into very complex procedures. Also like Matlab, there are thousands of useful procedures that others have written and are happy to share. Unlike Matlab, R is free. It runs on all platforms. Check out the R site for details (http://www.r-project.org/).
- GRAPHML (http://en.wikipedia.org/wiki/GraphML): XML defines standards for the format and syntax in which data can be shared over the internet (http://www.w3.org/TR/2001/REC-xmlschema-0-20010502/). Such standards are needed if multiple programs are to be able to exchange data. GRAPHML (http://graphml.graphdrawing.org/) is a subset of the overall XML schema that specifies how network data should be formatted for exchange and use in XML-based analysis programs. To understand these standards, one should be roughly familiar with the general XML principles and syntax. GXL (http://en.wikipedia.org/wiki/GXL) is a similar XML standard for graphical data used by some network graphics programs.
Overview sites
- Wikipedia (http://en.wikipedia.org/wiki/Social_network_analysis_software): The Wikipedia article on “Social network analysis software” begins with a brief overview of the kinds of network software available, and then provides a table with a list of different software packages, for what purposes they can be used, the available platforms and environments, costs, and relevant details. Links in the table can take you directly to the source site.
Data setup
The data needed to construct a network usually include a set of nodes (usually individuals), a set of links between nodes (which may be unweighted or weighted, signed or unsigned, fixed or dynamically valued, etc.), and a list of attributes for each node (e.g., sex, age, morph, etc.). Many network analyses begin with the construction of an association matrix. This is a two-dimensional matrix that codes the relationships between each node (usually an individual) and each other node. Codes can be discrete (usually 1’s and 0’s) or continuous (weighted links). There are many pitfalls and complexities in setting up association matrices properly. Alternatively, one can provide a program with node and edge lists: if the network is sparse, node and edge lists can be much more economical than setting up the full association matrix. Instead, one generates separate lists of nodes, links, and attributes. Such lists are usually provided as simple ASCII files but one has to be careful to follow any syntax or format restrictions for the particular program that will use the lists.
Some relevant resources for setting up network data:
- SOCPROG (http://myweb.dal.ca/hwhitehe/social.htm): This extensive series of association matrix tools (and some network analysis measures) was developed by Hal Whitehead at Dalhousie University. The package is free, but note that it consists entirely of Matlab procedures. You thus would need to have Matlab (particularly the Statistics Toolbox) installed on your computer to use this package. Because the cell values in an association matrix are not independent of each other, one cannot use standard statistics to test hypotheses (e.g., whether the distribution of values in the matrix differs significantly from random assignments). Whitehead thus provides a number of tools for performing such tests (e.g., Mantel tests) that are valid despite the dependence of values. He also describes methods that extract latent orthogonal variables from similarity/distance matrices which can be used in standard parametric statistics (e.g., multidimensional scaling, principal coordinates analysis).
- NETWORK (http://www.jstatsoft.org/v24/i02/): This site provides access to a package written for use in R that describes the format and syntax in which network data can be summarized into lists before feeding it to various network analysis programs in R.
- GRAPHML (http://graphml.graphdrawing.org/primer/graphml-primer.html): This site provides a very illustrative primer on how to format network data in GRAPHML. A number of platform-independent programs can import GRAPHML data files.
- PAJEK (http://pajek.imfm.si/doku.php?id=pajek): PAJEK is both an analysis package (see below) and has its own format for importing and exporting network data. Some other analysis programs will import PAJEK data (e.g., GELPHI).
Network analysis software
Most modern software packages provide various tools for importing data (association matrices or data lists), applying filters, various modes of visualizing the input networks, a variety of network connectivity, centrality, and attribute assortativity measures, and tools for identifying communities, modules, and cliques. Where statistics are feasible, suitable statistical test tools may also be included. Most programs allow one to export images of the drawn network and various measure values. Note that network analysis continues to be a very dynamic field: most of the packages listed below are continually being upgraded and extended. Here is a sampling of some of the current software packages:
- UCINET ( http://www.analytictech.com/ucinet6/ucinet.htm): This program is probably the easiest to master and most of the tools that most people will need. The program is currently available for the Windows platform only (although it works just as well in Windows emulator environments on Macintosh and Linux machines). It can be downloaded for free for 60 days and reduced rates on licenses are available for faculty and students. Data can be entered as an association matrix or as a list of nodes and links. It has been widely used in a variety of disciplines including behavioral ecology (Croft et al. 2008).
- MULTINET (http://www.sfu.ca/personal/archives/richards/Multinet/Pages/multinet.htm): This package has not been updated in some time, but has some nice functions, particularly eigen-analysis tools. It only runs on Windows platforms.
- PAJEK (http://pajek.imfm.si/doku.php?id=pajek): This is another free Windows program that has been widely used in biological contexts. Books are available explaining how to extract and interpret PAJEK measures. Although it has been available for some time, it was recently updated.
- STATNET (http://statnetproject.org/): STATNET is a free suite of procedures written for R. You must have R (or an equivalent derivative such as SPLUS) installed on your computer to use this software. However, because R can be installed on any platform, this means the units are platform independent. A nice feature is that the possible tasks one might consider are largely broken down into separate modules. These are easily interfaced, but one need download only those units that are relevant to your particular interests. There are good references available for most modules.
- TNET (http://toreopsahl.com/tnet/): These free units are also written to run in R and thus can be used on any platform that has R installed. They focus particularly on weighted networks and networks where there are two concurrent sets of attributes present.
- JUNG (http://jung.sourceforge.net/index.html): This free package is written in JAVA and so should run on any platform with suitable JAVA interpretation tools installed. Its introductory page says that it implements “a number of algorithms from graph theory, data mining, and social network analysis, such as routines for clustering, decomposition, optimization, random graph generation, statistical analysis, and calculation of network distances, flows, and importance measures (centrality, PageRank, HITS, etc.).” JUNG also provides a number of network visualization tools. It will handle both weighted and unweighted networks.
- GEPHI (http://gephi.org/): GEPHI is a free JAVA package that will run on any platform. It accepts a wide variety of input data formats and provides a wide range of tools for coloring, rotating, expanding and annotating images of networks. It also provides a standard set of measurement tools.
- NETWORKBENCH (http://nwb.cns.iu.edu/): NETWORKBENCH is a free JAVA-based package that should run on any platform. However, check the FAQ for extra items that may need to be installed to run these modules on a Mac. This program accepts a wide variety of import data files include GRAPHML, PAJEK, and various list files. The package is now quite extensive and it may take one a bit of time to master some of the more sophisticated tools.
- VISONE (http://visone.info/html/about.html): This free package (for non-commercial users) is also written in JAVA and should run on any platform. Some documentation and relevant publications are available. As with GEPHI, the package emphasizes visual inspection of networks.
Visualization tools
Some available software largely focuses on visualizing networks. Some useful sources:
- NETDRAW (https://sites.google.com/site/netdrawsoftware/home): This is a free Windows program that draws network structures given a variety of input data file types. It complements the UCINET program that is produced by the same source.
- Y-ED (http://www.yworks.com/en/products_yed_about.html): This is part of an extensive JAVA library called yFiles (http://www.yworks.com/en/products_yfiles_about.html). The free yED graphic editor has its own GUI for importing and manipulating data. The program accepts a number of import file types, though favoring GraphML, and can save the resulting images in various formats.
- GRAPHVIZ (http://graphviz.org/About.php): These routines are largely designed for implementation in other programs and packages. Readers who want to import them into their own programs or procedures may want to consult this site.
Extensions
Here are a few additional sites that may be of interest:
- CFINDER (http://cfinder.org/): This free JAVA program has versions for any platform. It is designed to identify communities and modules in networks. Input files are simple text file lists of links with an optional modifier indicating link weights.
- NETWORK FRACTALS: As discussed in Chapter 15, one can use a box-covering algorithm to compare the number of boxes needed to cover a network at various box sizes to determine the relative fractal versus small world nature of that network (Song et al. 2005, 2006; Song et al. 2007; Rozenfeld and Makse 2009; Rozenfeld et al. 2010). The algorithms used by these authors can be obtained at H.A. Makse’s website (http://www-levich.engr.ccny.cuny.edu/~hmakse/soft_data.html). Go to the bottom of the page to the section "Research on Fractal Complex Netorks" for links. Note that these must be run in the Python NetworkX environment (http://networkx.lanl.gov/). There may be other routines at NetworkX of interest. For another algorithm for determining fractal dimension of networks see Locci et al. (http://portal.acm.org/citation.cfm?id=1852511).
Literature cited
Albert, R. and A.L. Barabási. 2002. Statistical mechanics of complex networks. Reviews of Modern Physics 74: 47–97.
Barabási, A.L. and E. Bonabeau. 2003. Scale-free networks. Scientific American 288: 60–69.
Bivand, R.S., E.J. Pebesma and V. Gómez-Rubio. 2008. Applied Spatial Data Analysis with R. New York NY: Springer Science+Business Media.
Boccaletti, S., V. Latora, Y. Moreno, M. Chavez and D.U. Hwang. 2006. Complex networks: Structure and dynamics. Physics Reports—Review Section of Physics Letters 424: 175–308.
Croft, D.P., R. James and J. Krause, eds. 2008. Exploring Animal Social Networks. Princeton, NJ: Princeton University Press.
Dorogovtsev, S.N., A.V. Goltsev and J.F.F. Mendes. 2008. Critical phenomena in complex networks. Reviews of Modern Physics 80: 1275–1335.
Dorogovtsev, S.N. and J.F.F. Mendes. 2003. Evolution of Networks: From Biological Nets to the Internet and WWW. Oxford UK: Oxford University Press.
Krause, J., D. Lusseau and R. James. 2009. Animal social networks: an introduction. Behavioral Ecology and Sociobiology 63: 967–973.
Newman, M. 2003. The structure and function of complex networks. SIAM Review 45: 167–256.
Rozenfeld, H.D. and H.A. Makse. 2009. Fractality and the percolation transition in complex networks. Chemical Engineering Science 64: 4572–4575.
Rozenfeld, H.D., C.M. Song and H.A. Makse. 2010. Small-world to fractal transition in complex networks: a renormalization group approach. Physical Review Letters 104(2): article #025701.
Song, C.M., L.K. Gallos, S. Havlin and H.A. Makse. 2007. How to calculate the fractal dimension of a complex network: the box covering algorithm. Journal of Statistical Mechanics-Theory and Experiment
Song, C.M., S. Havlin and H.A. Makse. 2005. Self-similarity of complex networks. Nature 433: 392–395.
Song, C.M., S. Havlin and H.A. Makse. 2006. Origins of fractality in the growth of complex networks. Nature Physics 2: 275–281.
15.2 Additional Information on Evolutionary Graph Theory
Although the field is new, a number of websites provide additional background on evolutionary processes on graphs and help explain some of the models outlined in Chapter 15. Suggested links include:
Virtual Labs in Evolutionary Game Theory
This site contains a suite of evolutionary game models. In most cases, the authors present a well-mixed classical game theory or adaptive dynamics case and then examine how the outcomes might change if the game is played on a structured network (lattice) with local interactions only.
(http://www.univie.ac.at/virtuallabs/)
Moran Process Demonstration
(http://demonstrations.wolfram.com/TheMoranProcess/)
Cooperation in Heterogeneous Populations
(http://demonstrations.wolfram.com/CooperationInHeterogeneousPopulations/)
Lecture on evolutionary graph theory methods
PDF of Powerpoint slides by Gregory Puleo (www.math.uiuc.edu/~puleo/mighty.pdf)
Other
A search on Google for “evolutionary graph theory” will turn up a large number of PDFs and journal article abstracts that go beyond the general background at the sites above. Many of these are cited in the bibliography of our text, but others will surely be posted after the publication of the text.