FAQS

*****************
* miniVite FAQs *
*****************
----------------------------------------------------
FYI, typical "How to run" queries are addressed Q5
onward.

Please send your suggestions for improving this FAQ
to zsayanz at gmail dot com OR hala at pnnl dot gov.
----------------------------------------------------

-------------------------------------------------------------------------
Q1. What is graph community detection?
-------------------------------------------------------------------------

A1. In most real-world graphs/networks, the nodes/vertices tend to be 
organized into tightly-knit modules known as communities or clusters,
such that nodes within a community are more likely to be "related" to 
one another than they are to the rest of the network. The goodness of 
partitioning into communities is typically measured using a metric 
called modularity. Community detection is the method of identifying 
these clusters or communities in graphs.

[References]

Fortunato, Santo. "Community detection in graphs." Physics reports 
486.3-5 (2010): 75-174. https://arxiv.org/pdf/0906.0612.pdf

--------------------------------------------------------------------------
Q2. What is miniVite? 
--------------------------------------------------------------------------

A2. miniVite is a distributed-memory code (or mini application) that 
performs partial graph community detection using the Louvain method. 
Louvain method is a multi-phase, iterative heuristic that performs 
modularity optimization for graph community detection. miniVite only 
performs the first phase of Louvain method.

[Code]

https://github.com/Exa-Graph/miniVite
http://hpc.pnl.gov/people/hala/grappolo.html

[References]

[Louvain method] Blondel, Vincent D., et al. "Fast unfolding of 
communities in large networks." Journal of statistical mechanics: 
theory and experiment 2008.10 (2008): P10008.

[miniVite] Ghosh S, Halappanavar M, Tumeo A, Kalyanaraman A, 
Gebremedhin AH. miniVite: A Graph Analytics Benchmarking Tool for 
Massively Parallel Systems.

---------------------------------------------------------------------------
Q3. What is the parent application of miniVite? How are they different?
---------------------------------------------------------------------------

A3. miniVite is derived from Vite, which implements the multi-phase 
Louvain method. Apart from a parallel baseline version, Vite provides 
a number of heuristics (such as early termination, threshold cycling and 
incomplete coloring) that can improve the scalability and quality of 
community detection. In contrast, miniVite just provides a parallel 
baseline version, and, has option to select different MPI communication 
methods (such as send/recv, collectives and RMA) for one of the most 
communication intensive portions of the code. miniVite also includes an 
in-memory random geometric graph generator, making it convenient for 
users to run miniVite without any external files. Vite can also convert 
graphs from different native formats (like matrix market, SNAP, edge 
list, DIMACS, etc) to the binary format that both Vite and miniVite 
requires.

[Code]

http://hpc.pnl.gov/people/hala/grappolo.html

[References]

Ghosh S, Halappanavar M, Tumeo A, Kalyanaraman A, Lu H, Chavarria-Miranda D, 
Khan A, Gebremedhin A. Distributed louvain algorithm for graph community 
detection. In 2018 IEEE International Parallel and Distributed Processing 
Symposium (IPDPS) 2018 May 21 (pp. 885-895). IEEE.

Ghosh S, Halappanavar M, Tumeo A, Kalyanaraman A, Gebremedhin AH. 
Scalable Distributed Memory Community Detection Using Vite. 
In 2018 IEEE High Performance extreme Computing Conference (HPEC) 2018 
Sep 25 (pp. 1-7). IEEE.

-----------------------------------------------------------------------------
Q4. Is there a shared-memory equivalent of Vite/miniVite?
-----------------------------------------------------------------------------

A4. Yes, Grappolo performs shared-memory community detection using Louvain 
method. Apart from community detection, Grappolo has routines for matrix 
reordering as well. 

[Code]

http://hpc.pnl.gov/people/hala/grappolo.html

[References]

Lu H, Halappanavar M, Kalyanaraman A. Parallel heuristics for scalable 
community detection. Parallel Computing. 2015 Aug 1;47:19-37.

Halappanavar M, Lu H, Kalyanaraman A, Tumeo A. Scalable static and dynamic 
community detection using grappolo. In High Performance Extreme Computing 
Conference (HPEC), 2017 IEEE 2017 Sep 12 (pp. 1-6). IEEE.

------------------------------------------------------------------------------
Q5. How does one perform strong scaling analysis using miniVite? How to 
determine 'good' candidates (input graphs) that can be used for strong 
scaling runs? How much time is approximately spent in performing I/O?
------------------------------------------------------------------------------

A5. Use a large graph as an input, preferably over a billion edges. Not all 
large graphs have a good community structure. You should be able to identify 
one that serves your purpose, hopefully after few trials. Graphs can be 
obtained various websites serving as repositories, such as Sparse TAMU 
collection[1], SNAP repository[2] and MIT Graph Challenge website[3], to name 
a few of the prominent ones. You can convert graphs from their native format to 
the binary format that miniVite requires, using the converters in Vite (please 
see README). 

If your graph is in Webgraph[4] format, you can easily convert it to an edge list 
first (example code snippet below), before passing it on to Vite for subsequent 
binary conversion, using the C++ version of Webgraph library[5]. Since the C++
Webgraph library is not actively developed, you may use the original Webgraph
JAVA library too.

#include "offline_edge_iterator.hpp"
...
using namespace webgraph::ascii_graph;

// read in input/output file 
std::ofstream ofile(argv[2]);
offline_edge_iterator itor(argv[1]), end;

// read edges
while( itor != end ) {
    ofile << itor->first << " " << itor->second << std::endl;
    ++itor;
}
ofile.close();
...

miniVite takes about 2-4s to read a 55GB binary file if you use Burst buffer 
(Cray DataWarp) or Lustre striping (about 25 OSTs, default 1M blocks). The overall 
I/O time for most cases is observed to be within 1/2% of the overall execution time. 
This is assuming the simple vertex-based distribution (which is the default), if you 
pass "-b" (only valid when you pass an input graph), then miniVite tries to balance 
the number of edges (as a result, processes may own dissimilar number of vertices). 
In such cases, there will be a serial overhead that may increase the I/O time and 
subsequent distributed graph creation time significantly.

[1] https://sparse.tamu.edu/
[2] http://snap.stanford.edu/data
[3] http://graphchallenge.mit.edu/data-sets
[4] http://webgraph.di.unimi.it/
[5] http://cnets.indiana.edu/groups/nan/webgraph/

-----------------------------------------------------------------------------------
Q6. How does one perform weak scaling analysis using miniVite? How does one scale
the graphs with processes?
-----------------------------------------------------------------------------------

A6. miniVite has an in-memory random geometric graph generator (please see
README) that can be used for weak-scaling analysis. An n-D random geometric graph 
(RGG), is generated by randomly placing N vertices in an n-D space and connecting 
pairs of vertices whose Euclidean distance is less than or equal to d. We only 
consider 2D RGGs contained within a unit square, [0,1]^2. We distribute the domain 
such that each process receives N/p vertices (where p is the total 
number of processes). 

Each process owns (1 * 1/p) portion of the unit square and d is computed as (please 
refer to Section 4 of miniVite paper for details): 

d = (dc + dt)/2;
where, dc = sqrt(ln(N) / pi*N); dt = sqrt(2.0736 / pi*N)

Therefore, the number of vertices (N) passed during miniVite execution on p
processes must satisfy the condition -- 1/p > d.

Please note, the default distribution of graph generated from the in-built random 
geometric graph generator causes a process to only communicate with its two 
immediate neighbors. If you want to increase the communication intensity for 
generated graphs, please use the "-p" option to specify an extra percentage of edges 
that will be generated, linking random vertices. As a side-effect, this option 
significantly increases the time required to generate the graph.

------------------------------------------------------------------------------
Q7. Does Vite (the parent application to miniVite) have an in-built graph 
generator?
------------------------------------------------------------------------------

A7. At present, Vite does not have an in-built graph generator that we have in 
miniVite, so we rely on users providing external graphs for Vite (strong/weak 
scaling) analysis. However, Vite has bindings to NetworKit[5], and users can use 
those bindings to generate graphs of their choice from Vite (refer to the 
README). Generating large graphs in this manner can take a lot of time, since 
there are intermediate copies and the graph generators themselves may be serial 
or may use threads on a shared-memory system. We do not plan on supporting the 
NetworKit bindings in future.

[5] https://networkit.github.io/

------------------------------------------------------------------------------
Q8. Does providing a larger input graph translate to comparatively larger 
execution times? Is it possible to control the execution time for a particular
graph?
------------------------------------------------------------------------------

A8. No. A relatively small graph can run for many iterations, as compared to
a larger graph that runs for a few iterations to convergence. Since miniVite is 
iterative, the final number of iterations to convergence (and hence, execution 
time) depends on the structure of the graph. It is however possible to exit 
early by passing a larger threshold (using the "-t <...>" option, the default 
threshold or tolerance is 1.0E-06, a larger threshold can be passed, for e.g, 
"-t 1.0E-03"), that should reduce the overall execution time for all graphs in 
general (at least w.r.t miniVite, which only executes the first phase of Louvain 
method).

------------------------------------------------------------------------------
Q9. Is there an option to add some noise in the generated random geometric 
graphs?
------------------------------------------------------------------------------

A9. Yes, the "-p <percent>" option allows extra edges to be added between 
random vertices (see README). This increases the overall communication, but 
affects the structure of communities in the generated graph (lowers the 
modularity). Therefore, adding extra edges in the generated graph will 
most probably reduce the global modularity, and the number of iterations to 
convergence shall decrease. 
The maximum number of edges that can be added is bounded by INT_MAX, at 
present, we do not handle data ranges more than INT_MAX.

------------------------------------------------------------------------------
Q10. What are the steps required for using real-world graphs as an input to
miniVite?
------------------------------------------------------------------------------

A10. First, please download Vite (parent application of miniVite) from: 
http://hpc.pnl.gov/people/hala/grappolo.html

Graphs/Sparse matrices come in several native formats (matrix market, SNAP, 
DIMACS, etc.) Vite has several options to convert graphs from native to the 
binary format that miniVite requires (please take a look at Vite README).

As an example, you can download the Friendster file from: 
https://sparse.tamu.edu/SNAP/com-Friendster
The option to convert Friendster to binary using Vite's converter is as follows 
(please note, this part is serial): 

$VITE_BIN_PATH/bin/./fileConvertDist -f $INPUT_PATH/com-Friendster.mtx 
    -m -o $OUTPUT_PATH/com-Friendster.bin

After the conversion, you can run miniVite with the binary file obtained
from the previous step:

mpiexec -n <...> $MINIVITE_PATH/./dspl -r <processes-per-node> 
    -f $FILE_PATH/com-Friendster.bin

--------------------------------------------------------------------------------
Q11. miniVite is scalable for a particular input graph, but not for another 
similar sized graph, why is that and what can I do to improve the situation?
--------------------------------------------------------------------------------

A11. Presently, our default distribution is vertex-based. That means a process 
owns N/p vertices and all the edges connected to those N/p vertices (including 
ghost vertices). Load imbalances are very probable in this type of distribution, 
depending on the graph structure. Instead, it may be favorable to use an 
edge-centric distribution, in which processes own approximately equal number of 
edges. When the "-b" option is passed, miniVite attempts to distribute the 
vertices across processes such each process owns approximately similar number 
of edges. This edge-balanced distribution may reduce the overall communication,
improving the performance.

As an example, lets say there is a large (real-world) graph, and its structure 
is such that only a few processes end up owning a majority of edges, as per 
miniVite graph data distribution. Also, lets assume that the graph has either a 
very poor community structure (modularity closer to 0) or very stable community 
structure (modularity close to 1 after a few iterations, that means not many 
vertices are migrating to neighboring communities). In both these cases, 
community detection in miniVite will run for relatively less number of 
iterations, which may affect the overall scalability.

--------------------------------------------------------------------------------
Q12. Can miniVite return some statistics on vertex/edge distribution of the
underlying graph?
--------------------------------------------------------------------------------
Yes, please pass -DPRINT_DIST_STATS while building miniVite. When miniVite is 
executed, it returns some basic information, such as:

-------------------------------------------------------
Graph edge distribution characteristics
-------------------------------------------------------
Number of vertices: 34
Number of edges: 156
Maximum number of edges: 80
Average number of edges: 78
Expected value of X^2: 6088
Variance: 4
Standard deviation: 2
-------------------------------------------------------