Page 1 of 2
[SOLVED] Best Layout Algorithm for Massive Network
Posted: 01 Jul 2011 14:19
by PMR3349
I have a network with ~100,000 nodes and ~1,600,000 edges.
Clearly such a monstrosity is going to take some time, but after letting a force-atlas attempt run overnight I check it out this morning to find the program crashed. Is there a better algorithm to use for a gigantic network like this one?
I still haven't ruled out the possibility that the hardware could be the limiting factor, but in case it's not, I'm wondering if one of the algorithms is best. Thanks
Re: Best Layout Algorithm for Massive Newtork
Posted: 01 Jul 2011 16:55
by jacomyma
Hi, I think ForceAtlas2 might be able to deal with networks of this size but I actually didn't test above 50K nodes. If you tried only ForceAtlas "version 1" it might be a good idea to test the second version (available in Gephi 0.8).
By the way, the layout considered as the most adapted to big networks is OpenOrd, which is available as a plug-in in the catalog.
And also, I think that Gephi in 64bits allows more RAM and can handle bigger graphs (if you have a 64bits CPU and at least 4Go RAM).
Re: Best Layout Algorithm for Massive Newtork
Posted: 01 Jul 2011 17:42
by PMR3349
Thanks, I'll give that a try
I got the memory allocation up by increasing it for Java, which helped. It seems that Gephi isn't made to use multiple cores on the CPU though since I can't get it above 25% CPU usage on my quad-core
Re: Best Layout Algorithm for Massive Newtork
Posted: 01 Jul 2011 18:01
by mbastian
Gephi takes advantage of your multi-quores when you are doing several things at the same time. For layout, the
OpenOrd is able to use several threads.
Re: Best Layout Algorithm for Massive Newtork
Posted: 01 Jul 2011 20:41
by PMR3349
Thanks for the suggestion. Just thought I should follow up and say that the OpenOrd algorithm worked amazingly well. After waiting hours and finally crashing with other layouts, OpenOrd finished in about 5 minutes! Thanks again for the tip

Re: Best Layout Algorithm for Massive Newtork
Posted: 01 Jul 2011 20:48
by pbittner
I planned to write about that a little later but since you are talking multi-core, I’ll say it here. I have recently implemented a multi-core version of the Force Atlas layout in Fortran using OpenMP for the parallelism.
Since the repulsion in O(n2) is where most of the time is spent, I wrote a small function in Fortran that performs the repulsion on all nodes at once. This Fortran function is wrapped in a C++ function which is compiled as a DLL and the DLL is called from Java using JNI.
I have tested it with graphs up to 75000 nodes and it performs very well (around 50x faster than the Java implementation on an 8 cores workstation). However, it obviously doesn’t outperform the O(nlogn) Barnes-Hut repulsion of the new Force Atlas 2 layout. I have therefore started working on a parallel version of the Force Atlas 2 and I’ll release a plug-in containing the two parallel implementations when it is finished.
I think that the performance of the parallel Force Atlas 2 will increase by the number of cores you have (4 times faster with 4 cores) which would be great. It doesn’t solve the issue with the amount of RAM required to handle very large graphs but running Force Atlas 2 on a 100,000 nodes graph will hopefully be quite smooth.
My main concern for the plug-in is portability since the DLL has to be compiled separately for each architecture. The first version of the plug-in may only be available on Windows 64bits.
Then, I may have a look at the OpenOrd layout to make it even faster or I may try to use CUDA to speedup Force Atlas using GPGPU. I have very little experience with CUDA but the O(n2) repulsion of Force Atlas looks like a perfect candidate.
Any suggestion?
Re: Best Layout Algorithm for Massive Newtork
Posted: 02 Jul 2011 09:44
by admin
pbittner, THAT sounds amazing!

Re: Best Layout Algorithm for Massive Newtork
Posted: 03 Jul 2011 16:21
by seinecle
Just great, I can't wait for the release of the plugin!
Re: Best Layout Algorithm for Massive Newtork
Posted: 04 Jul 2011 23:07
by seinecle
A follow-up: how can I make sure that I installed the 64 bit version of Gephi? (there is no 32 vs 64 versions in the download page).
Thx,
Clement
Re: Best Layout Algorithm for Massive Newtork
Posted: 04 Jul 2011 23:45
by eduramiba
Parallel layout algorithms sound great

!
Clement, to make sure you run Gephi under a 64 bit virtual machine you have to install a 64 bit jre/jdk.
Then you can edit {gephi installation}/etc/gephi.conf file and set larger maximum memory.
If you have problems because other jre/jdk versions are installed then uncomment the jdkhome and put, for example:
Code: Select all
jdkhome="C:\Program Files\Java\jdk1.6.0_25"
Here
http://gephi.org/users/install there is some more info.
Eduardo