I am working on a huge graph.
I'm interested to focus on a small part of the graph and make some optimization on it.
All other "not interesting" nodes will be packed to one "supernode".
I am going to make number of iterations, each iteration will focus on another part of a graph. Then, I need to unpack existing supernode and pack another supernode.
I am looking for existing packages/algorithms which can do for me the packing/unpacking jobs.
I am working on C++ and using boost BGL.
Related
I'd like to use Caffe to extract image features. However, it takes too long to process an image, so I'm looking for ways to optimize for speed.
One thing I noticed is that the network definition I'm using has four extra layers on top the one from which I'm reading a result (and there are no feedback signals, so they should be safe to delete).
I tried to delete them from the definition file but it had no effect at all. I guess I might need to remove the corresponding part of the file that contains pre-trained weights, too. That is, however, a binary file (a protobuffer) so editing it is not that easy.
Do you think that removing the four layers might have a profound effect of the net performance?
If so then how do I get familiar with the file contents so that I could edit it and how do I know which parts to remove?
first, I don't think removing the binary weights will have any effect.
Second, you can do it easily using the python interface: see this tutorial.
Last but not least, have you tried running caffe time to measure the performance of your net? this may help you identify the bottlenecks of your computations.
PS,
You might find this thread relevant as well.
Caffemodel stores data as key-value pair. Caffe only copies weight for those layers (in train.prototxt) having exactly same name as caffemodel. Hence I don't think removing binary weights will work. If you want to change network structure, just modify train.prototxt and deploy.txt.
If you insist to remove weights from binary file, follow this caffe example.
And to make sure you delete right part, this visualizing tool should help.
I would retrain on a smaller input size, change strides, etc. However if you want to reduce file size, I'd suggest quantizing the weights https://github.com/yuanyuanli85/CaffeModelCompression and then using something like lzma compression (xz for unix). We do this so we can deploy to mobile devices. 8 bit weights compress nicely.
I want to be able to use bfs in BGL to find all the trees of a forest, which is basically finding connected components using multiple source vertices. For example, it may be used to find different connected regions of an image to enable image segmentation (this is just one of the cases). How can I use the breadth_first_search in BGL to do this? Any pointers to examples/sources would be greatly appreciated! I have surveyed BGL documentation and have not been successful in finding what I want to do.
Just use the connected components that already exists in boost. There is a useful example. Afterwards all vertices in the graph will have a mapping to their component. If you really want to use BFS on individual parts, just use a visitor and push nodes into a vector.
You can specify your start node with:
breadth_first_search(graph, visitor(vis).root_vertex(root_vertex_descriptor));
I've inherited a large C++ project which I need to port to Linux. There are over 200,000 lines of source in this project spread across more than 300 files. It would be tremendously helpful to have a visual dependency/include tree to refer to for this project so that I can get a general feel for the application's internal structure. This would also help me to locate the "fault lines" between the core modules and Windows header files so that I can stub them out later.
The class viewer in Visual Studio simply isn't cutting it. I was reading around, and learned that Doxygen is a commonly used tool for listing dependencies. I'm much more of a visual person, and found that this wasn't so helpful. Fortunately, I learned about the Graphviz plugin, using something called "Dot" that has enabled me to generate dependency trees for parts. Unfortunately, hundreds of smaller dependency trees are generated for specific files, rather than having one large one as I'd hoped for. Here are a couple of examples:
As you can see (I hope), Doxygen/GraphViz seem to give up when the graph gets too large and gray out the child nodes. I then have to go to the graph for that specific node if I want to see what's further down the tree. Not only does this limit the visual helpfulness of the graph, but if the child node depends on any of the nodes from the original graph, these nodes will be shown again. This is leading to lots of duplicate connections that make it very hard to conceptually isolate the graph from any given file. As a result, I feel like I'm "zoomed in" and still can't see the whole picture.
I've tried playing around with the DOT_GRAPH_MAX_NODES setting in the Expert view in Doxygen, but this doesn't seem to affect the scope of the graphs that are being generated. From the output generated from any given run, it seems like Doxygen itself is generating hundreds of graph files, and Graphviz is just faithfully generating graphs for each one. Is there any known way to make Doxygen generate one large graph file instead of hundreds of smaller ones?
Alternatively, are there any free visual graphing solutions out there which know how to handle complicated C++ project files with nested pre-processor directives, MIDL interfaces, and manually defined include paths the way Doxygen does?
My searches are finding general graphing utilities (or questions about them), but nothing specific to large C++ projects. Surely with all the coding that's been done over the years somebody must have such a tool!
Thanks,
-Alex
You can use the XML files generated by doxygen, and merge them into a single giant dot-format graph file (using xml stylesheet or similar), then run graphviz on it.
Doxygen automatically invoking graphviz is most useful when the number of graphs is high. For a single graph, automatically creating the content is important, but automatically calling dot, not so much.
hello every i want to ask that is there any way through which i can pick values from the plotted graph and statistically analyze them using c++ programming and also the graph is moving graph (realtime graph)
thank u kindly help me
i am using c++ and linux and graph is plotted by using qwt and qt
Analyzing doesn't look necessary for what you describe (if I got it well) and would be too costly. You simply need to create an observer system ; when a node value is changed, send a message with the new value. You'll only need to see if the value is zero. Add special events for the extreme nodes and check when their value is changed if they match.
I'm experimenting my distributed clustering algorithm (implemented with MPI) on 24 computers that I set up as a cluster using BCCD (Bootable Cluster CD) that can be downloaded at http://bccd.net/.
I've written a batch program to run my experiment that consists in running my algorithm several times varying the number of nodes and the size of the input data.
I want to know the amount of data used in the MPI communications for each run of my algorithm so I can see how the amount of data changes when varying the previous mentioned parameters. And I want to do all this automatically using a batch program.
Someone told me to use tcpdump, but I found some difficulties in this approach.
First, I don't know how to call tcpdump in my batch program (which is written in C++ using the command system for making calls) before each run of my algorithm, since tcpdump requires another terminal to run in parallel with my application. And I can't run tcpdump in another computer since the network uses a switch. So I need to run it on the master node.
Second, I saw the traffic with tcpdump while my experiment was going on and I couldn't figure out what was the port used by MPI. It seems to use many ports. I wanted to know that for filtering the packages.
Third, I tried capturing whole packages and saving it to a file using tcpdump and in a few seconds the file was 3,5MB. But my whole experiment takes 2 days. So the final log file will be huge if I follow this approach.
The ideal approach would be to capture just the size field in the header of the packages and sum this up to obtain the total amount of data transmitted. In that way the logfile would be much smaller than if I were capturing the whole package. But I don't know how to do it.
Another restriction is that I don't have access to the computer disc. So I only have the RAM and my 4GB USB Flash drive. So I can't have huge logfiles.
I have already thought about using some MPI tracing or profiling tool such as those mentioned at http://www.open-mpi.org/faq/?category=perftools. I have only tested Sun Performance Analyzer until now. The problem is that I guess it will be difficult to install those tools on BCCD and maybe even impossible. In addtion to that, this tool will make my experiment take longer to end, sice it adds overhead. But if someone is familiar with BCCD and think it is a good choice to use one of those tools, so please let me know.
Hope someone have a solution.
Implementations like tcpdump won't work if there are multi-core nodes which use shard memory to communicate, anyway.
Using something like MPE is almost certainly the way to go. Those tools add very little overhead, and some overhead is always going to be necessary if you want to count messages. You can use mpitrace to write out every MPI call, and parse the resulting text file yourself. By the way, note that MPE is explicitly discussed on the bccd website. MPICH2 comes with MPE built in, but it can be compiled for any implementation. I've only found a very modest overhead for MPE.
IPM is another nice tool that does counting of messages and sizes; you should be able either parse the XML output, or use the postprocessing tools and just manually integrate the graphs (say either bytes_rx/bytes_tx by rank, or the message buffer size/count graph). The overhead for IPM is even less than for MPE, and mostly comes after the program's finished running to do the file I/O.
If you were really super worried about the overhead with either of these approaches, you could always write your own MPI wrappers using the profiling interface that wrapped MPI_Send, MPI_Recv, etc, and just counted # of bytes sent and recieved for each process, and output only that total at the end.