autocad dxf files displayed in opengl/opengl es - opengl

I'm wondering if there is a way to extract the necessary data out of an autocad .dxf file, so I can visualize the structure in opengl?
I've found some old cold snippets for windows written in cpp but since the standard changes I assume 15 yr old code is a little outdated.
Also, there is a book about the .dxf file standard but it's also from the 90's and aside that, rarely available.
Another way might be to convert it to some other file format and then extract the data I need.
Trying to look into the .dxf files didn't give too much insight either since a simple cuboid contains a lot of data already!
Can anyone give me hint on how to approach this?

The references are a good place to start, but if you are doing heavy 3D work it may not be possible to accomplish what you are attempting..
We recently wrote a DXF converter in JAVA based entirely on the references. Although many of the entities are relatively straightfoward, many other entities (3DSOLID, BODY, REGION, SURFACE, Swept Surface) are not really possible to translate, since the reference states that the groups are primarily proprietary data. Other objects (Extruded Surface, Revolved Surface, Swept Surface (again)) have significant chunks of binary data which may hold important information you need.
These entities were not vital for our efforts, but if you are looking to convert to OpenGL, these may be the entities you were particularly concerned with.

Autodesk has references for the DXF formats used by recent revisions of AutoCAD. I'd probably take a second look at that 15 year-old code though. Even if you can't/don't use it as-is, it may provide a decent starting point. The DXF specification is sufficiently large and complex that having something to start from, and just add new bits and pieces where needed can be a big help. As an interchange format, DXF has to be pretty conservative anyway, only including elements that essentially all programs can interpret reasonably directly.
I'd probably be more concerned about the code itself than changes in the DXF format. A lot of code that old uses deep, monolithic class hierarchies that's quite a bit different from what you'd expect in modern C++.

Related

Strings vs binary for storing variables inside the file format

We aim at using HDF5 for our data format. HDF5 has been selected because it is a hierarchical filesystem-like cross-platform data format and it supports large amounts of data.
The file will contain arrays and some parameters. The question is about how to store the parameters (which are not made up by large amounts of data), considering also file versioning issues and the efforts to build the library. Parameters inside the HDF5 could be stored as either (A) human-readable attribute/value pairs or (B) binary data in the form of HDF5 compound data types.
Just as an example, let's consider as a parameter a polygon with three vertex. Under case A we could have for instance a variable named Polygon with the string representation of the series of vertices, e.g. for instance (1, 2); (3, 4); (4, 1). Under case B, we could have instead a variable named Polygon made up by a [2 x 3] matrix.
We have some idea, but it would be great to have inputs from people who have already worked with something similar. More precisely, could you please list pro/cons of A and B and also say under what circumstances which would be preferable?
Speaking as someone who's had to do exactly what you're talking about a number of time, rr got it basically right, but I would change the emphasis a little.
For file versioning, text is basically the winner.
Since you're using an hdf5 library, I assume both serializing and parsing are equivalent human-effort.
text files are more portable. You can transfer the files across generations of hardware with the minimal risk.
text files are easier for humans to work with. If you want to extract a subset of the data and manipulate it, you can do that with many programs on many computers. If you are working with binary data, you will need a program that allows you to do so. Depending on how you see people working with your data, this can make a huge difference to the accessibility of the data and maintenance costs. You'll be able to sed, grep, and even edit the data in excel.
input and output of binary data (for large data sets) will be vastly faster than text.
working with those binary files in a new environmnet (e.g. a 128 bit little endian computer in some sci-fi future) will require some engineering.
similarly, if you write applications in other languages, you'll need to handle the encoding identically between applications. This will either mean engineering effort, or having the same libraries available on all platforms. Plain text this is easier...
If you want others to write applications that work with your data, plain text is simpler. If you're providing binary files, you'll have to provide a file specification which they can follow. With plain text, anyone can just look at the file and figure out how to parse it.
you can archive the text files with compression, so space concerns are primarily an issue for the data you are actively working with.
debugging binary data storage is significantly more work than debugging plain-text storage.
So in the end it depends a little on your use case. Is it meaningful to look at the data in the myriad tools that handle plain-text? Is it only meaningful to look at it with big-data hdf5 viewers? Will writing plain text be onerous to you in terms of time and space?
In general, when I'm faced with this issue, I basically always do the same thing: I store the data in plain text until I realize the speed problems are more irritating than working with binary would be, and then I switch. If you don't know in advance if you're crossing that threshold start with plain-text, and write your interface to your persistence layer in such a way that it will be easy to switch later. This is tiny bit of additional work, which you will probably get back thanks to plain text being easier to debug.
If you expect to edit the file by hand often (like XMLs or JSONs), then go with human readable format.
Otherwise go with binary - it's much easier to create a parser for it and it will run faster than any grammar parser.
Also note how there's nothing that prevents you from creating a converter between binary and human-readable form later.
Versioning files might sound nice, but are you really going to inspect the diffs for files "containing large arrays"?

Best approach to storing scientific data sets on disk C++

I'm currently working on a project that requires working with gigabytes of scientific data sets. The data sets are in the form of very large arrays (30,000 elements) of integers and floating point numbers. The problem here is that they are too large too fit into memory, so I need an on disk solution for storing and working with them. To make this problem even more fun, I am restricted to using a 32-bit architecture (as this is for work) and I need to try to maximize performance for this solution.
So far, I've worked with HDF5, which worked okay, but I found it a little too complicated to work with. So, I thought the next best thing would be to try a NoSQL database, but I couldn't find a good way to store the arrays in the database short of casting them to character arrays and storing them like that, which caused a lot of bad pointer headaches.
So, I'd like to know what you guys recommend. Maybe you have a less painful way of working with HDF5 while at the same time maximizing performance. Or maybe you know of a NoSQL database that works well for storing this type of data. Or maybe I'm going in the totally wrong direction with this and you'd like to smack some sense into me.
Anyway, I'd appreciate any words of wisdom you guys can offer me :)
Smack some sense into yourself and use a production-grade library such as HDF5. So you found it too complicated, but did you find its high-level APIs ?
If you don't like that answer, try one of the emerging array databases such as SciDB, rasdaman or MonetDB. I suspect though, that if you have baulked at HDF5 you'll baulk at any of these.
In my view, and experience, it is worth the effort to learn how to properly use a tool such as HDF5 if you are going to be working with large scientific data sets for any length of time. If you pick up a tool such as a NoSQL database, which was not designed for the task at hand, then, while it may initially be easier to use, eventually (before very long would be my guess) it will lack features you need or want and you will find yourself having to program around its deficiencies.
Pick one of the right tools for the job and learn how to use it properly.
Assuming your data sets really are large enough to merit (e.g., instead of 30,000 elements, a 30,000x30,000 array of doubles), you might want to consider STXXL. It provides interfaces that are intended to (and largely succeed at) imitate those of the collections in the C++ standard library, but are intended to work with data too large to fit in memory.
I have been working on scientific computing for years, and I think HDF5 or NetCDF is a good data format for you to work with. It can provide efficient parallel read/wirte, which is important for dealing with big data.
An alternate solution is to use array database, like SciDB, MonetDB, or RasDaMan. However, it will be kinda painful if you try to load HDF5 data into an array database. I once tried to load HDF5 data into SciDB, but it requires a series of data transformations. You need to know if you will query the data often or not. If not often, then the time-consuming loading may be unworthy.
You may be interested in this paper.
It can allow you to query the HDF5 data directly by using SQL.

Which library for voxel data structure?

I'm working in C++ with large voxel grids in a scientific context and I'm trying to decide, which library to use. Only a fraction of the voxel grid holds values - but might be several per voxel (e.g. struct), which are determined by raytracing. I'm not trying to render anything, but I have to determine the potential number of rays passing though the entire target area, thus an awful lot of ray-box computations will have to be caluculated and preferebly very fast...
So far, I found
OpenVDB http://www.openvdb.org/
Field3d http://sites.google.com/site/field3d/
The latter appeals a bit more, because it seems simpler/easier to use.
My question is: Which of them would be more suited if put to use in tasks, which are not aimed at rendering/visualization? Which one is faster/better when computing a lot of ray-box-intersections (no viewpoint-dependent culling possible)? Suggestions, anyone?
In any case, I want to use an existing C++ library and not write a kdTree/Octree etc. myself. Don't have the time for inventing the wheel anew.
I would advise
OpenSceneGraph
Ogre3D
VTK
I have personally used the first two. However, VTK is also a popular alternative. All three of them support voxel based rendering.

Offline embedded realtime routing

I am currently working on a senior design project for school and have come across a design issue that i do not know how to solve. I need to have realtime, offline routing for an embedded walking application.
I have not been able to find any libraries that suit my need. I understand i might either have to make my own vectorized map of my local town or routing algorithm. I will not go into much detail what my project entails but it does not require a large map. Maybe a 5x5 mile grid. The maps can be loaded by SD if need to be changed.
I see there are GpsMid, YOURs, and others all using OpenStreetMap data.
We will have a TI micro-controller for processing and GPS card for real time lat/lon I just do not know how to take the real time info and route using a static map.
Thanks,
Matt
I'm not well versed in what is typically used for real-time routing with GPS and vectorized maps, but I can recommend some general algorithms that can be used as tools to help you get your project done.
A* search is a pretty typical path finding algorithm. http://en.wikipedia.org/wiki/A_star
Depending on how you organize your data, you may also find to Dijkstra's algorithm to be helpful. http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm
These algorithms are popular enough that you should be able to find example code in whatever language you want, although I'd be very skeptical of the quality. I'd recommend writing your own, since you are in school, as it'd be beneficial for you to have written and debugged them on your own at least once in your career. When you are done, you'll have a tried and true implementation to call your own.
Seems to me there are two parts to this:
1 - Identifying map data that tells you what's a road/path (potential route), I would expect this is already in the data in some way. It could be as simple as which colour any given line is.
2 - Calculating a route over those paths. This is well documented/discussed and there are plenty of algorithms etc. out there on the problem. These days it's hardly worth trying very hard for elegance/efficiency, you can just throw CPU cycles at it until an answer pops out.
Also, should this be tagged [homework] ?

What makes object representation and recognition hard?

Intuitively, it would seem that given a dozen or so 2d images from different angles of almost any object, it should be easy to construct a 3d representation of that object. Subsequently a library of 3d representations attained in this way could be used to identify new 2d images.
What literature is there along these lines, and why has it not yet produced strong object recognition?
It is your word "intuitively" that is causing you trouble there. Your brain is not designed to be very good at certain tasks, like multiplying thousands of numbers in an instant. However for raw computational power your brain makes the fastest computer look like mere tiddly-winks (neural response time of only about 10 milliseconds, but all those 10^14 or so neurons all working in parallel totally beats any modern machine). Its just that your brain is designed to solve problems that are intensely more computationally complex, like recognizing objects in a picture, parsing sound data and picking out individual speakers amidst background noise. Learning to classify and deal with tens of thousands of types of objects.
The incredibly computationally intense things your brain is designed to do really well are the things that, to a person, seem "intuitive". The things it isn't designed to do really well seem "unintuitive" or difficult. But the raw computation needed for strong object recognition (because there are just so MANY kinds of objects, many of which really have subobjects, and multiple classifications, and non-rigid forms, e.g. "trousers", "water", "dog") is WAY more than what is needed accomplish things one considers only possible for a computer. Things like using "common sense" to solve an every day problem are similarly trivial for a person, but computationally incredibly complex.
What you want to do is indeed possible, but (there are quite a few buts)
for the 3D reconstruction:
For anything but the simplest shapes you need more than just a few dozen images.
The shape you are reconstructing needs to have a lot of recognizable features that look similar enough from different angles so that you can match them.
Lighting needs to be fairly constant over your entire set of images, otherwise shadows will throw you off (or you need even more images)
even with very feature rich objects (i.e. lot of variation in colour and shape) 3D reconstruction accuracy from any matched pair of features is going to be terrible if you do not have full knowledge of the parameters (position, view direction and opening angle) of the camera used to take each picture.
These are all problems can be solved, so suppose you did, and now you have a new picture from the object that you want to match to your 3D shape.
You could of course try to find a 2D projection of your shape that fit the new picture, but the search space there is enormous. It would probably be a lot easier and faster to use the feature finding and matching system you built for the initial 3D reconstruction to directly match the new picture to the existing set, and find where it fits on the object that way.
So once you've solved the problem of creating the initial 3D reconstruction your second step is basically done as well.
Photosynth is a brilliant example of these two steps. Browse the site, try to find some of the references they have there.
As for your final step, strong object recognition, just imagine the search space! What you need for strong object recognition, apart from a good representation of the objects you want to recognize, is a good way to search the space of objects you know, and a good way to represent your new object (the image of an object in this case) in that space. This is something I know nearly nothing about.
For just matching the same object in different 2D images there are SIFT features. But I don't think this translates well to 3D.
Note that what you're describing is instance recognition. Computer can indeed do a good job of instance recognition these days. For example, Google Goggles is very good at recognizing landmarks like the Golden Gate Bridge and Eiffel Tower.
However, computers are less good at doing category recognition and classification. Creating dozens of 2D snapshots for all possible objects under all types of lighting conditions etc. becomes intractable very quickly. The fact that certain objects such as a dog can move around makes the space of possibilities even bigger. Computers become much worse at this.
Also, from the biological standpoint, our visual field is around 100 million pixels. Graphics cards have only now started to become capable of rendering that much data in real-time. Making sense of that much data is even more computationally intensive.
One often talks about having a machine reach a 5 year old's ability to process information. But let's think about how much data that is. 100 million pixels with 3 color channels and 1 byte per pixel = 300MB/s. Now multiply that by 30 frames per second, 31,556,926 seconds per year, and 5 years, you end up with roughly 1.4 exabytes (1.4x10^18).