Whats the most effective way to store a chemical formula? - list

I currently have have a HashSet of NElement objects. Each NElement object has a unique Element field, and an integer n.
Here are 2 operations I need to do with the data:
Iterate over all the values in collection.
With Element e, search the collection for an instance of NElement that has e and process it.
Here's an example of #2:
public void Add(NElement ne) {
foreach(NElement ne2 in elements) { //elements is the HashSet
if(ne2.element == ne.element) {
ne2.Number += ne.Number; //Number is the integer
return;
}
}
elements.Add(ne);
}
I think there is a better way to accomplish this using a collection other than a List or Set. Any suggestions?

A possible solution would be a bit of a different design. A molecular formula consists of a bunch of elements along with how many of those elements there are. So a possible solution is to have a MolecularFormula class that wraps this information, which is based in a
Map<Element, int>.
A possible example:
public class MolecularFormula
{
private Map<Element, int> elements = new HashMap<Element, int>();
//... Constructors etc
//A list to iterate through all values
public List<NElement> getElements()
{
List<NElement> retList = new ArrayList<NElement>();
foreach(Element e : elements)
{
retList.put(new NElement(e, elements.get(e));
}
return retList;
}
//To add something
public void add(Element e, int num)
{
if(elements.containsKey(e))
{
int newNum = elements.get(e) + num;
elements.remove(e);
elements.put(e, newNum);
}
else
{
elements.put(e, num);
}
}
}
This is hastily thrown together and not very efficient at all, but it should give you an idea of a possible option.

Try using SMARTS, SMILES, InChi or ASL. The first two are open source, I believe. InChi is maintained by the IUPAC, and is nicely hashable for database use. ASL is proprietary to Schrödinger, Inc, though if you are already using Schrödinger software, I'd recommend using their Python API directly.
Using any of these tools, you could find functional groups (or atoms) described by a specific SMARTS/SMILES/ASL string within a molecule described by SMARTS/SMILES/ASL.

Related

How to for loop with iterators if vector type is parent of two child types filling vector

I have this problem:
this is my loop for previously used child type
std::vector<Coin>::iterator coin;
for (coin = coinVec.begin(); coin != coinVec.end(); ++coin)
{
sf::FloatRect coinBounds = coin->getGlobalBounds();
if (viewBox.intersects(coinBounds))
{
if (playerBounds.intersects(coinBounds))
{
coin = coinVec.erase(coin);
break;
}
else coin->setTextureRect(currentFrame);
}
}
And the similar one for std::vector<Heal>.
I rebuild my code structure to: Coin is now child of Collectables.
There is now only one vector: std::vector<Collectables> which contains all collactable child classes objects like Coin, Heal etc.. Is there a way to make the code work with only one loop and iterators? If yes, how would you do that?
Thanks
I find out that my code is so universal, that I dont need to separate vector elements at all.
My solution here:
std::vector<Collectables>::iterator collect;
for (collect = collectVec.begin(); collect != collectVec.end(); ++collect)
{
sf::FloatRect collectBounds = collect->getGlobalBounds();
if (viewBox.intersects(collectBounds))
{
if (playerBounds.intersects(collectBounds))
{
collect = collectVec.erase(collect);
break;
}
else collect->setTextureRect(currentFrame);
}
}
And if you really need to separate elements, try this:
if (collect->getType() == ResourceManager::enums::textures::coin)
{
} else
if (collect->getType() == ResourceManager::enums::textures::health)
{
}
Where getType stores simply int id matching enum value.
Although seems that you have solved the problem, it is probably worth to tell how would you solve the initial problem.
As mentioned in the comments storing elements in the vector would result in object slicing, e.g.
std::vector<BaseClass> vect;
The elements of the vector would be able to store information about objects, which is common for classes, aka defined in BaseClass. If you want to be able to perform polymorphism(which you are trying to achieve in the example), you should store vector of pointers like this.
std::vector<BaseClass*> vect;
Each pointer would point to the address of the element, and thus object slicing won't happen.

Dijkstra's Algorithm in String-type Graph

I am making an inter-city route planning program where the graph that is formed has string-type nodes (e.g. LHR, ISB, DXB). It's undirected but weighted, and is initialized as:
map<pair<string, string>, int> city;
and then I can add edges by for example:
Graph g;
g.addEdge("DXB", "LHR", 305);
g.addEdge("HTR", "LHR", 267);
g.addEdge("HTR", "ISB", 543);
and the resultant output will be:
ISB LHR
DXB 0 305
HTR 543 267
Now, the question... I'm trying to implement Dijkstra's algorithm in this graph but so far have been unable to correctly run it on string-type nodes and opposed to learning and doing it on int-type nodes. Can someone guide me through the correct steps of implementing it in the most efficient way possible?
The data structure used by a graph application has a big impact on the efficiency and ease of coding.
Many designs start off with the nodes. I guess the nodes, in the problems that are being modelled, often have a physical reality while the links can be abstract relationships. So it is more natural to start writing a node class, and add on the links later.
However, when coding algorithms that solve problems in graph theory, it becomes clear that the links are the real focus. So, lets start with a link class.
class cLink
{
public:
cLink(int c = 1)
: myCost(c)
{
}
int myCost; // a constraint e.g. distance of a road, max xapacity of a pipe
int myValue; // a calculated value, e.g. actual flow through a pipe
};
If we store the out edges of node in a map keyed by the destination node index, then the map will be an attribute of the source node.
class cNode
{
public:
cNode(const std::string &name = "???")
: myName(name)
{
}
std::string myName;
std::map<int, cLink> myLink; // out edges of node, keyed by destination
};
We have links and nodes, so we are ready to construct a class to store the graph
class cGraph {
public:
std::map<int, cNode> myG; // the graph, keyed by internal node index
};
 Where did the node index come from? Humans are terrible at counting, so better the computer generates the index when the node is added.
cGraph::createNode( const std::string& name )
{
int n = myG.size();
myG.insert(std::make_pair(n, cNode(name)));
}
Don't implement this! It has a snag - it can create two nodes with the same name. We need to be able to check if node with a specified name exists.
int cGraph::find(const std::string &name)
{
for (auto n : myG)
{
if (n.second.myName == name)
{
return n.first;
}
}
return -1;
}
This is inefficient. However, it only needs to be done once when the node is added. Then the algorithms that search through the graph can use fast lookup of nodes by index number.
Now we can prevent two nodes being created with the same name
int cGraph::findoradd(const std::string &name)
{
// search among the existing nodes
int n = find(name);
if (n < 0)
{
// node does not exist, create a new one
// with a new index and add it to the graph
n = myG.size();
myG.insert(std::make_pair(n, cNode(name)));
}
return n;
}
Humans, in addition to being terrible counters, are also over confident in their counting prowess. When they specify a graph like this
1 -> 2
1 -> 3
Let’s not be taken in. Let’s regard these numbers as names and continue to use our own node index system.
/** Add costed link between two nodes
*
* If the nodes do not exist, they will be added.
*
*/
void addLink(
const std::string & srcname,
const std::string & dstname,
double cost = 1)
{
int u = findoradd(srcname);
int v = findoradd(dstname);
myG.find(u)->second.myLink.insert(
std::make_pair(v, cLink(cost)));
if (!myfDirected)
myG.find(v)->second.myLink.insert(
std::make_pair(u, cLink(cost)));
}
With the addition of a few getters and setters, we are ready to start implementing graph algorithms!
To see an complete implementation, including Dijsktra, using these ideas, check out PathFinder.
The core problem is that when we work on graphs with integer vertices, the index of the adjacency list represents the node (since the indexes are also numbers). Now instead of using adjacency list like vector<pair<int, int> > adj[N]we can use map<string,vector<string, int> > adj. Now adj["DXB"] will contain a vector of pairs of the form <string, int> which is the <name, weight> for the cities connected to "DXB".
If this approach seems very complex, then you use some extra memory to map a city to a number, and then you can code everything considering that the graph has integer vertices.

Object oriented design for hotel reservation system

I am practicing object oriented design for an upcoming interview. My question is about the design for a hotel reservation system:
- The system should be able to return an open room of a specific type or return all the open rooms in the hotel.
- There are many types of rooms in hotel like regular, luxury, celebrity and so on.
So far I have come up with following classes:
Class Room{
//Information about room
virtual string getSpecifications(Room *room){};
}
Class regularRoom: public Room{
//get specifications for regular room
}
Class luxuryRoom: public Room{
//get specifications for regular room
}
//Similarly create as many specialized rooms as you want
Class hotel{
vector<Room *>openRooms; //These are all the open rooms (type casted to Room type pointer)
Public:
Room search(Room *aRoom){ //Search room of a specific type
for(int i=0;i<openRooms.size();i++){
if(typeid(*aRoom)==typeid(*openRooms[i])) return *openRooms[i];
}
}
vector<Room> allOpenRooms(){//Return all open rooms
...
}
}
I am confused about the implementation of hotel.search() method where I am checking the type (which I believe should be handled by polymorphism in some way). Is there a better way of designing this system so that the search and allOpenRooms methods can be implemented without explicitly checking the type of the objects?
Going through the sub-class objects asking what type they are isn't really a good illustration of o-o design. You really need something you want to do to all rooms without being aware of what type each one is. For example print out the daily room menu for the room (which might be different for different types).
Deliberately looking for the sub-class object's type, while not being wrong, is not great o-o style. If you just want to do that, as the other respondents have said, just have "rooms" with a set of properties.
You could always let a room carry it's real type, instead of comparing the object type:
enum RoomType
{
RegularRoom,
luxuryRoom
};
class Room{
public:
explicit Room(RoomType room_type) : room_type_(room_type) { }
virtual ~Room(){}
RoomType getRoomType() const { return room_type_; }
private:
RoomType room_type_; // carries room type
};
class regularRoom: public Room{
public:
regularRoom() : Room(RegularRoom){ }
};
Room search(Room *aRoom)
{
//Search room of a specific type
for(size_t i=0;i<openRooms.size();i++)
{
if (aRoom->getRoomType() == RegularRoom) // <<-- compare room type
{
// do something
}
}
};
Do the different types of rooms have different behavior? From
the description you give, this is not a case where inheritance
should be used. Each room simply has an attribute, type, which
is, in its simplest form, simply an enum.
The simplest way is to have a Room type enumeration as #billz suggest you. The problem with tis way is that you must not forget to add a value to the enumeration and use it once every time you add a new type of Room to the system. You have to be sure you use the enum values only once, one time per class.
But, on the other hand, inheritance bassed dessigns only have sense if the types of the hierarchy shares a common behaviour. In other words, you want to use them in the same way, regardless of its type. IMPO, an OO/inheritance dessign is not the better way to do this.
The freak and scalable way I do this type of things is through typelists.
Normally, you have different search criteria for every type in your system. And, in many cases, the results of this search are not the same for different types of your system (Is not the ssame to search a luxury room and to search a normal room, you could have different search criteria and/or want different search results data).
For this prupose, the system has three typelists: One containing the data types, one containing the search criteria types, and one containing the search results types:
using system_data_types = type_list<NormalRoom,LuxuryRoom>;
using search_criteria_types = type_list<NormalRoomsCriteria,LuxuryRoommsCriteria>;
using search_results_types = type_list<NormalRoomSearchResults,LuxuryRoomSearchResults>;
Note that type_lists are sorted in the same manner. This is important, as I show below.
So the implementation of the search engine is:
class SearchEngine
{
private:
std::vector<VectorWrapper*> _data_lists; //A vector containing each system data type in its own vector. (One vector for NormalRoom, one for LuxuryRoom, etc)
//This function returns the vector that stores the system data type passed.
template<typename T>
std::vector<T>& _get_vector() {...} //Implementation explained below.
public:
SearchEngine() {...}//Explanation below.
~SearchEngine() {...}//Explanation below.
//This function adds an instance of a system data type to the "database".
template<typename T>
void addData(const T& data) { _get_vector<T>().push_back( data ); }
//The magic starts here:
template<typename SEARCH_CRITERIA_TYPE>//This template parameter is deduced by the compiler through the function parameter, so you can ommit it.
typename search_results_types::type_at<search_criteria_types::index_of<SEARCH_CRITERIA_TYPE>> //Return value (The search result that corresponds to the passed criteria. THIS IS THE REASON BECAUSE THE TYPELISTS MUST BE SORTED IN THE SAME ORDER.
search( const SEARCH_CRITERIA_TYPE& criteria)
{
using system_data_type = system_data_types::type_at<search_criteria_types::index_of<SEARCH_CRITERIA_TYPE>>; //The type of the data to be searched.
std::vector<system_data_type>& data = _get_vector<system_data_type>(); //A reference to the vector where that type of data is stored.
//blah, blah, blah (Search along the vector using the criteria parameter....)
}
};
And the search engine can be used as follows:
int main()
{
SearchEngine engine;
engine.addData(LuxuryRoom());
engine.addData(NormalRoom());
auto luxury_search_results = engine.search(LuxuryRoomCriteria()); //Search LuxuryRooms with the specific criteria and returns a LuxuryRoomSearchResults instance with the results of the search.
auto normal_search_results = engine.search(NormalRoomCriteria()); //Search NormalRooms with the specific criteria and returns a NormalRoomSearchResults instance with the results of the search.
}
The engine is based on store one vector for every system data type. And the engine uses a vector that stores that vectors.
We cannot have a polymorphic reference/pointer to vectors of different types, so we use a wrapper of a std::vector:
struct VectorWrapper
{
virtual ~VectorWrapper() = 0;
};
template<typename T>
struct GenericVectorWrapper : public VectorWrapper
{
std::vector<T> vector;
~GenericVectorWrapper() {};
};
//This template class "builds" the search engine set (vector) of system data types vectors:
template<int type_index>
struct VectorBuilder
{
static void append_data_type_vector(std::vector<VectorWrapper*>& data)
{
data.push_back( new GenericVectorWrapper< system_data_types::type_at<type_index> >() ); //Pushes back a vector that stores the indexth type of system data.
VectorBuilder<type_index+1>::append_data_type_vector(data); //Recursive call
}
};
//Base case (End of the list of system data types)
template<>
struct VectorBuilder<system_data_types::size>
{
static void append_data_type_vector(std::vector<VectorWrapper*>& data) {}
};
So the implementation of SearchEngine::_get_vector<T> is as follows:
template<typename T>
std::vector<T>& get_vector()
{
GenericVectorWrapper<T>* data; //Pointer to the corresponing vector
data = dynamic_cast<GenericVectorWrapper<T>*>(_data_lists[system_data_types::index_of<T>]); //We try a cast from pointer of wrapper-base-class to the expected type of vector wrapper
if( data )//If cast success, return a reference to the std::vector<T>
return data->vector;
else
throw; //Cast only fails if T is not a system data type. Note that if T is not a system data type, the cast result in a buffer overflow (index_of<T> returns -1)
}
The constructor of SearchEngine only uses VectorBuilder to build the list of vectors:
SearchEngine()
{
VectorBuilder<0>::append_data_type_vector(_data_list);
}
And the destructor only iterates over the list deleting the vectors:
~SearchEngine()
{
for(unsigned int i = 0 ; i < system_data_types::size ; ++i)
delete _data_list[i];
}
The advantages of this dessign are:
The search engine uses exactly the same interface for different searches (Searches with different system data types as target). And the process of "linking" a data type to its corresponding search criteria and results is done at compile-time.
That interface is type safe: A call to SearchEngine::search() returns a type of results based only on the search criteria passed. Assignament results errors are detected at compile-time. For example: NormalRoomResults = engine.search(LuxuryRoomCriteria()) generates a compilation error (engine.search<LuxuryRoomCriteria> returns LuxuryRoomResults).
The search engine is fully scalable: To add a new datatype to the system, you must only go to add the types to the typelists. The implementation of the search engine not changes.
Room Class
class Room{
public:
enum Type {
Regular,
Luxury,
Celebrity
};
Room(Type rt):roomType(rt), isOpen(true) { }
Type getRoomType() { return roomType; }
bool getRoomStatus() { return isOpen; }
void setRoomStatus(bool isOpen) { this->isOpen = isOpen; }
private:
Type roomType;
bool isOpen;
};
Hotel Class
class Hotel{
std::map<Room::Type, std::vector<Room*>> openRooms;
//std::map<Room::Type, std::vector<Room*>> reservedRooms;
public:
void addRooms(Room &room) { openRooms[room.getRoomType()].push_back(&room); }
auto getOpenRooms() {
std::vector<Room*> allOpenRooms;
for(auto rt : openRooms)
for(auto r : rt.second)
allOpenRooms.push_back(r);
return allOpenRooms;
}
auto getOpenRoomsOfType(Room::Type rt) {
std::vector<Room*> OpenRooms;
for(auto r : openRooms[rt])
OpenRooms.push_back(r);
return OpenRooms;
}
int totalOpenRooms() {
int roomCount=0;
for(auto rt : openRooms)
roomCount += rt.second.size();
return roomCount;
}
};
Client UseCase:
Hotel Marigold;
Room RegularRoom1(Room::Regular);
Room RegularRoom2(Room::Regular);
Room LuxuryRoom(Room::Luxury);
Marigold.addRooms(RegularRoom1);
Marigold.addRooms(RegularRoom2);
Marigold.addRooms(LuxuryRoom);
auto allRooms = Marigold.getOpenRooms();
auto LRooms = Marigold.getOpenRoomsOfType(Room::Luxury);
auto RRooms = Marigold.getOpenRoomsOfType(Room::Regular);
auto CRooms = Marigold.getOpenRoomsOfType(Room::Celebrity);
cout << " TotalOpenRooms : " << allRooms.size()
<< "\n Luxury : " << LRooms.size()
<< "\n Regular : " << RRooms.size()
<< "\n Celebrity : " << CRooms.size()
<< endl;
TotalOpenRooms : 4
Luxury : 2
Regular : 2
Celebrity : 0
If you really want to check that a room is of the same type as some other room, then typeid() is as good as any other method - and it's certainly "better" (from a performance perspective, at least) to calling a virtual method.
The other option is to not have separate classes at all, and store the roomtype as a member variable (and that is certainly how I would design it, but that's not a very good design for learning object orientation and inheritance - you don't get to inherit when the base class fulfils all your needs).

Maps and Arrays

I'm working on a project for school and i'm trying to create a map that uses an array of size 2 as the index for the map. I'm not even sure if this is possible since I don't know I could access the elements of the map (since I really don't know how i could reference an entire array by value). Basically i'm trying to use the map index as a coordinant system to strings. If anyone could let me know if this is even possible and if it is what the syntax would be that would be a great help. Thanks!
I'm doing this prject in c++
If using Java, one approach you could use is to wrap your array with a class, and then implement the hashCode and equals method. These methods are a mechanism which allow other objects to identify an instance of that class. For example, the Map class uses hashCode as the key to store and retrieve that object.
Here's an example of your wrapper class.
class Point {
private int[] coordinates;
public Point(int x, int y){
this.coordinates = new int[]{x, y};
}
#Override
public boolean equals(Object o){
// implement equals as stated in the docs.
}
#Override
public int hashCode(){
// implement hashCode as stated in the docs using coordinates[0] and coordinates[1]
}
}
class App {
public static void main(String[] args){
Map<Point, String> map = new HashMap<Point, String>();
map.put(new Point(1,2), "some string");
// etc...
}
}
Well, the easiest way is to just concatenate the values into a string (assuming it's something simple) If you are using ints or floats just represent {1.2, 4.3} as a string "1.2,4.3" and make your map keys of type string.
ggreiner's answer is a good Java implementation and I've included a comment on his for the C# implementation, but I wouldn't be able to help you on generating a hash code in C++. However, if this is for homework, converting the array to a string will work and is probably what your instructor is expecting.

serializing objects in C++ and storing as a blob type in mysql

I am using mysql/C++ connector to connect to a mysql database. I have some complex data structures so I need to serialize those and save in the database.
I tried something like the following.
vector<int> vectorTest(10,100);
istream *blob = NULL;
ostringstream os;
int size_data = sizeof(vector<int>);
blob = new istringstream((char*)&vectorTest, istringstream::in | istringstream::binary);
string qry = "INSERT INTO vector(id,object) VALUES (?,?)";
prep_stmt = con->prepareStatement(qry);
prep_stmt->setInt(1,1);
prep_stmt->setBlob(2,blob);
prep_stmt->execute();
I just tried a small example here. However the vector object is not getting saved.
Alternatively can I can use the following approach.
ostringstream os;
int size_data = sizeof(vector<int>);
os.write((char*)&vectorTest, size_data);
However I don't know how to redirect the outputstream to an inputstream, because the setBlob() method needs an istream as the input parameter.
Can I know how to get any of this examples working ? If my approach is incorrect can anyone provide a code example or improve the given code segment ? Your immediate response is greatly appreciated.
Thanks
You're going about this completely the wrong way. This isn't "serialization", in fact it's quite possibly the opposite of serialization -- it's just trying to write out a raw memory dump of a vector into the database. Imagine for a second that vector looked like something this:
struct vector_int {
unsigned int num_elements;
int* elements;
};
Where elements is a dynamically allocated array that holds the elements of the vector.
What you would end up writing out to your database is the value of num_elements and then the value of the pointer elements. The element data would not be written to the database, and if you were to load the pointer location back into a vector on a different run of your program, the location it points to would contain garbage. The same sort of thing will happen with std::vector since it contains dynamically allocated memory that will will be written out as pointer values in your case, and other internal state that may not be valid if reloaded.
The whole point of "serialization" is to avoid this. Serialization means turning a complex object like this into a sequence of bytes that contains all of the information necessary to reconstitute the original object. You need to iterate through the vector and write out each integer that's in it. And moreover, you need to devise a format where, when you read it back in, you can determine where one integer ends and the next begins.
For example, you might whitespace-delimit the ints, and write them out like this:
1413 1812 1 219 4884 -57 12
And then when you read this blob back in you would have to parse this string back into seven separate integers and insert them into a newly-created vector.
Example code to write out:
vector<int> vectorTest(10,100);
ostringstream os;
for (vector<int>::const_iterator i = vectorTest.begin(); i != vectorTest.end(); ++i)
{
os << *i << " ";
}
// Then insert os.str() into the DB as your blob
Example code to read in:
// Say you have a blob string called "blob"
vector<int> vectorTest;
istringstream is(blob);
int n;
while(is >> n) {
vectorTest.push_back(n);
}
Now, this isn't necessarily the most efficient approach, space-wise, since this turns your integers into strings before inserting them into the database, which will take much more space than if you had just inserted them as binary-coded integers. However, the code to write out and read in would be more complex in that case because you would have to concern yourself with how you pack the integers into a byte sequence and how you parse a byte sequence into a bunch of ints. The code above uses strings so that the standard library streams can make this part easy and give a more straightforward demonstration of what serialization entails.
My solution to writing to a MySQL database was to use the Visitor design pattern and an abstract base class. I did not use the BLOB data structure, instead used fields (columns):
struct Field
{
// Every field has a name.
virtual const std::string get_field_name(void) = 0;
// Every field value can be converted to a string (except Blobs)
virtual const std::string get_value_as_string(void) = 0;
// {Optional} Every field knows it's SQL type.
// This is used in creating the table.
virtual unsigned int get_sql_type(void) = 0;
// {Optional} Every field has a length
virtual size_t get_field_length(void) = 0;
};
I built a hierarchy including fields for numbers, bool, and strings. Given a Field pointer or reference, an SQL INSERT and SELECT statement can be generated.
A Record would be a container of fields. Just provide a for_each() method with a visitor:
struct Field_Functor
{
virtual void operator()(const Field& f) = 0;
};
struct Record
{
void for_each(Field_Functor& functor)
{
//...
functor(field_container[i]); // or something similar
}
};
By using a more true Visitor design pattern, the SQL specifics are moved into the visitor. The visitor knows the field attributes due to the method called. This reduces the Field structure to having only get_field_name and get_value_as_string methods.
struct Field_Integer;
struct Visitor_Base
{
virtual void process(const Field_Integer& fi) = 0;
virtual void process(const Field_String& fs) = 0;
virtual void process(const Field_Double& fd) = 0;
};
struct Field_With_Visitor
{
virtual void accept_visitor(Visitor_Base& vb) = 0;
};
struct Field_Integer
{
void accept_visitor(Visitor_Base& vb)
{
vb.process(*this);
}
};
The record using the `Visitor_Base`:
struct Record_Using_Visitor
{
void accept_visitor(Visitor_Base& vistor)
{
Field_Container::iterator iter;
for (iter = m_fields.begin();
iter != m_fields.end();
++iter)
{
(*iter)->accept_visitor(rv);
}
return;
}
};
My current hurdle is handling BLOB fields with MySQL C++ Connector and wxWidgets.
You may also want to add the tags: MySQL and database to your next questions.
boost has a serialization library (I have never used it tho)
or XML or JSON