Mongodb GridFs with C++ - c++

I want to insert BSON object from C++ to mongodb gridfs.
I cannot find useful doc. about c++ gridfs API.
Can you give an example how to insert or update BSON object on gridfs structure in c++.
Suppose that i have a collection given below;
{ "_id" : "123", "data" : [ { "C1" : "value1","C2" : "value1"}] }
How can i insert this as gridfs in mongodb?
P.S.: I tried to insert data as collections, but i got error because the document exceeds the file size limit.(document to be inserted exceeds maxBsonObjectSize)
E.g. in the document above, the "data" array has sometimes more than 500.000 rows and with more columns.
Thanks in advance

The MongoDB C++ driver has a class called GridFS that will allow you to insert documents into your gridded system. GridFS is designed to operate on things abstracted as files, it reads its input as a stream of bytes rather than as some structured thing like a BSON object. You could convert your large BSON objects into strings to store them:
GridFS grid = new GridFS(*client, "database_name");
const char data[] = myLargeBSONObj.toString();
BSONObj result = grid->storeFile(data, sizeof data, "filename");
Or, you can store your large objects in a file and stream them into GridFS instead:
BSONObj result = grid->storeFile("path/to/file", "filename");
GridFS will not allow you to update a file once it has been inserted into the system. There is no safe way within the defined API to allow updates without encountering race conditions. To "update" a GridFS file, you'll have to read it out of the system, alter it in your application, and re-insert it into GridFS as a new file. This is slow and tedious. If you need to perform updates, I suggest that you instead re-design your schema so that your documents are smaller than 16 MB.

Related

What are the differences between Object Storages for example S3 and a columnar based Technology

I was thinking about the difference between those two approches.
Imagine you must handle information about pattern calls, which later should be
displayed to the user. A pattern call is a tuple consisting of a unique integer
identifier ("id"), a user defined name (“name"), a project relative path to the so
called pattern file ("patternFile") and a convenience flag, which states whether
the pattern should be called or not called. And the number of tuples are not known before and they won't be modified after initialization.
I thought that in this case a column based approach with big query for example would be better in terms of I/O and performance as well as the evolution of the schema. But actually I can't understand why. I would appreciate any help.
Amazon S3 is like a large key-value store. The Key is the filename (with full path) and the Value is the contents of the file. It's just a blob of data.
A columnar data store organizes data in such a way that specific data can be "jumped to", and only desired values need to be read from disk.
If you are wanting to perform a search on the data, then some form of logic is required on the data. This could be done by storing data in a database (typically a proprietary format) or by using a columnar storage format such as Parquet and ORC plus a query engine that understands this format (eg Amazon Athena).
The difference between S3 and columnar data stores is like the difference between a disk drive and an Oracle database.

How to insert multiple values eg JSON encoding in value parameter of put() method in leveldb in c++

I have been trying to insert key value pairs in database using leveldb and it works fine with simple strings. However if I want to store multiple attributes for a key or for example use JSON encoding , how can it be done in c++ . In Node.js leveldb package it can be done by specifying the encoding . Really can't figure this out
JSON is just a string, so I'm not completely sure where you're coming from here…
If you have some sort of in-memory representation of JSON you'll need to serialize it before writing it to the database and parse it when reading.

C++ persistence in database

I would like to persist some of my objects in a database (this could be relational (postgresql or MariaDB) or MongoDB). I have found a number of libraries that seem potentially useful, but I am missing the overall picture.
I have used boost::serialization serialize c++ to xml / binary, but it is not clear to me how to get this into the database (do I use the binary or xml format?)?
How do I get this into my mongoDB or postgresql?
You'd serialize to binary, as it is smaller and much faster. Also, the XML format isn't really pretty/easy to use outside of Boost Serialization anyways.
WARNING: Use Boost Portable Archive (EPA http://epa.codeplex.com/) if you need to use the format across different machines.
You'd usually store it in a column
text or CLOB (character large object) by encoding in base64 and putting that in the Database native charset (base64 is safe even for ASCII)
BLOB (binary large object) which doesn't bring the need to encode and could be more efficient storage wise.
Note: if you need to index, store the index properties in normal database columns.
Finally, if you like, I have recently made a streambuffer that allows you to stream data directly into a Sqlite BLOB column. Perhaps you can glean some ideas from this you could use:
How to boost::serialize into a sqlite::blob?

Visual C++, CMap object save to blob column

I have a Microsoft Foundation Class (MFC) CMap object built where each object instance stores 160K~ entries of long data.
I need to store it on Oracle SQL.
We decided to save it as a BLOB since we do not want to make an additional table. We thought about saving it as local file and point the SQL column to that file, but we'd rather just keep it as BLOB on the server and clear the table every couple of weeks.
The table has a sequential key ID, and 2 columns of date/time. I need to add the BLOB column in order to store the CMap object.
Can you recommend a guide to do so (read/write Map to blob or maybe a clob)?
How do I create a BLOB field in Oracle, and how can I read and write my object to the BLOB? Perhaps using a CLOB?
CMAP cannot be inserted into blob/clob since its using pointers.
first of all use clob
and store array/vector instead of cmap.

MySQL, C++: Need Blob size to read blob data

How do I get the size of data in a BLOB field in the Result Set? (Using C++ and MySQL Connector C++)
In order to read the data from the result set, I have allocate memory for it first. In order to allocate memory, I need to know the size of the blob data in the result set.
Searching the web and StackOverflow, I have found two methods: OCTECT and BLOB stream.
One method to find the BLOB size is to use OCTECT() function, which requires a new query and produces a new result set. I would rather not use this method.
Another method is to use the blob stream and seek to the end, and get the file position. However, I don't know if the stream can be rewound to the beginning in order to read the data. This method requires an additional read of the entire stream.
The ResultSet and ResultSetMetaData interfaces of MySQL Connector C++ 1.0.5 do not provide a method for obtaining the size of the data in a field (column).
Is there a process for obtaining the size of the data in a BLOB field given only the result set and a field name?
I am using MySQL Connector C++ 1.0.5, C++, Visual Studio 2008, Windows Vista / XP and "Server version: 5.1.41-community MySQL Community Server (GPL)".
You could do a select like:
select LENGTH(content),content where id=123;
where content is the BLOB field.
Regards.
see: LENGTH(str)