How to handle long strings with ODBC?

How to handle long strings with ODBC? - c++

I'm using ODBC SQLGetData to retrieve string data, using a 256 byte buffer as default. If the buffer is too short, I'm allocating a new buffer large enough for the string and calling SQLGetData() again.
It seems that calling this a second time only returns what was left after the last call, and not the whole field.
Is there any way to 'reset' this behaviour so SQLGetData returns the whole field into the second buffer?
char buffer[256];
SQLLEN sizeNeeded = 0;
SQLRETURN ret = SQLGetData(_statement, _columnIndex, SQL_C_CHAR, (SQLCHAR*)buffer, sizeof(buffer), &sizeNeeded);
if(ret == SQL_SUCCESS)
{
return std::string(buffer);
}
else if(ret == SQL_SUCCESS_WITH_INFO)
{
std::auto_ptr<char> largeBuffer(new char[sizeNeeded + 1]);
// Doesn't return the whole field, only what was left...
SQLGetData(_statement, _columnIndex, SQL_C_CHAR, (SQLCHAR*)largeBuffer.get(), sizeNeeded, &sizeNeeded);
}
Thanks for any help!

It is the caller's responsibility to put the data together; the limitation on returning the data in chunks could be due to the database provider and not your code, so you need to be able to handle the case either way.
Also your code has a logic flaw -- you might have to call SQLGetData multiple times; each time could return additional chunks of data with SQL_SUCCESS_WITH_INFO/01004 that need to be appended in a loop.

If you are interested in "resetting" the fetch buffer, I believe that the position in the column is only preserved if the column name/index is the same for two consecutive calls. In other words, calling SQLFetchData with a different column name should reset the position in the original column. Here's a snippet from MSDN:
Successive calls to SQLGetData will retrieve data from the last column requested; prior offsets become invalid. For example, when the following sequence is performed:
SQLGetData(icol=n), SQLGetData(icol=m), SQLGetData(icol=n)
the second call to SQLGetData(icol=n) retrieves data from the start of the n column. Any offset in the data due to earlier calls to SQLGetData for the column is no longer valid.
I don't have the ODBC spec handy, but MSDN seems to indicate that this is the expected behavior. Personally, I have always accumulated the result of multiple calls directly into a string using a fixed size buffer.

Related

Capnp: Move to previous position in BufferedInputStreamWrapper

I have a binary file with multiple Capnp messages which I want to read. Reading sequentially works well, but I have the use-case, that I want to jump to a previously known position.
The data sequential images with metadata including there timestamp. I would like to have the possibility to jump back and forth (like in a video player).
This is what I have tried:
int fd = open(filePath.c_str(), O_RDONLY);
kj::FdInputStream fdStream(fd);
kj::BufferedInputStreamWrapper bufferedStream(fdStream);
for (;;) {
kj::ArrayPtr<const kj::byte> framePtr = bufferedStream.tryGetReadBuffer();
if (framePtr != nullptr) {
capnp::PackedMessageReader message(bufferedStream);
// This should reset the buffer to the last read message?
bufferedStream.read((void*)framePtr.begin(), framePtr.size());
// ...
}
else {
// reset to beginning
}
}
But I get this error:
capnp/serialize.c++:186: failed: expected segmentCount < 512; Message has too many segments
I was assuming that tryGetReadBuffer() returns the position and size of the next packed message. But then again, how is the BufferedInputStream supposed to know what "a message" is.
Question: How can I get position and size of messages and read these messages later on from the BufferedInputStreamWrapper?
Alternative: Reading the whole file once, take ownership of the data and save it to a vector. Such as described here (https://groups.google.com/forum/#!topic/capnproto/Kg_Su1NnPOY). Better solution all along?

BufferedInputStream is not seekable. In order to seek backwards, you will need to destroy bufferedStream and then seek the underlying file descriptor, e.g. with lseek(), then create a new buffered stream.
Note that reading the current position (in order to pass to lseek() later to go back) is also tricky if a buffered stream is present, since the buffered stream will have read past the position in order to fill the buffer. You could calculate it by subtracting off the buffer size, e.g.:
// Determine current file position, so that we can seek to it later.
off_t messageStartPos = lseek(fd, 0, SEEK_CUR) -
bufferedStream.tryGetReadBuffer().size();
// Read a message
{
capnp::PackedMessageReader message(bufferedStream);
// ... do stuff with `message` ...
// Note that `message` is destroyed at this }. It's important that this
// happens before querying the buffered stream again, because
// PackedMesasgeReader updates the buffer position in its destructor.
}
// Determine the end position of the message (if you need it?).
off_t messageEndPos = lseek(fd, 0, SEEK_CUR) -
bufferedStream.tryGetReadBuffer().size();
bufferedStream.read((void*)framePtr.begin(), framePtr.size());
FWIW, the effect of this line is "advance past the current buffer an on to the next one". You don't want to do this when using PackedMessageReader, as it will already have advanced the stream itself. In fact, because PackedMessageReader might have already advanced past the current buffer, framePtr may now be invalid, and this line might segfault.
Alternative: Reading the whole file once, take ownership of the data and save it to a vector. Such as described here (https://groups.google.com/forum/#!topic/capnproto/Kg_Su1NnPOY). Better solution all along?
If the file fits comfortably in RAM, then reading it upfront is usually fine, and probably a good idea if you expect to be seeking back and forth a lot.
Another option is to mmap() it. This makes it appear as if the file is in RAM, but the operating system will actually read in the contents on-demand when you access them.
However, I don't think this will actually simplify the code much. Now you'll be dealing with an ArrayInputStream (a subclass of BufferedInputStream). To "seek" you would create a new ArrayInputStream based on a slice of the buffer starting at the point where you want to start.

How to buffer efficiently when writing to 1000s of files in C++

I am quite inexperienced when it comes to C++ I/O operations especially when dealing with buffers etc. so please bear with me.
I have a programme that has a vector of objects (1000s - 10,000s). At each time-step the state of the objects is updated. I want to have the functionality to log a complete state time history for each of these objects.
Currently I have a function that loops through my vector of objects, updates the state, and then calls a logging function which opens the file (ascii) for that object, writes the state to file, and closes the file (using std::ofstream). The problem is this signficantly slows down my run time.
I've been recommended a couple things to do to help speed this up:
Buffer my output to prevent extensive I/O calls to the disk
Write to binary not ascii files
My question mainly concerns 1. Specifically, how would I actually implement this? Would each object effectively require it's own buffer? or would this be a single buffer that somehow knows which file to send each bit of data? If the latter, what is the best way to achieve this?
Thanks!

Maybe the simplest idea first: instead of logging to separate files, why not log everything to an SQLite database?
Given the following table structure:
create table iterations (
id integer not null,
iteration integer not null,
value text not null
);
At the start of the program, prepare a statement once:
sqlite3_stmt *stmt;
sqlite3_prepare_v3(db, "insert into iterations values(?,?,?)", -1, SQLITE_PREPARE_PERSISTENT, &stmt, NULL);
The question marks here are placeholders for future values.
After every iteration of your simulation, you could walk your state vector and execute the stmt a number of times to actually insert rows into the database, like so:
for (int i = 0; i < objects.size(); i++) {
sqlite3_reset(stmt);
// Fill in the three placeholders and execute the query.
sqlite3_bind_int(stmt, 1, i);
sqlite3_bind_int(stmt, 2, current_iteration); // Could be done once, but here for illustration.
std::string state = objects[i].get_state();
sqlite3_bind_text(stmt, 3, state.c_str(), state.size(), SQLITE_STATIC); // SQLITE_STATIC means "no need to free this"
sqlite3_step(stmt); // Execute the query.
}
You can then easily query the history of each individual object using the SQLite command-line tool or any database manager that understands SQLite.

Implementing odbc wrapper for Sql server. Reading database data as characters or asking driver to convert the data to C type

I have written a database wrapper using odbc to communicate with sql server database.
It is working but how i am doing is I am reading all the data types as characters(number of characters specified while binding the column using SQLBindCol) and changing the returned characters to the required datatype in my application.
I know this method is not very efficient as i am converting the returned characters every time to the required datatype in my application, i can imagine this would take an extra time for conversion.
I see the microsoft reference for SQLBindCol stating
When it is retrieving data from the data source with SQLFetch, SQLFetchScroll, SQLBulkOperations, or SQLSetPos, the driver converts the data to this type
which is what i need and i think which will be efficient compared to my code (reading everything as characters.)
sqlbindcol sqlfetch
Below is the order of ODBC API Function calls
SQLAllocHandle (to allocate environment handle)
SQLSetEnvAttr
SQLAllocHandle (to allocate database handle)
SQLDriverConnect
SQLAllocHandle (to allocate statement handle)
SQLExecDirect
SQLBindCol
SQLFetch.
Every time when I bind the column, I am specifying, TargetType as sql_char and the number of characters to be pushed to the application buffer (*void) when sqlfetch is called.
When I want to read the data which is a big string (for example xml data) and of unknown size, this method is not workable.
I want to know if I read the all the datatypes as characters like how i did? How to read all the data in the returned result column without specifying the number characters to be pushed into buffer?
I read from the documents we can ask the driver to do the conversion to the specified C type in SQLBindCol. How to achieve this?
My structure to store the column information is
struct ColValInfo
{
ColValInfo(): pValue(0){}
SQLPOINTER pValue; // typedef void * SQLPOINTER;
SQLINTEGER StrLen_or_Ind; // typedef long SQLINTEGER;
};
pValue is a void pointer. if driver want to do the conversion and return the data to pValue. what all the necessary things to be done.

I think you are missing a call to SQLDescribeCol before SQLBindCol call.
It will both give you the column type and the column size.
You will then allocate your buffers accordingly using a C type corresponding to the SQL type of the column and will not convert any data.

WriteFile returning error 1784

I am creating a program to populate a disk with a dummy file system.
Currently, I am writing files of variable sizes using WriteFile.
WriteFile(hFile, FileData, i * 1024, &dwWrote, NULL);
err = GetLastError();
err returns #1784 which translates to
The supplied user buffer is not valid for the requested operation. ERROR_INVALID_USER_BUFFER
So for the first 24 files, the write operation works. For file #25 on, the write operation fails.
The files are still created but the WriteFile function does not populate the files.
Any ideas on how to get past ERROR_INVALID_USER_BUFFER?
Every reference I can find to the error is limited to crashing programs and I cannot figure out how it relates to the issue I am experiencing.
EDIT:
FileData = (char *) malloc(sizeof(char) * (size_t)k * 1024);
memset(FileData, 245, sizeof(char) * (size_t)k * 1024);
FileData is set and allocated to the size of the maximum anticipate buffer.
i is the loop variable that iterates until it increments to the Maximum Size (k).

My guess is that FileData is not large enough for you to write i * 1024 bytes from it. Is i the loop control variable for your list of files? If so, you need the write buffer FileData to grow 1K at a time as you loop through your files.
This is an unusual construct. Are you sure the logic is correct here? Post more code (specifically, all usage of FileData and i) for better accuracy in the answers.
Note that you should not always be checking GetLastError here - you need to check WriteFile's return code before you rely on that being meaningful. Otherwise you could be picking up an error from some unrelated part of your code - whatever failed last.

I got a Error = 1784 and it was because I opened the file without specifying the size of records and then did block reads on the file.
Reset( FileHandle );
Should be
Reset( FileHandle, 1 );

String issue with assert on erase

I am developing a program in C++, using the string container , as in std::string to store network data from the socket (this is peachy), I receive the data in a maximum possible 1452 byte frame at a time, the protocol uses a header that contains information about the data area portion of the packets length, and header is a fixed 20 byte length. My problem is that a string is giving me an unknown debug assertion, as in , it asserts , but I get NO message about the string. Now considering I can receive more than a single packet in a frame at a any time, I place all received data into the string , reinterpret_cast to my data struct, calculate the total length of the packet, then copy the data portion of the packet into a string for regex processing, At this point i do a string.erase, as in mybuff.Erase(totalPackLen); <~ THIS is whats calling the assert, but totalpacklen is less than the strings size.
Is there some convention I am missing here? Or is it that the std::string really is an inappropriate choice here? Ty.
Fixed it on my own. Rolled my own VERY simple buffer with a few C calls :)
int ret = recv(socket,m_buff,0);
if(ret > 0)
{
BigBuff.append(m_buff,ret);
while(BigBuff.size() > 16){
Header *hdr = reinterpret_cast<Header*>(&BigBuff[0]);
if(ntohs(hdr->PackLen) <= BigBuff.size() - 20){
hdr->PackLen = ntohs(hdr->PackLen);
string lData;
lData.append(BigBuff.begin() + 20,BigBuff.begin() + 20 + hdr->PackLen);
Parse(lData); //regex parsing helper function
BigBuff.erase(hdr->PackLen + 20); //assert here when len is packlen is 235 and string len is 1458;
}
}
}

From the code snippet you provided it appears that your packet comprises a fixed-length binary header followed by a variable length ASCII string as a payload. Your first mistake is here:
BigBuff.append(m_buff,ret);
There are at least two problems here:
1. Why the append? You presumably have dispatched with any previous messages. You should be starting with a clean slate.
2. Mixing binary and string data can work, but more often than not it doesn't. It is usually better to keep the binary and ASCII data separate. Don't use std::string for non-string data.
Append adds data to the end of the string. The very next statement after the append is a test for a length of 16, which says to me that you should have started fresh. In the same vein you do that reinterpret cast from BigBuff[0]:
Header *hdr = reinterpret_cast<Header*>(&BigBuff[0]);
Because of your use of append, you are perpetually dealing with the header from the first packet received rather than the current packet. Finally, there's that erase:
BigBuff.erase(hdr->PackLen + 20);
Many problems here:
- If the packet length and the return value from recv are consistent the very first call will do nothing (the erase is at but not past the end of the string).
- There is something very wrong if the packet length and the return value from recv are not consistent. It might mean, for example, that multiple physical frames are needed to form a single logical frame, and that in turn means you need to go back to square one.
- Suppose the physical and logical frames are one and the same, you're still going about this all wrong. As noted, the first time around you are erasing exactly nothing. That append at the start of the loop is exactly what you don't want to do.
Serialization oftentimes is a low-level concept and is best treated as such.

Your comment doesn't make sense:
BigBuff.erase(hdr->PackLen + 20); //assert here when len is packlen is 235 and string len is 1458;
BigBuff.erase(hdr->PackLen + 20) will erase from hdr->PackLen + 20 onwards till the end of the string. From the description of the code - seems to me that you're erasing beyond the end of the content data. Here's the reference for std::string::erase() for you.
Needless to say that std::string is entirely inappropriate here, it should be std::vector.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to handle long strings with ODBC? - c++

Related

Capnp: Move to previous position in BufferedInputStreamWrapper

How to buffer efficiently when writing to 1000s of files in C++

Implementing odbc wrapper for Sql server. Reading database data as characters or asking driver to convert the data to C type

WriteFile returning error 1784

String issue with assert on erase

Categories

Resources