Byte offset greater than Byte Length in BufferView - c++

I'm trying to read data from scene.bin files using Microsoft::glTF SDK. TinyGLTF is not an option. When I try to read MeshPrimitive attribute called TEXCOORD_0 i get a situation where BufferView byteOffset is greater than byteLength. Therefore, I don't know how to properly read given data and my program crashes.
I tried reading data using IStreamReader which is a part of SDK, and is a must when reading bin files using this SDK. I calculate data offset by adding accessor.byteOffset + bufferView.byteOffset which is > byteLength.
struct BuffersAccessors {
Microsoft::glTF::Accessor accessor;
Microsoft::glTF::BufferView view;
Microsoft::glTF::Buffer buffer;
void operator=(BuffersAccessors accessors);
};
template<typename T> struct BufferInfo {
BuffersAccessors buffersAccessors;
std::vector<T> bufferData;
BufferInfo<T>();
BufferInfo<T>(BuffersAccessors buffersAccessors, std::vector<T> bufferData);
const void operator=(const BufferInfo<T> &info) {
buffersAccessors = info.buffersAccessors;
bufferData = info.bufferData;
};
};
template<typename T>
std::vector<T> readBufferData(Microsoft::glTF::Document document, BufferInfo<T> bufferInfo, std::filesystem::path path) {
std::vector<T> stream;
if (bufferInfo.buffersAccessors.buffer.uri.length() > 0 || bufferInfo.buffersAccessors.buffer.byteLength > 0) {
Microsoft::glTF::Buffer buffer = bufferInfo.buffersAccessors.buffer;
path += bufferInfo.buffersAccessors.buffer.uri;
path = std::filesystem::absolute(path);
buffer.uri = path.string();
std::shared_ptr<StreamReader> streamReader = std::make_shared<StreamReader>(path);
Microsoft::glTF::GLTFResourceReader reader(streamReader);
stream = reader.ReadBinaryData<T>(buffer, bufferInfo.buffersAccessors.view);
}
return stream;
}
template<typename T>
BufferInfo<T> getFullBufferData(Microsoft::glTF::Document document, std::string accessorKey, std::filesystem::path path) {
BufferInfo<T> bufferInfo{};
BuffersAccessors mainPart = getBufferAccessorFromDocument(document, accessorKey);
bufferInfo.buffersAccessors = mainPart;
std::vector<T> bufferData = vkglTF::readBufferData<T>(document, bufferInfo, path);
const size_t bufferDataOffset = mainPart.accessor.byteOffset + mainPart.view.byteOffset; //How to properly calculate offset?
bufferData.erase(bufferData.begin(), bufferData.begin() + bufferDataOffset);
bufferInfo.bufferData = bufferData;
return bufferInfo;
}
I expect data in formats like uint8 and uint16 but my program crashes when trying to do bufferData.erase(..).
Edit: This happens while reading WEIGHTS_0 too.

I think the most likely error with your code is the mixing of byte offsets and vector element indices. Have you tried dividing bufferDataOffset by sizeof(T)?
Second, if you only want to read an accessor's data then try using the ReadBinaryData overload that accepts an Accessor parameter instead. That way the glTF SDK will handle all of the offset calculations for you.
There is no documentation but the deserialize sample demonstrates the basic code structure recommended when using the glTF SDK.

Related

How can I create optimized graphical multicolor logger?

I am setting up a logger class and want to optimize logging.
It has to be multicolor logger, so std::string::append(...) is not an option.
Adding new log to a vector of strings is not a good idea, because every push_back memory rises and fps goes down. I thought to create Log struct that would hold string msg and color or flag that inform us what kind of message is that and double buffer Log struct. Write to the first then pass it to the second Log object and draw from it, then clear first Log object ... and so on. I tried to implement it, but it does not work as I wish, though.
At the moment I left vector of Logs
class Logger
{
public:
struct Log {
std::string text;
Uint color;
}
void Draw() {
for(const auto& log : logs) {
renderer->DrawString(log.text, log.color);
}
}
void AddLog(const std::string& text, Uint color) {
logs.emplace_back(text, color);
}
std::vector<Log> logs;
};
int main() {
//window stuff, opengl context, etc.
Logger logger;
while(!quit) {
// Do not flood logger with logs, just add it sometimes
static double lastTime = -1.0;
if(time - lastTime >= 0.20f) {
logger.AddLog("Log", 0xff00ffff);
lastTime = time;
}
logger.Draw();
}
return 0;
}
we do not pass position to renderere->DrawString(...), because it gets automatically moved down to another line.
This approach works, but with very, very, very, very, superb poor speed.
How could I optimize it? I would like to get something like cs go console has. It is also a multicolor logger and it can log massive message with no fps drops.
In order to avoid reallocations of your vector when you push log messages to your logger, you can use a ring buffer structure. It preallocates memory for N message and it stores only the latest N messages pushed. A simple implementation can be as follows:
template <std::size_t bufferSize>
class Logger {
public:
struct Data {
std::string msg = " - ";
uint32_t color = 0x00000000;
};
void addLog(Data && item) {
buffer_[head_] = std::move(item);
if (++head_ >= bufferSize) head_ -= bufferSize;
}
void draw() const {
for (std::size_t i = 0; i < bufferSize; ++i) {
auto idx = head_ + i;
if (idx >= bufferSize) idx -= bufferSize;
// print the log whatever way you like, for example:
printf("%s\n", buffer_[idx].msg.c_str());
}
}
private:
std::size_t head_ = 0;
std::array<Data, bufferSize> buffer_;
};
Here, the template parameter bufferSize specifies the size of the ring buffer. The draw() method processes the oldest messages first, so your newest logs will be at the bottom.
Here is a live example: link

Import CSV into Vertica using Rfc4180CsvParser and exclude header row

Is there a way to exclude the header row when importing data via the Rfc4180CsvParser? The COPY command has a SKIP option but the option doesn't seem to work when using the CSV parsers provided in the Vertica SDK.
Background
As background, the COPY command does not read CSV files by itself. For simple CSV files, one can say COPY schema.table FROM '/data/myfile.csv' DELIMITER ',' ENCLOSED BY '"'; but this will fail with data files which have string values with embedded quotes.
Adding ESCAPE AS '"' will generate an error ERROR 3169: ENCLOSED BY and ESCAPE AS can not be the same value . This is a problem as CSV values are enclosed and escaped by ".
Vertica SDK CsvParser extensions to the rescue
Vertica provides an SDK under /opt/vertica/sdk/examples with C++ programs that can be compiled into extensions. One of these is /opt/vertica/sdk/examples/ParserFunctions/Rfc4180CsvParser.cpp.
This works great as follows:
cd /opt/vertica/sdk/examples
make clean
vsql
==> CREATE LIBRARY Rfc4180CsvParserLib AS '/opt/vertica/sdk/examples/build/Rfc4180CsvParser.so';
==> COPY myschema.mytable FROM '/data/myfile.csv' WITH PARSER Rfc4180CsvParser();
Problem
The above works great except that it imports the first row of the data file as a row. The COPY command has a SKIP 1 option but this does not work with the parser.
Question
Is it possble to edit Rfc4180CsvParser.cpp to skip the first row, or better yet, take some parameter to specify number of rows to skip?
The program is just 135 lines but I don't see where/how to make this incision. Hints?
Copying the entire program below as I don't see a public repo to link to...
Rfc4180CsvParser.cpp
/* Copyright (c) 2005 - 2012 Vertica, an HP company -*- C++ -*- */
#include "Vertica.h"
#include "StringParsers.h"
#include "csv.h"
using namespace Vertica;
// Note, the class template is mostly for demonstration purposes,
// so that the same class can use each of two string-parsers.
// Custom parsers can also just pick a string-parser to use.
/**
* A parser that parses something approximating the "official" CSV format
* as defined in IETF RFC-4180: <http://tools.ietf.org/html/rfc4180>
* Oddly enough, many "CSV" files don't actually conform to this standard
* for one reason or another. But for sources that do, this parser should
* be able to handle the data.
* Note that the CSV format does not specify how to handle different
* data types; it is entirely a string-based format.
* So we just use standard parsers based on the corresponding column type.
*/
template <class StringParsersImpl>
class LibCSVParser : public UDParser {
public:
LibCSVParser() : colNum(0) {}
// Keep a copy of the information about each column.
// Note that Vertica doesn't let us safely keep a reference to
// the internal copy of this data structure that it shows us.
// But keeping a copy is fine.
SizedColumnTypes colInfo;
// An instance of the class containing the methods that we're
// using to parse strings to the various relevant data types
StringParsersImpl sp;
/// Current column index
size_t colNum;
/// Parsing state for libcsv
struct csv_parser parser;
// Format strings
std::vector<std::string> formatStrings;
/**
* Given a field in string form (a pointer to the first character and
* a length), submit that field to Vertica.
* `colNum` is the column number from the input file; how many fields
* it is into the current record.
*/
bool handleField(size_t colNum, char* start, size_t len) {
if (colNum >= colInfo.getColumnCount()) {
// Ignore column overflow
return false;
}
// Empty colums are null.
if (len==0) {
writer->setNull(colNum);
return true;
} else {
return parseStringToType(start, len, colNum, colInfo.getColumnType(c
olNum), writer, sp);
}
}
static void handle_record(void *data, size_t len, void *p) {
static_cast<LibCSVParser*>(p)->handleField(static_cast<LibCSVParser*>(p)
->colNum++, (char*)data, len);
}
static void handle_end_of_row(int c, void *p) {
// Ignore 'c' (the terminating character); trust that it's correct
static_cast<LibCSVParser*>(p)->colNum = 0;
static_cast<LibCSVParser*>(p)->writer->next();
}
virtual StreamState process(ServerInterface &srvInterface, DataBuffer &input
, InputState input_state) {
size_t processed;
while ((processed = csv_parse(&parser, input.buf + input.offset, input.s
ize - input.offset,
handle_record, handle_end_of_row, this)) > 0) {
input.offset += processed;
}
if (input_state == END_OF_FILE && input.size == input.offset) {
csv_fini(&parser, handle_record, handle_end_of_row, this);
return DONE;
}
return INPUT_NEEDED;
}
virtual void setup(ServerInterface &srvInterface, SizedColumnTypes &returnTy
pe);
virtual void destroy(ServerInterface &srvInterface, SizedColumnTypes &return
Type) {
csv_free(&parser);
}
};
template <class StringParsersImpl>
void LibCSVParser<StringParsersImpl>::setup(ServerInterface &srvInterface, Sized
ColumnTypes &returnType) {
csv_init(&parser, CSV_APPEND_NULL);
colInfo = returnType;
}
template <>
void LibCSVParser<FormattedStringParsers>::setup(ServerInterface &srvInterface,
SizedColumnTypes &returnType) {
csv_init(&parser, CSV_APPEND_NULL);
colInfo = returnType;
if (formatStrings.size() != returnType.getColumnCount()) {
formatStrings.resize(returnType.getColumnCount(), "");
}
sp.setFormats(formatStrings);
}
template <class StringParsersImpl>
class LibCSVParserFactoryTmpl : public ParserFactory {
public:
virtual void plan(ServerInterface &srvInterface,
PerColumnParamReader &perColumnParamReader,
PlanContext &planCtxt) {}
virtual UDParser* prepare(ServerInterface &srvInterface,
PerColumnParamReader &perColumnParamReader,
PlanContext &planCtxt,
const SizedColumnTypes &returnType)
{
return vt_createFuncObj(srvInterface.allocator,
LibCSVParser<StringParsersImpl>);
}
};
typedef LibCSVParserFactoryTmpl<StringParsers> LibCSVParserFactory;
RegisterFactory(LibCSVParserFactory);
typedef LibCSVParserFactoryTmpl<FormattedStringParsers> FormattedLibCSVParserFac
tory;
RegisterFactory(FormattedLibCSVParserFactory);
The quick and dirty way would be to just hardcode it. It's using a callback to handle_end_of_row. Track the row number and just don't process the first row . Something like:
static void handle_end_of_row(int c, void *ptr) {
// Ignore 'c' (the terminating character); trust that it's correct
LibCSVParser *p = static_cast<LibCSVParser*>(ptr);
p->colNum = 0;
if (rowcnt <= 0) {
p->bad_field = "";
rowcnt++;
} else if (p->bad_field.empty()) {
p->writer->next();
} else {
// libcsv doesn't give us the whole row to reject.
// So just write to the log.
// TODO: Come up with something more clever.
if (p->currSrvInterface) {
p->currSrvInterface->log("Invalid CSV field value: '%s' Row skipped.",
p->bad_field.c_str());
}
p->bad_field = "";
}
}
Also, best to initialize rownum = 0 in process since I think it will call this for each file in your COPY statement. There might be more clever ways of doing this. Basically, this will just process the record and then discard it.
As for supporting SKIP generically... look at TraditionalCSVParser for how to handle parameter passing. You'd have to add it to the parser factor prepare and send in the value to the LibCSVParser class and override getParameterType. Then in LibCSVParser you need to accept the parameter in the constructor, and modify process to skip the first skip rows. Then use that value instead of the hardcoded 0 above.

Array copy in parallel_for_each context

I’m very newbie in AMP C++. Everything works fine if I use ‘memcpy’ inside the ‘parallel_for_each’ function, but I do know it is not the best practice. I tried to use ‘copy_to’, but it raises an exception. Below follows a simplified code, focusing the issue, that I am having troubles. Thanks in advance.
typedef std::vector<DWORD> CArrDwData;
class CdataMatrix
{
public:
CdataMatrix(int nChCount) : m_ChCount(nChCount)
{
}
void SetSize(UINT uSize)
{
// MUST be multiple of m_ChCount*DWORD
ASSERT(uSize%sizeof(DWORD) == 0);
m_PackedLength = uSize/sizeof(DWORD);
m_arrChannels.resize(m_ChCount*m_PackedLength);
}
UINT GetChannelPackedLen() const
{
return m_PackedLength;
}
const LPBYTE GetChannelBuffer(UINT uChannel) const
{
CArrDwData::const_pointer cPtr = m_arrChannels.data() + m_PackedLength*uChannel;
return (const LPBYTE)cPtr;
}
public:
CArrDwData m_arrChannels;
protected:
UINT m_ChCount;
UINT m_PackedLength;
};
void CtypDiskHeader::ParalelProcess()
{
const int nJobs = 6;
const int nChannelCount = 3;
UINT uAmount = 250000;
int vch;
CArrDwData arrCompData;
// Check buffers sizes
ASSERT((~uAmount & 0x00000003) == 3); // DWORD aligned
const UINT uInDWSize = uAmount/sizeof(DWORD); // in size give in DWORDs
CdataMatrix arrChData(nJobs);
arrCompData.resize(nJobs*uInDWSize);
vector<int> a(nJobs);
for(vch = 0; vch < nJobs; vch++)
a[vch] = vch;
arrChData.SetSize(uAmount+16); // note: 16 bytes or 4 DWORDs larger than uInDWSize
accelerator_view acc_view = accelerator().default_view;
Concurrency::extent<2> eIn(nJobs, uInDWSize);
Concurrency::extent<2> eOut(nJobs, arrChData.GetChannelPackedLen());
array_view<DWORD, 2> viewOut(eOut, arrChData.m_arrChannels);
array_view<DWORD, 2> viewIn(eIn, arrCompData);
concurrency::parallel_for_each(begin(a), end(a), [&](int vch)
{
vector<DWORD>::pointer ptr = (LPDWORD)viewIn(vch).data();
LPDWORD bufCompIn = (LPDWORD)ptr;
ptr = viewOut(vch).data();
LPDWORD bufExpandedIn = (LPDWORD)ptr;
if(ConditionNotOk())
{
// Copy raw data bufCompIn to bufExpandedIn
// Works fine, but not the best way, I suppose:
memcpy(bufExpandedIn, bufCompIn, uAmount);
// Raises exception:
//viewIn(vch).copy_to(viewOut(vch));
}
else
{
// Some data processing here
}
});
}
It was my fault. In the original code, the extent of viewOut(vch) is a little bit larger than viewIn(vch) extent. Using this way, it raises an exception 'runtime_exception'. When catching it, it supplies the following message xcp.what() = "Failed to copy because extents do not match".
I fixed the code replacing the original code by: viewIn(vch).copy_to(viewOut(vch).section(viewIn(vch).extent));
It copies only the source extent, that is what I need. But only compiles without restricted AMP.
The has nothing to do with the parallel_for_each it looks like it is a known bug with array_view::copy_to. See the following post:
Curiosity about concurrency::copy and array_view projection interactions
You can fix this using an explicit view_as() instead. I believe in your case your code should look something like this.
viewIn(vch).copy_to(viewOut(vch));
// Becomes...
viewIn[vch].view_as<1>(concurrency::extent<1>(uInDWSize)).copy_to(viewOut(vch));
I can't compile your example so was unable to verify this but I was able to get an exception from similar code and fix it using view_as().
If you want to copy data within a C++ AMP kernel then you need to do it as assignment operations on a series of threads. The following code copies the first 500 elements of source into the smaller dest array.
array<int, 1> source(1000);
array<int, 1> dest(500);
parallel_for_each(source.extent, [=, &source, &dest](index<1> idx)
{
if (dest.extent.contains(idx))
dest[idx] = source[idx];
});

Libjpeg write image to memory data

I would like to save image into memory (vector) using libjpeg library.
I found there funcitons:
init_destination
empty_output_buffer
term_destination
My question is how to do it safely and properly in parallel programs ? My function may be executed from different threads.
I want to do it in c++ and Visual Studio 2010.
Other libraries with callback functionality always have additional function parameter to store some additional data.
I don't see any way to add any additional parameters e.g. pointer to my local instance of vector.
Edit:
The nice solution of mmy question is here: http://www.christian-etter.de/?cat=48
The nice solution is described here: http://www.christian-etter.de/?cat=48
typedef struct _jpeg_destination_mem_mgr
{
jpeg_destination_mgr mgr;
std::vector<unsigned char> data;
} jpeg_destination_mem_mgr;
Initialization:
static void mem_init_destination( j_compress_ptr cinfo )
{
jpeg_destination_mem_mgr* dst = (jpeg_destination_mem_mgr*)cinfo->dest;
dst->data.resize( JPEG_MEM_DST_MGR_BUFFER_SIZE );
cinfo->dest->next_output_byte = dst->data.data();
cinfo->dest->free_in_buffer = dst->data.size();
}
When we finished then we need to resize buffer to actual size:
static void mem_term_destination( j_compress_ptr cinfo )
{
jpeg_destination_mem_mgr* dst = (jpeg_destination_mem_mgr*)cinfo->dest;
dst->data.resize( dst->data.size() - cinfo->dest->free_in_buffer );
}
When the buffer size is too small then we need to increase it:
static boolean mem_empty_output_buffer( j_compress_ptr cinfo )
{
jpeg_destination_mem_mgr* dst = (jpeg_destination_mem_mgr*)cinfo->dest;
size_t oldsize = dst->data.size();
dst->data.resize( oldsize + JPEG_MEM_DST_MGR_BUFFER_SIZE );
cinfo->dest->next_output_byte = dst->data.data() + oldsize;
cinfo->dest->free_in_buffer = JPEG_MEM_DST_MGR_BUFFER_SIZE;
return true;
}
Callbacks configuration:
static void jpeg_mem_dest( j_compress_ptr cinfo, jpeg_destination_mem_mgr * dst )
{
cinfo->dest = (jpeg_destination_mgr*)dst;
cinfo->dest->init_destination = mem_init_destination;
cinfo->dest->term_destination = mem_term_destination;
cinfo->dest->empty_output_buffer = mem_empty_output_buffer;
}
And sample usage:
jpeg_destination_mem_mgr dst_mem;
jpeg_compress_struct_wrapper cinfo;
j_compress_ptr pcinfo = cinfo;
jpeg_mem_dest( cinfo, &dst_mem);

std::fstream with multiple buffers?

You can specify one buffer for your file stream like that:
char buf[BUFFER_SIZE];
std::ofstream file("file", std::ios_base::binary | std::ios_base::out);
if (file.is_open())
{
file.rdbuf()->pubsetbuf(buf, BUFFER_SIZE);
file << "abcd";
}
What I want to do now, is using more than just one buffer:
char* buf[] = { new char[BUFFER_SIZE], new char[BUFFER_SIZE], new char[BUFFER_SIZE], };
Is it possible without creating a custom derivation of std::streambuf?
EDIT:
I think I need to explain what I want to do in more detail. Please consider the following situation:
- The file(s) I want to read won't fit into memory
- The file while be accessed by some kind of a binary jump search
So, if you split the file into logical pages of a specific size, then I would like to provide multiple buffers which are representing specific pages. This would increase performance when a file location is read and the related page is already in a buffer.
I gather from the comment that you want to do a kind of scatter-gather I/O. I'm pretty sure there's no support for that in the C++ standard I/O streams library, so you'll have to roll your own.
If you want to do this efficiently, you can use OS support for scatter-gather. E.g., POSIX/Unix-like systems have writev for this purpose.
There's nothing like this provided by Standard. However, depending on your platform, you can use Memory Mapped Files, which provide the same functionality. Windows and Linux both provide them.
I will take a look at boost::iostreams::mapped_file, but I think my requirement is much simpler. I've created a custom class derived from basic_filebuf.
template<typename char_type>
class basic_filemultibuf : public std::basic_filebuf<char_type/*, std::char_traits<char_type>*/>
{
private:
char_type** m_buffers;
std::ptrdiff_t m_buffer_count,
m_curent_buffer;
std::streamsize m_buffer_size;
protected:
virtual int_type overflow(int_type meta = traits_type::eof())
{
if (this->m_buffer_count > 0)
{
if (this->m_curent_buffer == this->m_buffer_count)
this->m_curent_buffer = 0;
this->basic_filebuf::setbuf(this->m_buffers[this->m_curent_buffer++], this->m_buffer_size);
}
return this->basic_filebuf::overflow(meta);
}
public:
basic_filemultibuf(basic_filebuf const& other)
: basic_filebuf(other),
m_buffers(NULL),
m_buffer_count(0),
m_curent_buffer(-1),
m_buffer_size(0)
{
}
basic_filemultibuf(basic_filemultibuf const& other)
: basic_filebuf(other),
m_buffers(other.m_buffers),
m_buffer_count(other.m_buffer_count),
m_curent_buffer(other.m_curent_buffer),
m_buffer_size(other.m_buffer_size)
{
}
basic_filemultibuf(FILE* f = NULL)
: basic_filemultibuf(basic_filebuf(f))
{
}
basic_filemultibuf* pubsetbuf(char** buffers, std::ptrdiff_t buffer_count, std::streamsize buffer_size)
{
if ((this->m_buffers = buffers) != NULL)
{
this->m_buffer_count = buffer_count;
this->m_buffer_size = buffer_size;
this->m_curent_buffer = 0;
}
else
{
this->m_buffer_count = 0;
this->m_buffer_size = 0;
this->m_curent_buffer = -1;
}
this->basic_filebuf::setbuf(NULL, 0);
return this;
}
};
Example usage:
typedef basic_filemultibuf<char> filemultibuf;
std::fstream file("file", std::ios_base::binary | std::ios_base::in | std::ios_base::out);
char** buffers = new char*[2];
for (int i = 0; i < n; ++i)
buffers[i] = new char[4096];
filemultibuf multibuf(*file.rdbuf());
multibuf.pubsetbuf(buffers, 2, 4096);
file.set_rdbuf(&multibuf);
//
// do awesome stuff with file ...
//
for (int i = 0; i < n; ++i)
delete[] buffers[i];
That's pretty much it. The only thing I would really like to do is offer this functionally for other streambufs, because the usage of multiple buffers should not be restricted to filebuf. But it seems to me it isn't possible without rewriting the file specific functions.
What do you think about that?