I'm having an issue reading in some bytes from a yuv file (it's 1280x720 if that matters) and was hoping someone could point out what I'm doing wrong. I'm getting different results using the read command and using an istream iterator . Here's some example code of what I'm trying to do:
void readBlock(std::ifstream& yuvFile, YUVBlock& destBlock, YUVConfig& config, const unsigned int x, const unsigned int y, const bool useAligned = false)
{
//Calculate luma offset
unsigned int YOffset = (useAligned ? config.m_alignedYFileOffset : config.m_YFileOffset) +
(destBlock.yY * (useAligned ? config.m_alignedYUVWidth : config.m_YUVWidth) + destBlock.yX);// *config.m_bitDepth;
//Copy Luma data
//yuvFile.seekg(YOffset, std::istream::beg);
for (unsigned int lumaY = 0; lumaY < destBlock.m_YHeight && ((lumaY + destBlock.yY) < config.m_YUVHeight); ++lumaY)
{
yuvFile.seekg(YOffset + ((useAligned ? config.m_alignedYUVWidth : config.m_YUVWidth)/* * config.m_bitDepth*/) * (lumaY), std::istream::beg);
int copySize = destBlock.m_YWidth;
if (destBlock.yX + copySize > config.m_YUVWidth)
{
copySize = config.m_YUVWidth - destBlock.yX;
}
if (destBlock.yX >= 1088 && destBlock.yY >= 704)
{
char* test = new char[9];
yuvFile.read(test, 9);
delete[] test;
yuvFile.seekg(YOffset + ((useAligned ? config.m_alignedYUVWidth : config.m_YUVWidth)/* * config.m_bitDepth*/) * (lumaY));
}
std::istream_iterator<uint8_t> start = std::istream_iterator<uint8_t>(yuvFile);
std::copy_n(start, copySize, std::back_inserter(destBlock.m_yData));
}
}
struct YUVBlock
{
std::vector<uint8_t> m_yData;
std::vector<uint8_t> m_uData;
std::vector<uint8_t> m_vData;
unsigned int m_YWidth;
unsigned int m_YHeight;
unsigned int m_UWidth;
unsigned int m_UHeight;
unsigned int m_VWidth;
unsigned int m_VHeight;
unsigned int yX;
unsigned int yY;
unsigned int uX;
unsigned int uY;
unsigned int vX;
unsigned int vY;
};
This error only seems to be happening at X =1088 and Y = 704 in the image. I'm expecting to see a byte value of 10 as the first byte I read back. When I use
yuvFile.read(test, 9);
I get 10 as my first byte. When I use the istream iterator:
std::istream_iterator<uint8_t> start = std::istream_iterator<uint8_t>(yuvFile);
std::copy_n(start, copySize, std::back_inserter(destBlock.m_yData));
The first byte I read is 17. 17 is the byte after 10 so it seems the istream iterator skips the first byte.
Any help would be appreciated
There is a major difference between istream::read and std::istream_iterator.
std::istream::read performs unformatted read.
std::istream_iterator performs formatted read.
From http://en.cppreference.com/w/cpp/iterator/istream_iterator
std::istream_iterator is a single-pass input iterator that reads successive objects of type T from the std::basic_istream object for which it was constructed, by calling the appropriate operator>>.
If your file was created using std::ostream::write or fwrite, you must use std::istream::read or fread to read the data.
If your file was created using any of the methods that create formatted output, such as std::ostream::operato<<(), fprintf, you have a chance to read the data using std::istream_iterator.
Related
Consider the following c++ code:
unsigned char* data = readData(..); //Let say data consist of 12 characters
unsigned int dataSize = getDataSize(...); //the size in byte of the data is also known (let say 12 bytes)
struct Position
{
float pos_x; //remember that float is 4 bytes
double pos_y; //remember that double is 8 bytes
}
Now I want to fill a Position variable/instance with data.
Position pos;
pos.pos_x = ? //data[0:4[ The first 4 bytes of data should be set to pos_x, since pos_x is of type float which is 4 bytes
pos.pos_x = ? //data[4:12[ The remaining 8 bytes of data should be set to pos_y which is of type double (8 bytes)
I know that in data, the first bytes correspond to pos_x and the rest to pos_y. That means the 4 first byte/character of data should be used to fill pos_x and the 8 remaining byte fill pos_y but I don't know how to do that.
Any idea? Thanks. Ps: I'm limited to c++11
You can use plain memcpy as another answer advises. I suggest packing memcpy into a function that also does error checking for you for most convenient and type-safe usage.
Example:
#include <cstring>
#include <stdexcept>
#include <type_traits>
struct ByteStreamReader {
unsigned char const* begin;
unsigned char const* const end;
template<class T>
operator T() {
static_assert(std::is_trivially_copyable<T>::value,
"The type you are using cannot be safely copied from bytes.");
if(end - begin < static_cast<decltype(end - begin)>(sizeof(T)))
throw std::runtime_error("ByteStreamReader");
T t;
std::memcpy(&t, begin, sizeof t);
begin += sizeof t;
return t;
}
};
struct Position {
float pos_x;
double pos_y;
};
int main() {
unsigned char data[12] = {};
unsigned dataSize = sizeof data;
ByteStreamReader reader{data, data + dataSize};
Position p;
p.pos_x = reader;
p.pos_y = reader;
}
One thing that you can do is to copy the data byte-by byte. There is a standard function to do that: std::memcpy. Example usage:
assert(sizeof pos.pos_x == 4);
std::memcpy(&pos.pos_x, data, 4);
assert(sizeof pos.pos_y == 8);
std::memcpy(&pos.pos_y, data + 4, 8);
Note that simply copying the data only works if the data is in the same representation as the CPU uses. Understand that different processors use different representations. Therefore, if your readData receives the data over the network for example, a simple copy is not a good idea. The least that you would have to do in such case is to possibly convert the endianness of the data to the native endianness (probably from big endian, which is conventionally used as the network endianness). Converting from one floating point representation to another is much trickier, but luckily IEE-754 is fairly ubiquitous.
I'm using Microsoft's cpprest sdk to read binary data over the internet.
My variable stream below is of type concurrency::streams::istream. I'm trying to read a million rows of type struct row and process them. I see that I don't get all the bytes I request. I suspect there is a good way of coding this but I haven't been able to figure it out. I also suspect that my casting to extract a row from the buffer is not the right way to do things. Any help would be appreciated.
struct row {
unsigned long long tag_id : 32, day : 32;
unsigned long long time;
double value;
};
size_t row_count = 1000000;
concurrency::streams::container_buffer<vector<uint8_t>> buffer;
size_t bytes_requested = sizeof(row) * row_count;
size_t bytes_received = stream.read(buffer, bytes_requested).get();
// bytes_received does not always match bytes requested
for (size_t i = 0; i < row_count; ++i) {
row &r = *(row *) &buffer.collection()[i * sizeof(row)];
// do something with row here
}
I am trying to copy data from an array of characters into a member of my class using memcpy. I set a breakpoint in the debugger write before memcpy. I checked all of the variables that I will be using and the i calculated how much space is left in the destination and it looks like it should work.
#include <iostream>
#include <cstdlib>
#include <cstring>
class bigNum
{
unsigned int dataLength;
unsigned long long int *data;
public:
bigNum(){
//long long int is 8 bytes so (2 * 1024)long long int = 16 KiB
// if that's not enough, we can always add more later
dataLength = 2048;
data = new unsigned long long [dataLength];
};
virtual ~bigNum(){delete[] data;};
//bigNum& operator=(const bigNum& other);
bigNum& set(char chars[], unsigned int charsLength) {
//calculate where we will start writing the data
void *writeStart = (void*)(
(unsigned long long)data + dataLength*64 - charsLength*8
);
//DEBUG -- set a couple of the array elements to watch in debugger
data[2047] = data[2046] = ~((unsigned long long)0);
//zero out the space before writeStart
std::memset(data, 0, dataLength*8 - charsLength);
//write the data starting at writeStart
std::memcpy(writeStart, chars, charsLength);
return *this;
}
};
using namespace std;
int main()
{
bigNum myNum;
char chars[9] = {'a'};
myNum.set(chars, 9);
system("PAUSE");
return 0;
}
The bug is in this statement here:
void *writeStart = (void*)(
(unsigned long long)data + dataLength*64 - charsLength*8
);
That is, it appears you're trying to get the right write location in bits, when you should be doing this in bytes.
When you have statements like dataLength*64 and charsLength*8, you're multiplying by their sizes in bits, when what you're dealing with - pointers - refers to bytes.
But still, it seems you're playing loose with integer sizes. Don't do that! It means your code will break on other machine architectures. Instead of assuming that unsigned long long is 64 bits, find out exactly how many bytes they are using sizeof(unsigned long long), or if you want a fixed width integer, use the corresponding type like uint64_t.
Then again, with C++, you should stick to standard containers like std::vector.
I'm trying to teach myself C++AMP, and would like to start with a very simple task from my field, that is image processing. I'd like to convert a 24 Bit-per-pixel RGB image (a Bitmap) to a 8 Bit-per-Pixel grayscale one. The image data is available in unsigned char arrays (obtained from Bitmap::LockBits(...) etc.)
I know that C++AMP for some reason cannot deal with char or unsigned char data via array or array_view, so I tried to use textures according to that blog. Here it is explained how 8bpp textures are written to, although VisualStudio 2013 tells me writeonly_texture_view was deprecated.
My code throws a runtime exception, saying "Failed to dispatch kernel." The complete text of the exception is lenghty:
ID3D11DeviceContext::Dispatch: The Unordered Access View (UAV) in slot 0 of the Compute Shader unit has the Format (R8_UINT). This format does not support being read from a shader as as UAV. This mismatch is invalid if the shader actually uses the view (e.g. it is not skipped due to shader code branching). It was unfortunately not possible to have all hardware implementations support reading this format as a UAV, despite that the format can written to as a UAV. If the shader only needs to perform reads but not writes to this resource, consider using a Shader Resource View instead of a UAV.
The code I use so far is this:
namespace gpu = concurrency;
gpu::extent<3> inputExtent(height, width, 3);
gpu::graphics::texture<unsigned int, 3> inputTexture(inputExtent, eight);
gpu::graphics::copy((void*)inputData24bpp, dataLength, inputTexture);
gpu::graphics::texture_view<unsigned int, 3> inputTexView(inputTexture);
gpu::graphics::texture<unsigned int, 2> outputTexture(width, height, eight);
gpu::graphics::writeonly_texture_view<unsigned int, 2> outputTexView(outputTexture);
gpu::parallel_for_each(outputTexture.extent,
[inputTexView, outputTexView](gpu::index<2> pix) restrict(amp) {
gpu::index<3> indR(pix[0], pix[1], 0);
gpu::index<3> indG(pix[0], pix[1], 1);
gpu::index<3> indB(pix[0], pix[1], 2);
unsigned int sum = inputTexView[indR] + inputTexView[indG] + inputTexView[indB];
outputTexView.set(pix, sum / 3);
});
gpu::graphics::copy(outputTexture, outputData8bpp);
What's the reason for this exception, and what can I do for a workaround?
I've also been learning C++Amp on my own and faced a very similar problem than yours, but in my case, I needed to deal with a 16 bit image.
Likely, the issue can be solved using textures although I can't help you on that due to a lack of experience.
So, what I did is basically based on bit masking.
First off, trick the compiler in order to let you compile:
unsigned int* sourceData = reinterpret_cast<unsigned int*>(source);
unsigned int* destData = reinterpret_cast<unsigned int*>(dest);
Next, your array viewer has to see all your data. Be aware that viwer really thing your data is 32 bit sized. So, you have to make the conversion ( divided to 2 because 16 bits, use 4 for 8 bits).
concurrency::array_view<const unsigned int> source( (size+ 7)/2, sourceData) );
concurrency::array_view<unsigned int> dest( (size+ 7)/2, sourceData) );
Now, you are able to write a typical for_each block.
typedef concurrency::array_view<const unsigned int> OriginalImage;
typedef concurrency::array_view<unsigned int> ResultImage;
bool Filters::Filter_Invert()
{
const int size = k_width*k_height;
const int maxVal = GetMaxSize();
OriginalImage& im_original = GetOriginal();
ResultImage& im_result = GetResult();
im_result.discard_data();
parallel_for_each(
concurrency::extent<2>(k_width, k_height),
[=](concurrency::index<2> idx) restrict(amp)
{
const int pos = GetPos(idx);
const int val = read_int16(im_original, pos);
write_int16(im_result, pos, maxVal - val);
});
return true;
}
int Filters::GetPos( const concurrency::index<2>& idx ) restrict(amp, cpu)
{
return idx[0] * Filters::k_height + idx[1];
}
And here it comes the magic:
template <typename T>
unsigned int read_int16(T& arr, int idx) restrict(amp, cpu)
{
return (arr[idx >> 1] & (0xFFFF << ((idx & 0x7) << 4))) >> ((idx & 0x7) << 4);
}
template<typename T>
void write_int16(T& arr, int idx, unsigned int val) restrict(amp, cpu)
{
atomic_fetch_xor(&arr[idx >> 1], arr[idx >> 1] & (0xFFFF << ((idx & 0x7) << 4)));
atomic_fetch_xor(&arr[idx >> 1], (val & 0xFFFF) << ((idx & 0x7) << 4));
}
Notice that this methods are for 16 bits for 8 bits won't work but it shouldn't be too difficult to adapt it to 8 bits. In fact, this was based on a 8 bit version, unfortunately, I couldn't find the reference.
Hope it helps.
David
.h file:
#define VECTOR_SIZE 1024
.cpp file:
int main ()
{
unsigned int* A;
A = new unsigned int [VECTOR_SIZE];
CopyToDevice (A);
}
.cu file:
void CopyToDevice (unsigned int *A)
{
ulong4 *UA
unsigned int VectorSizeUlong4 = VECTOR_SIZE / 4;
unsigned int VectorSizeBytesUlong4 = VectorSizeUlong4 * sizeof(ulong4);
cudaMalloc( (void**)&UA, VectorSizeBytesUlong4 );
// how to use cudaMemcpy to copy data from A to UA?
// I tried to do the following but it gave access violation error:
for (int i=0; i<VectorSizeUlong4; ++i)
{
UA[i].x = A[i*4 + 0];
UA[i].y = A[i*4 + 1];
UA[i].z = A[i*4 + 2];
UA[i].w = A[i*4 + 3];
}
// I also tried to copy *A to device and then work on it instead going back to CPU to access *A every time but this did not work again
}
The CUDA ulong4 is a 16 byte aligned structure defined as
struct __builtin_align__(16) ulong4
{
unsigned long int x, y, z, w;
};
this means that the stream of four consecutive 32 bit unsigned source integers you want to use to populate a stream of ulong4 are the same size. The simplest solution is contained right in the text on the image you posted - just cast (either implicitly or explicitly) the unsigned int pointer to a ulong4 pointer, use cudaMemcpydirectly on the host and device memory, and pass the resulting device pointer to whatever kernel function you have that requires a ulong4 input. Your device transfer function could look something like:
ulong4* CopyToDevice (unsigned int* A)
{
ulong4 *UA, *UA_h;
size_t VectorSizeUlong4 = VECTOR_SIZE / 4;
size_t VectorSizeBytesUlong4 = VectorSizeUlong4 * sizeof(ulong4);
cudaMalloc( (void**)&UA, VectorSizeBytesUlong4);
UA_h = reinterpret_cast<ulong4*>(A); // not necessary but increases transparency
cudaMemcpy(UA, UA_h, VectorSizeBytesUlong4);
return UA;
}
[Usual disclaimer: written in browser, not tested or compiled, use at own risk]
This should raise all alarm bells:
cudaMalloc( (void**)&UA, VectorSizeBytesUlong4 );
// ...
UA[i].x = A[i*4 + 0];
You are allocating UA on the device and then use it in host code. Don't ever do that. You will need to use cudaMemcpy to copy arrays to the device. This tutorial shows you a basic program that uses cudaMemcpy to copy things over. The length argument to cudaMemcpy is the length of your array in bytes. And in your case that is VECTOR_SIZE * sizeof(unsigned int).