I'm trying to create a Vector3 struct but everytime I use the operator * it seems to throw an exception
"read access violation".
I'm relevantly new to c++ and have no idea what is causing this.
VS 2017, Debug, x86
Code
float x, y, z;
Vec3 operator+(Vec3 d) {
return { x + d.x, y + d.y, z + d.z };
}
Vec3 operator-(Vec3 d) {
return { x - d.x, y - d.y, z - d.z };
}
Vec3 operator*(float d) {
return { x * d, y * d, z * d }; // throwing an exception
/*
Unhandled exception thrown: read access violation.
this was 0x302C.*/
}
void Normalize() {
while (y < -180) {
y += 360;
};
while (y > 180) {
y -= 360;
};
if (x > 89) {
x = 89;
};
if (x < -89) {
x = -89;
};
}
};
// example code
uintptr_t c= *(uintptr_t*)(ModuleHandle+ 0x2);
Vec3* b= (Vec3*)(c+ 0x1);
Vec3 a= *b* 2;
You have a star before b:
Vec3 a= *b* 2;
Edit in regards to edited question, the other line of code above it seems to be the issue now:
Vec3* b= (Vec3*)(c+ 0x1);
If c isn't an array of at least 2 properly initialized Vec3 instances, dereferencing that is going to cause the access violation you're seeing.
And I mean literally of Vec3* or Vec3[] type, not a char* or something. Because you're adding 1 to it, the type of the array has to be of the type of an individual item so the CPU knows how much to move the pointer forward by.
Related
Given two integers X and Y, whats the most efficient way of converting them into X.Y float value in C++?
E.g.
X = 3, Y = 1415 -> 3.1415
X = 2, Y = 12 -> 2.12
Here are some cocktail-napkin benchmark results, on my machine, for all solutions converting two ints to a float, as of the time of writing.
Caveat: I've now added a solution of my own, which seems to do well, and am therefore biased! Please double-check my results.
Test
Iterations
ns / iteration
#aliberro's conversion v2
79,113,375
13
#3Dave's conversion
84,091,005
12
#einpoklum's conversion
1,966,008,981
0
#Ripi2's conversion
47,374,058
21
#TarekDakhran's conversion
1,960,763,847
0
CPU: Quad Core Intel Core i5-7600K speed/min/max: 4000/800/4200 MHz
Devuan GNU/Linux 3
Kernel: 5.2.0-3-amd64 x86_64
GCC 9.2.1, with flags: -O3 -march=native -mtune=native
Benchmark code (Github Gist).
float sum = x + y / pow(10,floor(log10(y)+1));
log10 returns log (base 10) of its argument. For 1234, that'll be 3 point something.
Breaking this down:
log10(1234) = 3.091315159697223
floor(log10(1234)+1) = 4
pow(10,4) = 10000.0
3 + 1234 / 10000.0 = 3.1234.
But, as #einpoklum pointed out, log(0) is NaN, so you have to check for that.
#include <iostream>
#include <cmath>
#include <vector>
using namespace std;
float foo(int x, unsigned int y)
{
if (0==y)
return x;
float den = pow(10,-1 * floor(log10(y)+1));
return x + y * den;
}
int main()
{
vector<vector<int>> tests
{
{3,1234},
{1,1000},
{2,12},
{0,0},
{9,1}
};
for(auto& test: tests)
{
cout << "Test: " << test[0] << "," << test[1] << ": " << foo(test[0],test[1]) << endl;
}
return 0;
}
See runnable version at:
https://onlinegdb.com/rkaYiDcPI
With test output:
Test: 3,1234: 3.1234
Test: 1,1000: 1.1
Test: 2,12: 2.12
Test: 0,0: 0
Test: 9,1: 9.1
Edit
Small modification to remove division operation.
(reworked solution)
Initially, my thoughts were improving on the performance of power-of-10 and division-by-power-of-10 by writing specialized versions of these functions, for integers. Then there was #TarekDakhran's comment about doing the same for counting the number of digits. And then I realized: That's essentially doing the same thing twice... so let's just integrate everything. This will, specifically, allow us to completely avoid any divisions or inversions at runtime:
inline float convert(int x, int y) {
float fy (y);
if (y == 0) { return float(x); }
if (y >= 1e9) { return float(x + fy * 1e-10f); }
if (y >= 1e8) { return float(x + fy * 1e-9f); }
if (y >= 1e7) { return float(x + fy * 1e-8f); }
if (y >= 1e6) { return float(x + fy * 1e-7f); }
if (y >= 1e5) { return float(x + fy * 1e-6f); }
if (y >= 1e4) { return float(x + fy * 1e-5f); }
if (y >= 1e3) { return float(x + fy * 1e-4f); }
if (y >= 1e2) { return float(x + fy * 1e-3f); }
if (y >= 1e1) { return float(x + fy * 1e-2f); }
return float(x + fy * 1e-1f);
}
Additional notes:
This will work for y == 0; but - not for negative x or y values. Adapting it for negative value is pretty easy and not very expensive though.
Not sure if this is absolutely optimal. Perhaps a binary-search for the number of digits of y would work better?
A loop would make the code look nicer; but the compiler would need to unroll it. Would it unroll the loop and compute all those floats beforehand? I'm not sure.
I put some effort into optimizing my previous answer and ended up with this.
inline uint32_t digits_10(uint32_t x) {
return 1u
+ (x >= 10u)
+ (x >= 100u)
+ (x >= 1000u)
+ (x >= 10000u)
+ (x >= 100000u)
+ (x >= 1000000u)
+ (x >= 10000000u)
+ (x >= 100000000u)
+ (x >= 1000000000u)
;
}
inline uint64_t pow_10(uint32_t exp) {
uint64_t res = 1;
while(exp--) {
res *= 10u;
}
return res;
}
inline double fast_zip(uint32_t x, uint32_t y) {
return x + static_cast<double>(y) / pow_10(digits_10(y));
}
double IntsToDbl(int ipart, int decpart)
{
//The decimal part:
double dp = (double) decpart;
while (dp > 1)
{
dp /= 10;
}
//Joint boths parts
return ipart + dp;
}
Simple and very fast solution is converting both values x and y to string, then concatenate them, then casting the result into a floating number as following:
#include <string>
#include <iostream>
std::string x_string = std::to_string(x);
std::string y_string = std::to_string(y);
std::cout << x_string +"."+ y_string ; // the result, cast it to float if needed
(Answer based on the fact that OP has not indicated what they want to use the float for.)
The fastest (most efficient) way is to do it implicitly, but not actually do anything (after compiler optimizations).
That is, write a "pseudo-float" class, whose members are integers of x and y's types before and after the decimal point; and have operators for doing whatever it is you were going to do with the float: operator+, operator*, operator/, operator- and maybe even implementations of pow(), log2(), log10() and so on.
Unless what you were planning to do is literally save a 4-byte float somewhere for later use, it would almost certainly be faster if you had the next operand you need to work with then to really create a float from just x and y, already losing precision and wasting time.
Try this
#include <iostream>
#include <math.h>
using namespace std;
float int2Float(int integer,int decimal)
{
float sign = integer/abs(integer);
float tm = abs(integer), tm2 = abs(decimal);
int base = decimal == 0 ? -1 : log10(decimal);
tm2/=pow(10,base+1);
return (tm+tm2)*sign;
}
int main()
{
int x,y;
cin >>x >>y;
cout << int2Float(x,y);
return 0;
}
version 2, try this out
#include <iostream>
#include <cmath>
using namespace std;
float getPlaces(int x)
{
unsigned char p=0;
while(x!=0)
{
x/=10;
p++;
}
float pow10[] = {1.0f,10.0f,100.0f,1000.0f,10000.0f,100000.0f};//don't need more
return pow10[p];
}
float int2Float(int x,int y)
{
if(y == 0) return x;
float sign = x != 0 ? x/abs(x) : 1;
float tm = abs(x), tm2 = abs(y);
tm2/=getPlaces(y);
return (tm+tm2)*sign;
}
int main()
{
int x,y;
cin >>x >>y;
cout << int2Float(x,y);
return 0;
}
If you want something that is simple to read and follow, you could try something like this:
float convertToDecimal(int x)
{
float y = (float) x;
while( y > 1 ){
y = y / 10;
}
return y;
}
float convertToDecimal(int x, int y)
{
return (float) x + convertToDecimal(y);
}
This simply reduces one integer to the first floating point less than 1 and adds it to the other one.
This does become a problem if you ever want to use a number like 1.0012 to be represented as 2 integers. But that isn't part of the question. To solve it, I would use a third integer representation to be the negative power of 10 for multiplying the second number. IE 1.0012 would be 1, 12, 4. This would then be coded as follows:
float convertToDecimal(int num, int e)
{
return ((float) num) / pow(10, e);
}
float convertToDecimal(int x, int y, int e)
{
return = (float) x + convertToDecimal(y, e);
}
It a little more concise with this answer, but it doesn't help to answer your question. It might help show a problem with using only 2 integers if you stick with that data model.
What is the best practice in this case:
Should I get variables before running a for loop like this:
void Map::render(int layer, Camera* pCam)
{
int texture_index(m_tilesets[layer]->getTextureIndex());
int tile_width(m_size_of_a_tile.getX());
int tile_height(m_size_of_a_tile.getY());
int camera_x(pCam->getPosition().getX());
int camera_y(pCam->getPosition().getY());
int first_tile_x(pCam->getDrawableArea().getX());
int first_tile_y(pCam->getDrawableArea().getY());
int map_max_x( (640 / 16) + first_tile_x );
int map_max_y( (360 / 16) + first_tile_y );
if (map_max_x > 48) { map_max_x = 48; }
if (map_max_y > 28) { map_max_x = 28; }
Tile* t(nullptr);
for (int y(first_tile_y); y < map_max_y; ++y) {
for (int x(first_tile_x); x < map_max_x; ++x) {
// move map relative to camera
m_dst_rect.x = (x * tile_width) + camera_x;
m_dst_rect.y = (y * tile_height) + camera_y;
t = getTile(layer, x, y);
if (t) {
pTextureManager->draw(texture_index, getTile(layer, x, y)->src, m_dst_rect);
}
}
}
}
or is it better to get it directly in the loop like this (in this case the code is shorter but less readable):
void Map::render(int layer, Camera* pCam)
{
int first_tile_x(pCam->getDrawableArea().getX());
int first_tile_y(pCam->getDrawableArea().getY());
for (int y(first_tile_y); y < (640 / 16) + first_tile_x; ++y) {
for (int x(first_tile_x); x < (360 / 16) + first_tile_y; ++x) {
// move map relative to camera
m_dst_rect.x = (x * m_size_of_a_tile.getX()) + pCam->getPosition().getX();
m_dst_rect.y = (y * m_size_of_a_tile.getY()) + pCam->getPosition().getY();
Tile* t(getTile(layer, x, y));
if (t) {
pTextureManager->draw(m_tilesets[layer]->getTextureIndex(), getTile(layer, x, y)->src, m_dst_rect);
}
}
}
}
Is there an impact on performance using one method over another?
Syntactically the second version is to be preferred as it does contain the object in the scope where it is being used, not leaking it to different contexts. Performance wise you will need to profile but I'd be surprised if there was any difference at all because a compiler will often notice that the results don't change, at least for simple functions, and do this optimization for you.
For functions that are more complex or potentially dynamic, but you know they will not change their result during the for loop it makes sense to define them before the loop.
How to find the smallest float/double number x which satisfies x + d = y given d and y?
(iiuc this is theoretically solved via setting fesetround (FE_DOWNWARD) and just doing y - d but in clang/Xcode I got a warning that FENV_ACCESS isn't supported and in practice found that it didn't work)
So, so far I made this:
// Find minimum x value so that x + d = y
template<typename T, bool supportDenormals = false>
T subtractMost (const T y, const T d)
{
T x = y - d;
while (true)
{
const T nextX =
x == 0 && !supportDenormals
? -std::numeric_limits<T>::min()
: nextafter (x, -std::numeric_limits<T>::infinity());
if (nextX + d != y)
return x;
T step = x - nextX;
while (true)
{
const T nextStep = step + step;
if (x - nextStep + d != y)
break;
step = nextStep;
}
x -= step;
}
}
Which does quite a lot of actions to find the result, but I wonder:
Is there's a more efficient solution or a more standard way to achieve this?
I am trying to push data to my vector, but I'm met with an error:
expression must have class type
This is my code:
float calcX(float u, float v)
{
return (((-90.0*pow(u, 5) + 225.0*pow(u, 4) - 270.0*pow(u, 3) + 180.0*pow(u, 2) - 45.0*u)*cos(pi*v)));
}
float calcY(float u, float v)
{
return (160.0*pow(u, 4) - 320.0*pow(u, 3) + 160.0*pow(u, 2) - 5.0f);
}
float calcZ(float u, float v)
{
return (((-90.0*pow(u, 5) + 225.0*pow(u, 4) - 270.0*pow(u, 3) + 180.0*pow(u, 2) - 45.0*u)*sin(pi*v)));
}
typedef float point3[3];
std::vector <point3*> createEggBuffor(int n=20)
{
std::vector <point3*> egg;
for (int u = 0; u < n; u++) {
for (int v = 0; v < n; v++) {
egg[u][v][0].push_back(calcX(static_cast<float>(u) / (n - 1), static_cast<float>(v) / (n - 1)));
egg[u][v][1].push_back(calcY(static_cast<float>(u) / (n - 1), static_cast<float>(v) / (n - 1)));
egg[u][v][2].push_back(calcZ(static_cast<float>(u) / (n - 1), static_cast<float>(v) / (n - 1)));
}
}
return egg;
}
What this error means?
egg[u] is a point3*. egg[u][v] is a point3 (which is an array of 3 floats). So egg[u][v][N] is a float. float is a built-in type, and does not have a member function named push_back, or any members at all. The error is telling you that, since it is not a class type, you can't use the dot operator to access members of it (since it doesn't have any).
If you're trying to push back elements onto your egg vector, it would look like this:
egg.push_back(something);
Where something is an expression of type point3*.
For a videogame I'm implementing in my spare time, I've tried implementing my own versions of sinf(), cosf(), and atan2f(), using lookup tables. The intent is to have implementations that are faster, although with less accuracy.
My initial implementation is below. The functions work, and return good approximate values. The only problem is that they are slower than calling the standard sinf(), cosf(), and atan2f() functions.
So, what am I doing wrong?
// Geometry.h includes definitions of PI, TWO_PI, etc., as
// well as the prototypes for the public functions
#include "Geometry.h"
namespace {
// Number of entries in the sin/cos lookup table
const int SinTableCount = 512;
// Angle covered by each table entry
const float SinTableDelta = TWO_PI / (float)SinTableCount;
// Lookup table for Sin() results
float SinTable[SinTableCount];
// This object initializes the contents of the SinTable array exactly once
class SinTableInitializer {
public:
SinTableInitializer() {
for (int i = 0; i < SinTableCount; ++i) {
SinTable[i] = sinf((float)i * SinTableDelta);
}
}
};
static SinTableInitializer sinTableInitializer;
// Number of entries in the atan lookup table
const int AtanTableCount = 512;
// Interval covered by each Atan table entry
const float AtanTableDelta = 1.0f / (float)AtanTableCount;
// Lookup table for Atan() results
float AtanTable[AtanTableCount];
// This object initializes the contents of the AtanTable array exactly once
class AtanTableInitializer {
public:
AtanTableInitializer() {
for (int i = 0; i < AtanTableCount; ++i) {
AtanTable[i] = atanf((float)i * AtanTableDelta);
}
}
};
static AtanTableInitializer atanTableInitializer;
// Lookup result in table.
// Preconditions: y > 0, x > 0, y < x
static float AtanLookup2(float y, float x) {
assert(y > 0.0f);
assert(x > 0.0f);
assert(y < x);
const float ratio = y / x;
const int index = (int)(ratio / AtanTableDelta);
return AtanTable[index];
}
}
float Sin(float angle) {
// If angle is negative, reflect around X-axis and negate result
bool mustNegateResult = false;
if (angle < 0.0f) {
mustNegateResult = true;
angle = -angle;
}
// Normalize angle so that it is in the interval (0.0, PI)
while (angle >= TWO_PI) {
angle -= TWO_PI;
}
const int index = (int)(angle / SinTableDelta);
const float result = SinTable[index];
return mustNegateResult? (-result) : result;
}
float Cos(float angle) {
return Sin(angle + PI_2);
}
float Atan2(float y, float x) {
// Handle x == 0 or x == -0
// (See atan2(3) for specification of sign-bit handling.)
if (x == 0.0f) {
if (y > 0.0f) {
return PI_2;
}
else if (y < 0.0f) {
return -PI_2;
}
else if (signbit(x)) {
return signbit(y)? -PI : PI;
}
else {
return signbit(y)? -0.0f : 0.0f;
}
}
// Handle y == 0, x != 0
if (y == 0.0f) {
return (x > 0.0f)? 0.0f : PI;
}
// Handle y == x
if (y == x) {
return (x > 0.0f)? PI_4 : -(3.0f * PI_4);
}
// Handle y == -x
if (y == -x) {
return (x > 0.0f)? -PI_4 : (3.0f * PI_4);
}
// For other cases, determine quadrant and do appropriate lookup and calculation
bool right = (x > 0.0f);
bool top = (y > 0.0f);
if (right && top) {
// First quadrant
if (y < x) {
return AtanLookup2(y, x);
}
else {
return PI_2 - AtanLookup2(x, y);
}
}
else if (!right && top) {
// Second quadrant
const float posx = fabsf(x);
if (y < posx) {
return PI - AtanLookup2(y, posx);
}
else {
return PI_2 + AtanLookup2(posx, y);
}
}
else if (!right && !top) {
// Third quadrant
const float posx = fabsf(x);
const float posy = fabsf(y);
if (posy < posx) {
return -PI + AtanLookup2(posy, posx);
}
else {
return -PI_2 - AtanLookup2(posx, posy);
}
}
else { // right && !top
// Fourth quadrant
const float posy = fabsf(y);
if (posy < x) {
return -AtanLookup2(posy, x);
}
else {
return -PI_2 + AtanLookup2(x, posy);
}
}
return 0.0f;
}
"Premature optimization is the root of all evil" - Donald Knuth
Nowadays compilers provide very efficient intrinsics for trigonometric functions that get the best from modern processors (SSE etc.), which explains why you can hardly beat the built-in functions. Don't lose too much time on these parts and instead concentrate on the real bottlenecks that you can spot with a profiler.
Remember you have a co-processor ... you would have seen an increase in speed if it were 1993 ... however today you will struggle to beat native intrinsics.
Try viewing the disassebly to sinf.
Someone has already benchmarked this, and it looks as though the Trig.Math functions are already optimized, and will be faster than any lookup table you can come up with:
http://www.tommti-systems.de/go.html?http://www.tommti-systems.de/main-Dateien/reviews/languages/benchmarks.html
(They didn't use anchors on the page so you have to scroll about 1/3 of the way down)
I'm worried by this place:
// Normalize angle so that it is in the interval (0.0, PI)
while (angle >= TWO_PI) {
angle -= TWO_PI;
}
But you can:
Add time-meters to all functions, write special performance tests, run performance tests, print report of time test.. I think you will know answer after this tests.
Also you could use some profiling tools such as AQTime.
The built-in functions are very well optimized already, so it's going to be REALLY tough to beat them. Personally, I'd look elsewhere for places to gain performance.
That said, one optimization I can see in your code:
// Normalize angle so that it is in the interval (0.0, PI)
while (angle >= TWO_PI) {
angle -= TWO_PI;
}
Could be replaced with:
angle = fmod(angle, TWO_PI);