What is the reason to store IPv6 as big endian - c++

I am having a doubt about how to store IPv6 addresses on my system. On my system I accept a packet from the network run statistics on it and send it forward.
I have read the that common way to store it is in big-endian(network order) regardless of the endianness of the CPU. Although several colleagues of mine said they also familiar with this notation no one including google couldn't give a concrete explanation of why its the customary tradition.
The way I see it, the data will be accessed many times in the system so isn't it simpler to change it to host-order(on my case its little-endian) and use it on the entire system without having to run ntoh calls each time I want to run some arithmetic operation on it, and simply change it to network-order if I want to use this address on a packet I want to send back on the network?

The reason is as simple as: because RFC-1700 says so.
Following a three-decade dispute over which endianness is the single "correct one", Danny Cohen published On Holy Wars and a Plea of Peace in an attempt to shed light into the problem (which is deeper than just "byte order") and with the futile hope that the industry would agree on one consistent order.
The bottom line was that as long as you transmit a message as "one message", there is no such problem as order, but as soon you transmit sub-parts in the message (words, bytes, or bits) you need to decide for one order. It does not matter much which one you choose as long as you stick to your decision. The debate about which order is more correct than the other was as unproductive and silly as the debate on how to break an egg in "Gulliver's Travels", which Cohen referred to and took the name "endianness" from.
In 1994, the authors of RFC-1700 decided to end the debate, at least as far as the IP suite was concerned, by stating:
The convention in the documentation of Internet Protocols is to express numbers in decimal and to picture data in "big-endian" order. That is, fields are described left to right, with the most significant octet on the left and the least significant octet on the right.
Every RFC thereafter has followed (explicitly or silently) that convention, which of course includes IPv6.
Practically speaking, it doesn't matter what byte order IP addresses are. They could be words or bytes in little-endian or big-endian, or nibbles, or they could be counting Quatloos if the implementors deemed that practical.
For 99% of all people, 99% of the time, it makes zero difference because neither are you supposed to look at or understand the address, nor do you need to remember "magic values", nor do you (normally) need to modify an address.
You usually get an opaque block of memory that is an IP address and port from "somewhere" (say, getaddrinfo or recvfrom) and you use that blob as-is, e.g. with socket or sendto.
You do not normally need to perform "math" on the address, or anything similar. At best, you might want to compare two addresses for equality.
Yes, applications that need to perform more complicated things with addresses exist, but they are by far the minority of applications.

The way I see it, the data will be accessed many times in the system so isn't it simpler to change it to host-order(on my case its little-endian) and use it on the entire system without having to run ntoh calls each time I want to run some arithmetic operation on it, and simply change it to network-order if I want to use this address on a packet I want to send back on the network?
Well, you're right to wonder about this, but I believe that what you call customary tradition, is just a design choice when doing an application.
I might reverse your question: when you don't need to do any arithmetics on the addresses, why get into the trouble of reversing the order twice just for the sake of keeping it into the host's order?
If your application does a lot of maths on the IP address, it might be indeed smart to reverse it to little-endianness and reverse it before sending it back. But if you do not do any maths on it, then just keep it big-endian. And don't forget that the x86 is not the only CPU around, you also got other host architectures, like ARM or PPC, that were big-endian until they became bi-endian.

Related

Why we don't care about bit order?

Recently I've learned some Windows socket programming to achieve some socket connection stuff.
Inside the code we use some functions like htonl(), htons() to convert our data from host byte order to network byte order, or so called big-endian, since in some machines data is stored as little-endian, like Intel's CPU as I know.
But what I confused is that doesn't the really important thing is the bit order instead of byte?
Since bit is the minimum unit that computers used, not byte.
Let's say we want to pass an u_short u=18 to another machine by a machine use little-endian.
In our machine, u's first byte, the least significant one is 2 and second byte, the most significant one is 1, let me express this by [2][1].
So we need to call htons(u) first to convert [2][1] into [1][2] and send it to network.
The remote machine will receive 2 first, but 2 is expressed by a sequence of bits which the machine actually receive are 0000-0010, how does the machine know this byte is 2? Won't it think 0000-0010 represents 64?(0100-0000) Do all machines store bytes in the same way?
"Bit Ordering" is specified by the protocols used to transfer data from the hardware (like your network card/port) to host memory.
Technically, this applies in many, many cases. Like, you might wonder the same question if a Harddrive on a machine that stores bits in one order 01234567 is transferred to a machine that stores bits in the opposite order 76543210, whether it'll read the data correctly or not.
But the simple answer is that it always reads correctly, because the protocols used to map the hard drives to the system bus specify the exact ordering of the bits as they are "presented" to host memory.
Network cards and networking hardware have a similar behavior: They have a standardized "bit ordering" they use in hardware, and as part of the hardware protocol, it "presents" those bits to the Host in whatever form the host expects them.
"Byte Ordering", of course, is a separate thing, and more difficult to deal with, because network hardware and storage hardware only recognize "streams of bytes" (Yes, I'm oversimplifying), and don't much care for what those bytes actually mean.
But in terms of "Bit Ordering", unless you're writing code "On the Metal", so to speak, you don't need to think about it.

Is it possible to change a Byte Array (4) in a Compiled Application

Me and my friend have been creating a Advanced C++ TCPClient, He created the client, and i created the server. The client has a static IP inside the code and We lost the code for the client. I am currently wondering is it possible to decompile in IDA and change the IP. i have been scanning through IDA and i have not found the IP anywhere. Does anyone know if this is possible?
Its not as simple as just recreating the client, it is a bit more complex then just placing a listener and client
Yes, it is certainly possible, and probably not too hard.
Suppose the IP address is 10.11.12.13. Search through the binary for 0D 0C 0B 0A and 0A 0B 0C 0D... the IP address might be stored in network byte order (big endian), or in host byte order (little endian), depending on how it was written and how it was optimized. Note that this may be more difficult if you are on another architecture. On some architectures (not x86), if you want to load a 32-bit constant like an IP address, you'll do it by loading two 16-bit constants.
Decompilation might not even be necessary. You just have to make sure that the new IP address is added using the same byte order.
I am assuming here that your IP address isn't stored as a string, which is also possible, in which case the new address would have to be shorter.
(Of course, the lesson here is that you should always use host names instead of hard-coding IP addresses into your code, but you'll do better next time, right? If you use a host name, you can always just change DNS records when your server moves, or modify /etc/hosts (C:\Windows\System32\Drivers\etc\hosts on Windows).)
If the address is in fact stored in an array of 4 bytes (regardless of how it's declared), then it's quite possible to change it in the executable image.
Finding it, with confidence, is another story. Depending on how the code was written, the bytes may be in ascending or descending order of precedence. Let's say the address is 12.34.56.78 - if you perform a binary search on the executable for those four bytes in either order and find exactly one instance, it's pretty likely that's them, and depending on how brave you are, you can just change them and see if it works.
If you find more than one instance (in either order), things get significantly trickier.
If you have a recollection of what the code looked like where the address was stored and used it'll make it much easier to find. In particular, if the address was actually stored in a data segment, especially if referenced from another module, that narrows down where you need to search.
Because IPv4 addresses fit comfortably in 32-bit integers, it's entirely possible to use them in a manner where they'll only appear in actual machine instructions, which takes you into the code segment, a much more dangerous place to be playing around.
I'd only do this for a one-off check - without the source code, the software is un-maintainable, so for anything beyond the most minimal usage, I'd say you really need to rewrite it ... and keep the source!

What is the defacto standard for sharing variables between programs in different languages?

I've never had formal training in this area so I'm wondering what do they teach in school (if they do).
Say you have two programs in written in two different languages: C++ and Python or some other combination and you want to share a constantly updated variable on the same machine, what would you use and why? The information need not be secured but must be isochronous should be reliable.
Eg. Program A will get a value from a hardware device and update variable X every 0.1ms, I'd like to be able to access this X from Program B as often as possible and obtain the latest values. Program A and B are written and compiled in two different (robust) languages. How do I access X from program B? Assume I have the source code from A and B and I do not want to completely rewrite or port either of them.
The method's I've seen used thus far include:
File Buffer - Read and write to a
single file (eg C:\temp.txt).
Create a wrapper - From A to B or B
to A.
Memory Buffer - Designate a specific
memory address (mutex?).
UDP packets via sockets - Haven't
tried it yet but looks good.
Firewall?
Sorry for just throwing this out there, I don't know what the name of this technique is so I have trouble searching.
Well you can write XML and use some basic message queuing (like rabbitMQ) to pass messages around
Don't know if this will be helpful, but I'm also a student, and this is what I think you mean.
I've used marshalling to get a java class and import it into a C# program.
With marshalling you use xml to transfer code in a way so that it can be read by other coding environments.
When asking particular questions, you should aim at providing as much information as possible. You have added a use case, but the use case is incomplete.
Your particular use case seems like a very small amount of data that has to be available at a high frequency 10kHz. I would first try to determine whether I can actually make both pieces of code part of a single process, rather than two different processes. Depending on the languages (missing from the question) it might even be simple, or turn the impossible into possible --depending on the OS (missing from the question), the scheduler might not be fast enough switching from one process to another, and it might impact the availability of the latest read. Switching between threads is usually much faster.
If you cannot turn them into a single process, then you will have to use some short of IPC (Inter Process Communication). Due to the frequency I would rule out most heavy weight protocols (avoid XML, CORBA) as the overhead will probably be too high. If the receiving end needs only access to the latest value, and that access may be less frequent than 0.1 ms, then you don't want to use any protocol that includes queueing as you do not want to read the next element in the queue, you only care about the last, if you did not read the element when it was good, avoid the cost of processing it when it is already stale --i.e. it does not make sense to loop extracting from the queue and discarding.
I would be inclined to use shared memory, or a memory mapped shared file (they are probably quite similar, depends on the platform missing from the question). Depending on the size of the element and the exact hardware architecture (missing from the question) you may be able to avoid locking with a mutex. As an example in current intel processors, read/write access to 32 bit integers from memory is guaranteed to be atomic if the variable is correctly aligned, so in that case you would not be locking.
At my school they teach CORBA. They shouldn't, it's an ancient hideous language from the eon of mainframes, it's a classic case of design-by-committee, every feature possible that you don't want is included, and some that you probably do (asynchronous calls?) aren't. If you think the c++ specification is big, think again.
Don't use it.
That said though, it does have a nice, easy-to-use interface for doing simple things.
But don't use it.
It almost always pass through C binding.

Permanent Memory Address

With my basic knowledge of C++, I've managed to whip together a simple program that reads some data from a program (using ReadProcessMemory) and sends it to my web server every five minutes, so I can see the status of said program while I'm not at home.
I found the memory addresses to read from using a program designed to hack games called "Memory Hacking Software." The problem is, the addresses change whenever I move the program to another machine.
My question is: is there a way to find a 'permanent' address that is the same on any machine? Or is this simply impossible. Excuse me if this is a dumb question, but I don't know a whole lot on the subject. Or perhaps another means to access information from a running program.
Thanks for any and all help!
There are ways to do it such as being able to recognise memory patterns around the thing you're looking for. Crackers can use this to find memory locations to patch even with software that "moves around", so to speak (as with operating systems that provide randomisation of address spaces).
For example, if you know that there are fixed character strings always located X bytes beyond the area of interest, you can scan the whole address space to find them, then calculate the area of interest from that.
However, it's not always as reliable as you might think.
I would instead be thinking of another way to achieve your ends, one that doesn't involve battling the features that are protecting such software from malicious behaviour.
Think of questions like:
Why exactly do you need access to the address space at all?
Does the program itself provide status information in a more workable manner?
If the program is yours, can you modify it to provide that information?
If you only need to know if the program is doing its job, can you simply "ping" the program (e.g., for a web page, send an HTML request and ensure you get a valid response)?
As a last resort, can you convince the OS to load your program without address space randomisation then continue using your (somewhat dubious) method?
Given your comment that:
I use the program on four machines and I have to "re-find" the addresses (8 of them) on all of them every time they update the program.
I would simply opt for automating this process. This is what some cracking software does. It scans files or in-memory code and data looking for markers that it can use for locating an area of interest.
If you can do it manually, you should be able to write a program that can do it. Have that program locate the areas of interest (by reading the process address space) and, once they're found, just read your required information from there. If the methods of finding them changes with each release (instead of just the actual locations), you'll probably need to update your locator routines with each release of their software but, unfortunately, that's the price you pay for the chosen method.
It's unlikely the program you're trying to read will be as secure as some - I've seen some move their areas of interest around as the program is running, to try and confuse crackers.
What you are asking for is impossible by design. ASLR is designed specifically to prevent this kind of snooping.
What kind of information are you getting from the remote process?
Sorry, this isn't possible. The memory layout of processes isn't going to be reliably consistent.
You can achieve your goal in a number of ways:
Add a client/server protocol that you can connect to and ask "what's your status?" (this also lends itself nicely to asking for more info).
Have the process periodically touch a file, the "monitor" can check the modification time of that file to see if the process is dead.

Generating a Hardware-ID on Windows

What is the best way to generate a unique hardware ID on Microsoft Windows with C++ that is not easily spoofable (with for example changing the MAC Address)?
Windows stores a unique Guid per machine in the registry at:
HKEY_LOCAL_MACHINE\Software\Microsoft\Cryptography\MachineGuid
This used to be the CPU serial number but today there are many types of motherboards and this factor is not accurate. MAC address can be easily forged. That leaves us with the internal hard drive serial number. See also: http://www.codeproject.com/Articles/319181/Haephrati-Searching-for-a-reliable-Hardware-ID
There are a variety of "tricks", but the only real "physical answer" is "no, there is no solution".
A "machine" is nothing more than a passive bus with some hardware around.
Although each piece of iron can provide a somehow usable identifier, every piece of iron can be replaced by a user for whatever bad or good reason you can never be fully aware of (so if you base your functionality on this, you create problems to your user, and hence -as a consequence- to yourself every time an hardware have to be replaced / reinitialized / reconfigured etc. etc.).
Now, if your problem is identify a machine in a context where many machines have to inter-operate together, this is a role well played by MAC or IP addresses or Hostnames. But be prepared to the idea that they are not necessarily constant on long time-period (so avoid to hard-code them - instead "discover then" upon any of your start-up)
If your problem is -instead- identify a software instance or a licence, you have probably better to concentrate on another kind of solution: you sell licences to "users" (it is the user that has the money, not his computer!), not to their "machines" (that users must be free to change whenever they need/like without your permission, since you din't licence the hardware or the OS...), hence your problem is not to identify a machine, but a USER (consider that a same machine can be a host for many user and that a same user can work on a variety of machines ..., you cannot assume/impose a 1:1 relation, without running into some kind of problems sooner or later, when this idiom ifs found to no more fit).
The idea should be to register the users in a somewhat reachable site, give them keys you generate, and check that a same user/key pair is not con-temporarily used more than an agreed number of times under a given time period. When violations exceed, or keys becomes old, just block and wait for the user to renew.
As you can see, the answer mostly depends on the reason behind your question, more than from the question itself.
There are various IDs assigned to hardware that can be read and combined to form a machine key. For example, you could get the ID of the hard drive where the software is stored, the proc ID, etc. Some of these can be set more easily than others, but part of the strength is in combining multiple pieces together that are not necessarily strong enough by themselves.
Here is a program (also available as DLL) that can read and show your computer/hardware ID: http://www.soft.tahionic.com/download-hdd_id/index.html
Use Win32 System HDS APIs.
Don't read the registry, it has no sense at all.