How to read output of hexdump of a file? - c++

I wrote a program in C++ that compresses a file.
Now I want to see the contents of the compressed file.
I used hexdump but I dont know what the hex numbers mean.
For example I have:
0000000 00f8
0000001
How can I convert that back to something that I can compare with the original file contents?

If you implemented a well-known compression algorithm you should be able to find a tool that performs the same kind of compression and compare its results with yours. Otherwise you need to implement an uncompressor for your format and check that the result of compressing and then uncompressing is identical to your original data.

That looks like a file containing the single byte 0xf8. I say that since it appears to have the same behaviour as od under UNIX-like operating systems, with the last line containing the length and the contents padded to a word boundary (you can use od -t x1 to get rid of the padding, assuming your od is advanced enough).
As to how to recreate it, you need to run it through a decryption process that matches the encryption used.
Given that the encrypted file is that short, you either started with a very small file, your encryption process is broken, or it's incredibly efficient.

Related

How to read .inp file in c++?

I have a dataset, a ".inp" format file, and I need to read this file in c++. However, the fopen() fread() method seemed to fail and read the wrong data(e.g. the first integer should be 262144, the fread yields an integer much larger than this nevertheless).
To be more specific, my ".inp" file contains a few integers and float points, how can I read them successfully in c++?
enter image description here
This is the screenshot of the "*.inp" file from Notepad++. Basically this is a text file.
I solved it by coping the data into a txt. However, I am still not aware how to read "*.inp"
I found some info about INP file extension. It seems like there are multiple variances of it, each meant to be used for different purpose. Where is your file coming from? As for soultion, if you can't open the file using fopen/fstream normally, you could treat it as binary and read each value in the way you specify. Other than that, I could think of calling system functions to get file contents (like cat in linux for example), then if there are some random characters, you could parse your string to ommit them.
Here is example of how to call cat in C++:
Simple way to call 'cat' from c++?

How to read large files in C++ with mixed text and binary

I need to read a large file of either text, binary, or combination, such as a JPEG file, encrypt it, and write it to a file. At some later time I will need to read the encrypted data, and decrypt it.
The end goal is to verify that the decrypted data matches the original data.
My problem is that with large files greater than 1Meg, I don't want to read and write character by character. I am targeting this code for a phone and I/O will cause too long a delay for the user.
With a pure text file, using fread() and fwrite() convert the data to binary, and the result is different than the original. With a jpeg image, it appears that there is some textual content mixed in with the binary data.
Is there a way to efficiently read in an arbitrary type of file and write it back in the original format?
Or is character by character the only option?
Or am I still out of luck?
After debugging it turned out that the decrypt function had the plain text and cipher text buffers assigned backwards. After swapping the buffer assignments, the decrypted results matched the original data. I originally thought that maybe reading the text as binary and then rewriting as binary would not appear as text, but I was wrong.
Reading the entire file as binary works just fine.

Add/edit string in compiled C program?

I have a strange question, I am wondering if there is a way to add/edit a string (or something that could be accessed via the C program (inside, ie not an external file)) after it has been compiled?
The purpose is to change a URL on an Windows program via PHP on Linux (obviously I cannot just compile it).
Many posix platforms come with the program strings which will read through a binary file searching for strings. There is an option to print out the offset of the strings. For example:
strings -td myexec
From there you can use a hex editor but the main problem is that you wouldn't be able to make a string bigger than it already is.
A Hex Editor is probably your best bet.
A hex editor will work, but you have to be careful not to alter the size of the executable. If the string happens to be in the .res file, you can use ResEdit.
There are specialized tools to modify existing executable files. A notable tool is
Resource Tuner, which can be used to edit all sorts of resources in an executable.
Another option is to use a text editor, like Hex Workshop, to edit the characters in the strings of an executable. However, bear in mind that with this method, you can only edit existing strings in an executable, and the replaced strings must have an equal or smaller length than the original ones, otherwise you'll end up modifying executable code.
As others have suggested, you can use a binary file editor (hex editor) to change the string in the executable file. You will want to embed into the string a marker (unique sequence of bytes) so that you can find the string in your file. And you will want to ensure that you are reading/writing the file at correct offsets.
As OP stated plans to use PHP on linux to rewrite the file, you will need to use fseek to position the file pointer to the starting location of this URL string, ensure you stay within the size of the string as you replace bytes, and then use fseek/rewind and fwrite to change the file.
This technique can be used to change a URL embedded in a binary file, and it can also be used to embed a license key into a binary, or to embed an application checksum value into a binary so that one can detect when the binary has changed.
As some posters have suggested, you may need to recompute a checksum or re-sign a binary file. A quick way to check for this behavior would be to compile two versions of your binary with different URL values. Then compare the files and see if there are differences other than in the URL values.
to properly edit a string in a compiled program you need to:
read in the files bytes
search the .rdata for strings and record the address of the first occurrence of the string
convert that address to the virtual address using some of the data in the file header
write a new .rdata onto the executable and write your new string into it recording its address and getting its virtual address.
search the .text section for references to the virtual address of the old string and replace it with the reference to your new string.
fortunately i made a program to do this on windows it only works on 32 bit programs here
Not unless you want to poke around in the generated hex or assembly code.

Converting WAV file audio input into plain ASCII characters

I am working on a project where we need to convert WAV file audio input into plain ASCII characters. The input WAV file will contain a single short alphanumeric code e.g. asdrty543 and each character will be pronounced one by one when you play the WAV file. Our requirement is that when a single character code is pronounced we need to convert it into it's equivalent ASCII code. The implementation will be done in C/C++ as un-managed Win32 DLL. We are open to use third party libraries. I am already googling for directions. However, I will really appreciate it if I can get directions/pointers from an experienced programmer who has already worked on similar requirement. Thank you in advance for your help.
ASCII characters like Az09 are only a portion of the ASCII Table. WAV files like any other file is stored and accessed in bytes.
1 byte has 256 different values. Therefore one can't simply convert bytes into Az09 since there are not enough Az09 characters.
You'll have to find a library which opens WAV files and creates the wave format for you. In relation to the wave's intensity and length, a chain of Az or Az09 characters can be produced.
I believe you're trying to convert the wave to a series of notes. That's possible too, using the same approach.

How to identify compressed/uncompressed bit groups?

I'm using a static dictionary file with some words and values for this words. This values are not fixed sized, for example the is 1, love is 01, kill is 101 etc. When I try to compress a group of words, I traverse every word and look up to dictionary if a value exists for that word. If one exists I change the word with the value, if it doesn't exist I encode the word as bytes. After compression I got a chunk of bits, and because these dictionary values and uncompressed words are not fixed sized I can not group the bits and decode them.
I have thought about using 1 bit flag for every group of bits to determine it is compressed or uncompressed, but I can't detect the flag bit because of this unknown length of a codeword or regular word.
If I use a 1 byte delimiter, it still has problems. Let's say my delimiter is 00000000, and before the delimiter I have 100 and after delimiter I have 001, so we have 10000000000001, how am I supposed to know that which group of these bits are my delimiter?
Can I use some other method to group these compressed/uncompressed bits to decode them? Thank you.
First off,what language and system are you intending to deploy this? Many languages provide their own libraries and tools for compression and may suite your needs without major low-level design effors.
The answer here is to establish some more rigorous bookkeeping and file formatting to be able to undo the compression. Most compression systems have some amount of overhead in their file format which is why when you compress something twice you don't necessarily save anything and can actually increase the size of the file.
Often files take advantage of header at the start of a file to provide key information. which would be a good place to define any rules that are specific to the compressed file.
create fixed size delimiter to use between code words only. This can be determined after analyzing the file but before actually writing out the compressed data.
If you generate your delimiter rather than a fixed known value, include this as one of your header items.
keep your header a simple ascii format so that you can easily extract it with standard tools like sscanf and fscanf.
if you want to have a header that can contain extra information you may need a consistent way to tell where the header ends and the data begins. Including something to the effect of "ENDHEADER" should be enough and still easily identifiable.