How to replace gzip compressed data with other data and still have a valid .gz file? - compression

I'm experimenting. I created a .txt file with the word hi and run gzip test.txt to compress it.
This gives me a file test.txt.gz with the following bytes:
1F 8B 08 08 E6 E8 3F 60 00 03 62 2E 74 78 74 00
CB C8 04 00 AC 2A 93 D8 02 00 00 00
With the software 101 editor, I found out that the first line is the header.
CB C8 04 00 are the compressed data
AC 2A 93 D8 is "CRC of the data section"
02 00 00 00 is the "size of the uncompressed input"
What I'm trying to do (I don't know if it is even possible): I want to have my own characters as "compressed" data but want the .gz file to be still valid.
I tried replacing CB C8 04 00 with 62 62 62 62 (letter 'b' 4 times) but the file is invalid then. Then I tried to replace AC 2A 93 D8, too with the CRC32 value of "bbbb", but the file is still invalid. I can't decompress it. Running gzip -d test.txt.gz returns "unexpected end of file".
Is it possible what I'm trying to do? If yes: what am I doing wrong?

CB C8 04 00 is a valid deflate stream. 62 62 62 62 is not. A gzip member is a gzip header, a valid deflate stream, and a gzip trailer.
Deflate streams are defined in RFC 1951.

Related

Zlib inflate unexpected decompression errors

I am struggling for a few weeks with zlib inflate alghorithm.
I would like to decompress packets from popular game called Tibia. They have compressed it with zlib inflate alghorithms. But it seems something is changed. Can you check it, maybe you will spot something I am missing?
Packet compressed: - cannot be inflated directly (its raw huffman fixed coding - probably)
DA 22 A6 CB 10 99 5F AA 50 9C 9A AA 90 A8 90 05 B4 2F B5 44 41 C3 B1 28 D7 CA 50 53 8F CB B3 44 A1 3C 35 33 3D A3 58 C1 C8 44 CF C0 40 21 BF 4A 0F
Packet decompressed using Reverse Engineering:
B4 16 2D 00 59 6F 75 20 73 65 65 20 61 20 6A 61 63 6B 65 74 20 28 41 72 6D 3A 31 29 2E 0A 49 74 20 77 65 69 67 68 73 20 32 34 2E 30 30 20 6F 7A 2E
Decompressed packet compressed with zlib deflate using CyberChef deflate / php deflate functions:
db 22 a6 cb 10 99 5f aa 50 9c 9a aa 90 a8 90 95 98 9c 9d 5a a2 a0 e1 58 94 6b 65 a8 a9 c7 e5 59 a2 50 9e 9a 99 9e 51 ac 60 64 a2 67 60 a0 90 5f a5 07 00
Data matches in a few places, but generally, its different. Do you know what could be the case?
I attach a picture from RE an inflating function: Screenshot from IdaPro
Here are packets in correct order:
http://wklej.org/hash/6aee9e223f0/txt/ - inflated correctly
http://wklej.org/hash/bd371e7f510/txt/ - inflated correctly
http://wklej.org/hash/8f15935dc15/txt/ - inflated correctly
And here is the packet that cannot be inflated...
CA059BC6043619009FC9FFFFE831
Your packet that cannot be inflated is likely part of a longer stream of compressed data, with other packets preceding it and following it. You need to decompress all of them as a single stream for the decompression to succeed.
Your first example is a portion of a deflate stream that references data that preceded it. So it is part of a larger deflate stream. You need all of the compressed data that preceded that piece in order to decompress that piece. Your last example (CA05...) also references preceding data, so it too is part of a larger stream with compressed data that preceded it.

Wrong CRLF in UTF-16 stream?

Here is a problem I could not solve despite all my efforts. So I am totally stuck, please help!
For regular, “ASCII” mode the following simplified file and stream outputs
FILE *fa = fopen("utfOutFA.txt", "w");
fprintf(fa, "Line1\nLine2");
fclose(fa);
ofstream sa("utfOutSA.txt");
sa << "Line1\nLine2";
sa.close();
result, naturally, in exactly the same text files (hex dump):
00000000h: 4C 69 6E 65 31 0D 0A 4C 69 6E 65 32 ; Line1..Line2
where the new line \n is expanded to CRLF: 0D 0A – typical for Windows.
Now, we do the same for Unicode output, namely UTF-16 LE which is a sort of “default”. File output
FILE *fu = fopen("utfOutFU.txt", "w, ccs=UNICODE");
fwprintf(fu, L"Line1\nLine2");
fclose(fu);
results in this contents:
00000000h: FF FE 4C 00 69 00 6E 00 65 00 31 00 0D 00 0A 00 ; ÿþL.i.n.e.1.....
00000010h: 4C 00 69 00 6E 00 65 00 32 00 ; L.i.n.e.2.
which looks perfectly correct considering BOM and endianness, including CRLF: 0D 00 0A 00. However, the similar stream output
wofstream su("utfOutSU.txt");
su.imbue(locale(locale::empty(), new codecvt_utf16<wchar_t, 0x10ffffUL,
codecvt_mode(generate_header + little_endian)>));
su << L"Line1\nLine2";
su.close();
results in one byte less and overall incorrect text file:
00000000h: FF FE 4C 00 69 00 6E 00 65 00 31 00 0D 0A 00 4C ; ÿþL.i.n.e.1....L
00000010h: 00 69 00 6E 00 65 00 32 00 ; .i.n.e.2.
The reason is wrong expansion of CRLF: 0D 0A 00. Is this a bug? Or have I done something wrong?
I use Microsoft Visual Studio compiler (14.0 and other). I tried using stream endl instead of \n – same result! I tried to put su.imbue() first and then su.open() – all the same! I also checked the UTF-8 output (ccs=UTF-8 for file and codecvt_utf8 for stream) – no problem as CRLF stays the same as in ASCII mode: 0D 0A
I appreciate any ideas and comments on the issue.
When you are imbue()'ing a new locale into the std::wofstream, you are wiping out its original locale. Don't use locale::empty(), use su.getloc() instead, so the new locale copies the old locale before modifying it.
Also, on a side note, the last template parameter of codecvt_utf16 is a bitmask, so codecvt_mode(generate_header + little_endian) really should be std::generate_header | std::little_endian instead.
su.imbue(std::locale(su.getloc(), new codecvt_utf16<wchar_t, 0x10ffffUL,
std::generate_header | std::little_endian>));
I've discovered that this problem comes from the fact that you are writing the file in text mode. What you want to do is open your output file in binary and the issue will be solved, like so:
wofstream su("utfOutSU.txt", ofstream::out | ofstream::binary);
su.imbue(locale(su.getloc(), new codecvt_utf16<wchar_t, 0x10ffffUL,
codecvt_mode(generate_header + little_endian)>));
su << L"Line1\r\nLine2";
su.close();
Something "clever" is probably done behinde the scenes when writing text.

Error `invalid distance too far back` when Inflate HTML gZIP content

I want to inflate HTML webpages. I am using zlib functions
inflateInit2(&zstream,15+32);
and then
inflate(&zstream,Z_SYNC_FLUSH);
It works for lots of webpages correctly but for "www.tabnak.ir" it does not work correctly.
invalid distance too far back is the ERROR I got for this website.
This webpage is also gzip and utf8.
How should I deal with that?
This is For Bing.com which works Fine
1f 8b 08 00 ef 8c 77 56 00 ff ec 5a eb 73 9c 46
12 ff 9e aa fc 0f 04 d5 9d ad 78 1f c0 3e b4 0b
96 52 b2 24 2b ba 73 1c 9d 2d 27 b9 8a af b6 06
This is For tabnak.ir which results in invalid distance too far back Error
1f 8b 08 00 00 00 00 00 00 03 ed fd db 73 5b d7
99 2f 8a 3e ab ab d6 ff 30 ac ae ac d8 3b 82 80
39 71 a7 6d 55 39 89 7b 75 f7 4a d2 7d 92 74 af
The zlib/gzip format performs compression saying things like "The next 22 bytes are the same as the 22 bytes we saw 1013 bytes ago.
In this case the record describing the repetition, is from before the size of the 'window'.
Given you have specified a maximum size of window, the likelihood, is that the data format has changed a bit, or the data you received is not the same as was sent.
Some things to check.
You are using the latest zlib library.
Standard utilities (e.g. gunzip, winzip) can decompress the data.
The data you are getting is not being mangled by a text filter ('rb' vs 'rt')
If that hasn't helped, try walking through the data and understanding what the failure in gzip is.
It would seem that the file you are trying to "inflate" (decompress using zlib) is not a valid zip file. Since bing.com is most likely not a zlib file, it might be pure coincidence that you found something quite early that prevented decompression.

SMPP client is adding an ¿ to the end of the message

I'm trying to make a Windows desktop smpp client, and it's connecting and sending well, aside from a bug where an extra character (¿) is added to the end of the message content that I'm receiving on my phone.
So I'm sending "test" but my phone receives "test¿". Here's the contents of the pdu object, just before it gets sent:
size :58
sequence :2
cmd id :0x4
cmd status:0x0 : No Error
00000000 00 00 00 3a 00 00 00 04 00 00 00 00 00 00 00 02 |...:............|
00000010 00 05 00 74 65 73 74 66 72 6f 6d 00 01 01 34 34 |...testfrom...44|
00000020 37 37 37 37 37 37 37 37 37 37 00 00 00 00 00 00 |7777777777......|
00000030 00 00 00 00 05 74 65 73 74 00 |.....test.|
0000003a
I'm using this c++ smpp library as a base:
https://github.com/onlinecity/cpp-smpp
I had to make some slight changes to get it working on windows, but I don't think anything was changed that could have affected this.
Someone else ran a test using a different account on the smpp server, and their test added an # symbol instead.
Any ideas what could be causing this? Thanks!
Found the problem in the end, it was due to an option in the smpp library that defaults to true, called nullTerminateOctetStrings
It was adding the 00 to the end of the message, sound slike this was required by the SMPP 3.4 standards, but our smsc didn't like it. I suppose ideally I would fix the smsc, but that's provided by a 3rd party, so I've just switched off the null terminate instead.
Someone with a similar problem and more info here: https://www.mail-archive.com/devel#kannel.3glab.org/msg06765.html

How to programmatically load a Java card applet ( a .cap file ) using Visual C++/PCSC

I am currentlly on a project that requires me to load a JavaCard application Applet ( a .cap ) file to a JavaCard. Our framework is based on Visual C++ and PCSC, so we need to load the same applet to a series of JavaCards. Does anyone know how this can be processed? I mean, where should I start. Thanks!
You are correct that this is not a trivial job.
There are differences between different javacards, but generally you need to do 4 things:
initialize secure communications with the card (because many javacards are "global platform" cards they require a secure channel)
send a command saying "i wanna install an applet"
send the binary data for the applet to be installed
send a command to "instantiate" the applet after the binary data is sent
I'd recommend using the eclipse plugin to install the applet initially, because you can see the APDUs generated by the plugin to do the steps above. Once you know the APDU commands you must send to install your applet, you can directly send these commands using the PCSC interface from your C++ code to automate installation on a large number of cards.
My company makes a web browser plugin called Card Boss for doing this kind of thing (card communications via pcsc) from a browser - there's a web page you can use where you can type your own APDUs and send them to the card at the follwing URL:
https://cardboss.cometway.com/content.agent?page_name=Card+Boss+Lab
If you use our tool, your applet installation script should look something like this (note that this is a script for a JCOP card using the default jcop keys)
MESSAGE BOX Installing applets...
INIT CHANNEL 40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f, 40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f
// INSTALL CAP:
SEND 80 E6 02 00 1D 10 A0 00 00 00 09 00 03 FF FF FF FF 89 10 71 00 01 08 A0 00 00 00 03 00 00 00 00 00 00
// LOADING CAP:
SEND 80 E8 00 00 FA C4 82 01 03 01 00 25 DE CA FF (snip, I removed a bunch
of binary data representing the cap file to shorten this post, and you might
need multiple SEND commands because of limits on the size of APDUS)
// INSTANTIATING Applet
SEND 80 E6 0C 00 1E 05 63 6F 6D 65 74 07 63 6F 6D 65 74 00 01 05 00 00 00 00 00 01 00 06 C9 04 68 2C 00 03 00 00