Splunk Data preview - Timestamp in milliseconds, Regex problems - regex

Im trying to parse out the timestamp in milliseconds with this Regex:
\d{7}/
Any idea why its not working?
9281736 : COUNT IN 1003
Tx: 01 04 00 71 00 02 21 d0 ...q..!.
Rx: 01 04 04 00 08 0a 28 7c f8 ......(|.
9282136 : COUNT IN 1003
Tx: 01 04 00 c9 00 02 a1 f5 ........
Rx: 01 04 04 00 08 00 00 7a 46 .......zF
9282536 : COUNT IN 1003
Tx: 01 04 01 2d 00 02 e0 3e ...-...>
Rx: 01 04 04 00 00 ff ff fa 34 ........4
9282936 : COUNT IN 1003
Tx: 01 04 01 f5 00 02 60 05 ......`.
Rx: 01 04 04 00 23 00 00 0a 4e
I preview with "Unsorted data" and get timestamp error message - "Failed to parse timestamp. Defaulting to file modtime."

I think the Regex should be \d{7}, not /\d{7}. Note the slashes!

Related

Extract compress image from memo field

Dbf file in which the memo field contains the #SIXPIC# entry and a dbt file that contains binary data with #SIXPIC# tags
for example:
0000000000: 00 1A 0B 0C CE 9F 01 00 │ 49 49 2A 00 C0 9A 01 00 →♂♀Ο›☺ II* ��☺
0000000010: FF FF C0 04 00 40 FF FC │ A6 C9 0C F1 53 4F FF 80 ���♦ #����♀�SO��
0000000020: 08 00 80 29 B5 51 9B 17 │ FF FF 6A 23 F9 4D 96 08 ◘ �)�Q�↨��j#�M�◘
0000000030: 53 85 35 39 6D 12 85 09 │ D8 40 CC 81 85 2A A2 41 S�59m↕�○�#́›�*�A
0000000040: 10 20 D8 57 A0 F4 E5 7A │ F3 F9 78 B5 F4 02 FF 0B ► �W���z��x��☻�♂
0000000050: E5 70 C3 2D 07 F0 FC 8A │ 04 E9 FC 2F BB 84 1F FF �p�-•���♦��/��▼�
0000000060: 50 01 00 10 2B E8 19 45 │ AE E1 A4 4D 80 83 9A C3 P☺ ►+�↓E���M���Ò
0000000070: 92 03 99 08 0A 55 02 91 │ 40 9E FE 17 DF 4F FF AA ›♥�◘◙U☻�#��↨�O��
...
0000019F50: 00 00 21 03 00 00 68 02 │ 00 00 2E 02 00 00 FF 01 !♥ h☻ .☻ �☺
0000019F60: 00 00 13 02 00 00 F7 01 │ 00 00 06 02 00 00 D9 01 ‼☻ �☺ ♠☻ �☺
0000019F70: 00 00 7C 01 00 00 1B 01 │ 00 00 2A 01 00 00 17 01 |☺ ←☺ *☺ ↨☺
0000019F80: 00 00 ED 00 00 00 DF 00 │ 00 00 04 01 00 00 8C 01 � � ♦☺ �☺
0000019F90: 00 00 76 01 00 00 09 01 │ 00 00 E9 00 00 00 E9 00 v☺ ○☺ � �
0000019FA0: 00 00 83 00 00 00 77 00 │ 00 00 78 00 00 00 8C 00 � w x �
0000019FB0: 00 00 47 00 00 00 15 00 │ 00 00 88 00 00 00 06 00 G § � ♠
0000019FC0: 00 00 05 00 00 00 00 00 │ 80 25 00 00 20 00 00 00 ♣ �%
0000019FD0: 80 25 00 00 20 00 1A 01 │ 73 80 84 30 C1 B0 C1 40 �% →☺s��0���#
0000019FE0: 39 E0 E4 80 84 80 A7 69 │ 03 48 EA 1A 47 40 68 24 9�䀄››��i♥H�→G#h$
0000019FF0: 04 35 83 59 38 52 70 68 │ ♦5�Y8Rph
How to get an image from this code?

Formatting a file using regex

Spent some time trying to format a file with roughly 5,000 hex values, however with no luck. For example
1b 00 10 50 a3 bb 0e b7 ff ff 00 00 00 00 09 00
01 02 00 01 00 85 03 0e 00 00 00 55 0e 04 66 03
2a 38 32 80 00 0e 00 2f c2
1b 00 10 50 a3 bb 0e b7 ff ff 00 00 00 00 09 00
01 02 00 01 00 85 03 2b 00 00 00 55 2b 04 58 28
2a 39 32 80 00 01 00 12 57 4d 32 34 30 20 41 43
20 56 65 72 2e 41 00 00 23 06 00 0a 23 06 00 0a
01 00 00 c0 14 56
1b 00 30 a6 59 b8 0e b7 ff ff 00 00 00 00 09 00
00 02 00 01 00 04 03 0d 00 00 00 55 0d 04 33 2a
03 3a 32 40 00 0e be 40
1b 00 f0 01 f1 b6 0e b7 ff ff 00 00 00 00 09 00
00 02 00 01 00 04 03 0e 00 00 00 55 0e 04 66 2a
00 3b 32 40 00 01 05 c9 b1
and so on..
I need to format in the following matter:
1b 00 10 50 a3 bb 0e b7 ff ff 00 00 00 00 09 00 01 02 00 01 00 85 03 0e 00 00 00 55 0e 04 66 03 2a 38 32 80 00 0e 00 2f c2
1b 00 10 50 a3 bb 0e b7 ff ff 00 00 00 00 09 00 01 02 00 01 00 85 03 2b 00 00 00 55 2b 04 58 28 2a 39 32 80 00 01 00 12 57 4d 32 34 30 20 41 43 20 56 65 72 2e 41 00 00 23 06 00 0a 23 06 00 0a 01 00 00 c0 14 56
1b 00 30 a6 59 b8 0e b7 ff ff 00 00 00 00 09 00 00 02 00 01 00 04 03 0d 00 00 00 55 0d 04 33 2a 03 3a 32 40 00 0e be 40
1b 00 f0 01 f1 b6 0e b7 ff ff 00 00 00 00 09 00 00 02 00 01 00 04 03 0e 00 00 00 55 0e 04 66 2a 00 3b 32 40 00 01 05 c9 b1
Basically put each hex block into one line still separated by space. For the life of me, i cannot figure out the regular expression to format this for me. I have tried different expressions but everything i tried either removes the line that separates hex blocks or grabs a last character of the line instead of the actual \n.
Maybe there is a better way of formatting files other than using regex
Please try regex: (?!\n\n)\n
Demo

crtdbg dumps a memory leak when sf::Text::setOutlineThickness is used

Working with SFML 2.4.2 on Windows 7 64-bit version, I've noticed an issue with sf::Text::setOutlineThickness(float). Once it is used in the program, except for default value 0, crtdbg dumps a memory leak of various sizes of bytes but always the same amount. I believe this is related to the size of the string, if the text gets drawn, and if the parameter of setOutlineThickness is accepted, demonstrated here:
/// Initial set-up
sf::Text test;
test.setString("A");
// ... Set charactersize, font, fillcolor, etc ...
test.setOutlineThickness(1);
test.setOutlineColor(sf::Color::Black);
/// Make a drawcall for test later in the program
void Game::draw(sf::RenderTarget & target, sf::RenderStates states) const
{
target.draw(test, states);
}
This produces a leak:
{8601} normal block at 0x0000000005CA5C90, 60 bytes long.
Data: < > 03 00 07 00 0B 00 0F 00 13 00 17 00 1B 00 1F 00
{8600} normal block at 0x0000000005E03A20, 120 bytes long.
Data: < > 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01
{8599} normal block at 0x0000000005E2A680, 960 bytes long.
Data: < > 00 00 00 00 80 07 00 00 0D 01 00 00 80 07 00 00
{8598} normal block at 0x0000000005CA36B0, 72 bytes long.
Data: < h > F0 1A 9D 05 00 00 00 00 68 AE 83 DB FE 07 00 00
If test.setString("B");, there are still four blocks but the byte size differs, since the string uses another character:
68 bytes, 136 bytes, 1088 bytes, 72 bytes.
Finally if test.setString("AB");, there are 8 blocks with the expected sizes:
{8667} normal block at 0x0000000005C35D10, 68 bytes long.
Data: < > 03 00 07 00 0B 00 0F 00 13 00 17 00 1B 00 1F 00
{8666} normal block at 0x0000000005C61310, 136 bytes long.
Data: < > 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01
{8665} normal block at 0x000000000325CDE0, 1088 bytes long.
Data: < > 00 00 00 00 80 07 00 00 0D 01 00 00 80 07 00 00
{8664} normal block at 0x0000000005C340D0, 72 bytes long.
Data: < h > F0 1A 96 05 00 00 00 00 68 AE B5 DB FE 07 00 00
{8601} normal block at 0x0000000005C35C90, 60 bytes long.
Data: < > 03 00 07 00 0B 00 0F 00 13 00 17 00 1B 00 1F 00
{8600} normal block at 0x0000000005D93A20, 120 bytes long.
Data: < > 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01
{8599} normal block at 0x0000000005DBA680, 960 bytes long.
Data: < > 00 00 00 00 80 07 00 00 0D 01 00 00 80 07 00 00
{8598} normal block at 0x0000000005C336B0, 72 bytes long.
Data: < h > F0 1A 96 05 00 00 00 00 68 AE B5 DB FE 07 00 00
I use sf::Text as a private member of a class which should be destroyed with the class but that doesn't seem to be the case. What am I missing?
I use _CrtSetDbgFlag(_CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF);, is this a false positive?
Glancing at the function, sf::Text::setOutlineThickness, I don't see an issue here. A brief documentation.
Different leaks depending on the size of the string is more of a symptom really, it's the drawcall and non-default value on outline thickness that I'm clueless on.
Looks like there is a real leak in SFML, at Font.cpp#L561
It looks like this:
FT_Glyph_Stroke(&glyphDesc, stroker, false);
But according to the docu of FT_Glyph_Stroke, it should actually be this, so the source glyph is destroyed:
FT_Glyph_Stroke(&glyphDesc, stroker, true);

std::fstream::tellg() outputs file cursor pointer incorrectly?

I have a std::fstream, which I imported using
std::fstream myFile { "C:/path/to/file.txt" };
When I want to read the first byte, I use
char c;
cout << myFile.tellg() << endl; // Correctly outputs 0 (begining of file)
myFile.read(&c, 1);
cout << myFile.tellg() << endl; // Should output 1, but it outputs
// FFFFFFFFFFFFFFFA
myFile.read(&c, 1);
cout << myFile.tellg() << endl; // Should output 2, but it outputs
// FFFFFFFFFFFFFFFB
What's happening here?
I tried putting
midi_file.seekg(0, ios_base::beg);
or
midi_file.seekg(0, myFile.beg);
But the cursor moves to FFFFFFFFFFFFFFFA whenever I try to read a byte.
EDIT:
I don't know if it has something to do, but I did an endianness test and these are the results:
bool endianness = *reinterpret_cast<short*>("10") & 1; // Outputs 1
EDIT 2:
The file is broken, as the output is not the same with another file, but why is it?
Here is the byte data from the file, taken from HxD, which is a .midi file:
4D 54 68 64 00 00 00 06 00 01 00 03 00 04 4D 54
72 6B 00 00 00 A1 00 C0 69 00 90 3C 5A 01 41 5A
01 45 5A 01 48 5A 01 49 5A 01 48 5A 01 45 5A 01
41 5A 01 3C 5A 01 37 5A 01 33 5A 01 30 5A 01 30
5A 01 30 5A 01 33 5A 01 37 5A 01 3C 5A 01 41 5A
01 45 5A 01 48 5A 01 49 5A 01 48 5A 01 45 5A 01
41 5A 01 3C 5A 01 37 5A 01 33 5A 01 30 5A 01 30
5A 01 30 5A 01 33 5A 01 37 5A 01 3C 5A 01 41 5A
01 45 5A 01 48 5A 01 49 5A 01 48 5A 01 45 5A 01
41 5A 01 3C 5A 01 37 5A 01 33 5A 01 30 5A 01 30
5A 01 30 5A 01 33 5A 01 37 5A 01 3C 5A 01 41 5A
01 45 00 00 FF 2F 00 4D 54 72 6B 00 00 00 41 00
C1 72 05 91 3C 5A 00 40 5A 00 43 5A 00 48 5A 0A
35 5A 00 41 5A 00 44 5A 00 49 5A 0A 37 5A 00 40
5A 00 43 5A 00 48 5A 0A 41 5A 00 47 5A 0A 30 5A
00 40 5A 00 43 5A 00 48 5A 05 32 00 00 FF 2F 00
4D 54 72 6B 00 00 00 26 00 C2 47 0A 92 50 64 01
52 64 09 50 78 00 52 78 0A 50 00 01 52 00 09 50
78 01 50 00 0A 52 00 00 50 00 00 FF 2F 00
EDIT 3:
Here is the full code of this test:
#include <fstream>
#include <iostream>
int main() {
cout << std::hex << std::setfill('0') << std::uppercase;
fstream midi_file{ "D:/Descargas/OutFile.midi" };
cout << midi_file.good() << endl; // Outputs 1
char c;
cout << midi_file.tellg() << endl; // Correctly outputs 0 (begining of file)
midi_file.read(&c, 1);
cout << midi_file.tellg() << endl; // Erroneously outputs FFFFFFFFFFFFFFFA
midi_file.read(&c, 1);
cout << midi_file.tellg() << endl; // Erroneously outputs FFFFFFFFFFFFFFFB
// Endianness test:
cout << (*reinterpret_cast<short*>("10") & 1) << endl; // Outputs 1
return 0;
}
The return value of tellg is NOT a number. It is a pos_type, which has some rules it needs to follow but being understandable when printed isn't one of them. See http://en.cppreference.com/w/cpp/io/fpos
It's primary purpose is to allow a seek operation to return to a saved position.
Also, your "endianness test" is very, very messed up. Reinterpreting a character string as a short? C doesn't work that way. Maybe if you had used "\x01\x00"
File open mode must be in ios::binary mode not to catch special characters.

How to read a UTF-16 file and compare it's contents to a wchar_t* string literal defined with hex values

I have a file in UTF-16 (or UCS-2, doesn't really matter since it is UTF-16 LE as far as I know) encoding which I have downloaded from here: http://www.humancomp.org
I'd like to read the contents of that file into a std::wstring, which is my first problem: I haven't been able to read the file correcly yet. The read data always seems to be messed up.
Secondly, I'd like to compare the read std::wstring to a const wchar_t* string literal. And here, I am experiencing my second problem: How do I specify the wchar_t content via hex values?
The file which I want to turn into a const wchar_t* string literal has the following bytes (copied out of a hex editor)
FE FF 05 31 05 65 05 81 05 65 05 70 05 6B 00 20 05 6B 05 74 00 20 05 6C 05 61 05 7E 00 20 00 3F 05 82 05 72 05 6B 05 65 00 20 05 6C 05 61 05 7E 05 61 05 80 05 61 05 80 00 2C 00 0D 00 0A 05 3F 05 75 05 61 05 65 05 62 05 7D 00 20 05 79 05 7F 05 61 05 75 05 6B 00 20 05 6F 05 61 05 7D 05 6F 05 61 05 6E 05 6B 00 20 05 74 05 70 05 63 05 6B 05 65 00 2E 00 2E 00 2E 00 0D 00 0A 05 31 05 75 05 65 05 7A 05 70 05 7D 00 20 05 6F 00 3F 05 82 05 66 05 70 05 6B 00 20 05 74 05 70 05 6F 05 65 00 20 05 6B 05 65 05 6E 00 20 00 3F 05 61 05 7E 05 61 05 7F 05 80 00 2C 00 0D 00 0A 05 31 05 75 05 65 05 7A 05 70 05 7D 00 20 05 6F 00 3F 05 82 05 66 05 70 05 6B 00 20 00 3F 05 61 05 7E 05 61 05 7F 05 61 05 6C 00 20 05 74 05 70 05 6F 05 6B 05 65 05 89
Of course, I can't initialize the string literal with that. I tried to turn it into hex values and apply a reinterpret_cast to get a const wchar_t*
reinterpret_cast<const wchar_t*>("\xFE\xFF\x05\x31\x05\x65\x05\x81\x05\x65\x05\x70\x05\x6B\x00\x20\x05\x6B\x05\x74\x00\x20\x05\x6C\x05\x61\x05\x7E\x00\x20\x00\x3F\x05\x82\x05\x72\x05\x6B\x05\x65\x00\x20\x05\x6C\x05\x61\x05\x7E\x05\x61\x05\x80\x05\x61\x05\x80\x00\x2C\x00\x0D\x00\x0A\x05\x3F\x05\x75\x05\x61\x05\x65\x05\x62\x05\x7D\x00\x20\x05\x79\x05\x7F\x05\x61\x05\x75\x05\x6B\x00\x20\x05\x6F\x05\x61\x05\x7D\x05\x6F\x05\x61\x05\x6E\x05\x6B\x00\x20\x05\x74\x05\x70\x05\x63\x05\x6B\x05\x65\x00\x2E\x00\x2E\x00\x2E\x00\x0D\x00\x0A\x05\x31\x05\x75\x05\x65\x05\x7A\x05\x70\x05\x7D\x00\x20\x05\x6F\x00\x3F\x05\x82\x05\x66\x05\x70\x05\x6B\x00\x20\x05\x74\x05\x70\x05\x6F\x05\x65\x00\x20\x05\x6B\x05\x65\x05\x6E\x00\x20\x00\x3F\x05\x61\x05\x7E\x05\x61\x05\x7F\x05\x80\x00\x2C\x00\x0D\x00\x0A\x05\x31\x05\x75\x05\x65\x05\x7A\x05\x70\x05\x7D\x00\x20\x05\x6F\x00\x3F\x05\x82\x05\x66\x05\x70\x05\x6B\x00\x20\x00\x3F\x05\x61\x05\x7E\x05\x61\x05\x7F\x05\x61\x05\x6C\x00\x20\x05\x74\x05\x70\x05\x6F\x05\x6B\x05\x65\x05\x89");
but this doesn't work. It gives me bogus data.
I've also tried to create a wchar_t string literal directly:
L"\xFEFF\x0531\x0565\x0581\x0565\x0570\x056B\x0020\x056B\x0574\x0020\x056C\x0561\x057E\x0020\x003F\x0582\x0572\x056B\x0565\x0020\x056C\x0561\x057E\x0561\x0580\x0561\x0580\x002C\x000D\x000A\x053F\x0575\x0561\x0565\x0562\x057D\x0020\x0579\x057F\x0561\x0575\x056B\x0020\x056F\x0561\x057D\x056F\x0561\x056E\x056B\x0020\x0574\x0570\x0563\x056B\x0565\x002E\x002E\x002E\x000D\x000A\x0531\x0575\x0565\x057A\x0570\x057D\x0020\x056F\x003F\x0582\x0566\x0570\x056B\x0020\x0574\x0570\x056F\x0565\x0020\x056B\x0565\x056E\x0020\x003F\x0561\x057E\x0561\x057F\x0580\x002C\x000D\x000A\x0531\x0575\x0565\x057A\x0570\x057D\x0020\x056F\x003F\x0582\x0566\x0570\x056B\x0020\x003F\x0561\x057E\x0561\x057F\x0561\x056C\x0020\x0574\x0570\x056F\x056B\x0565\x0589"
This, again, ends up in bogus data. I'm not even sure if this is the correct way of specifying wchar_t data - combining 2 bytes?
Here is the solution which was achieved with the help of the comment by Remy Lebeau:
// BOM: \xFEFF
auto utf16raw = L"\x0531\x0565\x0581\x0565\x0570\x056B\x0020\x056B\x0574\x0020\x056C\x0561\x057E\x0020\x003F\x0582\x0572\x056B";
std::wstring utf16str{utf16raw};
The BOM must be left out of the string.
The UTF-16 string, utf16str can be converted into an UTF-8 encoded string (and vice-versa) with the UTF-8 CPP library available on Sourceforge, for instance.