Related
I'm trying to solve a regex problem with rsyslog but I've been got hard to get the desired output.
I've been developing on https://www.rsyslog.com/regex/
The logs have this form:
{"trace_indices":["psno_elastic"],"utc_timestamp":"2022-02-01T21:18:57.214+00:00","request_headers":"{Acess=Low 2j3k4uuuuuuuui=}","trace_id":null,"audit_category":"AUTHENTICATED","aud_reason":"Action: indices:data/write/index","#version":"1","ad_principal":null,"format_version":2,"remote_address":"10.135.99.176:52634","request_user":"superuser","audit_trace_source":"{\"Timestamp\" : \"1643749681989\",\"Index\" : \"hive_black_d\",\"Index_Class\" : \"hive_qxs_netfixa\",\"Shards\" : \"389\",\"MBytes\" : \"0.236956\"}","ad_date":"Tue Feb 01 21:10:57 WET 2022","audit_details":"indices:data/write/index","#timestamp":"2022-02-01T21:12:00.665Z","audit_request_type":"transport","request_class":"class org.elasticsearch.action.index.IndexRequest","trace_index_types":["default"],"trace_resolved_indices":["psno_elastic"]}
And I want to get this, eliminate just the extra next to Low:
{"trace_indices":["psno_elastic"],"utc_timestamp":"2022-02-01T21:18:57.214+00:00","request_headers":"{Acess=Low}","trace_id":null,"audit_category":"AUTHENTICATED","aud_reason":"Action: indices:data/write/index","#version":"1","ad_principal":null,"format_version":2,"remote_address":"10.135.99.176:52634","request_user":"superuser","audit_trace_source":"{\"Timestamp\" : \"1643749681989\",\"Index\" : \"hive_black_d\",\"Index_Class\" : \"hive_qxs_netfixa\",\"Shards\" : \"389\",\"MBytes\" : \"0.236956\"}","ad_date":"Tue Feb 01 21:10:57 WET 2022","audit_details":"indices:data/write/index","#timestamp":"2022-02-01T21:12:00.665Z","audit_request_type":"transport","request_class":"class org.elasticsearch.action.index.IndexRequest","trace_index_types":["default"],"trace_resolved_indices":["psno_elastic"]}
I tried to do following regex pattern:
"(.*Access=?)(?>=(Low 2j3k4uuuuuuuui=(.*)$))"
And the results were :
0: {"trace_indices":["psno_elastic"],"utc_timestamp":"2022-02-01T21:18:57.214+00:00","request_headers":"{Access=Low 2j3k4uuuuuuuui=}","trace_id":null,"audit_category":"AUTHENTICATED","aud_reason":"Action: indices:data/write/index","#version":"1","ad_principal":null,"format_version":2,"remote_address":"10.135.99.176:52634","request_user":"superuser","audit_trace_source":"{\"Timestamp\" : \"1643749681989\",\"Index\" : \"hive_black_d\",\"Index_Class\" : \"hive_qxs_netfixa\",\"Shards\" : \"389\",\"MBytes\" : \"0.236956\"}","ad_date":"Tue Feb 01 21:10:57 WET 2022","audit_details":"indices:data/write/index","#timestamp":"2022-02-01T21:12:00.665Z","audit_request_type":"transport","request_class":"class org.elasticsearch.action.index.IndexRequest","trace_index_types":["default"],"trace_resolved_indices":["psno_elastic"]}
1: {"trace_indices":["psno_elastic"],"utc_timestamp":"2022-02-01T21:18:57.214+00:00","request_headers":"{Access
2: Low 2j3k4uuuuuuuui=}","trace_id":null,"audit_category":"AUTHENTICATED","aud_reason":"Action: indices:data/write/index","#version":"1","ad_principal":null,"format_version":2,"remote_address":"10.135.99.176:52634","request_user":"superuser","audit_trace_source":"{\"Timestamp\" : \"1643749681989\",\"Index\" : \"hive_black_d\",\"Index_Class\" : \"hive_qxs_netfixa\",\"Shards\" : \"389\",\"MBytes\" : \"0.236956\"}","ad_date":"Tue Feb 01 21:10:57 WET 2022","audit_details":"indices:data/write/index","#timestamp":"2022-02-01T21:12:00.665Z","audit_request_type":"transport","request_class":"class org.elasticsearch.action.index.IndexRequest","trace_index_types":["default"],"trace_resolved_indices":["psno_elastic"]}
3: }","trace_id":null,"audit_category":"AUTHENTICATED","aud_reason":"Action: indices:data/write/index","#version":"1","ad_principal":null,"format_version":2,"remote_address":"10.135.99.176:52634","request_user":"superuser","audit_trace_source":"{\"Timestamp\" : \"1643749681989\",\"Index\" : \"hive_black_d\",\"Index_Class\" : \"hive_qxs_netfixa\",\"Shards\" : \"389\",\"MBytes\" : \"0.236956\"}","ad_date":"Tue Feb 01 21:10:57 WET 2022","audit_details":"indices:data/write/index","#timestamp":"2022-02-01T21:12:00.665Z","audit_request_type":"transport","request_class":"class org.elasticsearch.action.index.IndexRequest","trace_index_types":["default"],"trace_resolved_indices":["psno_elastic"]}
The numbers (0,1,2,3) are the Submatch to Use.
I don't know what's the right regex expression to use to match the desired output...
Thank You.
I used regex101.com in mode PCRE2 & substitution.
The regex string
/^([\w\W]*)2j3k4uuuuuuuui=([\w\W]*)$/gm
with the source string
":"Action: indices:data/write/index","#version":"1","ad_principal":null,"format_version":2,"remote_address":"10.135.99.176:52634","request_user":"superuser","audit_trace_source":"{\"Timestamp\" : \"1643749681989\",\"Index\" : \"hive_black_d\",\"Index_Class\" : \"hive_qxs_netfixa\",\"Shards\" : \"389\",\"MBytes\" : \"0.236956\"}","ad_date":"Tue Feb 01 21:10:57 WET 2022","audit_details":"indices:data/write/index","#timestamp":"2022-02-01T21:12:00.665Z","audit_request_type":"transport","request_class":"class org.elasticsearch.action.index.IndexRequest","trace_index_types":["default"],"trace_resolved_indices":["psno_elastic"]}
and the substitution $1$2gave the desired result:
{"trace_indices":["psno_elastic"],"utc_timestamp":"2022-02-01T21:18:57.214+00:00","request_headers":"{Acess=Low }","trace_id":null,"audit_category":"AUTHENTICATED","aud_reason":"Action: indices:data/write/index","#version":"1","ad_principal":null,"format_version":2,"remote_address":"10.135.99.176:52634","request_user":"superuser","audit_trace_source":"{\"Timestamp\" : \"1643749681989\",\"Index\" : \"hive_black_d\",\"Index_Class\" : \"hive_qxs_netfixa\",\"Shards\" : \"389\",\"MBytes\" : \"0.236956\"}","ad_date":"Tue Feb 01 21:10:57 WET 2022","audit_details":"indices:data/write/index","#timestamp":"2022-02-01T21:12:00.665Z","audit_request_type":"transport","request_class":"class org.elasticsearch.action.index.IndexRequest","trace_index_types":["default"],"trace_resolved_indices":["psno_elastic"]}
/*
typedef struct _HRFS_VOLUME_CONTROL_BLOCK
{
FSRTL_ADVANCED_FCB_HEADER VolumeFileHeader;
ULONG nodeType;
FAST_MUTEX AdvancedFcbHeaderMutex;
....
};
*/
DumpFileObject(*(pVolDev->fileObject));
Vcb = (HRFS_VOLUME_CONTROL_BLOCK_PTR)ExAllocatePool(PagedPool, sizeof(HRFS_VOLUME_CONTROL_BLOCK));
pVolDev->fileObject->SectionObjectPointer = \
(PSECTION_OBJECT_POINTERS)ExAllocatePool(PagedPool, sizeof(SECTION_OBJECT_POINTERS));;
pVolDev->fileObject->WriteAccess = TRUE;
pVolDev->fileObject->ReadAccess = TRUE;
pVolDev->fileObject->DeleteAccess = TRUE;
pVolDev->fileObject->FsContext = &HrfsData.gVolume;
pVolDev->fileObject->Vpb = Vpb;
CC_FILE_SIZES fileSize;
fileSize.AllocationSize.QuadPart = fileSize.FileSize.QuadPart = sizeof(PACKED_BOOT_SECTOR);
fileSize.ValidDataLength.QuadPart = 0xFFFFFFFFFFFFFFFF;
CcInitializeCacheMap(pVolDev->fileObject,
&fileSize,
TRUE,
&HrfsData.CacheManagerNoOpCallbacks,
Vcb);
In this Code segment a crash occured when I call the CcInitializeCacheMap function.
The FILE_OBJECT and the dump infomation is as below :
fileObject.Size : d8
fileObject.DeviceObject : c2221670
fileObject.Vpb : c39302e0
fileObject.FsContext : 32166f0
fileObject.FsContext2 : 0
fileObject.SectionObjectPointer : 0
fileObject.PrivateCacheMap : 0
fileObject.FinalStatus : 0
fileObject.RelatedFileObject : 0
fileObject.LockOperation : 0
fileObject.DeletePending : 0
fileObject.ReadAccess : 1
fileObject.WriteAccess : 1
fileObject.DeleteAccess : 1
fileObject.SharedRead : 0
fileObject.SharedWrite : 0
fileObject.SharedDelete : 0
fileObject.Flags : 40100
fileObject.FileName : 247bb70
fileObject.CurrentByteOffset : 0
fileObject.Waiters : 0
fileObject.Busy : 0
fileObject.LastLock : 0
fileObject.FileObjectExtension : 0
The stack text is as below:
fffff880`0247bac0 fffff880`03241c78 : fffff880`00000000 00000000`00000000 00000000`00000001 fffff880`032166c8 : nt!CcInitializeCacheMap+0xd3
fffff880`0247bba0 fffff880`0323e095 : fffffa80`c303b010 fffffa80`c2222040 fffffa80`c39302e0 fffffa80`c3d56a40 : fastfatDemo!FatMountVolume+0xaf8 [G:\BaiduNetdiskDownload\fastfat_V1G13\fastfat_File_System_Driver\FsCtrl.c # 1460]
fffff880`0247c2f0 fffff880`0323ecb7 : fffffa80`c303b010 fffffa80`c259bb40 00000000`00000065 00000000`00000003 : fastfatDemo!FatCommonFileSystemControl+0xe5 [G:\BaiduNetdiskDownload\fastfat_V1G13\fastfat_File_System_Driver\FsCtrl.c # 1053]
fffff880`0247c340 fffff880`0113d4bc : fffffa80`c3d56a40 fffffa80`c259bb40 00000000`00000000 00000000`00000000 : fastfatDemo!FatFsdFileSystemControl+0x127 [G:\BaiduNetdiskDownload\fastfat_V1G13\fastfat_File_System_Driver\FsCtrl.c # 969]
fffff880`0247c3a0 fffff880`01138971 : fffffa80`c3d56450 00000000`00000000 fffffa80`c3024200 fffffa80`c3129cb0 : fltmgr!FltpFsControlMountVolume+0x28c
fffff880`0247c470 fffff800`04334e6b : fffffa80`c3d56450 00000000`00000000 fffffa80`c3d56450 fffffa80`c259bb40 : fltmgr!FltpFsControl+0x101
fffff880`0247c4d0 fffff800`040789e7 : fffff880`0247c7c0 fffff880`0247c701 fffffa80`c2221600 00000000`00000000 : nt!IopMountVolume+0x28f
fffff880`0247c590 fffff800`044fac6d : 00000000`00000025 00000000`00000000 fffff880`0247c7c0 fffff880`0247c768 : nt!IopCheckVpbMounted+0x1b7
fffff880`0247c600 fffff800`044229a4 : fffffa80`c2221670 00000000`00000000 fffffa80`c31dbb10 fffff8a0`00000001 : nt!IopParseDevice+0xb4d
fffff880`0247c760 fffff800`042fd756 : 00000000`00000000 fffff880`0247c8e0 00000000`00000040 fffffa80`c15c07b0 : nt!ObpLookupObjectName+0x784
fffff880`0247c860 fffff800`044c9d88 : fffffa80`c3d20cb0 00000000`00000000 00000000`00000401 fffff800`043fdef6 : nt!ObOpenObjectByName+0x306
fffff880`0247c930 fffff800`0435d7f4 : fffffa80`c629f870 fffff8a0`80100080 00000000`0029f4f8 00000000`0029f448 : nt!IopCreateFile+0xa08
fffff880`0247c9e0 fffff800`040b4bd3 : fffffa80`c3539b00 00000000`00000001 fffffa80`c629f870 fffff800`042fe1e4 : nt!NtCreateFile+0x78
fffff880`0247ca70 00000000`77629dda : 000007fe`fd3760d6 00000000`00000000 00000000`80000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x13
00000000`0029f428 000007fe`fd3760d6 : 00000000`00000000 00000000`80000000 00000000`00000000 00000000`000c0000 : ntdll!ZwCreateFile+0xa
00000000`0029f430 00000000`773b0add : 00000000`0034bec0 00000000`80000000 00000000`00000003 00000000`0029f892 : KERNELBASE!CreateFileW+0x2cd
00000000`0029f590 000007fe`f1971c1e : 00000000`00000000 00000000`00000000 00000000`01d14280 00000000`0029f830 : kernel32!CreateFileWImplementation+0x7d
00000000`0029f5f0 00000000`00000000 : 00000000`00000000 00000000`01d14280 00000000`0029f830 00000000`00000003 : FVEAPI+0x1c1e
I traced the address to nt!CcInitializeCacheMap+0xd3
and found there is a compaire instruction .
So what courced the crash to CcInitializeCacheMap by my program ?
This code should not set to PagedPool Type .
//ErrorCode:
Vcb = (HRFS_VOLUME_CONTROL_BLOCK_PTR)ExAllocatePool(PagedPool, sizeof(HRFS_VOLUME_CONTROL_BLOCK));
pVolDev->fileObject->SectionObjectPointer = \
(PSECTION_OBJECT_POINTERS)ExAllocatePool(PagedPool, sizeof(SECTION_OBJECT_POINTERS));;
I'm trying to tighten the security on a Windows process, by overriding the process owners ability to further alter the DACL on the process.
Having created the process with CreateProcessAsUser()
I then proceed to get the existing DACL from it like so:
CDacl procDacl;
if (AtlGetDacl(hProcess, SE_KERNEL_OBJECT, &procDacl))
{
//..
}
After this, I construct an OWNER RIGHTS ACE and add it to the DACL so that the owner only has read access to the permissions (this removes the default WRITE_DAC access):
PSID OwnerRightsSid;
if (ConvertStringSidToSid(OWNER_RIGHTS_SID_STRING, &OwnerRightsSid))
{
CSid sidOwnerRights(*static_cast<SID*>(OwnerRightsSid));
LocalFree(OwnerRightsSid);
procDacl.AddAllowedAce(sidOwnerRights, READ_CONTROL);
}
Before setting the DACL back on the process.
AtlSetDacl(hProcess, SE_KERNEL_OBJECT, procDacl, PROTECTED_DACL_SECURITY_INFORMATION);
If I use AtlGetDacl() at this point I can enumerate the ACEs and see that all of them have 0 flags. However, looking at the process in WinDBG I see the owner rights SID has gained flag 0x8 (INHERIT_ONLY_ACE) which means it doesn't actually apply to the process its self. What's weird is that:
All the other ACEs I've added in the same way have flag 0 as expected.
But if I use process explorer or process hacker, they're able to set the OWNER RIGHTS sid without this inheritance problem... (Note the owner rights sid: S-1-3-4)
Here's my ACEs in WinDBG:
0: kd> !sd (0xffff8903`70d9d5e7 & 0xffffffff`fffffff0)
->Revision: 0x1
->Sbz1 : 0x0
->Control : 0x8014
SE_DACL_PRESENT
SE_SACL_PRESENT
SE_SELF_RELATIVE
->Owner : S-1-5-21-2264418099-4034413657-3176887289-1001
->Group : S-1-5-21-2264418099-4034413657-3176887289-513
->Dacl :
->Dacl : ->AclRevision: 0x2
->Dacl : ->Sbz1 : 0x0
->Dacl : ->AclSize : 0x64
->Dacl : ->AceCount : 0x4
->Dacl : ->Sbz2 : 0x0
->Dacl : ->Ace[0]: ->AceType: ACCESS_ALLOWED_ACE_TYPE
->Dacl : ->Ace[0]: ->AceFlags: 0x0
->Dacl : ->Ace[0]: ->AceSize: 0x14
->Dacl : ->Ace[0]: ->Mask : 0x001fffff
->Dacl : ->Ace[0]: ->SID: S-1-5-18
->Dacl : ->Ace[1]: ->AceType: ACCESS_ALLOWED_ACE_TYPE
->Dacl : ->Ace[1]: ->AceFlags: 0x0
->Dacl : ->Ace[1]: ->AceSize: 0x18
->Dacl : ->Ace[1]: ->Mask : 0x001fffff
->Dacl : ->Ace[1]: ->SID: S-1-5-32-544
->Dacl : ->Ace[2]: ->AceType: ACCESS_ALLOWED_ACE_TYPE
->Dacl : ->Ace[2]: ->AceFlags: 0x0
->Dacl : ->Ace[2]: ->AceSize: 0x1c
->Dacl : ->Ace[2]: ->Mask : 0x00120410
->Dacl : ->Ace[2]: ->SID: S-1-5-5-0-165627
->Dacl : ->Ace[3]: ->AceType: ACCESS_ALLOWED_ACE_TYPE
->Dacl : ->Ace[3]: ->AceFlags: 0x8
->Dacl : ->Ace[3]: INHERIT_ONLY_ACE
->Dacl : ->Ace[3]: ->AceSize: 0x14
->Dacl : ->Ace[3]: ->Mask : 0x00020000
->Dacl : ->Ace[3]: ->SID: S-1-3-4
->Sacl :
->Sacl : ->AclRevision: 0x2
->Sacl : ->Sbz1 : 0x0
->Sacl : ->AclSize : 0x1c
->Sacl : ->AceCount : 0x1
->Sacl : ->Sbz2 : 0x0
->Sacl : ->Ace[0]: ->AceType: SYSTEM_MANDATORY_LABEL_ACE_TYPE
->Sacl : ->Ace[0]: ->AceFlags: 0x0
->Sacl : ->Ace[0]: ->AceSize: 0x14
->Sacl : ->Ace[0]: ->Mask : 0x00000003
->Sacl : ->Ace[0]: ->SID: S-1-16-8192
I've already tried looking at the process hacker source code, and I can't see that they're doing anything differently (Except that they seem to be using SetObjectInfo() rather than the ATL wrapper functions). Does anyone have a good understanding how how these inheritance flags work? And why my ACE seems to have been altered from the flags I set?
**
Quick summary: C++ app loading data from SQL server using using OTL4, writing to Mongo using mongocxx bulk_write, the strings seem to getting mangled somehow so they don't work in the aggregation pipeline (but appear fine otherwise).
**
I have a simple Mongo collection which doesn't seem to behave as expected with an aggregation pipeline when I'm projecting multiple fields. It's a trivial document, no nesting, fields are just doubles and strings.
First 2 queries work as expected:
> db.TemporaryData.aggregate( [ { $project : { ParametersId:1 } } ] )
{ "_id" : ObjectId("5c28f751a531251fd0007c72"), "ParametersId" : 526988617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c73"), "ParametersId" : 526988617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c74"), "ParametersId" : 526988617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c75"), "ParametersId" : 526988617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c76"), "ParametersId" : 526988617 }
> db.TemporaryData.aggregate( [ { $project : { Col1:1 } } ] )
{ "_id" : ObjectId("5c28f751a531251fd0007c72"), "Col1" : 575 }
{ "_id" : ObjectId("5c28f751a531251fd0007c73"), "Col1" : 579 }
{ "_id" : ObjectId("5c28f751a531251fd0007c74"), "Col1" : 616 }
{ "_id" : ObjectId("5c28f751a531251fd0007c75"), "Col1" : 617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c76"), "Col1" : 622 }
But then combining doesn't return both the fields as expected.
> db.TemporaryData.aggregate( [ { $project : { ParametersId:1, Col1:1 } } ] )
{ "_id" : ObjectId("5c28f751a531251fd0007c72"), "ParametersId" : 526988617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c73"), "ParametersId" : 526988617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c74"), "ParametersId" : 526988617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c75"), "ParametersId" : 526988617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c76"), "ParametersId" : 526988617 }
It seems to be specific to the ParametersId field, for instance if I choose 2 other fields it's OK.
> db.TemporaryData.aggregate( [ { $project : { Col1:1, Col2:1 } } ] )
{ "_id" : ObjectId("5c28f751a531251fd0007c72"), "Col1" : 575, "Col2" : "1101-2" }
{ "_id" : ObjectId("5c28f751a531251fd0007c73"), "Col1" : 579, "Col2" : "1103-2" }
{ "_id" : ObjectId("5c28f751a531251fd0007c74"), "Col1" : 616, "Col2" : "1300-3" }
{ "_id" : ObjectId("5c28f751a531251fd0007c75"), "Col1" : 617, "Col2" : "1300-3" }
{ "_id" : ObjectId("5c28f751a531251fd0007c76"), "Col1" : 622, "Col2" : "1400-3" }
For some reason when I include ParametersId field, all hell breaks loose in the pipeline:
> db.TemporaryData.aggregate( [ { $project : { ParametersId:1, Col2:1, Col1:1, Col3:1 } } ] )
{ "_id" : ObjectId("5c28f751a531251fd0007c72"), "ParametersId" : 526988617, "Col1" : 575 }
{ "_id" : ObjectId("5c28f751a531251fd0007c73"), "ParametersId" : 526988617, "Col1" : 579 }
{ "_id" : ObjectId("5c28f751a531251fd0007c74"), "ParametersId" : 526988617, "Col1" : 616 }
{ "_id" : ObjectId("5c28f751a531251fd0007c75"), "ParametersId" : 526988617, "Col1" : 617 }
{ "_id" : ObjectId("5c28f751a531251fd0007c76"), "ParametersId" : 526988617, "Col1" : 622 }
DB version and the data:
> db.version()
4.0.2
> db.TemporaryData.find()
{ "_id" : ObjectId("5c28f751a531251fd0007c72"), "CellId" : 998909269, "ParametersId" : 526988617, "Order" : 1, "Col1" : 575, "Col2" : "1101-2", "Col3" : "CHF" }
{ "_id" : ObjectId("5c28f751a531251fd0007c73"), "CellId" : 998909269, "ParametersId" : 526988617, "Order" : 1, "Col1" : 579, "Col2" : "1103-2", "Col3" : "CHF" }
{ "_id" : ObjectId("5c28f751a531251fd0007c74"), "CellId" : 998909269, "ParametersId" : 526988617, "Order" : 1, "Col1" : 616, "Col2" : "1300-3", "Col3" : "CHF" }
{ "_id" : ObjectId("5c28f751a531251fd0007c75"), "CellId" : 998909269, "ParametersId" : 526988617, "Order" : 36, "Col1" : 617, "Col2" : "1300-3", "Col3" : "CHF" }
{ "_id" : ObjectId("5c28f751a531251fd0007c76"), "CellId" : 998909269, "ParametersId" : 526988617, "Order" : 1, "Col1" : 622, "Col2" : "1400-3", "Col3" : "CHF" }
Update: enquoting the field names makes no difference. I'm typing all the above in the mongo.exe command line, but I see the same behavior in my C++ application with a slightly more complex pipeline (projecting all fields to guarantee order).
This same app is actually creating the data in the first place - does anyone know anything which can go wrong? All using the mongocxx lib.
** update **
Turns out there's something going wrong with my handling of strings. Without the string fields in the data, it's all fine. So I've knackered my strings, somehow, even though they look and behave correctly in other ways they don't play nice with the aggregation pipeline. I'm using mongocxx::collection.bulk_write to write standard std::strings which are being loaded from sql server through the OTL4 header. In-between there's a strncpy_s when they get stored internally. I can't seem to create a simple reproducible example.
Just to be safe that there is no conflict with anything else, try using the projection with a strict formatted json: (add quotes to keys)
db.TemporaryData.aggregate( [ { $project : { "ParametersId":1, "Col1":1 } } ] )
Finally found the issue was corrupt documents, which because I was using bulk_write for the insert were getting into the database but causing this strange behavior. I switched to using insert_many, which threw up the document was corrupt, and then I could track down the bug.
The docs were corrupt because I was writing the same field-value data multiple times, which seems to be break the bsoncxx::builder::stream::document I was using to construct them.
I want to extract the addressId for a given housenumber in a response with a long array. The array response looks like this (snippet):
: : "footprint":null,
: : "type":null,
: : "addressId":"0011442239",
: : "streetName":"solitudestr.",
: : "streetNrFirstSuffix":null,
: : "streetNrFirst":null,
: : "streetNrLastSuffix":null,
: : "streetNrLast":null,
: : "houseNumber":"25",
: : "houseName":null,
: : "city":"stuttgart",
: : "postcode":"70499",
: : "stateOrProvince":null,
: : "countryName":null,
: : "poBoxNr":null,
: : "poBoxType":null,
: : "attention":null,
: : "geographicAreas":
: : [
: : ],
: : "firstName":null,
: : "lastName":null,
: : "title":null,
: : "region":"BW",
: : "additionalInfo":null,
: : "properties":
: : [
: : ],
: : "extAddressId":null,
: : "entrance":null,
: : "district":null,
: : "addressLine1":null,
: : "addressLine2":null,
: : "addressLine3":null,
: : "addressLine4":null,
: : "companyName":null,
: : "contactName":null,
: : "houseNrExt":null,
: : "derbyStack":false
: },
: {
: : "footprint":null,
: : "type":null,
: : "addressId":"0011442246",
: : "streetName":"solitudestr.",
: : "streetNrFirstSuffix":null,
: : "streetNrFirst":null,
: : "streetNrLastSuffix":null,
: : "streetNrLast":null,
: : "houseNumber":"26",
: : "houseName":null,
: : "city":"stuttgart",
: : "postcode":"70499",
: : "stateOrProvince":null,
: : "countryName":null,
: : "poBoxNr":null,
: : "poBoxType":null,
: : "attention":null,
: : "geographicAreas":
: : [
: : ],
: : "firstName":null,
: : "lastName":null,
: : "title":null,
: : "region":"BW",
: : "additionalInfo":null,
: : "properties":
: : [
: : ],
: : "extAddressId":null,
: : "entrance":null,
: : "district":null,
: : "addressLine1":null,
: : "addressLine2":null,
: : "addressLine3":null,
: : "addressLine4":null,
: : "companyName":null,
: : "contactName":null,
: : "houseNrExt":null,
: : "derbyStack":false
: },
i only show 2 housenumbers in this response as an example but the original response is bigger.
Q: How can i match the adressId for a specific houseNumber (i have these houseNumbers in my CSV dataset) ? I Could do a regex which extracts all addressId's but then i'd have to use the correct matching no. in Jmeter. However, i cannot assume that the ordening of these will remain same in the different environments we test the script against.
I would recommend reconsidering using regular expressions to deal with JSON data.
Starting from JMeter 3.0 you have a JSON Path PostProcessor. Using it you can execute arbitrary JSONPath queries so extracting the addressID for the given houseNumber would be as simple as:
`$..[?(#.houseNumber == '25')].addressId`
Demo:
You can use a JMeter Variable instead of the hard-coded 25 value like:
$..[?(#.houseNumber == '${houseNumber}')].addressId
If for some reason you have to use JMeter < 3.0 you still can have JSON Path postprocessing capabilities using JSON Path Extractor via JMeter Plugins
See Advanced Usage of the JSON Path Extractor in JMeter article, in particular Conditional Select chapter for more information.
You may use a regex that will capture the digits after addressId and before a specific houseNumber if you use an unrolled tempered greedy token (for better efficiency) in between them to make sure the regex engine does not overflow to another record.
"addressId":"(\d+)"(?:[^\n"]*(?:\n(?!: +: +\[)[^\n"]*|"(?!houseNumber")[^\n"]*)*"houseNumber":"25"|$)
See the regex demo (replace 25 with the necessary house number)
Details:
"addressId":" - literal string
(\d+) - Group 1 ($1$ template value) capturing 1+ digits
" - a quote
(?:[^\n"]*(?:\n(?!: +: +\[)[^\n"]*|"(?!houseNumber")[^\n"]*)*"houseNumber":"25"|$) - a non-capturing group with 2 alternatives, one being $ (end of string) or:
[^\n"]* - zero or more chars other than newline and "
(?: - then come 2 alternatives:
\n(?!: +: +\[)[^\n"]* - a newline not followed with : : [ like string and followed with 0+chars other than a newline and "
| - or
"(?!houseNumber")[^\n"]* - a " not followed with houseNumber and followed with 0+chars other than a newline and "
)* - than may repeat 0 or more times
"houseNumber":"25" - hourse number literal string.