How to send regular expression via in a mongoDB query using mongocxx? - regex

Following is a piece of code where I am trying to query mongodb using mongocxx driver.
/*Code for finding all that match the filter */
mongocxx::cursor cursor = collection.find(
document{} << "Vehicle_Registration" << vReg
<< "Vehicle_Make" << vMake
<< "Vehicle_Model" << vModel
<< "Vehicle_Owner" << vOwner
<< finalize);
Here
Vehicle_Registration, Vehicle_Make, Vehicle_Model, Vehicle_Owner
are fields of the collection.
The value of
vReg, vMake, vModel , vOwner
are as specified by the user on the screen. If a user specifies only some (not all) of these values, then rest of the values remain NULL. To avoid search on NULL values I try to set them to a regular expression { $regex: /./ } so that the NULL values don't affect the search.
This regular expression works on mongo shell and all fields set to this regular expression get ignored and don't affect the search.
But in the code, to set this regular expression I do :
If (vReg == NULL) { vreg = "{ $regex: /./ }" }
and then pass vReg in the document{} as shown in the code at the top
document{} << "Vehicle_Registration" << vReg
Here vReg gets passed as a string "{ $regex: /./ }" (with quotes) and not as { $regex: /./ } (without quotes). As a result it is considered a string and not evaluated as a regular expression in the query and hence there are no search results.
Can somebody please help me know how to pass it as regular expression?
Thank you!

You'll need to make the regular expression clause a proper BSON subdocument, not a string. Fortunately, there's a shortcut for you with the bsoncxx::types::b_regex type, which expands to the subdocument when added to a BSON document.
Updated: To see how you could toggle the way you describe, here's an example that uses a ternary operator and then wraps the query string or the regular expression in a bsoncxx::types::value, which is a union type:
using namespace bsoncxx::builder::stream;
using namespace bsoncxx::types;
int main() {
auto inst = mongocxx::instance{};
auto client = mongocxx::client{mongocxx::uri{}};
auto coll = client["test"]["regex_question"];
coll.drop();
std::string vModel;
auto empty_regex = b_regex(".", "");
coll.insert_one(document{} << "Vehicle_Make"
<< "Toyota"
<< "Vehicle_Model"
<< "Highlander" << finalize);
auto filter = document{} << "Vehicle_Make"
<< "Toyota"
<< "Vehicle_Model"
<< (vModel.empty() ? value{empty_regex} : value{b_utf8{vModel}})
<< finalize;
std::cout << bsoncxx::to_json(filter) << std::endl;
std::cout << "Found " << coll.count(filter.view()) << " document(s)" << std::endl;
}
That gives this output:
{ "Vehicle_Make" : "Toyota", "Vehicle_Model" : { "$regex" : ".", "$options" : "" } }
Found 1 document(s)

Just add another Example of making case-insensitive find by b_regex as xdg's answer has pointed out:
collection.find(
bsoncxx::builder::stream::document{}
<< KEY_VALUE
<< open_document
<< "$regex"
<< bsoncxx::types::value{bsoncxx::types::b_regex{"^" + tag, "i"}}
<< close_document
<< finalize
);
It annoys me that the MongoDB manual says it should be { <field>: { $regex: /pattern/, $options: '<options>' } }, but actually the slash/ around /pattern/ is not needed here.

Related

Use regex to validate string not starting with specific character and string length

I have a function with the following if statements:
if (name.length() < 10 || name.length() > 64)
{
return false;
}
if (name.front() == L'.' || name.front() == L' ')
{
return false;
}
I was curious to see if can do this using the following regular expression:
^(?!\ |\.)([A-Za-z]{10,46}$)
to dissect the above expression the first part ^(?!\ |.) preforms a negative look ahead to assert that it is impossible for the string to start with space or dot(.) and the second part should take care of the string length condition. I wrote the following to test the expression out:
std::string randomStrings [] = {" hello",
" hellllloooo",
"\.hello",
".zoidbergvddd",
"asdasdsadadasdasdasdadasdsad"};
std::regex txt_regex("^(?!\ |\.)([A-Za-z]{10,46}$)");
for (const auto& str : randomStrings)
{
std::cout << str << ": " << std::boolalpha << std::regex_match(str, txt_regex) << '\n';
}
I expected the last one to to match since it does not start with space or dot(.) and it meets the length criteria. However, this is what I got:
hello: false
hellllloooo: false
.hello: false
.zoidbergvddd: false
asdasdsadadasdasdasdadasdsad: false
Did I miss something trivial here or this is not possible using regex? It seems like it should be.
Feel free to suggest a better title, I tried to be as descriptive as possible.
Change your regular expression to: "^(?![\\s.])([A-Za-z]{10,46}$)" and it will work.
\s refers to any whitespace and you need to escape the \ inside the string and that's why it becomes \\s.
You can also check this link
You need to turn on compiler warnings. It would have told you that you have an unknown escape sequence in your regex. I recommend using a raw literal.
#include <iostream>
#include <regex>
int main() {
std::string randomStrings[] = { " hello", " hellllloooo", ".hello",
".zoidbergvddd", "asdasdsadadasdasdasdadasdsad" };
std::regex txt_regex(R"foo(^(?!\ |\.)([A-Za-z]{10,46}$))foo");
for (const auto& str : randomStrings) {
std::cout << str << ": " << std::boolalpha
<< std::regex_match(str, txt_regex) << '\n';
}
}
clang++-3.8 gives
hello: false
hellllloooo: false
.hello: false
.zoidbergvddd: false
asdasdsadadasdasdasdadasdsad: true

MongoDB 3.2 c++ drivers , using $exists

bsoncxx::builder::stream::document search_builder;
mongocxx::options::find img_find; // This speeds up the queries
search_builder_images.clear();
search_builder_images << "_id" << "abc" << "data" << open_document <<"$exists" << true << close_document ;
for (bsoncxx::document::view doc : cursor_cal) {
std::cout << bsoncxx::to_json(doc) << std::endl;
}
auto cursor_cal = dbMongo[collectionName].find(search_builder.view());
Here randomly 50-50% chances , I sometimes get the output I expect and sometimes I get segmentation fault error.
What am I doing wrong ? ( I am trying to create this search_builder to search in mongodb database and get documents where data exists ? )
This is a bit old but I was having a segfault issue for the construction of the document, not sure if its what you were facing. I had to break the query document construction into multiple lines, such as :
auto queryDoc = document{};
queryDoc << _id << "abc";
queryDoc << "data" << open_document;
queryDoc << "$exists" << true;
queryDoc << close_document;
auto query = queryDoc << finalize;
Hope this helps someone else.

Processing array of generic BSON documents with MongoDB C++ driver

I have the following document in my MongoDB test database:
> db.a.find().pretty()
{
"_id" : ObjectId("5113d680732fb764c4464fdf"),
"x" : [
{
"a" : 1,
"b" : 2
},
{
"a" : 3,
"b" : 4
}
]
}
I'm trying to access and process the elements in the "x" array. However, it seems that the Mongo driver is identifying it not as an array of JSON document, but as Date type, as shown in the following code:
auto_ptr<DBClientCursor> cursor = c.query("test.a", BSONObj());
while (cursor->more()) {
BSONObj r = cursor->next();
cout << r.toString() << std::endl;
}
which output is:
{ _id: ObjectId('51138456732fb764c4464fde'), x: new Date(1360233558334) }
I'm trying to follow the documentation in http://api.mongodb.org/cplusplus and http://docs.mongodb.org/ecosystem/drivers/cpp-bson-array-examples/, but it is quite poor. I have found other examples of processing arrays, but always with simple types (e.g. array of integer), but not when the elements in the array are BSON documents themselves.
Do you have some code example of procesing arrays which elements are generic BSON elements, please?
you could use the .Array() method or the getFieldDotted() method: as in the following:
Query query = Query();
auto_ptr<DBClientCursor> cursor = myConn.query("test.a", query);
while( cursor->more() ) {
BSONObj nextObject = cursor->next();
cout << nextObject["x"].Array()[0]["a"] << endl;
cout << nextObject.getFieldDotted("x.0.a") << endl;
}
At the end, it seems that embeddedObject() method was the key:
auto_ptr<DBClientCursor> cursor = c.query("test.a", BSONObj());
while (cursor->more()) {
BSONObj r = cursor->next();
cout << "Processing JSON document: " << r.toString() << std::endl;
std::vector<BSONElement> be = r.getField("x").Array();
for (unsigned int i = 0; i<be.size(); i++) {
cout << "Processing array element: " << be[i].toString() << std::endl;
cout << " of type: " << be[i].type() << std::endl;
BSONObj bo = be[i].embeddedObject();
cout << "Processing a field: " << bo.getField("a").toString() << std::endl;
cout << "Processing b field: " << bo.getField("b").toString() << std::endl;
}
}
I was wrongly retrieving a different ObjectID and a different type (Date instead of array) becuase I was looking to a different collection :$
Sorry for the noise. I hope that the fragment above helps others to figure out how to manipulate arrays using the MongoDB C++ driver.

Regular Expression or Not (Getting not all text that satisfies regexp)

I want use regex to find something in string (or QString) that is between " (quotes).
My simple String: x="20.51167" and I want 20.51167.
Is it possible with Regular Expressions ??
On start I had somthing like this string :
<S id="1109" s5="1" nr="1183" n="Some text" test=" " x="20.53843" y="50.84443">
Using regexp like: (nr=\"[0-9]+\") (y=\"[0-9 .^\"]+\")" etc I get my simple string like x="20.51167". Maybe this is wrong way and I can get value that is between quotes at one time ??
For your particular example, this will work:
#include <QRegExp>
#include <QString>
#include <iostream>
int main()
{
//Here's your regexp.
QRegExp re("\"[^\"^=]+\"");
//Here's your sample string:
QString test ="<S id=\"1109\" s5=\"1\" nr=\"1183\" n=\"Some text\" test=\" \" x=\"20.53843\" y=\"50.84443\">";
int offset = 0;
while( offset = re.indexIn( test, offset + 1 ) )
{
if(offset == -1)
break;
QString res = re.cap().replace("\"", "");
bool ok;
int iRes;
float fRes;
if( res.toInt( &ok ) && ok )
{
iRes = res.toInt();
std::cout << "int: " << iRes << std::endl;
}
else if ( res.toFloat( &ok ) && ok )
{
fRes = res.toFloat();
std::cout << "float: " << fRes << std::endl;
}
else
std::cout << "string: " << res.toStdString() << std::endl;
}
}
The output will be;
int: 1109
int: 1
int: 1183
string: Some text
string:
float: 20.5384
float: 50.8444
Try this works. untested
="([^"]+)"
The above captures anything that is in-between =" "
In this expression : (nr=\"[0-9]+\") (y=\"[0-9 .^\"]+\")"
Delete the last quote after )
For your regular expression try :
x=^[0-9]+.[0-9]{5}
If you want to find anything in quotes, I guess the regex should read:
"([^"]*)"
(anything that is not a quote between quotes)
You just have to move your capturing group inside the quotes:
x=\"([0-9.]+)\"

Boost regex don't match tabs

I'm using boost regex_match and I have a problem with matching no tab characters.
My test application looks as follows:
#include <iostream>
#include <string>
#include <boost/spirit/include/classic_regex.hpp>
int
main(int args, char** argv)
{
boost::match_results<std::string::const_iterator> what;
if(args == 3) {
std::string text(argv[1]);
boost::regex expression(argv[2]);
std::cout << "Text : " << text << std::endl;
std::cout << "Regex: " << expression << std::endl;
if(boost::regex_match(text, what, expression, boost::match_default) != 0) {
int i = 0;
std::cout << text;
if(what[0].matched)
std::cout << " matches with regex pattern!" << std::endl;
else
std::cout << " does not match with regex pattern!" << std::endl;
for(boost::match_results<std::string::const_iterator>::const_iterator it=what.begin(); it!=what.end(); ++it) {
std::cout << "[" << (i++) << "] " << it->str() << std::endl;
}
} else {
std::cout << "Expression does not match!" << std::endl;
}
} else {
std::cout << "Usage: $> ./boost-regex <text> <regex>" << std::endl;
}
return 0;
}
If I run the program with these arguments, I don't get the expected result:
$> ./boost-regex "`cat file`" "(?=.*[^\t]).*"
Text : This text includes some tabulators
Regex: (?=.*[^\t]).*
This text includes some tabulators matches with regex pattern!
[0] This text includes some tabulators
In this case I would have expected that what[0].matched is false, but it's not.
Is there any mistake in my regular expression?
Or do I have to use other format/match flag?
Thank you in advance!
I am not sure what you want to do. My understanding is, you want the regex to fail as soon as there is a tab in the text.
Your positive lookahead assertion (?=.*[^\t]) is true as soon as it finds a non tab, and there are a lot of non tabs in your text.
If you want it to fail, when there is a tab, go the other way round and use a negative lookahead assertion.
(?!.*\t).*
this assertion will fail as soon as it find a tab.