How to avoid matching unwanted blocks in regex.
For example, I just want to match the static prototypes in C language and how to make it not include the main function.
/**
* #brief
*
* #param dsc
*/
static void func1(const char *dsc);
/**
* #brief
*
* #param argc
* #param argv
* #return int
*/
int main(int argc, char *argv[]);
/**
* #brief
*
* #param dsc
*/
static void fun2(const char *dsc);
my regex is
\/\*[\w\W]*?\*\/\n+^static.*\);
and it alway match the main function prototypes. In Vscode just like this:
You can use
/\*(?:(?!/\*|\*/)[\w\W])*?\*/\n+static.*\);
Note that you do not need to escape the / char in the search and replace field since the regex is defined with a mere string, not a regex literal notation where / chars are used as regex delimiters.
The (?:(?!/\*|\*/)[\w\W])*? part makes it match any char, zero or more but as few as possible times, that does not start a /* or */ char sequence.
You do not need ^ as a start of a line anchor here since it makes no sense after \n+, it is implied there.
Related
I maybe found a bug in std::regex_replace.
The following code should write "1a b2" with length 5, but it writes "1a2" with length 3.
Am I right? If not, why not?
#include <iostream>
#include <regex>
using namespace std;
int main()
{
string a = regex_replace("1<sn>2", std::regex("<sn>"), string("a\0b", 3));
cout << "a: " << a << "\n";
cout << a.length();
return 0;
}
This does seem to be a bug in libstdc++. Using a debugger I stepped into regex_replace, until getting to this part:
// std [28.11.4] Function template regex_replace
/**
* #brief Search for a regular expression within a range for multiple times,
and replace the matched parts through filling a format string.
* #param __out [OUT] The output iterator.
* #param __first [IN] The start of the string to search.
* #param __last [IN] One-past-the-end of the string to search.
* #param __e [IN] The regular expression to search for.
* #param __fmt [IN] The format string.
* #param __flags [IN] Search and replace policy flags.
*
* #returns __out
* #throws an exception of type regex_error.
*/
template<typename _Out_iter, typename _Bi_iter,
typename _Rx_traits, typename _Ch_type,
typename _St, typename _Sa>
inline _Out_iter
regex_replace(_Out_iter __out, _Bi_iter __first, _Bi_iter __last,
const basic_regex<_Ch_type, _Rx_traits>& __e,
const basic_string<_Ch_type, _St, _Sa>& __fmt,
regex_constants::match_flag_type __flags
= regex_constants::match_default)
{
return regex_replace(__out, __first, __last, __e, __fmt.c_str(), __flags);
}
Referencing this write-up at cppreference.com, this seems to be implementing the first overload, the one that takes a std::string for the replacement string, by calling its c_str() and then calling the 2nd overload, the one that takes a const char * parameter, for the actual implementation. And that explains the observed behavior. I can't find anything that requires this approach.
Stepping further into the actual implementation:
auto __len = char_traits<_Ch_type>::length(__fmt);
__out = __i->format(__out, __fmt, __fmt + __len, __flags);
So, it determines the length of the replacement string and passes the replacement string, as a beginning and an ending iterator, into format().
This seems like it should be the other way around, with __fmt preserved as a std::basic_string, and passing iterators directly derived from it into format().
In a multibyte project (vs2017):
#ifndef _TCHAR_DEFINED
typedef char TCHAR;
typedef char * PTCHAR;
typedef unsigned char TBYTE;
typedef unsigned char * PTBYTE;
#define _TCHAR_DEFINED
struct _getopt_data
{
/* These have exactly the same meaning as the corresponding global
variables, except that they are used for the reentrant
versions of getopt. */
int optind;
int opterr;
int optopt;
TCHAR *optarg;
/* Internal members. */
/* True if the internal members have been initialized. */
int __initialized;
/* The next char to be scanned in the option-element
in which the last option character we returned was found.
This allows us to pick up the scan where we left off.
If this is zero, or a null string, it means resume the scan
by advancing to the next ARGV-element. */
TCHAR *__nextchar;
/* Describe how to deal with options that follow non-option ARGV-elements.
If the caller did not specify anything,
the default is REQUIRE_ORDER if the environment variable
POSIXLY_CORRECT is defined, PERMUTE otherwise.
REQUIRE_ORDER means don't recognize them as options;
stop option processing when the first non-option is seen.
This is what Unix does.
This mode of operation is selected by either setting the environment
variable POSIXLY_CORRECT, or using `+' as the first character
of the list of option characters.
PERMUTE is the default. We permute the contents of ARGV as we
scan, so that eventually all the non-options are at the end.
This allows options to be given in any order, even with programs
that were not written to expect this.
RETURN_IN_ORDER is an option available to programs that were
written to expect options and other ARGV-elements in any order
and that care about the ordering of the two. We describe each
non-option ARGV-element as if it were the argument of an option
with character code 1. Using `-' as the first character of the
list of option characters selects this mode of operation.
The special argument `--' forces an end of option-scanning regardless
of the value of `ordering'. In the case of RETURN_IN_ORDER, only
`--' can cause `getopt' to return -1 with `optind' != ARGC. */
enum
{
REQUIRE_ORDER, PERMUTE, RETURN_IN_ORDER
} __ordering;
/* If the POSIXLY_CORRECT environment variable is set. */
int __posixly_correct;
/* Handle permutation of arguments. */
/* Describe the part of ARGV that contains non-options that have
been skipped. `first_nonopt' is the index in ARGV of the first
of them; `last_nonopt' is the index after the last of them. */
int __first_nonopt;
int __last_nonopt;
#if defined _LIBC && defined USE_NONOPTION_FLAGS
int __nonoption_flags_max_len;
int __nonoption_flags_len;
# endif
};
int
_getopt_internal_r(int argc, TCHAR *const *argv, const TCHAR *optstring,
const struct option *longopts, int *longind,
int long_only, struct _getopt_data *d, int posixly_correct)
{
...
TCHAR c = *d->__nextchar++;
TCHAR *temp = _tcschr(optstring, c); // <= cannot convert from 'const char *' to 'TCHAR *' (first parameter)
...
}
Everything looks correct; in tchar.h:
#define _PUC unsigned char *
#define _CPUC const unsigned char *
#define _PC char *
#define _CRPC _CONST_RETURN char *
#define _CPC const char *
#define _UI unsigned int
/* String functions */
__inline _CRPC _tcschr(_In_z_ _CPC _s1,_In_ _UI _c) {return (_CRPC)_mbschr((_CPUC)_s1,_c);}
Why is _tcschr() complaining that a parameter is not of const char*, when it is?
In C++, _tcschr() is overloaded to take either a TCHAR* or a const TCHAR* as input. To return a non-const TCHAR*, you will have to call the non-const overload, which means casting away the const off of optstring, eg:
TCHAR *temp = _tcschr(const_cast<TCHAR*>(optstring), c);
Or else define _CONST_RETURN, per the documentation:
In C, these functions take a const pointer for the first argument. In C++, two overloads are available. The overload taking a pointer to const returns a pointer to const; the version that takes a pointer to non-const returns a pointer to non-const. The macro _CRT_CONST_CORRECT_OVERLOADS is defined if both the const and non-const versions of these functions are available. If you require the non-const behavior for both C++ overloads, define the symbol _CONST_RETURN.
In C++ Primer book, there is an explanation on type aliases as:
typedef char *pstring;
const pstring cstr = 0; // cstr is a constant pointer to char
They say that the following is a wrong interpretation:
const char *cstr = 0;
However it makes sense to me, to replace the typedef alias with its original meaning.
In a normal scenario without type aliasing a constant pointer is defined as:
char *const cstr = 0;
Why is it constant pointer rather than pointer to const?
Can anyone explain in clear terms because the book doesn't seem to clarify it much.
2 * 3 + 1 is 7. But how come if I do int i = 3 + 1; and then 2 * i it gives 8? Shouldn't the variable be replaced with its original meaning?
It's because 2 * 3 + 1 is interpreted as (2 * 3) + 1, while 2 * i is the same as 2 * (3 + 1). These mean different things and work out to different numbers. When you give 3 + 1 a name, when you use the name it doesn't break up the number back into 3 + 1 in order to only multiply the 3.
The reason that const char * is different from const pstring is very similar. const char * is interpreted as (const char) * i.e. a pointer to a constant char. But const pstring is the same as const (char *) i.e. a constant pointer to a char. pstring is a whole type by itself, and when you do const pstring it doesn't split up the char * in order to make the char part const.
Note: if you did #define pstring char * then const pstring would be the same as const char *, because macros (#defines) are just treated as text replacements.
I want to ask how to use boost::iostreams::mapped_file with wchar_t *. Currently, I found the following:
boost::iostreams::mapped_file reader("input.txt" , mapped_file::readonly);
char const * it = reader.const_begin();
char const * endit = reader.const_end();
As I see, the interface does only allow char *, but I need to read a Vietnamese corpus (encoding UTF-16 LE). The reason I ask is that all previous assignments I use this, so If there is someway I can reuse these codes again.
Anywhere I can find some documentation of MiniUPnP?
For example, some doc explains what the parameters of this function are.
LIBSPEC int
UPNP_AddPortMapping(const char * controlURL, const char * servicetype,
const char * extPort,
const char * inPort,
const char * inClient,
const char * desc,
const char * proto,
const char * remoteHost,
const char * leaseDuration);
theses are methods defined by UPnP standard.
Have a look at http://upnp.org/specs/gw/UPnP-gw-WANIPConnection-v2-Service.pdf
for UPNP_AddPortMapping.
To see more specificaly how to use with libminiupnpc, see the upnpc.c sample program.
https://github.com/miniupnp/miniupnp/blob/master/miniupnpc/upnpc.c