I am looking for a templating literals feature like the one that was introduced to ES6 JavaScript. Is there something comparable?
Javascript:
for (let i = 0; i < 10; i++) {
console.log(`Liftoff in ${i} seconds`)
}
I am looking for a clean way to iterate through several directories using a for loop.
If you have C++20 available, you could use std::format(). Here's a usage example from the linked page:
#include <iostream>
#include <format>
int main() {
std::cout << std::format("Hello {}!\n", "world");
}
If you don't have C++20 yet, Boost has a similar feature.
Related
I am modifying a Node native extension that is spawning native threads to do some processing. My issue is that I'd like to have the Javascript code provide a filter for the processing to exclude some data.
At this point, I'm passing a JS RegExp string from JS to C++, creating a std::regex instance from it, and passing it around the different structures down to the native thread logic.
My issue now is that despite std::regex using what seems to be the same syntax as ECMAScript regular expressions, the behavior is not the same :(
My original plan was to rely on V8's RegExp engine somehow but trigger the C++ bits directly instead of going from C++ to JS and back. But I wasn't able to find how to do this.
As example, see the following programs using the same regex but yielding different results:
#include <stdio.h>
#include <regex>
int main() {
std::regex re("^(?:(?:(?!(?:\\/|^)\\.).)*?\\/c)$");
std::smatch match;
std::string input("a.b/c");
int result = std::regex_match(input, match, re);
if (result == 1) {
printf("ok");
} else {
printf("nok");
}
return 0;
}
The equivalent JS code:
const re = new RegExp("^(?:(?:(?!(?:\\/|^)\\.).)*?\\/c)$");
const match = re.exec("a.b/c");
if (match) {
console.log("ok");
} else {
console.log("nok");
}
My question then is: What can I do to get the same results I would in JS but in C++? Is it possible to run V8's RegExp from a pure C++ context?
In a data processing project, i need to detect split words in chinese ( words in chinese dont contain spaces).
Is there a way to detect chinese characters using a native c++ feature or boost.locale library ?
Generally speaking, if you want full Unicode support in C++, there is little to no way around ICU. Boost provides some access to its features (through Boost.Locale and Boost.Regex), but it requires Boost to be compiled with ICU support for this. So instead of making sure the Boost of the target platform is compiled thusly you are probably better off using the ICU API directly.
If you are looking for word boundaries, icu::BreakIterator (more specifically, icu::BreakIterator::createWordInstance) is the starting point. You then pass the text to be iterated over via setText and move the iterator via next et al. (yes, ICU is a bit non-idiomatic this way, as it originated in Java land).
Alternatively, if you don't want to go for the full C++ API, there's ublock_getCode which will tell you the UBlockCode of the code point in question.
Here is my attempt using only boost and standard library:
#include <iostream>
#include <boost/regex/pending/unicode_iterator.hpp>
#include <functional>
#include <algorithm>
using Iter = boost::u8_to_u32_iterator<std::string::const_iterator>;
template <::boost::uint32_t a, ::boost::uint32_t b>
class UnicodeRange
{
static_assert(a <= b, "Proper range");
public:
constexpr bool operator()(::boost::uint32_t x) const noexcept
{
return x >= a && x <= b;
}
};
using UnifiedIdeographs = UnicodeRange<0x4E00, 0x9FFF>;
using UnifiedIdeographsA = UnicodeRange<0x3400, 0x4DBF>;
using UnifiedIdeographsB = UnicodeRange<0x20000, 0x2A6DF>;
using UnifiedIdeographsC = UnicodeRange<0x2A700, 0x2B73F>;
using UnifiedIdeographsD = UnicodeRange<0x2B740, 0x2B81F>;
using UnifiedIdeographsE = UnicodeRange<0x2B820, 0x2CEAF>;
using CompatibilityIdeographs = UnicodeRange<0xF900, 0xFAFF>;
using CompatibilityIdeographsSupplement = UnicodeRange<0x2F800, 0x2FA1F>;
constexpr bool isChineese(::boost::uint32_t x) noexcept
{
return UnifiedIdeographs{}(x)
|| UnifiedIdeographsA{}(x) || UnifiedIdeographsB{}(x) || UnifiedIdeographsC{}(x)
|| UnifiedIdeographsD{}(x) || UnifiedIdeographsE{}(x)
|| CompatibilityIdeographs{}(x) || CompatibilityIdeographsSupplement{}(x);
}
int main()
{
std::string s;
while (std::getline(std::cin, s))
{
auto start = std::find_if(Iter{s.cbegin()}, Iter{s.cend()}, isChineese);
auto stop = std::find_if_not(start, Iter{s.cend()}, isChineese);
std::cout << std::string{start.base(), stop.base()} << '\n';
}
return 0;
}
https://wandbox.org/permlink/FtxKa8D2LtR3ko9t
Probably you should be able to polish that approach to something fully functional.
I do not know how to properly cover this by tests and not sure which characters should be included in this check.
Can anyone give me an example of how I can use segmented stacks with boost coroutines? Do I have to annotate every function that is called from the coroutine with a special split-stack attribute?
When I try and write a program that should use segmented stacks, it just segfaults.
Here is what I have done so far
https://wandbox.org/permlink/TltQwGpy4hRoHgDY The code seems to segfault very quickly, if segmented stacks were used I would expect it to be able to handle more iterations. The program errors out after 35 iterations.
#include <boost/coroutine2/all.hpp>
#include <iostream>
#include <array>
using std::cout;
using std::endl;
class Int {
int a{2};
};
void foo(int num) {
cout << "In iteration " << num << endl;
std::array<Int, 1000> arr;
static_cast<void>(arr);
foo(num + 1);
}
int main() {
using Coroutine_t = boost::coroutines2::coroutine<int>::push_type;
auto coro = Coroutine_t{[&](auto& yield) {
foo(yield.get());
}};
coro(0);
}
Compiling that code with -fsplit-stack solves the problem. Annotations are not required. All functions are by default treated as split stacks. Example - https://wandbox.org/permlink/Pzzj5gMoUAyU0h7Q
Easy as that.
compile boost (boost.context and boost.coroutine) with b2 property segmented-stacks=on (enables special code inside boost.coroutine and boost.context).
your app has to be compiled with -DBOOST_USE_SEGMENTED_STACKS and -fsplit-stack (required by boost.coroutines headers).
see documentation: http://www.boost.org/doc/libs/1_65_1/libs/coroutine/doc/html/coroutine/stack/segmented_stack_allocator.html
boost.coroutine contains an example that demonstrates segmented stacks
(in directory coroutine/example/asymmetric/ call b2 toolset=gcc segmented-stacks=on).
please note: while llvm supports segmented stacks, clang seams not to provide the __splitstack_<xyz> functions.
I have 4000 strings and I want to create a perfect hash table with these strings. The strings are known in advance, so my first idea was to use a series of if statements:
if (name=="aaa")
return 1;
else if (name=="bbb")
return 2;
.
.
.
// 4000th `if' statement
However, this would be very inefficient. Is there a better way?
gperf is a tool that does exactly that:
GNU gperf is a perfect hash function generator. For a given list of strings, it produces a hash function and hash table, in form of C or C++ code, for looking up a value depending on the input string. The hash function is perfect, which means that the hash table has no collisions, and the hash table lookup needs a single string comparison only.
According to the documentation, gperf is used to generate the reserved keyword recogniser for lexers in GNU C, GNU C++, GNU Java, GNU Pascal, GNU Modula 3, and GNU indent.
The way it works is described in GPERF: A Perfect Hash Function Generator by Douglas C. Schmidt.
Better later than never, I believe this now finally answers the OP question:
Simply use https://github.com/serge-sans-paille/frozen -- a Compile-time (constexpr) library of immutable containers for C++ (using "perfect hash" under the hood).
On my tests, it performed in pair with the famous GNU's gperf perfect hash C code generator.
On your pseudo-code terms:
#include <frozen/unordered_map.h>
#include <frozen/string.h>
constexpr frozen::unordered_map<frozen::string, int, 2> olaf = {
{"aaa", 1},
{"bbb", 2},
.
.
.
// 4000th element
};
return olaf.at(name);
Will respond in O(1) time rather than OP's O(n)
-- O(n) assuming the compiler wouldn't optimize your if chain, which it might do)
Since the question is still unanswered and I'm about to add the same functionality to my HFT platform, I'll share my inventory for Perfect Hash Algorithms in C++. It is harder than I thought to find an open, flexible and bug free implementation, so I'm sharing the ones I didn't drop yet:
The CMPH library, with a collection of papers and such algorithms -- https://git.code.sf.net/p/cmph/git
BBHash, one more implementation from a paper's author -- https://github.com/rizkg/BBHash
Ademakov's -- another implementation from the paper above -- https://github.com/ademakov/PHF
wahern/phf -- I'm currently inspecting this one and trying to solve some allocation bugs it has when dealing with C++ Strings on huge key sets -- https://github.com/wahern/phf.git
emphf -- seems unmantained -- https://github.com/ot/emphf.git
I believe #NPE's answer is very reasonable, and I doubt it is too much for your application as you seem to imply.
Consider the following example: suppose you have your "engine" logic (that is: your application's functionality) contained in a file called engine.hpp:
// this is engine.hpp
#pragma once
#include <iostream>
void standalone() {
std::cout << "called standalone" << std::endl;
}
struct Foo {
static void first() {
std::cout << "called Foo::first()" << std::endl;
}
static void second() {
std::cout << "called Foo::second()" << std::endl;
}
};
// other functions...
and suppose you want to dispatch the different functions based on the map:
"standalone" dispatches void standalone()
"first" dispatches Foo::first()
"second" dispatches Foo::second()
# other dispatch rules...
You can do that using the following gperf input file (I called it "lookups.gperf"):
%{
#include "engine.hpp"
struct CommandMap {
const char *name;
void (*dispatch) (void);
};
%}
%ignore-case
%language=C++
%define class-name Commands
%define lookup-function-name Lookup
struct CommandMap
%%
standalone, standalone
first, Foo::first
second, Foo::second
Then you can use gperf to create a lookups.hpp file using a simple command:
gperf -tCG lookups.gperf > lookups.hpp
Once I have that in place, the following main subroutine will dispatch commands based on what I type:
#include <iostream>
#include "engine.hpp" // this is my application engine
#include "lookups.hpp" // this is gperf's output
int main() {
std::string command;
while(std::cin >> command) {
auto match = Commands::Lookup(command.c_str(), command.size());
if(match) {
match->dispatch();
} else {
std::cerr << "invalid command" << std::endl;
}
}
}
Compile it:
g++ main.cpp -std=c++11
and run it:
$ ./a.out
standalone
called standalone
first
called Foo::first()
Second
called Foo::second()
SECOND
called Foo::second()
first
called Foo::first()
frst
invalid command
Notice that once you have generated lookups.hpp your application has no dependency whatsoever in gperf.
Disclaimer: I took inspiration for this example from this site.
I would like to have an easy to use way to write code like:
#include <iostream>
int main (){
std::cout << "hello, world!\n";
}
but that supports i18n. Here is an example using gettext():
#include <libintl.h>
#include <iostream>
int main (){
std::cout << gettext("hello, world!\n");
}
This can then be processed by xgettext to produce a message catalog file that can be used
by translators to create various versions. These extra files can be handled on target
systems to allow the user to interact in a preferred language.
I would like to write the code something like this instead:
#include <i18n-iostream>
int main (){
i18n::cout << "hello, world!\n";
}
At build time the quoted strings would be examined by a program like xgettext to produce the
base message catalog file. << operator with argument i18n::cout would take a string
literal as the key to lookup the run-time text to use from a message catalog.
Does it exist somewhere?
At build time the quoted strings would be examined by a program like xgettext to produce the base message catalog file. << operator with argument i18n::cout would take a string literal as the key to lookup the run-time text to use from a message catalog.
You try to convert a string like a single instance, but it isn't/
The point, you don't want something like this. Think of:
if(n=1)
i18n::cout << "I need one apple"
else
i18n::cout << "I need " << n << " apples" ;
So why this is would not work, because "n=1" or "n!=1" works only for English, many other languages have more then one plural form, also it requires translation of "I need X apples" as signle instance.
I suggest you just to learn to deal with gettext, it is quite simple and powerful, many people had thought about it.
Another point, you are usually do not call gettext but
#include <libintl.h>
#include <iostream>
#define _(x) gettext(x)
int main (){
std::cout << _("hello, world!\n");
}
This makes the code much cleaner, also it is quite a "standard" feature to use "_" as gettext alias.
Just learn how to use it, before you try to make "nicer" API. Just to mention, gettext API is quite de-facto standard for many languages, not only C.
The short answer is "No" :)
Seriously, which aspects of internationalization are you interested in? ICU provides pretty much everything but does not feel like standard C++. There are other libraries smaller in scope that provide some i18n functionalities, i.e. UTF-CPP for handling UTF-8 encoded strings.
Personally I would go with this answer, but it might be possible to use a bit of streambuf magic to do this as the text is written to the stream. If you're really interested in doing this though, please take a look at Standard C++ IOStreams and Locales by Langer and Kreft, it's the bible of iostreams.
The following assumes that everything written to the buffer is to be translated, and that each full line can be translated completely:
std::string xgettext (std::string const & s)
{
return s;
}
The following transbuf class overrides the "overflow" function and
translates the buffer every time it sees a newline.
class transbuf : public std::streambuf {
public:
transbuf (std::streambuf * realsb) : std::streambuf (), m_realsb (realsb)
, m_buf () {}
~transbuf () {
// ... flush m_buf if necessary
}
virtual std::streambuf::int_type overflow (std::streambuf::int_type c) {
m_buf.push_back (c);
if (c == '\n') {
// We have a complete line, translate it and write it to our stream:
std::string transtext = xgettext (m_buf);
for (std::string::const_iterator i = transtext.begin ()
; i != transtext.end ()
; ++i) {
m_realsb->sputc (*i);
// ... check that overflow returned the correct value...
}
m_buf = "";
}
return c;
}
std::streambuf * get () { return m_realsb; }
// data
private:
std::streambuf * m_realsb;
std::string m_buf;
};
And here's an example of how that might be used:
int main ()
{
transbuf * buf = new transbuf (std::cout.rdbuf ());
std::ostream trans (buf);
trans << "Hello"; // Added to m_buf
trans << " World"; // Added to m_buf
trans << "\n"; // Causes m_buf to be written
trans << "Added to buffer\neach new line causes\n"
"the string to be translated\nand written" << std::endl;
delete buf;
}
You mean you just want another API? You could write a small wrapper, shouldn't be too hard and it would give you the possibility to use the best API you can think of :)