range-v3: Joining piped ranges with a delimeter - c++

I'm trying to build a basic demo of the range-v3 library: take some integers, filter out odd values, stringify them, then join those into a comma-separated list. For example, { 8, 6, 7, 5, 3, 0, 9 } becomes "8, 6, 0". From reading the docs and going through examples, it seems like the naïve solution would resemble:
string demo(const vector<int>& v)
{
return v |
ranges::view::filter([](int i) { return i % 2 == 0; }) |
ranges::view::transform([](int i) { return to_string(i); }) |
ranges::view::join(", ");
}
but building on Clang 7 fails with a static assertion that one, "Cannot get a view of a temporary container". Since I'm collecting the result into a string, I can use the eager version - action::join - instead:
string demo(const vector<int>& v)
{
return v |
ranges::view::filter([](int i) { return i % 2 == 0; }) |
ranges::view::transform([](int i) { return to_string(i); }) |
ranges::action::join;
}
but the eager version doesn't seem to have an overload that takes a delimiter.
Interestingly, the original assertion goes away if you collect join's inputs into a container first. The following compiles and runs fine:
string demo(const vector<int>& v)
{
vector<string> strings = v |
ranges::view::filter([](int i) { return i % 2 == 0; }) |
ranges::view::transform([](int i) { return to_string(i); });
return strings | ranges::view::join(", ");
}
but this totally defeats the principle of lazy evaluation that drives so much of the library.
Why is the first example failing? If it's not feasible, can action::join be given a delimiter?

action::join should accept a delimiter. Feel free to file a feature request. The actions need a lot of love.

Related

Tweaking clang-format for C++20 ranges pipelines

C++20 (and 23 with std::ranges::to<T>()) makes idiomatic the use of operator| to make a pipeline of transformations such as this:
return numbers
| std::views::filter([](int n) { return n % 2 == 0; })
| std::views::transform([](int n) { return n * 2; })
| std::ranges::to<std::vector>();
With my project's current .clang-format, that looks something like
return numbers | std::views::filter([](int n) { return n % 2 == 0; }) |
std::views::transform([](int n) { return n * 2; }) | std::ranges::to<std::vector>();
which I find pretty hard to read. If I set BreakBeforeBinaryOperators: All I get
return numbers | std::views::filter([](int n) { return n % 2 == 0; })
| std::views::transform([](int n) { return n * 2; }) | std::ranges::to<std::vector>();
which is better, but I'd really like the original version with one pipeline operation on each line.
I can adjust the column limit, but that is a major change and also starts to line-break my lambdas, which I don't like:
return numbers | std::views::filter([](int n) {
return n % 2 == 0;
})
| std::views::transform(
[](int n) { return n * 2; })
| std::ranges::to<std::vector>();
I can manually use empty comments to force a newline:
return numbers //
| std::views::filter([](int n) { return n % 2 == 0; }) //
| std::views::transform([](int n) { return n * 2; }) //
| std::ranges::to<std::vector>();
but again, not ideal knowing that pipelines will be pretty common. Am I missing settings? Or is this more of a feature request I should direct to clang-format, like "Add an option so when more than n operator| appears in an expression, put each subexpression on its own line."
There's a feature request for AllowBreakingBinaryOperators. Before the feature completes, only compromise can be made.
As you've said, use // comments to force line breaks.
use clang-format off/on to disable clang-format and format it yourself.
Here's a more complex solution which combines both:
void function() {
return numbers | std::views::filter([](int n) { return n % 2 == 0; })
| std::views::transform([](int n) { return n * 2; })
| std::views::take(3) | std::ranges::to<std::vector>();
}
First, use // to split and then clang-format the code.
void function() {
return numbers
//
| std::views::filter([](int n) { return n % 2 == 0; })
| std::views::transform([](int n) { return n * 2; })
| std::views::take(3)
//
| std::ranges::to<std::vector>();
}
Next, remove //, use clang-format off/on to disable clang-format.
void function() {
// clang-format off
return numbers
| std::views::filter([](int n) { return n % 2 == 0; })
| std::views::transform([](int n) { return n * 2; })
| std::views::take(3)
| std::ranges::to<std::vector>();
// clang-format on
}
As for matrix, the option AlignArrayOfStructures might help.

Making groups (combinations) of objects using their min/max values

First of all, this is my first question, you can tell me how to improve it and what tags to use.
What I am trying to do is I have a bunch of objects that have minimal and maximal values by those values you can deduce if two objects have some sort of overlapping value and thus they can be put together in a group
This question might need dynamic programming to solve.
example objects:
1 ( min: 0, max: 2 )
2 ( min: 1, max: 3 )
3 ( min: 2, max: 4 )
4 ( min: 3, max: 5 )
object 1 can be grouped with objects 2, 3
object 2 can be grouped with objects 1, 3, 4
object 3 can be grouped with objects 1, 2, 4
object 4 can be grouped with objects 2, 3
as you can see there are multiple ways to group those elements
[1, 2]
[3, 4]
[1]
[2, 3]
[4]
[1]
[2, 3, 4]
[1, 2, 3]
[4]
now there should be some sort of rule to deduce which of the solutions is the best solution
for example least amount of groups
[1, 2]
[3, 4]
or
[1]
[2, 3, 4]
or
[1, 2, 3]
[4]
or most objects in one group
[1]
[2, 3, 4]
or
[1, 2, 3]
[4]
or any other rule that uses another attribute of said objects to compare the solutions
what I have now:
$objects = [...objects...];
$numberOfObjects = count($objects);
$groups = [];
for ($i = 0; $i < $numberOfObjects; $i++) {
$MinA = $objects[$i]['min'];
$MaxA = $objects[$i]['max'];
$groups[$i] = [$i];
for ($j = $i + 1; $j < $numberOfObjects; $j++) {
$MinB = $objects[$j]['min'];
$MaxB = $objects[$j]['max'];
if (($MinA >= $MinB && $MinA <= $MaxB) || ($MaxA >= $MinB && $MaxA <= $MaxB) || ($MinB >= $MinA && $MinB <= $MaxA)) {
array_push($groups[$i], $j);
}
}
}
this basically creates an array with indexes of objects that can be grouped together
from this point, I don't know how to proceed, how to generate all the solution and then check each of them how good it is, and the pick the best one
or maybe there is even better solution that doesn't use any of this?
PHP solutions are preferred, although this problem is not PHP-specific
When I was first looking at your algorithm, I was impressed by how efficient it is :)
Here it is rewritten in javascript, because I moved away from perl a good while ago:
function setsOf(objects){
numberOfObjects = objects.length
groups = []
let i
for (i = 0; i < numberOfObjects; i++) {
MinA = objects[i]['min']
MaxA = objects[i]['max']
groups[i] = [i]
for (j = i + 1; j < numberOfObjects; j++) {
MinB = objects[j]['min']
MaxB = objects[j]['max']
if ((MinA >= MinB && MinA <= MaxB) || (MaxA >= MinB && MaxA <= MaxB) ||
(MinB >= MinA && MinB <= MaxA)) {
groups[i].push(j)
}
}
}
return groups
}
if you happen to also think well in javascript, you might find this form more direct (it is identical, however):
function setsOf(objects){
let groups = []
objects.forEach((left,i) => {
groups[i]=[i]
Array.from(objects).splice(i+1).forEach((right, j) => {
if ((left.min >= right.min && left.min <= right.max) ||
(left.max >=right.max && left.max <= right.max) ||
(right.min >= left.min && right.min <= left.max))
groups[i].push(j+i+1)
})
})
return groups
}
so if we run it, we get:
a = setsOf([{min:0, max:2}, {min:1, max:3}, {min:2, max:4}, {min:3, max: 5}])
[Array(3), Array(3), Array(2), Array(1)]0: Array(3)1: Array(3)2: Array(2)3: Array(1)length: 4__proto__: Array(0)
JSON.stringify(a)
"[[0,1,2],[1,2,3],[2,3],[3]]"
and it does impressively catch the compound groups :) a weakness is that it is capturing groups containing more objects than necessary, without capturing all available objects. You seem to have a very custom selection criteria. To me, it seems like the groups should either be every last intersecting subset, or only subsets where each element in the group provides unique coverage: [0,1], [0,2], [1,2], [1,3], [2,3], [0,1,3]
the algorithm for that is perhaps more involved. this was my approach, and it is nowhere near as terse and elegant as yours, but it works:
function intersectingGroups (mmvs) {
const min = []
const max = []
const muxo = [...mmvs]
mmvs.forEach(byMin => {
mmvs.forEach(byMax => {
if (byMin.min === byMax.min && byMin.max === byMax.max) {
console.log('rejecting identity', byMin, byMax)
return // identity
}
if (byMax.min > byMin.max) {
console.log('rejecting non-overlapping objects', byMin, byMax)
return // non-overlapping objects
}
if ((byMax.max <= byMin.max) || (byMin.min >= byMax.min)) {
console.log('rejecting non-expansive coverage or inversed order',
byMin, byMax)
return // non-expansive coverage or inversed order
}
const entity = {min: byMin.min, max: byMax.max,
compositeOf: [byMin, byMax]}
if(muxo.some(mv => mv.min === entity.min && mv.max === entity.max))
return // enforcing Set
muxo.push(entity)
console.log('adding', byMin, byMax, muxo)
})
})
if(muxo.length === mmvs.length) {
return muxo.filter(m => 'compositeOf' in m)
// solution
} else {
return intersectingGroups(muxo)
}
}
now there should be some sort of rule to deduce which of the solutions is the best solution
Yeah, so, usually for puzzles or for a specification you are fulfilling, that would be given as part of the problem. As it is, you want a general method that is adaptable. It's probably best to make an object that can be configured with the results and accepts rules, then load the rules you are interested in, and the results from the search, and see what rules match where. For example, using your algorithm and sample criteria:
least amount of groups
start with code like:
let reviewerFactory = {
getReviewer (specification) { // generate a reviewer
return {
matches: [], // place to load sets to
criteria: specification,
review (objects) { // review the sets already loaded
let group
let results = {}
this.matches.forEach(mset => {
group = [] // gather each object from the initial set for each match in the result set
mset.forEach(m => {
group.push(objects[m])
})
results[mset] = this.criteria.scoring(group) // score the match relative to the specification
})
return this.criteria.evaluation(results) // pick the best score
}
}
},
specifications: {}
}
now you can add specifications like this one for least amount of groups:
reviewerFactory.specifications['LEAST GROUPS'] = {
scoring: function (set) { return set.length },
evaluation: function (res) { return Object.keys(res).sort((a,b) => res[a] - res[b])[0] }
}
then you can use that in the evaluation of a set:
mySet = [{min:0, max:2}, {min:1, max:3}, {min:2, max:4}, {min:3, max: 5}]
rf = reviewerFactory.getReviewer(reviewerFactory.specifications['LEAST GROUPS'])
Object {matches: Array(0), criteria: Object, review: function}
rf.matches = setsOf(mySet)
[Array(3), Array(3), Array(2), Array(1)]
rf.review(mySet)
"3"
or, most objects:
reviewerFactory.specifications['MOST GROUPS'] = {
scoring: function (set) { return set.length },
evaluation: function (res) { return Object.keys(res).sort((a,b) => res[a] - res[b]).reverse()[0] }
}
mySet = [{min:0, max:2}, {min:1, max:3}, {min:2, max:4}, {min:3, max: 5}]
reviewer = reviewerFactory.getReviewer(reviewerFactory.specifications['MOST GROUPS'])
reviewer.matches = setsOf(mySet)
reviewer.review(mySet)
"1,2,3"
Of course this is arbitrary, but so are the criteria, by definition in the OP. Likewise, you would have to change the algorithms here to work with my intersectingGroups function because it doesn't return indices. But this is what you are looking for I believe.

C++ to VBA (Excel)

So, basically, in Excel, I have 4 columns of data (all with strings) that I want to process, and want to have the results in another column, like this (nevermind the square brackets, they just represent cells):
Line Column1 Column2 Column3 Column4 Result
1: [a] [b] [k] [YES] [NO]
2: [a] [c] [l] [YES] [NO]
3: [b] [e] [] [YES] [NO]
4: [c] [e] [f] [NO] [NO]
5: [d] [h] [b] [NO] [NO]
6: [d] [] [w] [NO] [NO]
7: [e] [] [] [YES] [NO]
8: [j] [m] [] [YES] [YES]
9: [j] [] [] [YES] [YES]
10: [] [] [] [YES] [YES]
The process that I want the data to go through is this:
Assume that CheckingLine is the Line for which I currently want to calculate the value of Result, and that CurrentLine is any Line (except CheckingLine) that I am using to calculate the value of Result, at a given moment.
If Column4[CheckingLine] is "NO", Result is "NO" (simple enough, no help needed);
Example: CheckingLine = 1 -> Column4[1] = "NO" -> Result = "NO";
Else, I want to make sure that all Lines that share a common value with CheckingLine (in any Column between 1 and 3), also have Column4 as "YES" (Doing that would be simple enough even without VBA - in fact, I started by doing it in plain Excel and realised that it wasn't what I wanted) - if that happens, Result is "YES";
Example: CheckingLine = 8 -> Only shared value is "j" -> CurrentLine = 9 -> Column4[9] = "YES" -> Result = "YES";
Here's the tricky part: If one of those lines has any value (again, in any Column between 1 and 3) that IS NOT shared with CheckingLine, I want to do the whole process (restart at 1.), but checking the CurrentLine instead.
Example: CheckingLine = 2, "a" is shared with Line 1, c is shared with Line 4 -> CurrentLine = 1 -> Column4[1] = "YES", but "b" and "k" are not shared with CheckingLine -> CheckingLine' = 1 -> "b" is shared with Line 5 -> Column4[5] = "NO" -> Result = "NO";
I have written the corresponding C++ code (which works) (and it could have been in any other language, C++ was just the one I was using at the moment) (and the code HAS NOT been optimized in any way, because it's purpose was to be AS CLEAR about its functionality AS POSSIBLE) (the table above is the actual result of running it):
#include <iostream>
#include <string>
#include <vector>
std::vector<std::string> column1, column2, column3, column4, contentVector;
unsigned int location, columnsSize;
void InsertInVector(std::string Content)
{
if(Content == "")
{
return;
}
for(unsigned int i = 0; i < contentVector.size(); i++)
{
if(contentVector[i] == Content)
{
return;
}
}
contentVector.push_back(Content);
}
std::string VerifyCurrentVector(unsigned int Start)
{
std::string result = "";
if(contentVector.size() == 0)
{
result = "YES";
}
else
{
unsigned int nextStart = contentVector.size();
for(unsigned int i = 0; i < columnsSize; i++)
{
if(i != location)
{
for(unsigned int j = Start; j < nextStart; j++)
{
if(column1[i] == contentVector[j])
{
InsertInVector(column2[i]);
InsertInVector(column3[i]);
}
else if(column2[i] == contentVector[j])
{
InsertInVector(column1[i]);
InsertInVector(column3[i]);
}
else if(column3[i] == contentVector[j])
{
InsertInVector(column1[i]);
InsertInVector(column2[i]);
}
}
}
}
if(nextStart == contentVector.size())
{
for(unsigned int i = 0; i < columnsSize; i++)
{
if(i != location)
{
for(unsigned int j = 0; j < nextStart; j++)
{
if(column1[i] == contentVector[j] || column2[i] ==
contentVector[j] || column3[i] == contentVector[j])
{
if(column4[i] == "NO")
{
result = "NO";
return result;
}
}
}
}
}
result = "YES";
}
else
{
result = VerifyCurrentVector(nextStart);
}
}
return result;
}
std::string VerifyCell(unsigned int Location)
{
std::string result = "";
location = Location - 1;
if(column4.size() < Location)
{
result = "Error";
}
else if(column4[location] == "NO")
{
result = "NO";
}
else
{
contentVector.clear();
InsertInVector(column1[location]);
InsertInVector(column2[location]);
InsertInVector(column3[location]);
result = VerifyCurrentVector(0);
}
return result;
}
void SetUpColumns(std::vector<std::string> &Column1, std::vector<std::string> &Column2,
std::vector<std::string> &Column3, std::vector<std::string> &Column4)
{
if(Column4.size() > Column1.size())
{
for(unsigned int i = Column1.size(); i < Column4.size(); i++)
{
Column1.push_back("");
}
}
if(Column4.size() > Column2.size())
{
for(unsigned int i = Column2.size(); i < Column4.size(); i++)
{
Column2.push_back("");
}
}
if(Column4.size() > Column3.size())
{
for(unsigned int i = Column3.size(); i < Column4.size(); i++)
{
Column3.push_back("");
}
}
column1 = Column1;
column2 = Column2;
column3 = Column3;
column4 = Column4;
columnsSize = Column4.size();
}
int main()
{
std::vector<std::string> Column1, Column2, Column3, Column4;
Column1.push_back("a");
Column1.push_back("a");
Column1.push_back("b");
Column1.push_back("c");
Column1.push_back("d");
Column1.push_back("d");
Column1.push_back("e");
Column1.push_back("j");
Column1.push_back("j");
Column2.push_back("b");
Column2.push_back("c");
Column2.push_back("e");
Column2.push_back("e");
Column2.push_back("h");
Column2.push_back("");
Column2.push_back("");
Column2.push_back("m");
Column3.push_back("k");
Column3.push_back("l");
Column3.push_back("");
Column3.push_back("f");
Column3.push_back("b");
Column3.push_back("w");
Column4.push_back("YES");
Column4.push_back("YES");
Column4.push_back("YES");
Column4.push_back("NO");
Column4.push_back("NO");
Column4.push_back("NO");
Column4.push_back("YES");
Column4.push_back("YES");
Column4.push_back("YES");
Column4.push_back("YES");
SetUpColumns(Column1, Column2, Column3, Column4);
std::cout << "Line\t" << "Column1\t" << "Column2\t" << "Column3\t" << "Column4\t" <<
std::endl;
for(unsigned int i = 0; i < Column4.size(); i++)
{
std::cout << i + 1 << ":\t" << "[" << column1[i] << "]\t[" << column2[i] <<
"]\t[" << column3[i] << "]\t[" << column4[i] << "]\t[" << VerifyCell(i + 1)
<< "]" << std::endl;
}
return 0;
}
So, after this lengthy explanation, what I want to know is this:
Is there any way to do this in Excel's VBA (or even better, in plain Excel without VBA)?
If not, how can I have my code (which I can easily translate to another C-like language and/or optimise) get the data from, and deliver the results to, Excel?
Is there any way to do this in Excel's VBA?
Yes, you can surely do this with VBA, it is a complete and powerful programming language
(or even better, in plain Excel without VBA)?
Nope. The calculation seems too complicated to fit with Excel formulae without any VBA code.
If not, how can I have my code (which I can easily translate to another C-like language and/or optimise) get the data from, and deliver the results to, Excel?
You can access Excel from C++ in many ways. Using ATL is one of them. another, easier way would be to import/export your Excel file in CSV format, which is easy to parse and write from C++.
Also consider C#, it has complete COM inter-operability to access office components.
Ok, if you like to "whipped the code in a rush" then you'll love VBA, next time please try to ask a more specific question. Based on code and comments #MikeAscended you're a relatively good programmer, with a grasp of functions/recursion, variable/parameters, conditions, loops, data structures, etc. Re: " I have only touched VBA once in my life and ran away from it" My intent is to get you started and give you syntax here not necessarily a working solution. I'm happy to answer any further specific questions you may continue to have.
Strategy-wise,
I recommend plain VBA which is easy to use in Excel. Obviously your problem can be solved in many ways including formulas, however VBA is a powerful tool that any programmer will benefit from using.
Code-wise,
To start access the editor from Excel press [Alt-F11], or from Design Mode insert and double-click an ActiveX button. To run a macro press [Alt-F8], or in VBA click the green play button.
One last note, if you want those line numbers in column 1 in excel then yours will become Column 2-5 or B-F. I'm assuming you'll use the row numbers in excel so that Column 1 is A, but row 1 will still have titles, so you are staring your data on row 2.
sub processResults_Col5()
' Run This Script as Main()
dim rowCount as long, i as long 'rowCount = columnsSize
with sheets(1)
.Range("A1:D1") = Array("a", "b", "k", "YES")
' finish init here
' SetUpColumns not necessary in excel
if .cells(2,1).value <> "" then 'do not use .end(xldown) if data is missing
rowCount = .cells(1,1).end(xldown).row
for i = 1 to rowCount
.cells(i,5) = verifyCell(i + 1, rowCount)
next i
endif 'space will be added :p
end with
end sub
function verifyCell(rowLocation as long, size as long, optional wSh as excel.worksheet) as string
' the rest should be easy for you to figure out based on C-code
with wSh
if wsh is nothing then set wsh = activesheet 'let VBA capitalize stuff so you know you typed it correctly
if size < rowlocation then
verifyCell = "Error" 'the function name is the return value
'msgbox "Error" ' you can uncomment this line to see error
elseif cells(rowLocation, 4).value = "NO" then
cells(rowLocation, 5) = "NO" 'set result
else
call InsertInVector(rowLocation) 'CheckingLine
' edit the current rowLocation with for loops
verifyCell = VerifyCurrentVector(0) 'whatever you're doing here
endif
end with
end function
sub InsertInVector()
end sub
sub VerifyCurrentVector() 'function returns a value
end sub
Some tips:
Generally, Comment Your Code!
Generally, The first word/acronym of Variable and Object names should start in lowercase, then continue in camel-case. This helps distinguish them from library types.
In VBA always put [option explicit] in the beginning of every sheet/module, this requires you to [dim varName as Type] which will help debugging and make your code more explicit so it's easy to understand.
In VBA for numbers use type Long, learn early vs late-binding. If you're instantiating any object that requires a reference/library, always state it explicitly. This includes Excel.Worksheet, Excel.Workbook, etc. (eg. you may want your code in MS Access)
In Office One of the first settings you're going to want to disable is the popup error window, also use debug.print and the immediate box a few times.
Generally, as you know from C++ take your time, try to write correct code on your the first try as this will save you debugging time. Try not to rush and keep coffee & healthy snacks on hand. Good luck and have fun :)

Iterate through multiple linked lists at the same time in c++ +linked list

I am doing a task that requires to calculate a metric from a linked list that contains multiple linked lists of char (each row is a single linked list as shown in the graph). So I will need to iterate through every node that contains a space from the second row, to check how many spaces are surrounding by four other spaces (top, bottom, left, right). For instance, referring to the graph below, the second space in the third row is surrounding by four spaces, so count++. (the "H" just simply means non-space character, sorry that I don't have enough reputation to post a real picture).
I am allowed to used the STL linked list library. I was trying to use three iterators to iterate through three rows at the same time. However, the code gets really messy and does not even work correctly as each row has different lengths. I have been thinking the solution for two days, but as I've been practicing C++ for only two months, so what I could think of is pretty limited. So I am wondering if anyone could give me a hint or a smarter solution, please.
space | space | --H -- | --H -- | -- H -- | NULL| NULL
--- H --| --H ---| space | space | --- H- | -- H -- | NULL
--- H --| space | space | space | -- H-- | space| NULL
space | --H -- | space | space | -- H -- | NULL | NULL
Following may help: (in C++11): Live example.
std::size_t countSpaceSurroundBySpace(const std::list<std::list<char>>& l)
{
if (l.size() < 3u) {
return 0u;
}
auto it1 = l.begin();
auto it2 = std::next(it1);
auto it3 = std::next(it2);
std::size_t count = 0u;
for (; it3 != l.end(); ++it1, ++it2, ++it3) {
// pointers on the 5 characters
std::list<char>::const_iterator its[5] = {
it1->begin(),
it2->begin(),
it2->begin(),
it2->begin(),
it3->begin()
};
if (its[0] == it1->end()) { continue; }
++its[0];
if (its[2] == it2->end()) { continue; }
++its[2];
++its[3];
if (its[3] == it2->end()) { continue; }
++its[3];
if (its[4] == it3->end()) { continue; }
++its[4];
for (; its[0] != it1->end() && its[3] != it2->end() && its[4] != it3->end();) {
if (std::all_of(std::begin(its), std::end(its), [](std::list<char>::const_iterator it) { return *it == ' '; })) {
++count;
}
for (auto& it : its) {
++it;
}
}
}
return count;
}
You can use a std::vector of std::list::iterators. You'd have an inner loop that just does whatever computation you want at each step, if the iterator isn't already at the corresponding list's end(). This will be a lot easier if your incoming lists are in turn provided as an array or a vector.

Does Go have "if x in" construct similar to Python?

How can I check if x is in an array without iterating over the entire array, using Go? Does the language have a construct for this?
Like in Python:
if "x" in array:
# do something
There is no built-in operator to do it in Go. You need to iterate over the array. You can write your own function to do it, like this:
func stringInSlice(a string, list []string) bool {
for _, b := range list {
if b == a {
return true
}
}
return false
}
Or in Go 1.18 or newer, you can use slices.Contains (from golang.org/x/exp/slices).
If you want to be able to check for membership without iterating over the whole list, you need to use a map instead of an array or slice, like this:
visitedURL := map[string]bool {
"http://www.google.com": true,
"https://paypal.com": true,
}
if visitedURL[thisSite] {
fmt.Println("Already been here.")
}
Another solution if the list contains static values.
eg: checking for a valid value from a list of valid values:
func IsValidCategory(category string) bool {
switch category {
case
"auto",
"news",
"sport",
"music":
return true
}
return false
}
This is quote from the book "Programming in Go: Creating Applications for the 21st Century":
Using a simple linear search like this is the only option for unsorted
data and is fine for small slices (up to hundreds of items). But for
larger slices—especially if we are performing searches repeatedly—the
linear search is very inefficient, on average requiring half the items
to be compared each time.
Go provides a sort.Search() method which uses the binary search
algorithm: This requires the comparison of only log2(n) items (where n
is the number of items) each time. To put this in perspective, a
linear search of 1000000 items requires 500000 comparisons on average,
with a worst case of 1000000 comparisons; a binary search needs at
most 20 comparisons, even in the worst case.
files := []string{"Test.conf", "util.go", "Makefile", "misc.go", "main.go"}
target := "Makefile"
sort.Strings(files)
i := sort.Search(len(files),
func(i int) bool { return files[i] >= target })
if i < len(files) && files[i] == target {
fmt.Printf("found \"%s\" at files[%d]\n", files[i], i)
}
https://play.golang.org/p/UIndYQ8FeW
Just had a similar question and decided to try out some of the suggestions in this thread.
I've benchmarked best and worst-case scenarios of 3 types of lookup:
using a map
using a list
using a switch statement
Here's the function code:
func belongsToMap(lookup string) bool {
list := map[string]bool{
"900898296857": true,
"900898302052": true,
"900898296492": true,
"900898296850": true,
"900898296703": true,
"900898296633": true,
"900898296613": true,
"900898296615": true,
"900898296620": true,
"900898296636": true,
}
if _, ok := list[lookup]; ok {
return true
} else {
return false
}
}
func belongsToList(lookup string) bool {
list := []string{
"900898296857",
"900898302052",
"900898296492",
"900898296850",
"900898296703",
"900898296633",
"900898296613",
"900898296615",
"900898296620",
"900898296636",
}
for _, val := range list {
if val == lookup {
return true
}
}
return false
}
func belongsToSwitch(lookup string) bool {
switch lookup {
case
"900898296857",
"900898302052",
"900898296492",
"900898296850",
"900898296703",
"900898296633",
"900898296613",
"900898296615",
"900898296620",
"900898296636":
return true
}
return false
}
Best-case scenarios pick the first item in lists, worst-case ones use nonexistent value.
Here are the results:
BenchmarkBelongsToMapWorstCase-4 2000000 787 ns/op
BenchmarkBelongsToSwitchWorstCase-4 2000000000 0.35 ns/op
BenchmarkBelongsToListWorstCase-4 100000000 14.7 ns/op
BenchmarkBelongsToMapBestCase-4 2000000 683 ns/op
BenchmarkBelongsToSwitchBestCase-4 100000000 10.6 ns/op
BenchmarkBelongsToListBestCase-4 100000000 10.4 ns/op
Switch wins all the way, worst case is surpassingly quicker than best case.
Maps are the worst and list is closer to switch.
So the moral is:
If you have a static, reasonably small list, switch statement is the way to go.
The above example using sort is close, but in the case of strings simply use SearchString:
files := []string{"Test.conf", "util.go", "Makefile", "misc.go", "main.go"}
target := "Makefile"
sort.Strings(files)
i := sort.SearchStrings(files, target)
if i < len(files) && files[i] == target {
fmt.Printf("found \"%s\" at files[%d]\n", files[i], i)
}
https://golang.org/pkg/sort/#SearchStrings
This is as close as I can get to the natural feel of Python's "in" operator. You have to define your own type. Then you can extend the functionality of that type by adding a method like "has" which behaves like you'd hope.
package main
import "fmt"
type StrSlice []string
func (list StrSlice) Has(a string) bool {
for _, b := range list {
if b == a {
return true
}
}
return false
}
func main() {
var testList = StrSlice{"The", "big", "dog", "has", "fleas"}
if testList.Has("dog") {
fmt.Println("Yay!")
}
}
I have a utility library where I define a few common things like this for several types of slices, like those containing integers or my own other structs.
Yes, it runs in linear time, but that's not the point. The point is to ask and learn what common language constructs Go has and doesn't have. It's a good exercise. Whether this answer is silly or useful is up to the reader.
Another option is using a map as a set. You use just the keys and having the value be something like a boolean that's always true. Then you can easily check if the map contains the key or not. This is useful if you need the behavior of a set, where if you add a value multiple times it's only in the set once.
Here's a simple example where I add random numbers as keys to a map. If the same number is generated more than once it doesn't matter, it will only appear in the final map once. Then I use a simple if check to see if a key is in the map or not.
package main
import (
"fmt"
"math/rand"
)
func main() {
var MAX int = 10
m := make(map[int]bool)
for i := 0; i <= MAX; i++ {
m[rand.Intn(MAX)] = true
}
for i := 0; i <= MAX; i++ {
if _, ok := m[i]; ok {
fmt.Printf("%v is in map\n", i)
} else {
fmt.Printf("%v is not in map\n", i)
}
}
}
Here it is on the go playground
In Go 1.18+, you can now declare generic Contains function which is also implemented in the experimental slice function. It works for any comparable type
func Contains[T comparable](arr []T, x T) bool {
for _, v := range arr {
if v == x {
return true
}
}
return false
}
and use it like this:
if Contains(arr, "x") {
// do something
}
// or
if slices.Contains(arr, "x") {
// do something
}
which I found here
try lo: https://github.com/samber/lo#contains
present := lo.Contains[int]([]int{0, 1, 2, 3, 4, 5}, 5)