The codebase where I work has an object called Pair where A and B are the types of the first and second values in the Pair. I find this object to be offensive, because it gets used instead of an object with clearly named members. So I find this:
List<Pair<Integer, Integer>> productIds = blah();
// snip many lines and method calls
void doSomething(Pair<Integer, Integer> id) {
Integer productId = id.first();
Integer quantity = id.second();
}
Instead of
class ProductsOrdered {
int productId;
int quantityOrdered;
// accessor methods, etc
}
List<ProductsOrderded> productsOrdered = blah();
Many other uses of the Pair in the codebase are similarly bad-smelling.
I Googled tuples and they seem to be often misunderstood or used in dubious ways. Is there a convincing argument for or against their use? I can appreciate not wanting to create huge class hierarchies but are there realistic codebases where the class hierarchy would explode if tuples weren't used?
First of all, a tuple is quick and easy: instead of writing a class for every time you want to put 2 things together, there's a template that does it for you.
Second of all, they're generic. For example, in C++ the std::map uses an std::pair of key and value. Thus ANY pair can be used, instead of having to make some kind of wrapper class with accessor methods for every permutation of two types.
Finally, they're useful for returning multiple values. There's really no reason to make a class specifically for a function's multiple return values, and they shouldn't be treated as one object if they're unrelated.
To be fair, the code you pasted is a bad use of a pair.
Tuples are used all the time in Python where they are integrated into the language and very useful (they allow multiple return values for starters).
Sometimes, you really just need to pair things and creating a real, honest to god, class is overkill. One the other hand, using tuples when you should really be using a class is just as bad an idea as the reverse.
The code example has a few different smells:
Reinventing the wheel
There is already a tuple available in the framework; the KeyValuePair structure. This is used by the Dictionary class to store pairs, but you can use it anywhere it fits. (Not saying that it fits in this case...)
Making a square wheel
If you have a list of pairs it's better to use the KeyValuePair structure than a class with the same purpose, as it results in less memory allocations.
Hiding the intention
A class with properties clearly shows what the values mean, while a Pair<int,int> class doesn't tell you anything about what the values represent (only that they are probably related somehow). To make the code reasonably self explanatory with a list like that you would have to give the list a very desciptive name, like productIdAndQuantityPairs...
For what its worth, the code in the OP is a mess not because it uses tuples, but because the values in the tuple are too weakly typed. Compare the following:
List<Pair<Integer, Integer>> products_weak = blah1();
List<Pair<Product, Integer>> products_strong = blah2();
I'd get upset too if my dev team were passing around IDs rather than class instances around, because an integer can represent anything.
With that being said, tuples are extremely useful when you use them right:
Tuples exist to group ad hoc values together. They are certainly better than creating excessive numbers of wrapper classes.
Useful alternative to out/ref parameters when you need to return more than one value from a function.
However, tuples in C# make my eyes water. Many languages like OCaml, Python, Haskell, F#, and so on have a special, concise syntax for defining tuples. For example, in F#, the Map module defines a constructor as follows:
val of_list : ('key * 'a) list -> Map<'key,'a>
I can create an instance of a map using:
(* val values : (int * string) list *)
let values =
[1, "US";
2, "Canada";
3, "UK";
4, "Australia";
5, "Slovenia"]
(* val dict : Map<int, string> *)
let dict = Map.of_list values
The equilvalent code in C# is ridicuous:
var values = new Tuple<int, string>[] {
new Tuple<int, string>(1, "US"),
new Tuple<int, string>(2, "Canada"),
new Tuple<int, string>(3, "UK"),
new Tuple<int, string>(4, "Australia"),
new Tuple<int, string>(5, "Slovenia")
}
var dict = new Dictionary<int, string>(values);
I don't believe there is anything wrong with tuples in principle, but C#'s syntax is too cumbersome to get the most use out of them.
It's code reuse. Rather than writing Yet Another Class With Exactly The Same Structure As The Last 5 Tuple-Like Classes We Made, you make... a Tuple class, and use that whenever you need a tuple.
If the only significance of the class is "to store a pair of values", then I'd say using tuples is an obvious idea. I'd say it was a code smell (as much as I hate the term) if you started implementing multiple identical classes just so that you could rename the two members.
This is just prototype-quality code that likely was smashed together and has never been refactored. Not fixing it is just laziness.
The real use for tuples is for generic functionality that really doesn't care what the component parts are, but just operates at the tuple level.
Scala has tuple-valued types, all the way from 2-tuples (Pairs) up to tuples with 20+ elements. See First Steps to Scala (step 9):
val pair = (99, "Luftballons")
println(pair._1)
println(pair._2)
Tuples are useful if you need to bundle together values for some relatively ad hoc purpose. For example, if you have a function that needs to return two fairly unrelated objects, rather than create a new class to hold the two objects, you return a Pair from the function.
I completely agree with other posters that tuples can be misused. If a tuple has any sort of semantics important to your application, you should use a proper class instead.
The obvious example is a coordinate pair (or triple). The labels are irrelevant; using X and Y (and Z) is just a convention. Making them uniform makes it clear that they can be treated in the same way.
Many things have already been mentioned, but I think one should also mention, that there are some programming styles, that differ from OOP and for those tuples are quite useful.
Functional programming languages like Haskell for example don't have classes at all.
If you're doing a Schwartzian transform to sort by a particular key (which is expensive to compute repeatedly), or something like that, a class seems to be a bit overkill:
val transformed = data map {x => (x.expensiveOperation, x)}
val sortedTransformed = transformed sort {(x, y) => x._1 < y._1}
val sorted = sortedTransformed map {case (_, x) => x}
Having a class DataAndKey or whatever seems a bit superfluous here.
Your example was not a good example of a tuple, though, I agree.
Related
I'm making a class that is supposed to be able to store a 20 element array with each element being a tuple of four predefined types. Another catch is, I can't use parameters.
I can't find good online sources for this and the material provided from my university is honestly insufficient. I'm preparing for an exam and I'm stumped when it comes to objects in OCaml.
I was thinking of doing something like
val mutable arr = Array.make 20 (input 20 values)
but this seems too simplistic and inefficient to be a correct solution.
The fields of a class can have any type. This certainly includes an array type. Arrays, in turn, can contain any type, which includes tuples.
Any given mutable field and any given array is, of course, restricted to always contain values of the same type. This is what it means to have "strong" typing.
OCaml is a high level language, so there's no need (or opportunity, really) to be concerned with too many details of representation. If you want a class with a field like you say, your proposted type sounds perfectly fine.
type mytuple = int * float * char
class myclass = object
val mutable myfield : mytuple array = [||]
end
You can find good documentation on OCaml at realworldocaml.org. There are more resources listed at ocaml.org.
At SO, I have seen questions that compare Array with Seq, List with Seq and Vector with well, everything. I do not understand one thing though. When should I actually use a Seq over any of these? I understand when to use a List, when to use an Array and when to use a Vector. But when is it a good idea to use Seq rather than any of the above listed collections? Why should I use a trait that extends Iterable rather than all the concrete classes listed above?
You usually should use Seq as input parameter for method or class, defined for sequences in general (just general, not necessarily with generic):
def mySort[T](seq: Seq[T]) = ...
case class Wrapper[T](seq: Seq[T])
implicit class RichSeq[T](seq: Seq[T]) { def mySort = ...}
So now you can pass any sequence (like Vector or List) to mySort.
If you care about algorithmic complexity - you can specialize it to IndexedSeq (quick random element access) or LinearSeq (fast memory allocation). Anyway, you should prefer most top-level class if you want your function to be more polymorphic has on its input parameter, as Seq is a common interface for all sequences. If you need something even more generic - you may use Traversable or Iterable.
The principal here is the same as in a number of languages (E.g. in Java should often use List instead of ArrayList, or Map instead of HashMap). If you can deal with the more abstract concept of a Seq, you should, especially when they are parameters to methods.
2 main reasons that come to mind:
1) reuse of your code. e.g. if you have a method that takes a foo(s:Seq), it can be reused for lists and arrays.
2) the ability to change your mind easily. E.g. If you decide that List is working well, but suddenly you realise you need random access, and want to change it to an Array, if you have been defining List everywhere, you'll be forced to change it everywhere.
Note #1: there are times where you could say Iterable over Seq, if your method supports it, in which case I'd inclined to be as abstract as possible.
Note #2: Sometimes, I might be inclined to not say Seq (or be totally abstract) in my work libraries, even if I could. E.g. if I were to do something which would be highly non-performant with the wrong collection. Such as doing Random Access - even if I could write my code to work with a List, it would result in major inefficiency.
Is there any particular reason for the inconsistent return types of the functions in Dart's ListBase class?
Some of the functions do what (as a functional programmer) I would expect, that is: List -> (apply function) -> List. These include: take, skip, reversed.
Others do not: thus l.removeLast() returns just the final element of the list; to get the List without the final element, you have to use a cascade: l..removeLast().
Others return a lazy Iterable, which requires further work to retrieve the list: newl = l.map(f).toList().
Some functions operate more like properties l.last, as opposed to functions l.removeLast()
Is there some subtle reason for these choices?
mbmcavoy is right. Dart is an imperative language and many List members modify the list in-place. The most prominent is the operator []=, but sort, shuffle, add, removeLast, etc. fall into the same category.
In addition to these imperative members, List inherits some functional-style members from Iterable: skip, take, where, map, etc. These are lazy and do not modify the List in place. They are backed by the original list. Modifying the backing list, will change the result of iterating over the iterable. List furthermore adds a few lazy members, like reversed.
To avoid confusion, lazy members always return an Iterable and not an object implementing the List interface. Some of the iterables guarantee fast length and index-operators (like take, skip and reversed) and could easily implement the List interface. However, this would inevitably lead to bugs, since they are lazy and backed by the original list.
(Disclaimer: I have not yet used Dart specifically, but hope to soon.)
Dart is not a functional programming language, which may the the source of your confusion.
Methods, such as .removeLast() are intended to change the state of the object they are called upon. The operation performed by l.removeLast() is to modify l so that it no longer contains the last item. You can access the resulting list by simply using l in your next statement.
(Note they are called "methods" rather than "functions", as they are not truly functions in the mathematical sense.)
The choice to return the removed item rather than the remaining list is a convenience. most frequently, the program will need to do something with the removed item (like move it to a different list).
For other methods, the returned data will relate to a common usage scenario, but it isn't always necessary to capture it.
Scala is new to me so I'm not sure the best way to go about this.
I need to simply take the strings within a single list and join them.
So, concat(List("a","b","c")) returns abc.
Should I first see how many strings there are in the list, that way I can just loop through and join them all? I feel like that needs to be done first, that way you can use the lists just like an array and do list[1] append list[2] append list[3], etc..
Edit:
Here's my idea, of course with compile errors..
def concat(l: List[String]): String = {
var len = l.length
var i = 0
while (i < len) {
val result = result :: l(i) + " "
}
result
}
How about this, on REPL
List("a","b","c") mkString("")
or in script file
List("a","b","c").mkString("")
Some options to explore for you:
imperative: for-loop; use methods from the List object to determine
loop length or use for-each List item
classical functional: recursive function, one element at the time using
higher-order functions: look at fold.
Given the basic level of the problem, I think you're looking at learning some fundamentals in programming. If the language of choice is Scala, probably the focus is on functional programming, so I'd put effort on solving #2, then solve #1. #3 for extra credits.
This exercise is designed to encourage you to think about the problem from a functional perspective. You have a set of data over which you wish to move, performing a set of identical operations. You've already identified the imperative, looping construct (for). Simple enough. Now, how would you build that into a functional construct, not relying on "stateful" looping?
In functional programming, fold ... is a family of higher-order
functions that iterate an arbitrary function over a data structure in
some order and build up a return value.
http://en.wikipedia.org/wiki/Fold_%28higher-order_function%29
That sounds like something you could use.
As string concatenation is associative (to be exact, it forms a monoid having the empty String as neutral element), the "direction" of the fold doesn't matter (at least if you're not bothered by performance).
Speaking of performance: In real life, it would be a good idea to use a StringBuilder for the intermediate steps, but it's up to you if you want to use it.
A bit longer that mkString but more efficient:
s.foldLeft(new StringBuilder())(_ append _).toString()
I'm just assuming here that you are not only new to Scala, but also new to programming in general. I'm not saying SO is not made for newbies, but I'm sure there are many other places, which are better suited for your needs. For example books...
I'm also assuming that your problem doesn't have to be solved in a functional, imperative or some other way. It just has to be solved as a homework assignment.
So here are the list of things you should consider / ask yourself:
If you want to concat all elements of the list do you really need to know how many there are?
If you think you do, fine, but after having solved this problem using this approach try to fiddle around with your solution a little bit to find out if there is another way.
Appending the elements to a resulting list is a thought in right direction, but think about this: in addition to being object-oriented Scala is also a full-blown functional language. You might not know what this means, but all you need to know for now is this: it is pretty darn good with things like lists (LISP is the most known functional language and it stands for LISt Processing, which has to be an indication of some kind, don't you think? ;)). So maybe there is some magical (maybe even Scala idiomatic) way to accomplish such a concatination without defining the resulting list yourself.
Why does nobody seem to use tuples in C++, either the Boost Tuple Library or the standard library for TR1? I have read a lot of C++ code, and very rarely do I see the use of tuples, but I often see lots of places where tuples would solve many problems (usually returning multiple values from functions).
Tuples allow you to do all kinds of cool things like this:
tie(a,b) = make_tuple(b,a); //swap a and b
That is certainly better than this:
temp=a;
a=b;
b=temp;
Of course you could always do this:
swap(a,b);
But what if you want to rotate three values? You can do this with tuples:
tie(a,b,c) = make_tuple(b,c,a);
Tuples also make it much easier to return multiple variable from a function, which is probably a much more common case than swapping values. Using references to return values is certainly not very elegant.
Are there any big drawbacks to tuples that I'm not thinking of? If not, why are they rarely used? Are they slower? Or is it just that people are not used to them? Is it a good idea to use tuples?
A cynical answer is that many people program in C++, but do not understand and/or use the higher level functionality. Sometimes it is because they are not allowed, but many simply do not try (or even understand).
As a non-boost example: how many folks use functionality found in <algorithm>?
In other words, many C++ programmers are simply C programmers using C++ compilers, and perhaps std::vector and std::list. That is one reason why the use of boost::tuple is not more common.
Because it's not yet standard. Anything non-standard has a much higher hurdle. Pieces of Boost have become popular because programmers were clamoring for them. (hash_map leaps to mind). But while tuple is handy, it's not such an overwhelming and clear win that people bother with it.
The C++ tuple syntax can be quite a bit more verbose than most people would like.
Consider:
typedef boost::tuple<MyClass1,MyClass2,MyClass3> MyTuple;
So if you want to make extensive use of tuples you either get tuple typedefs everywhere or you get annoyingly long type names everywhere. I like tuples. I use them when necessary. But it's usually limited to a couple of situations, like an N-element index or when using multimaps to tie the range iterator pairs. And it's usually in a very limited scope.
It's all very ugly and hacky looking when compared to something like Haskell or Python. When C++0x gets here and we get the 'auto' keyword tuples will begin to look a lot more attractive.
The usefulness of tuples is inversely proportional to the number of keystrokes required to declare, pack, and unpack them.
For me, it's habit, hands down: Tuples don't solve any new problems for me, just a few I can already handle just fine. Swapping values still feels easier the old fashioned way -- and, more importantly, I don't really think about how to swap "better." It's good enough as-is.
Personally, I don't think tuples are a great solution to returning multiple values -- sounds like a job for structs.
But what if you want to rotate three values?
swap(a,b);
swap(b,c); // I knew those permutation theory lectures would come in handy.
OK, so with 4 etc values, eventually the n-tuple becomes less code than n-1 swaps. And with default swap this does 6 assignments instead of the 4 you'd have if you implemented a three-cycle template yourself, although I'd hope the compiler would solve that for simple types.
You can come up with scenarios where swaps are unwieldy or inappropriate, for example:
tie(a,b,c) = make_tuple(b*c,a*c,a*b);
is a bit awkward to unpack.
Point is, though, there are known ways of dealing with the most common situations that tuples are good for, and hence no great urgency to take up tuples. If nothing else, I'm not confident that:
tie(a,b,c) = make_tuple(b,c,a);
doesn't do 6 copies, making it utterly unsuitable for some types (collections being the most obvious). Feel free to persuade me that tuples are a good idea for "large" types, by saying this ain't so :-)
For returning multiple values, tuples are perfect if the values are of incompatible types, but some folks don't like them if it's possible for the caller to get them in the wrong order. Some folks don't like multiple return values at all, and don't want to encourage their use by making them easier. Some folks just prefer named structures for in and out parameters, and probably couldn't be persuaded with a baseball bat to use tuples. No accounting for taste.
As many people pointed out, tuples are just not that useful as other features.
The swapping and rotating gimmicks are just gimmicks. They are utterly confusing to those who have not seen them before, and since it is pretty much everyone, these gimmicks are just poor software engineering practice.
Returning multiple values using tuples is much less self-documenting then the alternatives -- returning named types or using named references. Without this self-documenting, it is easy to confuse the order of the returned values, if they are mutually convertible, and not be any wiser.
Not everyone can use boost, and TR1 isn't widely available yet.
When using C++ on embedded systems, pulling in Boost libraries gets complex. They couple to each other, so library size grows. You return data structures or use parameter passing instead of tuples. When returning tuples in Python the data structure is in the order and type of the returned values its just not explicit.
You rarely see them because well-designed code usually doesn't need them- there are not to many cases in the wild where using an anonymous struct is superior to using a named one.
Since all a tuple really represents is an anonymous struct, most coders in most situations just go with the real thing.
Say we have a function "f" where a tuple return might make sense. As a general rule, such functions are usually complicated enough that they can fail.
If "f" CAN fail, you need a status return- after all, you don't want callers to have to inspect every parameter to detect failure. "f" probably fits into the pattern:
struct ReturnInts ( int y,z; }
bool f(int x, ReturnInts& vals);
int x = 0;
ReturnInts vals;
if(!f(x, vals)) {
..report error..
..error handling/return...
}
That isn't pretty, but look at how ugly the alternative is. Note that I still need a status value, but the code is no more readable and not shorter. It is probably slower too, since I incur the cost of 1 copy with the tuple.
std::tuple<int, int, bool> f(int x);
int x = 0;
std::tuple<int, int, bool> result = f(x); // or "auto result = f(x)"
if(!result.get<2>()) {
... report error, error handling ...
}
Another, significant downside is hidden in here- with "ReturnInts" I can add alter "f"'s return by modifying "ReturnInts" WITHOUT ALTERING "f"'s INTERFACE. The tuple solution does not offer that critical feature, which makes it the inferior answer for any library code.
Certainly tuples can be useful, but as mentioned there's a bit of overhead and a hurdle or two you have to jump through before you can even really use them.
If your program consistently finds places where you need to return multiple values or swap several values, it might be worth it to go the tuple route, but otherwise sometimes it's just easier to do things the classic way.
Generally speaking, not everyone already has Boost installed, and I certainly wouldn't go through the hassle of downloading it and configuring my include directories to work with it just for its tuple facilities. I think you'll find that people already using Boost are more likely to find tuple uses in their programs than non-Boost users, and migrants from other languages (Python comes to mind) are more likely to simply be upset about the lack of tuples in C++ than to explore methods of adding tuple support.
As a data-store std::tuple has the worst characteristics of both a struct and an array; all access is nth position based but one cannot iterate through a tuple using a for loop.
So if the elements in the tuple are conceptually an array, I will use an array and if the elements are not conceptually an array, a struct (which has named elements) is more maintainable. ( a.lastname is more explanatory than std::get<1>(a)).
This leaves the transformation mentioned by the OP as the only viable usecase for tuples.
I have a feeling that many use Boost.Any and Boost.Variant (with some engineering) instead of Boost.Tuple.