In the following code snippet, when the <string, vector> pair is inserted into the map, is it necessary for the contents of the string and vector to be duplicated, or does this just shuffle pointers around?
Is the full 100 bytes of the string s duplicated and then the original freed? Since the local variables s and v are no longer needed after the end of the scope immediately following the insert, it seems quite unnecessary to duplicate and then free the originals. Can this be avoided? Do any implementations of STL implement copy-on-write semantics for string or vector?
> In the following code snippet, when the <string, vector> pair is > inserted into the map, is it necessary for the contents of the string > and vector to be duplicated, or does this just shuffle pointers > around?
> Is the full 100 bytes of the string s duplicated and then the original > freed? Since the local variables s and v are no longer needed after > the end of the scope immediately following the insert, it seems quite > unnecessary to duplicate and then free the originals. Can this be > avoided? Do any implementations of STL implement copy-on-write > semantics for string or vector?
make_pair creates a new pair object and returns it by value, so "in principle" s and v will be copied. Of course, the compiler is allowed to perform any optimizations that doesn't change the observable behavior, so there's no "guarantee" that the copies will be made.
In C++0x you will be able to use make_pair(T1&&, T2&&) and "move" s and v, avoiding the copies. As far as I know, the syntax will probably be m.insert(make_pair(move(s), move(v))) (with all std:: omitted).
> > In the following code snippet, when the <string, vector> pair is > > inserted into the map, is it necessary for the contents of the string > > and vector to be duplicated, or does this just shuffle pointers > > around?
> > Is the full 100 bytes of the string s duplicated and then the original > > freed? Since the local variables s and v are no longer needed after > > the end of the scope immediately following the insert, it seems quite > > unnecessary to duplicate and then free the originals. Can this be > > avoided? Do any implementations of STL implement copy-on-write > > semantics for string or vector?
> make_pair creates a new pair object and returns it by value, so "in > principle" s and v will be copied. Of course, the compiler is allowed to > perform any optimizations that doesn't change the observable behavior, > so there's no "guarantee" that the copies will be made.
> In C++0x you will be able to use make_pair(T1&&, T2&&) and "move" s and > v, avoiding the copies. As far as I know, the syntax will probably be > m.insert(make_pair(move(s), move(v))) (with all std:: omitted).
"In principle" then, would m.insert(pair<string, vector<int>>(s, v)); avoid making a copy? I had been treating make_pair as a syntatic nicety, but completely equivalent to the constructor of pair.
> On Nov 6, 1:21 am, ShaunJ<sjack...@gmail.com> wrote: >> avoided? Do any implementations of STL implement copy-on-write >> semantics for string or vector?
> I'm not 100% sure, but I think STLPort implements string with CoW.
Current STLPort-5.2.1 and STLPort 4 bundled with the latest Sun C++ compilers do not not use CoW. One most disappointing feature of STLPort std::string is that the default constructor allocates storage, in other words, a memory allocation is performed even for empty strings.
GNU std::string uses CoW. I hear that the standard interface of std::string can not be possibly satisfied by a CoW implementation of std::string, nevertheless I find CoW implementations of std::string the most practical.
GNU C++ library also provides another string class __gnu_cxx::__versa_string, which does not do CoW and I've heard that there are plans to make it the default std::string implementation in the future, although the CoW std::string implementation will still be available under a different name.
> copy-on-write doesn't work well with threads so nobody uses it much > anymore.
It is perfectly possible to combine CoW and threads when the CoW classes use atomic-reference counting. This is precisely how the Qt framework provides fully reentrant implementations of string, container, and other implicitly shared classes.
> > make_pair creates a new pair object and returns it by value, so "in > > principle" s and v will be copied. Of course, the compiler is allowed to > > perform any optimizations that doesn't change the observable behavior, > > so there's no "guarantee" that the copies will be made.
Since s and v are lvalues they will be copied. Now, if that means that all the characters of that string object are duplicated is another questions and depends on whether the implementation uses CoW or not. Recent libstdc++ Versions still use CoW (in a thread-safe way) for std::string. I'm not sure about std::vector. Probably not.
> > In C++0x you will be able to use make_pair(T1&&, T2&&) and "move" s and > > v, avoiding the copies. As far as I know, the syntax will probably be > > m.insert(make_pair(move(s), move(v))) (with all std:: omitted).
> "In principle" then, would m.insert(pair<string, vector<int>>(s, v)); > avoid making a copy?
No. They have to be copied since s and v are lvalues. Bug again, that doesn't imply that the string's elements are copied (due to CoW). But CoW is not mandated, only a possibility.
In C++0x you will be able to write
m.emplace(move(s), move(v));
without having to worry about any copying. The pair object will be directly constructed "into the map" using its templated constructor that forwards the rvalues references of move(s) and move(v) to the constructors of std::string and std::vector.
> I had been treating make_pair as a syntatic nicety, but completely > equivalent to the constructor of pair.
I think it's safe to say that is is if your compiler supports RVO (return value optimization) and inlining. You can expect make_pair to be as efficient as a direct constructor call. If you're concerned about the performance you could do something like this:
if (m.find(s)==m.end()) m[s].swap(v);
This will create a pair with a copied string as key and a default- constructed vector. The new vector is immediately swapped with v.
On Nov 9, 10:27 pm, SG <s.gesem...@gmail.com> wrote:
> In C++0x you will be able to write
> m.emplace(move(s), move(v));
While I agree that this is the best solution, I think it would also make sense to have container::insert(const T &&value), which makes use of the move constructor, making m.insert(make_pair(move(s), move(v)) almost as efficient. (I do not remember if C++0X has this or not).
Jeff Schwab wrote: > Jeff Flinn wrote: >> Do any implementations of STL implement copy-on-write >>> semantics for string or vector?
>> MSVC(VC6)/Dinkumware used to do COW for std::string, but found that >> small string optimization provided better overall performance.
> How does that preclude CoW for non-small strings?
It doesn't, it is a separate optimization, but with an overlap in effect for short strings.
The original use of CoW for std::string was shown not to work in practice. Due to the semantics it becomes copy-on-potential-write, which is most often a net loss - especially for multi-threaded use.
Herb Sutter did the measures that surprised most people:
tohava wrote: > On Nov 9, 10:27 pm, SG <s.gesem...@gmail.com> wrote: >> In C++0x you will be able to write
>> m.emplace(move(s), move(v));
> While I agree that this is the best solution, I think it would also > make sense to have container::insert(const T &&value), which makes use > of the move constructor, making m.insert(make_pair(move(s), move(v)) > almost as efficient. (I do not remember if C++0X has this or not).
The latest draft, N2960, has insert member functions that take rvalue references (P&&, not const P&&, though) for all containers that have insert member functions.