C++: Suggestions about a hash function for a sequence of strings where the order of the strings is irrelevant -


let's have these 2 sequences of strings

abc cba bc

bc abc cba

i'm trying create mapping such sequences(the sequence string) above 2 sequences mapped same bucket.

my initial thought add results of hashing function applied each string separately. in way order won't matter. if applied hashing function sequence string whole, of course hash result different.

however i'm new world of string hashing functions , have no idea whether approach efficient.

in website http://www.partow.net/programming/hashfunctions/index.html

i found many different implementations string hashing, i'm not sure 1 "best" needs.

some technical details each string in sequence each of them won't have more 25 characters. each sequence won't have more 3 strings.

questions

1. approach of adding results of string hashing function each string of sequence work?

2. if yes string hashing function should use give low amount of collisions , time efficient?

thank in advance

just idea demonstration (very inefficient string copying), complexity o(nlogn) n size of key (=== o(1) if keys have constant length known @ compile time), don't think can better complexity:

#include <boost/functional/hash.hpp> #include <set> #include <algorithm>  std::size_t make_hash(   std::string const& a,   std::string const& b,   std::string const& c) {     std::string input[] = {a,b,c};     std::sort(input, input + (sizeof(input)/sizeof(*input)));     return boost::hash_range(input, input + (sizeof(input)/sizeof(*input))); }  #include <iostream> // g++ -i.../boost_1_47_0 string_set_hash.cpp int main() {     std::cout << make_hash("abc", "bcd", "def") << std::endl; // 46247451276990640     std::cout << make_hash("bcd", "def", "abc") << std::endl; // 46247451276990640 } 

a fragment of boost/functional/hash.hpp reference:

template <class t> inline void hash_combine(std::size_t& seed, t const& v)  {     boost::hash<t> hasher;     seed ^= hasher(v) + 0x9e3779b9 + (seed<<6) + (seed>>2); }  template <class it> inline std::size_t hash_range(it first, last) {     std::size_t seed = 0;      for(; first != last; ++first)     {         hash_combine(seed, *first);     }      return seed; } 

Comments

Popular posts from this blog

Delphi XE2 Indy10 udp client-server interchange using SendBuffer-ReceiveBuffer -

Qt ActiveX WMI QAxBase::dynamicCallHelper: ItemIndex(int): No such property in -

Enable autocomplete or intellisense in Atom editor for PHP -