Extracting Unique Elements From a List

From Erlang Community

(Difference between revisions)
Revision as of 22:40, 3 September 2006 (edit)
Bfulgham (Talk | contribs)

← Previous diff
Revision as of 13:54, 24 September 2006 (edit) (undo)
Ayrnieu (Talk | contribs)
(buh!? This certainly isn't the best solution for preserving any order, as it clearly doesn't preserve the order.)
Next diff →
Line 34: Line 34:
</code> </code>
-Note that the final list produced by this function is not sorted, so this is the best approach if you want the set to contain the data in the order in which it was inserted. +Note that sets:to_list(sets:from_list(L)) produces an unreliably arranged list.
Using a General Balanced Tree Set (gb_set) Using a General Balanced Tree Set (gb_set)

Revision as of 13:54, 24 September 2006

Problem

You want to eliminate duplicate values from a list.

Solution

There are several possible solutions for this problem. Here are some of these:

Using lists:usort

The lists module contains a wealth of list processing functionality. One possible solution to this problem is to use the lists:usort function, which takes a list and returns a sorted copy of the original list, with all duplicates removed:

1> UL = [1,2,8,7,8,10,3,12,3,99,188,3,2,1,3,5,15,72].
[1,2,8,7,8,10,3,12,3,99,188,3,2,1,3,5,15,72]
2> lists:usort(UL).
[1,2,3,5,7,8,10,12,15,72,99,188]

Using the sets Module

Erlang standard libraries includes a module, sets, with a variety of functions related to generating, creating, and manipulating mathematical sets.

10> Set = sets:from_list(UL).
{sets,12,
      16,
      16,
      8,
      80,
      48,
      {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
      {{[],[99,3],[],[],"\274\f",[15],[2],[5],"H\b",[],[],[1],[],[7],"\n",[]}}}
11> sets:to_list(Set).
[3,99,12,188,15,2,5,8,72,1,7,10]

Note that sets:to_list(sets:from_list(L)) produces an unreliably arranged list.

Using a General Balanced Tree Set (gb_set)

Erlang's standard libraries includes an implementation of Professor Arne Andersson's General Balanced Trees. These structures are more costly than sorting lists for small sets, but this is a much more efficient implementation when working with large sets of data.

The gb_set:from_list function will produce an ordered set of elements (dropping duplicates). The set can then be extracted back to a list for other use:

3> GBSet = gb_sets:from_list(UL).
{12,
 {10,
  {5,{2,{1,nil,nil},{3,nil,nil}},{8,{7,nil,nil},nil}},
  {72,{15,{12,nil,nil},nil},{188,{99,nil,nil},nil}}}}
4> gb_sets:to_list(GBSet).
[1,2,3,5,7,8,10,12,15,72,99,188]

Erlang/OTP Projects
Personal tools