Erlang/OTP Forums

Author Message

<  Erlang  ~  length(LIST) and lists:nth in O(1) or O(N)?

Michael Nischt
Posted: Sat Aug 25, 2007 11:03 am Reply with quote
Joined: 25 Aug 2007 Posts: 4 Location: Berlin, DE
As an Erlang newbie, I was wondering whether the operations 'length(List)' and 'lists:nth(N,List)' are performed in constant time or depend on the number of elements in the list ?


The reason I'm asking is because programmed two functions:

One to compute a list K unique integers uniformly distributed either within 1..N or From..To:
Code:

> test:random_uniforms( 4, 12 ).
[1,11,3,4]
> test:random_uniforms( 4, 5, 12 ).
[6,7,8,10]


And another one to select K elements randomly from a list:
Code:

> test:random_select( 2, [[1,2],[3,4],[5,6],[7,8]] ).
[[5,6],[7,8]]


Probably, I'm still thinking to much in a imperative way, which is why my implementation uses both length and nth. however if the complexity is different from O(1), I bet the overall complexity won't be good:

Code:

-module(test).
-author('Michael Nischt').

-export([random_uniforms/2,random_uniforms/3,random_select/2]).

% K random elements of the List
random_select ( K, List ) -> random_uniforms( K, length(List), List, [] ).

% returns a random List with k unique values between [1,..,N]
random_uniforms ( K, N ) -> random_uniforms ( K, N, lists:seq( 1, N ), [] ).

% returns a random List with k unique values between [From,..,To]
random_uniforms ( K, From, To ) -> random_uniforms ( K, To-From, lists:seq( From, To ), [] ).
   
% help functions, which moves K elements from 'From' to 'To'
random_uniforms ( K, N, From, To ) ->
   case K of
      0 -> To;
      _ when N > 0 -> E = lists:nth( random:uniform(N), From) , random_uniforms ( K-1, N-1,  lists:delete (E, From ), [E|To] )
   end.
View user's profile Send private message Visit poster's website
dsmathers
Posted: Sun Aug 26, 2007 5:31 pm Reply with quote
Joined: 20 Aug 2007 Posts: 2 Location: Atlanta, GA
Well, let's do an experiment:

Code:

-module(listop).
-compile([export_all]).

time_len(_, 0) -> void;
time_len(L, N) ->
    statistics(runtime),
    statistics(wall_clock),
    Len = length(L),
    {_,T1} = statistics(runtime),
    {_,T2} = statistics(wall_clock),
    io:format("computing length ~p took ~p (~p) ms~n", [Len, T1, T2]),
    time_len(L, N-1).

mklist(0) -> [];
mklist(N) -> [thing|mklist(N-1)].


Yeah, perhaps it's O(N), but it's barely even phases the system for lists of a reasonable size: 2,4,16,256,65536. All of these time from 0-1 ms. I can only observe it's O(N) with ridiculously large lists:

Code:

4> listop:time_len(L1, 1).
computing length 10000000 took 40 (41) ms
void
5> listop:time_len(L2, 1).
computing length 20000000 took 70 (73) ms
void


And at this time my system is about to crash because I'm using nearly 1.5 GB of memory. A similar profiling test can be construed for lists:nth - but your larger worry should be making sure your application doesn't assume memory is infinite Wink
View user's profile Send private message
Michael Nischt
Posted: Sun Aug 26, 2007 8:17 pm Reply with quote
Joined: 25 Aug 2007 Posts: 4 Location: Berlin, DE
thanks for answering and the nice micro benchmark.

Indeed it seems to behave exactly as the following Java code

Code:

class ListOp
{
   public static void main(String[] test)
   {
      int size = Integer.valueOf(test[0]);
      int loops = Integer.valueOf(test[1]);
      
      Object[] array = new Object[size];
      
      for(int l=0; l<loops; l++)
      {
         long start = System.currentTimeMillis();
         int len=0;
         for(Object o : array) len++;
         System.out.printf("Length: %d \tElapsed Time: %d\n", len, System.currentTimeMillis()-start);
      }
   }
}


Code:

java ListOp 10000000 10
java ListOp 20000000 10

*use 10 to get real results
** you may need to set the max memory, e.g. with java -Xmx4048M ..


So I guess it is in O(n) always, not only for huge numbers - the last would be fine as when exceeding 2^32-1 bits as arrays are often limited to that dimension.
Of course the scaling factor is very small, but even the 0ms will get noticable when inside a loop also with O(n) complexity, because then we get O(n^2) - same for 'lists:nth/2'.

Regarding the memory resources, I deploy apps on a grid, each node with at least 8GByte RAM - actually that's why I'm evaluating Erlang/OTC.
Although I really enjoy writing Erlang code, my main problem is that I'm still unable to program efficiently:
My random_uniforms/2 is in O(n^2) assuming lists:delete/2 to be in O(n). Even if I could get rid of it, a lists:nth/2 in O(n) would bring it to O(k*n) but I can easily write a Java/C version in O(k) and k is always <= n.

I mean it's ok that these operations are in O(n), but I cannot convince others (and myself Wink) to use Erlang not being able to write functions in a "good" complexity.
I feel stuck with implementing some simple algorithms efficiently, although tried hard. To get my head up again, it would be great if anyone could give me a hint how to optimize random_uniforms to O(k) - I mean this should be possible in Erlang too, right?

If it helps anyone, Java Code is as follows:

Code:

import java.util.Random;

class RandomUniforms
{
    public static int[] randomUniforms(int k, int n)
    {
        assert (k > 0) && (k <= n);

        Random random = new Random();

        int[] shuffle = new int[n];
        int[] uniforms = new int[k];

        for (int i = 0; i < shuffle.length; i++)
        {
            shuffle[i] = i;
        }
        for (int i = 0, r = n; i < k; i++, r--)
        {
            int j = i + random.nextInt(r);
            uniforms[i] = shuffle[j];
            shuffle[j]  = shuffle[i];
        }
        return uniforms;
    }
}


besides the allocation, like lists:seq(1,N), there is almost no difference in

Code:

randomUniforms(100, 20000)
and
randomUniforms(100, 20000000)


Thanks for your help!
View user's profile Send private message Visit poster's website
Michael Nischt
Posted: Mon Aug 27, 2007 12:44 am Reply with quote
Joined: 25 Aug 2007 Posts: 4 Location: Berlin, DE
[! Updated !]

Still not being to think of an efficient solution without data manipulation, I wrote an implementation using ets and translated the Java code from above (more or less).

Code:

-module(test_ets).
-author('Michael Nischt').

-compile([export_all]).

% returns a random List with k unique values between [1,..,N]
random_uniforms ( K, N ) -> random_uniforms ( K, 1, N ).

% returns a random List with k unique values between [From,..,To]
random_uniforms ( K, From, To ) -> Tab = ets:new(uniforms, [private]), Uniforms = random_uniforms_help(K, From, To, Tab, []), ets:delete(Tab), Uniforms.

random_uniforms_help ( K, From, To, Shuffle, Uniforms ) ->
   if
      K > 0  -> Index = From + random:uniform(To-From+1)-1,
      {Index,New} =
         case ets:lookup(Shuffle,Index) of
            [] -> {Index,Index};
            [{Index,Value}]  -> {Index,Value}
         end,
      ets:insert(Shuffle, {Index,From}),
      random_uniforms_help ( K-1, From+1, To, Shuffle, [New|Uniforms]);
      K == 0 -> Uniforms
   end.   


random_uniforms is now in O(k), which is great! Very Happy
Also it uses less memory compared to the Java version (at least when 2*k < n) - ok I could also rewrite the latter one to utilize a java.util.HashMap. Wink

However, I have to admit I'd have loved to write a version without ets - although everything should thread-safe since Shuffle is kept local within a single process when random_uniforms_help is not exported.

-> I still hoping someone comes up with a really clever implementation Smile
View user's profile Send private message Visit poster's website
francesco
Posted: Tue Aug 28, 2007 12:22 pm Reply with quote
User Joined: 07 Jul 2006 Posts: 249 Location: London
As a side node, all lists in Erlang are represented by linked lists in the VM. Inserting elements in the list is constant while searching for an element in a list is proportional to the size of the list. Also worth mentioning that in the later releases, many of the lists operations were reimplemented as BIFs to speed them up, bringing them to the same speed (or faster) as ets tables for smaller sets. Reason being that the lists module is the most used one in any Erlang app. Take a look at the lists.erl source code in the last few releases.

Regards,
Francesco
--
http://www.erlang-consulting.com
View user's profile Send private message Visit poster's website
daniello
Posted: Wed Oct 10, 2007 2:16 pm Reply with quote
User Joined: 03 Apr 2007 Posts: 15
I think that dicts are what you are looking for the following micro-benchmark shows that dict lookups operates in O(1) time and inserting K elements to dict operates in O(K*log(K)).

Benchmark
Code:
time_lookup(_, 0, M) -> void;
time_lookup(S, N, M) ->
    statistics(runtime),
    statistics(wall_clock),
    Len = dict:find(random:uniform(M), S),
    {_,T1} = statistics(runtime),
    {_,T2} = statistics(wall_clock),
    io:format("computing r ~p took ~p (~p) ms~n", [Len, T1, T2]),
    time_lookup(S, N-1, M).

mkdic(0, Dict) -> Dict;
mkdic(N, Dict) ->
  NewDict = dict:store(N, N, Dict),
  mkdic(N-1, NewDict).

mkdic(N) ->
  mkdic(N, dict:new()).

test_mkdic(N) ->
  statistics(runtime),
  statistics(wall_clock),
  mkdic(N),
  {_,T1} = statistics(runtime),
  {_,T2} = statistics(wall_clock),
  io:format("Creating dictionary of size ~p took ~p (~p) ms~n", [N, T1, T2]).


Random uniforms with dict
Code:
% returns a random List with k unique values between [1,..,N]
random_uniforms ( K, N ) -> random_uniforms ( K, 1, N ).

% returns a random List with k unique values between [From,..,To]
random_uniforms ( K, From, To ) -> Dict = dict:new(), Uniforms = random_uniforms_help(K, From, To, Dict, []), Uniforms.

random_uniforms_help ( K, From, To, Shuffle, Uniforms ) ->
   if
      K > 0  -> Index = From + random:uniform(To-From+1)-1,
      {Index,New} =
         case dict:find(Index, Shuffle) of
          {ok, Value}  -> {Index,Value};
          _            -> {Index,Index}
         end,
      dict:store(Index, From, Shuffle),
      random_uniforms_help ( K-1, From+1, To, Shuffle, [New|Uniforms]);
      K == 0 -> Uniforms
   end.
View user's profile Send private message

Display posts from previous:  

All times are GMT
Page 1 of 1
This forum is locked: you cannot post, reply to, or edit topics.

Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum