| Author |
Message |
< Erlang ~ length(LIST) and lists:nth in O(1) or O(N)? |
| Michael Nischt |
Posted: Sat Aug 25, 2007 11:03 am |
|
|
|
Joined: 25 Aug 2007
Posts: 4
Location: Berlin, DE
|
As an Erlang newbie, I was wondering whether the operations 'length(List)' and 'lists:nth(N,List)' are performed in constant time or depend on the number of elements in the list ?
The reason I'm asking is because programmed two functions:
One to compute a list K unique integers uniformly distributed either within 1..N or From..To:
Code:
> test:random_uniforms( 4, 12 ).
[1,11,3,4]
> test:random_uniforms( 4, 5, 12 ).
[6,7,8,10]
And another one to select K elements randomly from a list:
Code:
> test:random_select( 2, [[1,2],[3,4],[5,6],[7,8]] ).
[[5,6],[7,8]]
Probably, I'm still thinking to much in a imperative way, which is why my implementation uses both length and nth. however if the complexity is different from O(1), I bet the overall complexity won't be good:
Code:
-module(test).
-author('Michael Nischt').
-export([random_uniforms/2,random_uniforms/3,random_select/2]).
% K random elements of the List
random_select ( K, List ) -> random_uniforms( K, length(List), List, [] ).
% returns a random List with k unique values between [1,..,N]
random_uniforms ( K, N ) -> random_uniforms ( K, N, lists:seq( 1, N ), [] ).
% returns a random List with k unique values between [From,..,To]
random_uniforms ( K, From, To ) -> random_uniforms ( K, To-From, lists:seq( From, To ), [] ).
% help functions, which moves K elements from 'From' to 'To'
random_uniforms ( K, N, From, To ) ->
case K of
0 -> To;
_ when N > 0 -> E = lists:nth( random:uniform(N), From) , random_uniforms ( K-1, N-1, lists:delete (E, From ), [E|To] )
end. |
|
|
| Back to top |
|
| dsmathers |
Posted: Sun Aug 26, 2007 5:31 pm |
|
|
|
Joined: 20 Aug 2007
Posts: 2
Location: Atlanta, GA
|
Well, let's do an experiment:
Code:
-module(listop).
-compile([export_all]).
time_len(_, 0) -> void;
time_len(L, N) ->
statistics(runtime),
statistics(wall_clock),
Len = length(L),
{_,T1} = statistics(runtime),
{_,T2} = statistics(wall_clock),
io:format("computing length ~p took ~p (~p) ms~n", [Len, T1, T2]),
time_len(L, N-1).
mklist(0) -> [];
mklist(N) -> [thing|mklist(N-1)].
Yeah, perhaps it's O(N), but it's barely even phases the system for lists of a reasonable size: 2,4,16,256,65536. All of these time from 0-1 ms. I can only observe it's O(N) with ridiculously large lists:
Code:
4> listop:time_len(L1, 1).
computing length 10000000 took 40 (41) ms
void
5> listop:time_len(L2, 1).
computing length 20000000 took 70 (73) ms
void
And at this time my system is about to crash because I'm using nearly 1.5 GB of memory. A similar profiling test can be construed for lists:nth - but your larger worry should be making sure your application doesn't assume memory is infinite  |
|
|
| Back to top |
|
| Michael Nischt |
Posted: Sun Aug 26, 2007 8:17 pm |
|
|
|
Joined: 25 Aug 2007
Posts: 4
Location: Berlin, DE
|
thanks for answering and the nice micro benchmark.
Indeed it seems to behave exactly as the following Java code
Code:
class ListOp
{
public static void main(String[] test)
{
int size = Integer.valueOf(test[0]);
int loops = Integer.valueOf(test[1]);
Object[] array = new Object[size];
for(int l=0; l<loops; l++)
{
long start = System.currentTimeMillis();
int len=0;
for(Object o : array) len++;
System.out.printf("Length: %d \tElapsed Time: %d\n", len, System.currentTimeMillis()-start);
}
}
}
Code:
java ListOp 10000000 10
java ListOp 20000000 10
*use 10 to get real results
** you may need to set the max memory, e.g. with java -Xmx4048M ..
So I guess it is in O(n) always, not only for huge numbers - the last would be fine as when exceeding 2^32-1 bits as arrays are often limited to that dimension.
Of course the scaling factor is very small, but even the 0ms will get noticable when inside a loop also with O(n) complexity, because then we get O(n^2) - same for 'lists:nth/2'.
Regarding the memory resources, I deploy apps on a grid, each node with at least 8GByte RAM - actually that's why I'm evaluating Erlang/OTC.
Although I really enjoy writing Erlang code, my main problem is that I'm still unable to program efficiently:
My random_uniforms/2 is in O(n^2) assuming lists:delete/2 to be in O(n). Even if I could get rid of it, a lists:nth/2 in O(n) would bring it to O(k*n) but I can easily write a Java/C version in O(k) and k is always <= n.
I mean it's ok that these operations are in O(n), but I cannot convince others (and myself ) to use Erlang not being able to write functions in a "good" complexity.
I feel stuck with implementing some simple algorithms efficiently, although tried hard. To get my head up again, it would be great if anyone could give me a hint how to optimize random_uniforms to O(k) - I mean this should be possible in Erlang too, right?
If it helps anyone, Java Code is as follows:
Code:
import java.util.Random;
class RandomUniforms
{
public static int[] randomUniforms(int k, int n)
{
assert (k > 0) && (k <= n);
Random random = new Random();
int[] shuffle = new int[n];
int[] uniforms = new int[k];
for (int i = 0; i < shuffle.length; i++)
{
shuffle[i] = i;
}
for (int i = 0, r = n; i < k; i++, r--)
{
int j = i + random.nextInt(r);
uniforms[i] = shuffle[j];
shuffle[j] = shuffle[i];
}
return uniforms;
}
}
besides the allocation, like lists:seq(1,N), there is almost no difference in
Code:
randomUniforms(100, 20000)
and
randomUniforms(100, 20000000)
Thanks for your help! |
|
|
| Back to top |
|
| Michael Nischt |
Posted: Mon Aug 27, 2007 12:44 am |
|
|
|
Joined: 25 Aug 2007
Posts: 4
Location: Berlin, DE
|
[! Updated !]
Still not being to think of an efficient solution without data manipulation, I wrote an implementation using ets and translated the Java code from above (more or less).
Code:
-module(test_ets).
-author('Michael Nischt').
-compile([export_all]).
% returns a random List with k unique values between [1,..,N]
random_uniforms ( K, N ) -> random_uniforms ( K, 1, N ).
% returns a random List with k unique values between [From,..,To]
random_uniforms ( K, From, To ) -> Tab = ets:new(uniforms, [private]), Uniforms = random_uniforms_help(K, From, To, Tab, []), ets:delete(Tab), Uniforms.
random_uniforms_help ( K, From, To, Shuffle, Uniforms ) ->
if
K > 0 -> Index = From + random:uniform(To-From+1)-1,
{Index,New} =
case ets:lookup(Shuffle,Index) of
[] -> {Index,Index};
[{Index,Value}] -> {Index,Value}
end,
ets:insert(Shuffle, {Index,From}),
random_uniforms_help ( K-1, From+1, To, Shuffle, [New|Uniforms]);
K == 0 -> Uniforms
end.
random_uniforms is now in O(k), which is great!
Also it uses less memory compared to the Java version (at least when 2*k < n) - ok I could also rewrite the latter one to utilize a java.util.HashMap.
However, I have to admit I'd have loved to write a version without ets - although everything should thread-safe since Shuffle is kept local within a single process when random_uniforms_help is not exported.
-> I still hoping someone comes up with a really clever implementation  |
|
|
| Back to top |
|
| francesco |
Posted: Tue Aug 28, 2007 12:22 pm |
|
|
|
User
Joined: 07 Jul 2006
Posts: 249
Location: London
|
As a side node, all lists in Erlang are represented by linked lists in the VM. Inserting elements in the list is constant while searching for an element in a list is proportional to the size of the list. Also worth mentioning that in the later releases, many of the lists operations were reimplemented as BIFs to speed them up, bringing them to the same speed (or faster) as ets tables for smaller sets. Reason being that the lists module is the most used one in any Erlang app. Take a look at the lists.erl source code in the last few releases.
Regards,
Francesco
--
http://www.erlang-consulting.com |
|
|
| Back to top |
|
| daniello |
Posted: Wed Oct 10, 2007 2:16 pm |
|
|
|
User
Joined: 03 Apr 2007
Posts: 15
|
I think that dicts are what you are looking for the following micro-benchmark shows that dict lookups operates in O(1) time and inserting K elements to dict operates in O(K*log(K)).
Benchmark
Code: time_lookup(_, 0, M) -> void;
time_lookup(S, N, M) ->
statistics(runtime),
statistics(wall_clock),
Len = dict:find(random:uniform(M), S),
{_,T1} = statistics(runtime),
{_,T2} = statistics(wall_clock),
io:format("computing r ~p took ~p (~p) ms~n", [Len, T1, T2]),
time_lookup(S, N-1, M).
mkdic(0, Dict) -> Dict;
mkdic(N, Dict) ->
NewDict = dict:store(N, N, Dict),
mkdic(N-1, NewDict).
mkdic(N) ->
mkdic(N, dict:new()).
test_mkdic(N) ->
statistics(runtime),
statistics(wall_clock),
mkdic(N),
{_,T1} = statistics(runtime),
{_,T2} = statistics(wall_clock),
io:format("Creating dictionary of size ~p took ~p (~p) ms~n", [N, T1, T2]).
Random uniforms with dict
Code: % returns a random List with k unique values between [1,..,N]
random_uniforms ( K, N ) -> random_uniforms ( K, 1, N ).
% returns a random List with k unique values between [From,..,To]
random_uniforms ( K, From, To ) -> Dict = dict:new(), Uniforms = random_uniforms_help(K, From, To, Dict, []), Uniforms.
random_uniforms_help ( K, From, To, Shuffle, Uniforms ) ->
if
K > 0 -> Index = From + random:uniform(To-From+1)-1,
{Index,New} =
case dict:find(Index, Shuffle) of
{ok, Value} -> {Index,Value};
_ -> {Index,Index}
end,
dict:store(Index, From, Shuffle),
random_uniforms_help ( K-1, From+1, To, Shuffle, [New|Uniforms]);
K == 0 -> Uniforms
end. |
|
|
| Back to top |
|
|
|
All times are GMT
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You cannot attach files in this forum You cannot download files in this forum
|
|
|