Erlang/OTP Forums

Author Message

<  Erlang questions mailing list  ~  Not an Erlang fan

Guest
Posted: Sun Sep 23, 2007 11:58 am Reply with quote
Guest
http://www.tbray.org/ongoing/When/200x/2007/09/22/Erlang

Tim Bray might raise some valid points here, even if he's slightly
biased by his background.

--
Didier
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
Guest
Posted: Sun Sep 23, 2007 4:22 pm Reply with quote
Guest
He definitely seems kind of bias at different points, but still it would be great to find out where he went wrong!

-Alex



> Date: Sun, 23 Sep 2007 19:53:36 +0800
> From: headspin@gmail.com
> To: erlang-questions@erlang.org
> Subject: [erlang-questions] Not an Erlang fan
>
> http://www.tbray.org/ongoing/When/200x/2007/09/22/Erlang
>
> Tim Bray might raise some valid points here, even if he's slightly
> biased by his background.
>
> --
> Didier
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@erlang.org
> http://www.erlang.org/mailman/listinfo/erlang-questions


Post recived from mailinglist
Guest
Posted: Sun Sep 23, 2007 4:34 pm Reply with quote
Guest
On 9/23/07, Alex Alvarez <eajam@hotmail.com (eajam@hotmail.com)> wrote:
Quote:
He definitely seems kind of bias at different points, but still it would be great to find out where he went wrong!

Isn't it that he's basing his whole analysis on file io, what's more, a single file which doesn't lend itself to parallelism?
Guest
Posted: Sun Sep 23, 2007 5:38 pm Reply with quote
Guest
> > > http://www.tbray.org/ongoing/When/200x/2007/09/22/Erlang
> > >
> > > Tim Bray might raise some valid points here, even if he's slightly
> > > biased by his background.

The good news is speeding up the i/o in erlang should be easier than
introducing better concurrency to another language.

-Patrick
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
Guest
Posted: Sun Sep 23, 2007 6:10 pm Reply with quote
Guest
As with any language, it's hard to make final conclusions when you barely know anything about it.
Thomas Lindgren
Posted: Sun Sep 23, 2007 6:22 pm Reply with quote
User Joined: 09 Mar 2005 Posts: 279
--- Keith Irwin <keith.irwin@gmail.com> wrote:

> On 9/23/07, Alex Alvarez <eajam@hotmail.com> wrote:
> >
> > He definitely seems kind of bias at different
> points, but still it would
> > be great to find out where he went wrong!
> >
>
> Isn't it that he's basing his whole analysis on file
> io, what's more, a
> single file which doesn't lend itself to
> parallelism? Had he started by
> writing a client/server or p2p application around
> some domain other than
> system-admin stuff, perhaps he'd be much more
> favorable towards the
> multitude of strengths Erlang has to offer.

He's also using the obvious, tempting but very slow
io:read_line. Reading the entire file into a binary
takes 7 ms (sic) using file:read_file, not 34 seconds
using io:read_line as he reports. For a beginner it's
not obvious what to use, though, so life could be
easier.

My own experience with parsing XML in Erlang vs Ruby
is that xmerl parsing about 4 MB of XML handily beat
"the obvious" Ruby library the other guy used
(REXML?), being 10+ times faster or more -- xmerl
needed 10 seconds versus "a few minutes" for Ruby. So
I wouldn't say Erlang is inherently slow w.r.t.
parsing, but again, one may need some experience to
get it right.

Best,
Thomas



____________________________________________________________________________________
Don't let your dream ride pass you by. Make it a reality with Yahoo! Autos.
http://autos.yahoo.com/index.html



_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
View user's profile Send private message
Guest
Posted: Sun Sep 23, 2007 6:46 pm Reply with quote
Guest
On Sep 23, 2007, at 10:20 AM, Thomas Lindgren wrote:
>
> My own experience with parsing XML in Erlang vs Ruby
> is that xmerl parsing about 4 MB of XML handily beat
> "the obvious" Ruby library the other guy used
> (REXML?), being 10+ times faster or more -- xmerl
> needed 10 seconds versus "a few minutes" for Ruby. So
> I wouldn't say Erlang is inherently slow w.r.t.
> parsing, but again, one may need some experience to
> get it right.
>

Ruby is known to be very slow. REXML is a pure Ruby XML parser. It's
the slowest XML parser I've ever used.

You've set the bar too low for xmerl to pass. Smile Now, if xmerl
beats a C XML parser, I'd be impressed. Smile

Chris
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
Guest
Posted: Sun Sep 23, 2007 7:21 pm Reply with quote
Guest
On 9/23/07, Thomas Lindgren <thomasl_erlang@yahoo.com> wrote:

> He's also using the obvious, tempting but very slow
> io:read_line. Reading the entire file into a binary
> takes 7 ms (sic) using file:read_file, not 34 seconds
> using io:read_line as he reports.

He reports 34 seconds for the whole log file of about 1 million lines.
Not for the reduced sample of about 20000 lines that he made
available on the site (a difference of 50x in size).

Cheers
P.
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
dcaoyuan
Posted: Sun Sep 23, 2007 7:24 pm Reply with quote
User Joined: 28 Mar 2007 Posts: 34
Tim's example is not about io, read whole file into binary is very
quick, but, when you even simply travel a binary byte by byte, it cost
a lot of time. I wrote a simple test module, and please take a look at
test2/1, which is a funtion simply travel a binary. when the binary is
read from a 200M file, travel it will cost about 30s.

-module(widefinder).

-export([test/1,
test1/1,
test2/1,
test3/1]).

test(FileName) ->
statistics(wall_clock),
{ok, IO} = file:open(FileName, read),
{Matched, Total} = scan_line(IO),
{_, Duration} = statistics(wall_clock),
io:format("Duration ~pms~n Matched:~B, Total:~B", [Duration,
Matched, Total]).

scan_line(IO) -> scan_line("", IO, 0, -1).
scan_line(eof, _, Matched, Total) -> {Matched, Total};
scan_line(Line, IO, Matched, Total) ->
NewCount = Matched + process_match(Line),
scan_line(io:get_line(IO, ''), IO, NewCount, Total + 1).

process_match([]) -> 0;
process_match("/ongoing/When/"++Rest) ->
case parse_until_space(Rest, false) of
true -> 0;
false -> 1
end;
process_match([_H|Rest]) -> process_match(Rest).

test1(FileName) ->
statistics(wall_clock),
{ok, Bin} = file:read_file(FileName),
{Matched, Total} = scan_line1(Bin),
{_, Duration} = statistics(wall_clock),
io:format("Duration ~pms~n Matched:~B, Total:~B", [Duration,
Matched, Total]).

scan_line1(Bin) -> scan_line1(Bin, [], 0, 0).
scan_line1(<<>>, _Line, Matched, Total) -> {Matched, Total};
scan_line1(<<$\n, Rest/binary>>, Line, Matched, Total) ->
%Line1 = lists:reverse(Line),
scan_line1(Rest, [], Matched, Total + 1);
scan_line1(<<C:1/binary, Rest/binary>>, Line, Matched, Total) ->
%NewCount = Matched + process_match(Line),
scan_line1(Rest, [C|Line], Matched, Total).

test2(FileName) ->
statistics(wall_clock),
{ok, Bin} = file:read_file(FileName),
Total = travel_bin(Bin),
{_, Duration} = statistics(wall_clock),
io:format("Duration ~pms~n Total:~B", [Duration, Total]).

travel_bin(Bin) -> travel_bin(Bin, 0).
travel_bin(<<>>, ByteCount) -> ByteCount;
travel_bin(<<_C:1/binary, Rest/binary>>, ByteCount) ->
travel_bin(Rest, ByteCount + 1).

test3(FileName) ->
statistics(wall_clock),
{ok, Bin} = file:read_file(FileName),
Total = travel_list(binary_to_list(Bin)),
{_, Duration} = statistics(wall_clock),
io:format("Duration ~pms~n Total:~B", [Duration, Total]).

travel_list(List) -> travel_list(List, 0).
travel_list([], CharCount) -> CharCount;
travel_list([_C|Rest], CharCount) ->
travel_list(Rest, CharCount + 1).

parse_until_space([$\040|_Rest], Bool) -> Bool;
parse_until_space([$.|_Rest], _Bool) -> true;
parse_until_space([_H|Rest], Bool) -> parse_until_space(Rest, Bool).



On 9/24/07, Thomas Lindgren <thomasl_erlang@yahoo.com> wrote:
>
> --- Keith Irwin <keith.irwin@gmail.com> wrote:
>
> > On 9/23/07, Alex Alvarez <eajam@hotmail.com> wrote:
> > >
> > > He definitely seems kind of bias at different
> > points, but still it would
> > > be great to find out where he went wrong!
> > >
> >
> > Isn't it that he's basing his whole analysis on file
> > io, what's more, a
> > single file which doesn't lend itself to
> > parallelism? Had he started by
> > writing a client/server or p2p application around
> > some domain other than
> > system-admin stuff, perhaps he'd be much more
> > favorable towards the
> > multitude of strengths Erlang has to offer.
>
> He's also using the obvious, tempting but very slow
> io:read_line. Reading the entire file into a binary
> takes 7 ms (sic) using file:read_file, not 34 seconds
> using io:read_line as he reports. For a beginner it's
> not obvious what to use, though, so life could be
> easier.
>
> My own experience with parsing XML in Erlang vs Ruby
> is that xmerl parsing about 4 MB of XML handily beat
> "the obvious" Ruby library the other guy used
> (REXML?), being 10+ times faster or more -- xmerl
> needed 10 seconds versus "a few minutes" for Ruby. So
> I wouldn't say Erlang is inherently slow w.r.t.
> parsing, but again, one may need some experience to
> get it right.
>
> Best,
> Thomas
>
>
>
> ____________________________________________________________________________________
> Don't let your dream ride pass you by. Make it a reality with Yahoo! Autos.
> http://autos.yahoo.com/index.html
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@erlang.org
> http://www.erlang.org/mailman/listinfo/erlang-questions
>


--
- Caoyuan
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
View user's profile Send private message
Thomas Lindgren
Posted: Sun Sep 23, 2007 8:02 pm Reply with quote
User Joined: 09 Mar 2005 Posts: 279
--- Chris Wong <chris@chriswongstudio.com> wrote:

>
> On Sep 23, 2007, at 10:20 AM, Thomas Lindgren wrote:
> >
> > My own experience with parsing XML in Erlang vs
> Ruby
> > is that xmerl parsing about 4 MB of XML handily
> beat
> > "the obvious" Ruby library the other guy used
> > (REXML?), being 10+ times faster or more -- xmerl
> > needed 10 seconds versus "a few minutes" for Ruby.
> So
> > I wouldn't say Erlang is inherently slow w.r.t.
> > parsing, but again, one may need some experience
> to
> > get it right.
> >
>
> Ruby is known to be very slow. REXML is a pure Ruby
> XML parser. It's
> the slowest XML parser I've ever used.
>
> You've set the bar too low for xmerl to pass. Smile
> Now, if xmerl
> beats a C XML parser, I'd be impressed. Smile

Oh yeah Smile

But just to be clear: my point is not that xmerl is
beating highly optimized parsers written in portable
assembly language nor that REXML is the gold standard
of Ruby XML parsing, but that one shouldn't jump to
conclusions based on simple examples.

I think Erlang and xmerl are not world-beaters but
pretty competitive in this instance, and that it's
thus premature to say parsing or file processing in
Erlang is inherently slow. (As far as I can tell, Tim
Bray doesn't say that either.) I'll grant that getting
there can be much messier than with scripting
languages because there is less support for those
kinds of operations.

Best,
Thomas




____________________________________________________________________________________
Yahoo! oneSearch: Finally, mobile search
that gives answers, not web links.
http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
View user's profile Send private message
Guest
Posted: Sun Sep 23, 2007 9:16 pm Reply with quote
Guest
On 9/24/07, Patrick Logan <patrickdlogan@gmail.com> wrote:
> > > > http://www.tbray.org/ongoing/When/200x/2007/09/22/Erlang
> > > >
> > > > Tim Bray might raise some valid points here, even if he's slightly
> > > > biased by his background.
>
> The good news is speeding up the i/o in erlang should be easier than
> introducing better concurrency to another language.
>

I've never had a problem with Erlang's general I/O performance, it's
probably just some implementation detail of direct file I/O that is
the loser here. The obvious Erlang fast path to read lines is to spawn
cat and let the port machinery do all of the work for you. Here's an
example (including a copy of Tim's dataset):

http://undefined.org/erlang/o10k.zip

-bob
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
Thomas Lindgren
Posted: Sun Sep 23, 2007 9:22 pm Reply with quote
User Joined: 09 Mar 2005 Posts: 279
--- Pierpaolo Bernardi <olopierpa@gmail.com> wrote:

> On 9/23/07, Thomas Lindgren
> <thomasl_erlang@yahoo.com> wrote:
>
> > He's also using the obvious, tempting but very
> slow
> > io:read_line. Reading the entire file into a
> binary
> > takes 7 ms (sic) using file:read_file, not 34
> seconds
> > using io:read_line as he reports.
>
> He reports 34 seconds for the whole log file of
> about 1 million lines.
> Not for the reduced sample of about 20000 lines that
> he made
> available on the site (a difference of 50x in size).

Oops, sorry about missing that. Even so, the I/O as
such does not appear very costly. Chunk the processing
into 50 reads of 2MB each, or whatever size is
suitable. The total cost of these operations should
then be on the order of 50*7=350 ms (let's say around
a second, because it really depends on whether the
data have to be fetched from disk, seeks, bandwidth,
memory management, etc). At a guess, most of the time
will instead be spent in scanning and processing the
binaries.

Regarding parallelism, it looks to me like the reading
and processing can be overlapped. You have to
special-case lines or data that span chunks, but apart
from that, it looks as if you could process each chunk
independently, at least when you are doing
map/filter/reduce style operations (where the output
is combined incrementally as chunks are processed).

Best,
Thomas




____________________________________________________________________________________
Be a better Heartthrob. Get better relationship answers from someone who knows. Yahoo! Answers - Check it out.
http://answers.yahoo.com/dir/?link=list&sid=396545433
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
View user's profile Send private message
Guest
Posted: Mon Sep 24, 2007 1:37 am Reply with quote
Guest
But what he is parsing here is plain text, not XML -- and his
complaints are about i/o and regexes, which are indeed slow. However I
think that he's mocking some of Erlang best features without really
using them, and I think we should prove him -- he asks for it after
all! -- that Erlang can do the job, and that if he can't wrap his old
Perl brain around it, not our fault...

--
Didier

On 9/24/07, Thomas Lindgren <thomasl_erlang@yahoo.com> wrote:
> He's also using the obvious, tempting but very slow
> io:read_line. Reading the entire file into a binary
> takes 7 ms (sic) using file:read_file, not 34 seconds
> using io:read_line as he reports. For a beginner it's
> not obvious what to use, though, so life could be
> easier.
>
> My own experience with parsing XML in Erlang vs Ruby
> is that xmerl parsing about 4 MB of XML handily beat
> "the obvious" Ruby library the other guy used
> (REXML?), being 10+ times faster or more -- xmerl
> needed 10 seconds versus "a few minutes" for Ruby. So
> I wouldn't say Erlang is inherently slow w.r.t.
> parsing, but again, one may need some experience to
> get it right.
>
> Best,
> Thomas
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
Thomas Lindgren
Posted: Mon Sep 24, 2007 7:11 am Reply with quote
User Joined: 09 Mar 2005 Posts: 279
--- dda <headspin@gmail.com> wrote:

> But what he is parsing here is plain text, not XML

Indeed, but the XML example shows that parsing as such
is not inherently slower in Erlang than in Ruby. The
point is that simple examples and benchmarks may not
generalize to broader statements.

> -- and his
> complaints are about i/o and regexes, which are
> indeed slow.

I wouldn't say I/O is inherently slow; for example,
reading large chunks of files into memory with
file:read_file and similar functions is fast enough.

Regexps are currently not a strength and could be sped
up. (Robert, here's your time to shine Smile

Also, when I consider the use of io:get_line/2, it
seems it may be "too easy" to choose a suboptimal
solution to some problems. That can be a problem in
situations like this, when the user base grows quickly
outside the core community and the new users thus
can't rely on helpful insiders.

Best,
Thomas

> However I
> think that he's mocking some of Erlang best features
> without really
> using them, and I think we should prove him -- he
> asks for it after
> all! -- that Erlang can do the job, and that if he
> can't wrap his old
> Perl brain around it, not our fault...
>
> --
> Didier
>
> On 9/24/07, Thomas Lindgren
> <thomasl_erlang@yahoo.com> wrote:
> > He's also using the obvious, tempting but very
> slow
> > io:read_line. Reading the entire file into a
> binary
> > takes 7 ms (sic) using file:read_file, not 34
> seconds
> > using io:read_line as he reports. For a beginner
> it's
> > not obvious what to use, though, so life could be
> > easier.
> >
> > My own experience with parsing XML in Erlang vs
> Ruby
> > is that xmerl parsing about 4 MB of XML handily
> beat
> > "the obvious" Ruby library the other guy used
> > (REXML?), being 10+ times faster or more -- xmerl
> > needed 10 seconds versus "a few minutes" for Ruby.
> So
> > I wouldn't say Erlang is inherently slow w.r.t.
> > parsing, but again, one may need some experience
> to
> > get it right.
> >
> > Best,
> > Thomas
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@erlang.org
>
http://www.erlang.org/mailman/listinfo/erlang-questions
>




____________________________________________________________________________________
Moody friends. Drama queens. Your life? Nope! - their life, your story. Play Sims Stories at Yahoo! Games.
http://sims.yahoo.com/
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
View user's profile Send private message
uwiger
Posted: Mon Sep 24, 2007 7:25 am Reply with quote
User Joined: 03 Jul 2006 Posts: 567 Location: Sweden
Thomas Lindgren wrote:
>
> Also, when I consider the use of io:get_line/2, it
> seems it may be "too easy" to choose a suboptimal
> solution to some problems. That can be a problem in
> situations like this, when the user base grows quickly
> outside the core community and the new users thus
> can't rely on helpful insiders.

There were some suggestions on how to perform line-
oriented I/O much faster in Erlang, in this post
and the replies that followed:

http://www.erlang.org/pipermail/erlang-questions/2007-June/027557.html

The problem I ran into was actually lack of flow control:
data flowed in so fast that the receiving end couldn't
keep up, even with minimal processing.

BR,
Ulf W
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
View user's profile Send private message Visit poster's website

Display posts from previous:  

All times are GMT
Page 1 of 2
Goto page 1, 2  Next
Post new topic

Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum