Erlang/OTP Forums

Author Message

<  Erlang questions mailing list  ~  Keeping massive concurrency when interfacing with C

Guest
Posted: Sun Oct 02, 2011 9:10 pm Reply with quote
Guest
Hi everyone,

From my understanding, there are four main ways to interface Erlang
with C:

* C Node
* Port
* Linked-In Driver
* NIF (Native Implemented Function)

My problem is if I have, for example, spawned 20,000 Erlang processes
and I want them all to execute concurrently but they need to call C
code, how can I have that C code run concurrently without having to
spawn 20,000 threads in C (which would probably crash the OS) or using
obscene amounts of memory?

I've been reading over the examples of a C node, and it seems if
20,000 processes all send a message to the C node, the node will
process them one-by-one and not concurrently, so it becomes a
serialized bottleneck. Spawning 20,000 C nodes on a single machine
isn't feasible, because of the amount of memory that would require.

A port suffers from the same problem, since the Erlang processes would
be communicating with a single external program, and again, I can't
create 20,000 instances of that program.

Reading the documentation for a linked-in driver, it says:
http://www.erlang.org/doc/tutorial/c_portdriver.html

"Just as with a port program, the port communicates with a Erlang
process. All communication goes through one Erlang process that is the
connected process of the port driver. Terminating this process closes
the port driver."

But on the driver documentation page:
http://www.erlang.org/doc/man/erl_driver.html

"A driver is a library with a set of function that the emulator calls,
in response to Erlang functions and message sending. There may be
multiple instances of a driver, each instance is connected to an
Erlang port. Every port has a port owner process. Communication with
the port is normally done through the port owner process."

So this also seems to have the same problem as C nodes and ports,
since in order to maintain concurrency I would need 20,000 instances
of the same driver.

Finally, we have NIFs. These have potential, but when I read the
documentation:
http://www.erlang.org/doc/man/erl_nif.html

"Avoid doing lengthy work in NIF calls as that may degrade the
responsiveness of the VM. NIFs are called directly by the same
scheduler thread that executed the calling Erlang code. The calling
scheduler will thus be blocked from doing any other work until the NIF
returns."

So if one Erlang process calls a NIF, does this mean the other 19,999
processes are blocked until the NIF returns (or the subset of
processes a scheduler manages)? If so, this won't work either.

Does anyone have a solution to this that still allows you to use C
(I'm using C for the parts that are intensive number crunching)? Or
will I have to implement everything in Erlang?

Thanks!
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions
Post received from mailinglist
Guest
Posted: Mon Oct 03, 2011 1:44 am Reply with quote
Guest
Thanks for the reply, Kresten!

I definitely would not be doing any disk I/O in the C code. It would
be intense number crunching, so it would be CPU (and perhaps memory)
bound. Everything I've read states Erlang is not good at number
brunching (Cesarini mentions this in his "Erlang Programming" book) so
I'm considering writing the code to do that in C.

If I call a NIF, only the particular scheduler that manages that
Erlang process would be blocked and no other scheduler, right? So for
example, if I have a CPU with eight cores, and an Erlang scheduler
thread is running on each core, and say the third scheduler is
executing an Erlang process that calls a NIF (and so blocks), only
that scheduler would be blocked until the NIF finishes executing,
correct?

I'm debating which solution would be better. Erlang would be slower at
number crunching, but is extremely efficient at managing concurrent
executing processes, meaning each would gradually make progress every
X units of time since they'll all get a turn to execute. But I wonder
if having a single process execute NIF code until it finishes (and so
all the processes managed by a single scheduler execute serially)
would be faster than implementing it all in Erlang and having
processes execute concurrently within a single scheduler (albeit the
code would be slower to execute). There would be less overhead of
Erlang process context switching (although admittedly that isn't much
to begin with) and the C code would be faster at number crunching. I
suppose there's only one way to find out! Smile

I was also thinking about writing the number crunching code in some
other language than C, such as OCaml. OCaml has a reputation for being
as fast as C, yet not nearly as low-level. Maybe that would be a good
fit with Erlang.

Example benchmarks:
http://shootout.alioth.debian.org/u32q/benchmark.php?test=all&lang=hipe&lang2=gpp

http://shootout.alioth.debian.org/u32q/benchmark.php?test=all&lang=ocaml

The Erlang benchmark was using HiPE as well.

Thanks for the suggestion!
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions
Post received from mailinglist
Guest
Posted: Mon Oct 03, 2011 6:32 am Reply with quote
Guest
Hi John,

There are few things here I may add.

As Kresten said, the number of real parallel threads you can run on a
computing element depends on the number of cores. Nevertheless, making
your code to run in threads may bring an advantage in certain cases.

Now, what I missed from your e-mails is how would you like the
information from Erlang to be processed. Meaning, is the information
processed by an Erlang thread linked to another running thread? Or each
thread has its own distinct information? These questions need to be
answered before you proceed further in designing your application.

In case of linked information, than serializing the data seems not so
bad idea (depending of the level of relation in between threads data).
Otherwise, in the case of independent data per thread, you don't need to
worry about creating 20k threads in C (using NIF's), but just create a
dynamic library .so (shared object) which you have to load it before
starting your Erlang threads. Linux will take care of the rest (it will
create as many data instances within your library as required). You just
need to take care that your library to be thread safe (mainly, no memory
leaking and not to try to use more memory in the buffer than you have
physically).

If you wonder if to use Erlang or simple C (or any programming language
for that matter), think firstly about what you need in the end. All of
us would like to have super-speedy applications by squeezing the maximum
the computational power from our hardware, but we all need to make some
compromises. What I perceived from using Erlang is that this is not
suitable for regular desktop applications, but, instead, it's a very
handy tool when developing applications such as non-blocking complex
data processing and fast network applications (and may be more, but I
used Erlang only for that for the time being). It's not that you cannot
obtain all those by writing your applications in C, but why reinventing
the wheel when you can just have it? Erlang is robust enough to give you
a nice environment for these kinds of applications.

Concluding, using Erlang is just a matter of taste and how comfortable
you feel yourself with such a programming language. Searching for
benchmarks of a programming language doesn't help you too much because
they are usually made for certain conditions which, in 90% of the cases,
do not fit your needs. In this case, you need high concurrency, I
suggest you to consider more cores of lower frequency better than fewer
cores of higher frequency (or, if you can afford GPU instead of CPU).
Keep in mind that whatever you will choose, you will always be
restricted by your hardware and for the few milliseconds you may gain
per process you need to work hours if not days.

Good luck!

Cheers,
CGS


On 10/03/2011 03:44 AM, John Smith wrote:
> Thanks for the reply, Kresten!
>
> I definitely would not be doing any disk I/O in the C code. It would
> be intense number crunching, so it would be CPU (and perhaps memory)
> bound. Everything I've read states Erlang is not good at number
> brunching (Cesarini mentions this in his "Erlang Programming" book) so
> I'm considering writing the code to do that in C.
>
> If I call a NIF, only the particular scheduler that manages that
> Erlang process would be blocked and no other scheduler, right? So for
> example, if I have a CPU with eight cores, and an Erlang scheduler
> thread is running on each core, and say the third scheduler is
> executing an Erlang process that calls a NIF (and so blocks), only
> that scheduler would be blocked until the NIF finishes executing,
> correct?
>
> I'm debating which solution would be better. Erlang would be slower at
> number crunching, but is extremely efficient at managing concurrent
> executing processes, meaning each would gradually make progress every
> X units of time since they'll all get a turn to execute. But I wonder
> if having a single process execute NIF code until it finishes (and so
> all the processes managed by a single scheduler execute serially)
> would be faster than implementing it all in Erlang and having
> processes execute concurrently within a single scheduler (albeit the
> code would be slower to execute). There would be less overhead of
> Erlang process context switching (although admittedly that isn't much
> to begin with) and the C code would be faster at number crunching. I
> suppose there's only one way to find out! Smile
>
> I was also thinking about writing the number crunching code in some
> other language than C, such as OCaml. OCaml has a reputation for being
> as fast as C, yet not nearly as low-level. Maybe that would be a good
> fit with Erlang.
>
> Example benchmarks:
> http://shootout.alioth.debian.org/u32q/benchmark.php?test=all&lang=hipe&lang2=gpp
>
> http://shootout.alioth.debian.org/u32q/benchmark.php?test=all&lang=ocaml
>
> The Erlang benchmark was using HiPE as well.
>
> Thanks for the suggestion!
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@erlang.org
> http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions
Post received from mailinglist
Guest
Posted: Mon Oct 03, 2011 11:42 am Reply with quote
Guest
John,

When I see a number like 20k processes my mind automatically skips to what is the load per process?

Do I need a full CPU for each, or a fractional CPU, or maybe I need 20k cores at peak?

If the issue is you have 20k anything that need to be scheduled, you need 20k x ( process time + switching cost) / process. Hardware architecture determines a lot about both. On Linux you can get away with hundreds of native processes on Intel but you may not be able to eek out enough processing time per core to do useful work.

If you are running on multicore ARM you can eek out maybe 10 processes per core, before just the switching cost alone kills your CPU. This is why we are seeing a move towards 128 core and 256 core ARM processors. If you need 20k cores and can afford around $32-64k in hardware, there are a couple companies that will have products shipping next year.

Generally my preferred solution to this problem is event driven C talking over a socket connection to Erlang. Using kqueue or epoll you can easily handle a few thousand socket connections per core on the C side, and Erlang can easily scale out as a command and control infrastructure.

If you are clever using consistent hashing to manage system memory across your nodes and job scheduling can make a handful of cores (24ish) perform like a 20k node cluster.

But the specifics of your project and budget will determine if it is even possible Smile

Dave

-=-=- dave@nexttolast.com -=-=-

On Oct 2, 2011, at 5:10 PM, John Smith <emailregaccount@gmail.com> wrote:

> Hi everyone,
>
> From my understanding, there are four main ways to interface Erlang
> with C:
>
> * C Node
> * Port
> * Linked-In Driver
> * NIF (Native Implemented Function)
>
> My problem is if I have, for example, spawned 20,000 Erlang processes
> and I want them all to execute concurrently but they need to call C
> code, how can I have that C code run concurrently without having to
> spawn 20,000 threads in C (which would probably crash the OS) or using
> obscene amounts of memory?
>
> I've been reading over the examples of a C node, and it seems if
> 20,000 processes all send a message to the C node, the node will
> process them one-by-one and not concurrently, so it becomes a
> serialized bottleneck. Spawning 20,000 C nodes on a single machine
> isn't feasible, because of the amount of memory that would require.
>
> A port suffers from the same problem, since the Erlang processes would
> be communicating with a single external program, and again, I can't
> create 20,000 instances of that program.
>
> Reading the documentation for a linked-in driver, it says:
> http://www.erlang.org/doc/tutorial/c_portdriver.html
>
> "Just as with a port program, the port communicates with a Erlang
> process. All communication goes through one Erlang process that is the
> connected process of the port driver. Terminating this process closes
> the port driver."
>
> But on the driver documentation page:
> http://www.erlang.org/doc/man/erl_driver.html
>
> "A driver is a library with a set of function that the emulator calls,
> in response to Erlang functions and message sending. There may be
> multiple instances of a driver, each instance is connected to an
> Erlang port. Every port has a port owner process. Communication with
> the port is normally done through the port owner process."
>
> So this also seems to have the same problem as C nodes and ports,
> since in order to maintain concurrency I would need 20,000 instances
> of the same driver.
>
> Finally, we have NIFs. These have potential, but when I read the
> documentation:
> http://www.erlang.org/doc/man/erl_nif.html
>
> "Avoid doing lengthy work in NIF calls as that may degrade the
> responsiveness of the VM. NIFs are called directly by the same
> scheduler thread that executed the calling Erlang code. The calling
> scheduler will thus be blocked from doing any other work until the NIF
> returns."
>
> So if one Erlang process calls a NIF, does this mean the other 19,999
> processes are blocked until the NIF returns (or the subset of
> processes a scheduler manages)? If so, this won't work either.
>
> Does anyone have a solution to this that still allows you to use C
> (I'm using C for the parts that are intensive number crunching)? Or
> will I have to implement everything in Erlang?
>
> Thanks!
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@erlang.org
> http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions
Post received from mailinglist
Guest
Posted: Mon Oct 03, 2011 1:19 pm Reply with quote
Guest
On Mon, Oct 3, 2011 at 3:44 AM, John Smith <emailregaccount@gmail.com> wrote:
> Thanks for the reply, Kresten!
>
> I definitely would not be doing any disk I/O in the C code. It would
> be intense number crunching, so it would be CPU (and perhaps memory)
> bound. Everything I've read states Erlang is not good at number
> brunching (Cesarini mentions this in his "Erlang Programming" book) so

This may not be true - I wrote some crypto stuff in Erlang with bignums
and it turned out to be faster than some C I had. I guess this was because
I could write a more advance algorithm than in C - but I never
investigated why. I would expect small fixed types and array type algorithms
to be faster in C, but not necessarily bignum computations.

Also bear in mind that the C and Erlang will not be solving the same problem
In C you might have to protect the code from buffer overflow attacks
but In Erlang this would not be necessary. Also Erlang is slower *by design*
to allow for code-changes on-the-fly which C cannot do.

So saying Erlang is "not good at number crunching" is only a first
approximation to the truth ... true for most things, but not a universal truth
for which you have to read the small print ...


> I'm considering writing the code to do that in C.

Just curious - what type of "intense number crunching?" - there are
different types of number crunching - things like digital image
processing involve identical computations on a grid - so could be done
on a GPU - other operations may or may not be suitable to a GPU. If
the CPU demands are very-variable upping the number of cores and
changing to a Tilera might help.

The time and memory properties of the C are also interesting - do the
C tasks always take the same time/memory or are they highly variable?
This can effect the scheduling strategy - you might get CPU or memory
starvation.

Although number crunching might be faster in C than Erlang the round-trip
times become important if you do relatively little work in C. You might spend
more time in communication than the time you save in being faster in C.

I'd start by making a pure Erlang solution and then measuring to see where
the problems are - Guessing where the time goes is notoriously difficult - even
if the the pure Erlang solution is not fast enough the code can provide
a useful reference implementation to start with and should be up-and-running
quicker than if you start coding NIFS etc.

Virtually every time I've had a program that was too slow, and I've guessed
where the problem was I've been wrong - so I'd build a reference
implementation first then measure - then optimize.

Cheers

/Joe


> If I call a NIF, only the particular scheduler that manages that
> Erlang process would be blocked and no other scheduler, right? So for
> example, if I have a CPU with eight cores, and an Erlang scheduler
> thread is running on each core, and say the third scheduler is
> executing an Erlang process that calls a NIF (and so blocks), only
> that scheduler would be blocked until the NIF finishes executing,
> correct?
>
> I'm debating which solution would be better. Erlang would be slower at
> number crunching, but is extremely efficient at managing concurrent
> executing processes, meaning each would gradually make progress every
> X units of time since they'll all get a turn to execute. But I wonder
> if having a single process execute NIF code until it finishes (and so
> all the processes managed by a single scheduler execute serially)
> would be faster than implementing it all in Erlang and having
> processes execute concurrently within a single scheduler (albeit the
> code would be slower to execute). There would be less overhead of
> Erlang process context switching (although admittedly that isn't much
> to begin with) and the C code would be faster at number crunching. I
> suppose there's only one way to find out! Smile
>
> I was also thinking about writing the number crunching code in some
> other language than C, such as OCaml. OCaml has a reputation for being
> as fast as C, yet not nearly as low-level. Maybe that would be a good
> fit with Erlang.
>
> Example benchmarks:
> http://shootout.alioth.debian.org/u32q/benchmark.php?test=all&lang=hipe&lang2=gpp
>
> http://shootout.alioth.debian.org/u32q/benchmark.php?test=all&lang=ocaml
>
> The Erlang benchmark was using HiPE as well.
>
> Thanks for the suggestion!
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@erlang.org
> http://erlang.org/mailman/listinfo/erlang-questions
>
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions
Post received from mailinglist
Guest
Posted: Tue Oct 04, 2011 3:05 am Reply with quote
Guest
Sorry, I should've explained in more detail what we're trying to do.
That would help, eh? Smile

In a nutshell, our goal is take a portfolio of securities (namely
bonds and derivatives), and calculate a risk/return analysis for each
security. For risk, interest rate shock, and for return, future cash
flows. There are different kinds of analyses you could perform.

Here's a more concrete example. Pretend you're an insurance company.
You have to pay out benefits to your customers, so you take their
money and make investments with it, hoping for a (positive) return, of
course. Quite often insurance companies will buy bonds, especially if
there are restrictions on what they can invest in (e.g., AAA only).

You need to have an idea of what your risk and return are. What's
going to happen to the value of your portfolio if yields rise or fall?
Ideally you want to know what your cash flows will look like in the
future, so you can have a reasonable idea of what shape you'll be in
depending on the outcome.

One such calculation would involve shocking the yield curve (yields
plotted against maturity). If yields rise 100 basis points, what
happens to your portfolio? If they fall far enough how much would
yields need to fall before any of your callable bonds started being
redeemed?

Part of the reason why I think Erlang would work out well is the
calculations for each security are independent of each other -- it's
an embarrassingly parallel problem. My goal was to spawn a process for
each scenario of a security. Depending on how many securities and
scenarios you want to calculate, there could be tens or hundreds of
thousands, hence why I would be spawning so many processes (I would
distribute these across multiple machines of course, but we would have
only a few servers at most to start off with).

Because Erlang is so efficient at creating and executing thousands of
processes, I thought it would be feasible to create that many to do
real work, but the impression I get is maybe it's not such a great
idea when you have only a few dozen cores available to you.

CGS, could you explain how the dynamic library would work in more
detail? I was thinking it could work like that, but I wasn't actually
sure how it would be implemented. For example, if two Erlang processes
invoke the same shared library, does the OS simply copy each function
call to its own stack frame so the data is kept separate, and only one
copy of the code is used? I could see in that case then how 20,000
Erlang processes could all share the same library, since it minimizes
the amount of memory used.

David, the solution you described is new to me. Are there any
resources I can read to learn more?

Joe (your book is sitting on my desk as well =]), that's rather
interesting Erlang was purposely slowed down to allow for on-the-fly
code changes. Could you explain why? I'm curious.

We are still in the R&D phase (you could say), so I'm not quite sure
yet which specific category the number crunching will fall into (I
wouldn't be surprised if there are matrices, however). I think what
I'll do is write the most intensive parts in both Erlang and C, and
compare the two. I'd prefer to stick purely with Erlang though!

We have neither purchased any equipment yet nor written the final
code, so I'm pretty flexible to whatever the best solution would be
using Erlang. Maybe next year I can pick up one of those 20K core
machines =)
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions
Post received from mailinglist
Guest
Posted: Tue Oct 04, 2011 3:28 am Reply with quote
Guest
On Mon, Oct 3, 2011 at 10:05 PM, John Smith <emailregaccount@gmail.com> wrote:
> Sorry, I should've explained in more detail what we're trying to do.
> That would help, eh? Smile
>
> In a nutshell, our goal is take a portfolio of securities (namely
> bonds and derivatives), and calculate a risk/return analysis for each
> security. For risk, interest rate shock, and for return, future cash
> flows. There are different kinds of analyses you could perform.
>
> Here's a more concrete example. Pretend you're an insurance company.
> You have to pay out benefits to your customers, so you take their
> money and make investments with it, hoping for a (positive) return, of
> course. Quite often insurance companies will buy bonds, especially if
> there are restrictions on what they can invest in (e.g., AAA only).
>
> You need to have an idea of what your risk and return are. What's
> going to happen to the value of your portfolio if yields rise or fall?
> Ideally you want to know what your cash flows will look like in the
> future, so you can have a reasonable idea of what shape you'll be in
> depending on the outcome.
>
> One such calculation would involve shocking the yield curve (yields
> plotted against maturity). If yields rise 100 basis points, what
> happens to your portfolio? If they fall far enough how much would
> yields need to fall before any of your callable bonds started being
> redeemed?
>
> Part of the reason why I think Erlang would work out well is the
> calculations for each security are independent of each other -- it's
> an embarrassingly parallel problem. My goal was to spawn a process for
> each scenario of a security. Depending on how many securities and
> scenarios you want to calculate, there could be tens or hundreds of
> thousands, hence why I would be spawning so many processes (I would
> distribute these across multiple machines of course, but we would have
> only a few servers at most to start off with).
>
> Because Erlang is so efficient at creating and executing thousands of
> processes, I thought it would be feasible to create that many to do
> real work, but the impression I get is maybe it's not such a great
> idea when you have only a few dozen cores available to you.
>
> CGS, could you explain how the dynamic library would work in more
> detail? I was thinking it could work like that, but I wasn't actually
> sure how it would be implemented. For example, if two Erlang processes
> invoke the same shared library, does the OS simply copy each function
> call to its own stack frame so the data is kept separate, and only one
> copy of the code is used? I could see in that case then how 20,000
> Erlang processes could all share the same library, since it minimizes
> the amount of memory used.
>
> David, the solution you described is new to me. Are there any
> resources I can read to learn more?
>
> Joe (your book is sitting on my desk as well =]), that's rather
> interesting Erlang was purposely slowed down to allow for on-the-fly
> code changes. Could you explain why? I'm curious.
>
> We are still in the R&D phase (you could say), so I'm not quite sure
> yet which specific category the number crunching will fall into (I
> wouldn't be surprised if there are matrices, however). I think what
> I'll do is write the most intensive parts in both Erlang and C, and
> compare the two. I'd prefer to stick purely with Erlang though!
>
> We have neither purchased any equipment yet nor written the final
> code, so I'm pretty flexible to whatever the best solution would be
> using Erlang. Maybe next year I can pick up one of those 20K core
> machines =)
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@erlang.org
> http://erlang.org/mailman/listinfo/erlang-questions
>

Given your description above, I'd probably just write the first
version of your application in Erlang. Normally I'm all for the NIF's
but your scenario doesn't strike me as a the best fit (without first
measuring the native Erlang which will be easier to code and maintain
initially).

The reason here is that there's a noticeable cost to passing data
across the Erlang/(Driver|NIF|CNode) boundary so anything you're doing
on the C side should be fast enough to more than make up for this. A
good example here is from Kevin Smith's talk at the last Erlang
Factory SF on using CUDA cards for numerical computations (he's
illustrating the CUDA memory transfer overhead, but the same basic
idea applies to passing data from Erlang to C).

Given that your examples (sound to my non-financially familiar brain)
to be small calculations on lots of data, you might be pleasantly
surprised on the performance you'll get just from using Erlang across
a large number of cores. And even if you find out in the future that
you can write a small NIF that does your calculation in C using a
request queue, that's just as well because you'll have tested that you
need it and will know exactly how much you're saving by using C and so
on.

Paul
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions
Post received from mailinglist
Guest
Posted: Tue Oct 04, 2011 11:21 am Reply with quote
Guest
On Tue, Oct 4, 2011 at 5:05 AM, John Smith <emailregaccount@gmail.com> wrote:
> Sorry, I should've explained in more detail what we're trying to do.
> That would help, eh? Smile
>
> In a nutshell, our goal is take a portfolio of securities (namely
> bonds and derivatives), and calculate a risk/return analysis for each
> security. For risk, interest rate shock, and for return, future cash
> flows. There are different kinds of analyses you could perform.
>
> Here's a more concrete example. Pretend you're an insurance company.
> You have to pay out benefits to your customers, so you take their
> money and make investments with it, hoping for a (positive) return, of
> course. Quite often insurance companies will buy bonds, especially if
> there are restrictions on what they can invest in (e.g., AAA only).
>
> You need to have an idea of what your risk and return are. What's
> going to happen to the value of your portfolio if yields rise or fall?
> Ideally you want to know what your cash flows will look like in the
> future, so you can have a reasonable idea of what shape you'll be in
> depending on the outcome.
>
> One such calculation would involve shocking the yield curve (yields
> plotted against maturity). If yields rise 100 basis points, what
> happens to your portfolio? If they fall far enough how much would
> yields need to fall before any of your callable bonds started being
> redeemed?
>
> Part of the reason why I think Erlang would work out well is the
> calculations for each security are independent of each other -- it's
> an embarrassingly parallel problem. My goal was to spawn a process for
> each scenario of a security. Depending on how many securities and
> scenarios you want to calculate, there could be tens or hundreds of
> thousands, hence why I would be spawning so many processes (I would
> distribute these across multiple machines of course, but we would have
> only a few servers at most to start off with).
>
> Because Erlang is so efficient at creating and executing thousands of
> processes, I thought it would be feasible to create that many to do
> real work, but the impression I get is maybe it's not such a great
> idea when you have only a few dozen cores available to you.
>
> CGS, could you explain how the dynamic library would work in more
> detail? I was thinking it could work like that, but I wasn't actually
> sure how it would be implemented. For example, if two Erlang processes
> invoke the same shared library, does the OS simply copy each function
> call to its own stack frame so the data is kept separate, and only one
> copy of the code is used? I could see in that case then how 20,000
> Erlang processes could all share the same library, since it minimizes
> the amount of memory used.
>
> David, the solution you described is new to me. Are there any
> resources I can read to learn more?
>
> Joe (your book is sitting on my desk as well =]), that's rather
> interesting Erlang was purposely slowed down to allow for on-the-fly
> code changes. Could you explain why? I'm curious.

I said "slow by design" - perhaps an unfortunately choice of words -
What I meant was that there was design decision to allow code changes
on the fly and that a consequence of this design decision
means that all intermodule calls have one extra level of indirection
which makes them slightly slower to implement then calls to code which
cannot be changed on the fly.

Suppose you have some module x executing some long-lived code
(typically a telephony transaction) - you discover a bug in x. So you
fix the bug. Now you have two versions of x. The x that is still
currently executing, and the modified x that you will use when you
start new
transactions.

We want to allow all the old processes running the old version of x to
"run to completion" - new processes will get the next version of x.

This is achieved as follows: if you call x:foo/2 you always call the
latest version of the code, but inlined calls call the current version
of the code.

Let me give an example:

Imagine the following:

-module(foo).

fix_loop(N) ->
...
fix_loop(N+1).


dynamic_loop(N) ->
...
foo:dynamic_loop(N+1)


In the above fix_loop and dynamic_loop have *entirely different behaviors *

if we compile and reload a new version of foo, then any existing processes
running fix_loop/1 inside x will continue running the old code.

Any old processes running dynamic_loop/1 will jump into the new
version of the code when they make the (tail) call to
foo:dynamic_loop/1

To implement this requires one level of indirection in making the subroutine
call. We can't just jump to the address of the code for loop, we have to
call the function via a pointer. The ability to change code on the fly
introduces
a slight overhead in all function calls where you call the function
with an explicit module name - if you omit the module name then the
call will be slightly
fast, since the address cannot be changed later. so calling fix_loop/1
in the above is slightly faster than calling dynamic_loop/1.

Why do we want to do all this anyway?

We designed Erlang for telecomms applications - we deploy applications that
run for years and want to upgrade the software wihout disrupting services.

If a user runs some code in a transaction that takes a a few minutes and
we change the code we don't want to kill ongoing transactions using
the old code - nor can we wait until all transactions are over before
introducing new code (this will never happen).

Banks turn off their transactions systems while upgrading the software -
(apart from Klarna :- ) - aircraft upgrade the software while the
planes are on the ground (I hope) - but we do it as we run the system
(we don't want to loose calls just because we are upgrading the
software)

Now suppose you discover a fault in your software that causes to you
buy or sell shares at a catastrophically bad rate - what do you do -
wait for everything to stop before changing the code? - or pump in new
code to fix the bug in mid session. Just killing everything might
leave (say) a data base in an inconsistent state and make restarting
time-consuming.

Dynamic code change is useful to have under your feet just in case you need
it one day - in the case on online banking companies like Klarna use
this for commercial advantage Smile

/Joe



>
> We are still in the R&D phase (you could say), so I'm not quite sure
> yet which specific category the number crunching will fall into (I
> wouldn't be surprised if there are matrices, however). I think what
> I'll do is write the most intensive parts in both Erlang and C, and
> compare the two. I'd prefer to stick purely with Erlang though!
>
> We have neither purchased any equipment yet nor written the final
> code, so I'm pretty flexible to whatever the best solution would be
> using Erlang. Maybe next year I can pick up one of those 20K core
> machines =)
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@erlang.org
> http://erlang.org/mailman/listinfo/erlang-questions
>
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions
Post received from mailinglist
Guest
Posted: Tue Oct 04, 2011 4:43 pm Reply with quote
Guest
On Mon, Oct 3, 2011 at 10:05 PM, John Smith <

> wrote: > We are still in the R&D phase (you could say), so I'm not quite sure > yet which specific category the number crunching will fall into (I > wouldn't be surprised if there are matrices, however). I think what > I'll do is write the most intensive parts in both Erlang and C, and > compare the two. I'd prefer to stick purely with Erlang though![/code]From time to time the "Erlang is poor at number crunching" can be heard.[/code] Mainly this revolves around Erlang being bad at the kind of number crunching needed for linear algebra/image processing etc.[/code]Having a similar requirement as John for a current project I thinking a lot recently how to use Erlang maybe together with another language or system together for this (other requirements are quite in favor of Erlang for my project).[/code] When pondering this I noticed that if Erlang would have a flexible n-dim array type with good performing matrix/vector manipulation functions I would not need to integrate some external system with all the complexity required to make the concurrency impedances match.[/code] There are several systems where efficient matrix manipulation is added to languages that would not be considered for numerical calculations without them. Examples are NumPy and pdl.perl.org.
Guest
Posted: Tue Oct 04, 2011 10:17 pm Reply with quote
Guest
On 5/10/2011, at 5:37 AM, Peer Stritzinger wrote:
[how about a linear algebra library built on top of binaries, not entirely
unlike NumPy]

Hasn't something like this already been done? I'm sure I remember reading
about it.


_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions
Post received from mailinglist
Guest
Posted: Tue Oct 04, 2011 10:34 pm Reply with quote
Guest
# Richard O'Keefe 2011-10-04:
> On 5/10/2011, at 5:37 AM, Peer Stritzinger wrote:
> [how about a linear algebra library built on top of binaries, not entirely
> unlike NumPy]
>
> Hasn't something like this already been done? I'm sure I remember reading
> about it.

Yeah, I remember reading the paper with keen interest, but not sure the
code was ever published:

"High-Performance Technical Computing with Erlang"
http://www.erlang.org/workshop/2008/Sess23.pdf

Personally I'd consider OCaml/MLton (running as port program over stdio)
for that kind of task, but then I may be missing the point of this thread
(sorry, didn't follow closely).

BR,
-- Jachym
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions
Post received from mailinglist
Guest
Posted: Tue Oct 04, 2011 10:39 pm Reply with quote
Guest
A further clarification on what Joe wrote about hot loading:

Whatever the current Erlang system actually does, the overhead of remote
calls need in principle be no more than the overhead of dynamic dispatch
in a language like C++.

That overhead is actually surprisingly high (and yet people *willingly*
write Java, go figure). There is an indirect cost to the indirection,
namely that dynamic calls can't be inlined. For C++ there is an answer:
link-time analysis can find calls (often lots and lots of them) that
don't actually need to be polymorphic (e.g., because the declared class
turns out not to have any subclasses that override the method in question)
and those calls can be inlined after all. In languages which allow new
code to be added at run time (like Java and Erlang) it's not that easy.

Some years ago I proposed that Erlang could distinguish between
"detachable" and "non-detachable" parts, so that a group of modules could
be bound together in such a way that they would have to be replaced _as a
unit_. The idea has not been taken up because it's very far from being
Erlang's most pressing problem.

To John Smith, what on earth does "shocking the yield curve" mean?


One thing about architecture. Joe raised an interesting question.
"Now suppose you discover a fault in your software that causes to you
buy or sell shares at a catastrophically bad rate - what do you do -
wait for everything to stop before changing the code?"

My question is, "how could you structure your system so that if it
TRIES to buy or sell at a catastrophically bad rate it CAN'T?" A couple
of years ago I came up with an idea for a potential PhD candidate who
ended up going somewhere else. That was inspired by a true event here,
where an electricity company cut off supply to a house where there was
an extremely sick woman who depended on some machine to keep her alive
(I forget what kind). Needless to say, she died. And of course it was
one of those stories where the computer noticed the bill hadn't been
paid recently and sent out a notice to a technician who dutifully went
out and turned the power off without asking any awkward questions. So
what can we do to stop that? (The customer had informed the company of
their special needs.) The answer I came up with turns out to be
quite similar in spirit to Joe's UBF.

You have a GENERATOR of actions,
a CRITIC of actions, and
an EFFECTOR of actions.

(Come to think of it, there's a link here to Dorothy L. Sayers' "The
Mind of the Maker.") The generator of actions receives inputs and
decides on things to do, but doesn't actually do them. It passes
its proposals on to the critic, which watches out for bad stuff.
Things that the critic is happy with are passed on to the effector to
be carried out.

In the electricity case, the critic would use rules like
"If the proposal is to disconnect supply
and the customer has registered a special need
and there is no record of a court order
REJECT"

In the trading case, the critic's rules would say something about the
amount of money.

The generator should not rely on the critic; if everything is working
well you won't be able to tell if the critic is there or not.

A rejection by the critic indicates an error in the generator
requiring corrective programming. This is where it gets similar
to UBF: UBF contract checking isn't there to make good things
happen normally, it's there to stop bad things happening and make
sure they're noticed.

This is one way to use multicore: spend some of the extra cores doing
more checking.

_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions
Post received from mailinglist
Guest
Posted: Wed Oct 05, 2011 12:48 am Reply with quote
Guest
Hi Richard,

Here's an example: imagine you've plotted a curve of US Treasury
yields and their maturities:

http://en.wikipedia.org/wiki/File:USD_yield_curve_09_02_2005.JPG

You do this for 360 months (30 years) and have a yield for every
month. Now obviously there aren't data points for every month (there
are no 12.5-year Treasuries) so you have to come up with data points
for those months (but we can ignore that detail).

Now you've constructed your yield curve and you want to shock it. What
that means is you either shift the curve up or down by a fixed amount
of basis points for every yield point. If you shock the curve 100
basis points up (100 basis points equals 1 percent), you move every
yield point up by 100 basis points, and now you have your shocked
yield curve (shocking normally occurs at major intervals, e.g., 25,
50, 100). You can then evaluate how your portfolio would fare in this
environment.
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions
Post received from mailinglist
Guest
Posted: Wed Oct 05, 2011 7:24 am Reply with quote
Guest
On Wed, Oct 5, 2011 at 12:17 AM, Richard O'Keefe <ok@cs.otago.ac.nz> wrote:
>
> On 5/10/2011, at 5:37 AM, Peer Stritzinger wrote:
> [how about a linear algebra library built on top of binaries, not entirely
> unlike NumPy]
>
> Hasn't something like this already been done?
Guest
Posted: Wed Oct 05, 2011 8:26 am Reply with quote
Guest
On Wed, Oct 5, 2011 at 12:38 AM, Richard O'Keefe <ok@cs.otago.ac.nz> wrote:
> A further clarification on what Joe wrote about hot loading:
>
>

Display posts from previous:  

All times are GMT
Page 1 of 2
Goto page 1, 2  Next
This forum is locked: you cannot post, reply to, or edit topics.

Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum