|
|
| Author |
Message |
< Erlang ~ The null pointer problem |
| jz87 |
Posted: Fri Jan 11, 2008 7:50 pm |
|
|
|
Joined: 10 Jan 2008
Posts: 3
|
So I run into this problem a lot. I'm working with a bunch of processes, then some bug in my code crashes a process. the other processes that work with it doesn't know that it crashed and keep sending messages to it. If they are using synchronous rpc calls, then this locks up the other processes that depend on this dead process. This is basically the null pointer problem you get in langauges like Java. What makes this a nasty problem is that you can't even check if a process is alive in is_process_alive in guard expressions.
I can restart a crashed process with a supervisor, but it won't have the same Pid. So I need someway of notifying every other process that depends on it that this process crashed, and distribute the new Pid. this basically makes a small problem into a big problem. It creates a whole slew of dependencies between processes. To make a simple rpc reliable, you have to build a whole Pid distribution network. Isn't there someway of intercepting messages sent to the old Pid and forwarding them or notifying the callers to update their address books? |
|
|
| Back to top |
|
| bluefly |
Posted: Fri Jan 11, 2008 8:30 pm |
|
|
|
User
Joined: 06 Jan 2008
Posts: 10
|
I am looking at this problem in two ways: one is a spiffy trick for registering process names, and the other is an architecture adjustment.
With this first trick, I am assuming you are spawning anonymous processes that do not deserve a global registered name. Why not give it a registered name that is unique but memorable? You can do that with this kind of call:Code: CPPid = spawn(...), % spawn your crashing process
UniqueIdentifier = ...,
register(
erlang:list_to_atom(
"crashing_process_" ++ UniqueIdentifier),
CPPid % the crashing process pid
) Then, when you restart your process, you just give it the same generated name. The generated name could be, for example, pid_to_list(PidOfSomeMasterProcess).
The second way is that you should always use spawn_link() and link() to associate the web of processes to each other so that they can detect when there is an odd situation and handle it appropriately. Because the processes are link()ed, they can be immediately made aware that the process has gone down. Restarting a process via some controller process is only one piece of the robustness puzzle for a given app; the other processes that are aware of that process need to safely handle the oddball situations, too.
I do not recommend the first way, as I think the entire system of interacting processes needs to be well-understood, robust, and not given temporary adjustments/controls that might produce code maintenance or cascading failures as the project evolves. The link() BIF is probably really what you are looking for. |
|
|
| Back to top |
|
| Mazen |
Posted: Sat Jan 12, 2008 8:33 am |
|
|
|
User
Joined: 20 Jul 2006
Posts: 164
Location: London
|
I agree with bluefly, you probably want to use spawn_link (or spawn and then link).
I think the point with asynch msgs is that it enables you to create an architecture where processes have little dependencies and are allowed to crash without impacting to much on their environment. Having been involved in a few "largish" projects I can safely say that I love this.
Basically, either you care about a process dying/crashing or you don't, and 95% of the time you don't. When you do care, stick to Supervisor and think hierarchy don't try to create a flat structure like a web where everyone knows everyone, unless you really have to of course. If you create a web, have a look at your architecture, perhaps you can "unweb" it. Otherwise you can often use a mapping process where processes register and calling processes get the pid based on a more static id.
So in the end I think it is more of an architectural issue tbh. You will normally have 2 types of processes (at least), servers and workers; servers serve the workers with information and need to stay operational, but if a worker dies the death should have no impact on anything. |
|
|
| Back to top |
|
| Mazen |
Posted: Sat Jan 12, 2008 8:49 am |
|
|
|
User
Joined: 20 Jul 2006
Posts: 164
Location: London
|
I agree with bluefly, you probably want to use spawn_link (or spawn and then link).
I think the point with asynch msgs is that it enables you to create an architecture where processes have little dependencies and are allowed to crash without impacting to much on their environment. Having been involved in a few "largish" projects I can safely say that I love this.
Basically, either you care about a process dying/crashing or you don't, and 95% of the time you don't. When you do care, stick to Supervisor and think hierarchy don't try to create a flat structure like a web where everyone knows everyone, unless you really have to of course. If you create a web, have a look at your architecture, perhaps you can "unweb" it. Otherwise you can often use a mapping process where processes register and calling processes get the pid based on a more static id.
So in the end I think it is more of an architectural issue tbh. You will normally have 2 types of processes (at least), servers and workers; servers serve the workers with information and need to stay operational, but if a worker dies the death should have no impact on anything. |
|
|
| Back to top |
|
|
|
All times are GMT
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You cannot attach files in this forum You cannot download files in this forum
|
|
|