Created
December 9, 2011 07:49
-
-
Save hinrik/1450660 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
12:01:34 <@Hinrik> this bug I'm trying to track down is just insane | |
12:02:49 <@Hinrik> as I remove more lines of code to make it as simple as possible, it's taking more runs of the program to trigger the bug | |
12:07:00 <@dngor> BinGOs: There are potential races with SIGCHLD. It can be delivered ahead of the pipe read that registers EOF. | |
12:07:31 <@Hinrik> sounds like something I'm experiencing | |
12:07:55 <@BinGOs> it wasn't so much the delivery order, but the fact that a forked process of the PWR process was holding the wheel 'open' | |
12:08:02 <@dngor> What happens if you add a sleep wherever you've been removing code? | |
12:09:49 <@BinGOs> was that for me ? | |
12:10:22 <@dngor> It was for Hinrik. He's seeing a race that goes away as he simplifies the test case. | |
12:10:33 <@BinGOs> Oh fine >:) | |
12:11:05 <@BinGOs> I think the forked process is holding the STDOUT/ERR filehandles open, so in my case the wheel doesn't close. | |
12:11:27 <@dngor> For you, I think the reliable thing to do is wait for both SIGCHLD and input EOF... but a single "hey, it's ALL DONE" callback would be nice. | |
12:11:47 <@Hinrik> in my case, I do get a CloseEvent | |
12:11:48 <@BinGOs> simple enough to add a wheel reaper. | |
12:12:03 <@Hinrik> it's just the sigchld which sometimes doesn't get delivered | |
12:12:50 <@BinGOs> 30 seconds is not an unreasonable time to wait between sig_child and not getting a CloseEvent I think, before reaping that wheel's ass. | |
12:13:33 <@dngor> Do you pause input on the wheel? | |
12:14:01 <@BinGOs> This happens in the edginess of edge cases. | |
12:14:19 <@dngor> If pause_stdout() or pause_stderr() is in effect, it won't trigger the sysread() that drains the pipe and registers closure. | |
12:14:32 <@BinGOs> I suspect it is Test::WWW::Mechanize that is fucking up being tested. | |
12:15:20 <@BinGOs> leaves processes around on certain perl 5.13.x versions. | |
12:16:42 <@BinGOs> Yeah, it's usually this test: http://cpansearch.perl.org/src/PETDANCE/Test-WWW-Mechanize-1.32/t/back_ok.t | |
12:17:33 <@BinGOs> it launches a HTTP::Server::Simple::CGI subclass. | |
12:22:06 <@Hinrik> this is so ridiculous... I started by just commenting out the exporter stuff in the module, and that makes the test case much less likely to trigger the bug | |
12:27:39 <@Hinrik> but why would the length of the compilation phase have any effect since that happens even before the first call to POE::Wheel::Run->new ? | |
12:28:13 <@dngor> Is it compiling in the parent or child process? | |
12:28:45 <@Hinrik> parent | |
12:29:27 <@Hinrik> https://gist.github.com/980630 | |
12:30:06 <@Hinrik> at this point, it takes 5-20 seconds to trigger the condition if I run the test in a loop | |
12:30:33 <@Hinrik> commenting out the exporter stuff makes it take longer | |
12:30:41 <@Hinrik> commenting out more stuff makes it take even longer... | |
12:34:22 <@Hinrik> note that if I replace 'Program' with a q{perl -e ...} string which does exactly the same thing, it seems to be fine (so far) | |
12:35:41 -!- sh4 [[email protected]] has quit [Ping timeout: 360 seconds] | |
12:37:40 <@dngor> Failure means it hangs? | |
12:37:45 <@Hinrik> yes | |
12:38:04 <@dngor> Well, the good news is I'm also seeing it. | |
12:38:36 <@Hinrik> a few hundred runs of Program as a string, and no hangs | |
12:38:57 <@Hinrik> so this seems to be specific to CODE args, or at least more likely there | |
12:40:25 <@Hinrik> but you said there was a known race condition with SIGCHLD? so this is a known issue? | |
12:41:53 <@Hinrik> or was that just related to whether sigchld or CloseEvent comes first? | |
12:42:55 < gcola> Hinrik: To be fair my experience is mostly on Windows which is weird anyway, but CODE args are rather twitchy. | |
12:44:14 <@Hinrik> hopefully this can be resolved, making them less twitchy | |
13:13:03 <@dngor> When I turn on POE_TRACE_DEFAULT, I get: You already have a parser for (hinrik-race.pl). Perhaps you have run the same test twice. at /Library/Perl/Updates/5.10.0/TAP/Ha↪ rness.pm line 521 | |
13:13:54 <@Hinrik> huh | |
13:14:31 <@dngor> Seems to also occur when I do: POE_CATCH_EXCEPTIONS=0 prove hinrik-race.pl hinrik-race.pl hinrik-race.pl .... | |
13:15:03 <@Hinrik> why run it like that? | |
13:15:27 <@Hinrik> since prove obviously doesn't like getting the same argument twice | |
13:15:44 <@Hinrik> I just ran it in a bash for loop | |
13:16:14 <@Hinrik> for i in {1..1000}; do prove hinrik-race.pl; done | |
13:27:42 <@dngor> Hm. Not hanging so far with POE_TRACE_DEFAULT=1 | |
13:27:57 <@Hinrik> yeah, I just noticed that too | |
13:28:13 <@Hinrik> silly bug | |
13:43:44 -!- earino [[email protected]] has quit [Quit: earino] | |
14:08:12 -!- sh4 [[email protected]] has joined #poe | |
14:13:17 <@dngor> Argh! I tried to be sneaky with $SIG{INT} = sub { confess }; # but it's cleared somewhere. | |
14:14:43 <@dngor> Unrelated news: http://poe.perl.org/?POE_Cookbook/DBI_Helper_Processes mentions DBIAgent but not LaDBI or SimpleDBI | |
14:18:37 < gcola> how about EasyDBI? | |
14:18:37 <+purl> EasyDBI is ready | |
14:20:45 < gcola> Also FWIW I think only SimpleDBI and EasyDBI work on Windows. Been a while since I did my research though so things may have changed. | |
14:28:28 -!- reyjrar [[email protected]] has joined #poe | |
14:29:04 -!- UltraDM [[email protected]] has quit [Quit: Leaving] | |
15:10:59 <@dngor> Well, it depends on who wants to write the examples and post them to the wiki. | |
15:11:44 * dngor shouts, "It's a wiki! A WIKI!" in the same way Charlton Heston did about Soylent Green. | |
15:15:21 -!- tmbg [[email protected]] has quit [Remote host closed the connection] | |
15:46:45 * GumbyNET5 CPAN Upload: App-Pocoirc-0.43 by HINRIK | |
15:47:16 < DrForr> Along with a hint of "Damn you all to hell" from Planet of the Apes? | |
15:55:32 -!- sh4 [[email protected]] has quit [Read error: Connection reset by peer] | |
15:56:12 -!- earino [[email protected]] has joined #poe | |
16:06:16 <@Hinrik> I ran the test on Perl 5.8.9, 5.10.1, 5.12.2 and 5.14.0, it hangs on all of them |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment