All tests 'fail' intermittently

We have 31 tests running, mostly http but a smattering of DNS, POP, IMAP etc

Approximately once a month (it's intermittent and there's no pattern we can see) all alerts are triggered saying the tests are down. However checking the services and the line shows no problems.

The tests on Simon don't then recover they are all stuck showing the services they monitor as being down. While Simon appears to be up and running normally we are unable to highlight click on any particular test.

Quitting and relaunching fixes it and all tests immediately then show as recovered.

Simon is running on Mac OS X 10.5.8 Server on a 2 x 3GHz Quad-core intel Mac Pro. The only other things running on there are Apache, MySQL for website development. These services (and anything else on the server) all still work fine when Simon is acting up. There are no other issues we are aware of on this machine.

I do occasionally receive reports like this. It usually comes down to a combination of factors leading to all resources being used up for Simon, and it starts having exceptions logged to the console. The solution when this happens is as you found: quit and re-launch Simon.

I have plans to split Simon up into multiple processes, which will avoid this situation: each test will be performed in a separate process, so there isn't one point that can get overloaded.

In the meantime, my apologies, and please quit and relaunch Simon every so often. In your case, every two weeks would be plenty.

The system log does show an exception. I can send the relevant log if that would be of any use to you.

I should also mention that we've used Simon since 2008 and this has only started happening since we upgraded to 3.x. The number and type of tests hasn't changed significantly over the period.

Yes, please send me the log; it might be helpful.

I've sent the log directly to your email.