Simon icon Simon
Flexible server monitoring

"Next check" time - too long; can i increase test concurrency?

I have approx 150 tests running.
I have interval checks set to 0 seconds.
I need to test some alerts every 5 minutes and some every 15 minutes.
But once a check has been performed the time to next check comes back at approx 20 minutes.
Is there a way i can increase the concurrency of tests so i can reduce the "next check" frequency?

Ulf Dunkel's picture

Re: "Next check" time - too long; can i increase test ...

Are you sure you really want to have "interval checks set to 0 seconds"?

Re: "Next check" time - too long; can i increase test ...

i assume by doing that i reduce the time to next test to the minimum.
is that correct?

David Sinclair's picture

Re: "Next check" time - too long; can i increase test ...

The minimum frequency is 5 seconds, if I recall correctly, though might be longer. I recommend setting the interval at 30 seconds or so. But for that number of tests, it'd probably be better to check less frequently, to avoid overloading Simon.

Re: "Next check" time - too long; can i increase test ...

when i say i have interval checks set to 0 seconds - i mean "interval checks" in the preferences which has a max value of 10 seconds

I'm more confused now. so to come back to my original question...
----------------------------------------
I have approx 150 tests running.
I have interval checks set to 0 seconds <---- in the preferences pain.
I need to test some alerts every 5 minutes and some every 15 minutes.
But once a check has been performed the time to next check comes back at approx 20 minutes.
Is there a way i can increase the concurrency of tests so i can reduce the "next check" frequency?
---------------------------------------------

David Sinclair's picture

Re: "Next check" time - too long; can i increase test ...

Ah, I see. The "Interval between checks" preference is how long to wait between starting each check when checking multiple tests manually. A value of 1 or higher is recommended to avoid overloading things.

Look at the check frequency in the Edit Test window for ones that come back as 20 minutes; maybe the success/failure frequency is actually set at 20 minutes.

Re: "Next check" time - too long; can i increase test ...

sample test values for a test that is currently showing "queued 43 minutes"
===============
Check Frequency:
in success: 5 minutes
on failure: 1 minute
Timeout: 3 minutes
===============
--------------
on the monitor screen, it shows the "Next check": queued 43 minutes.

Also the application becomes progressively more sluggish the longer i go between restarting Simon.
i.e. i can move the monitor window immediately, but to minimise or perform an edit has a delay of more than 30 seconds
The server it is on is running normally and other processes are responsive e.g. launching terminal to perform the "top" command

running a "Top" command:
Processes: 57 total, 3 running, 54 sleeping... 293 threads 10:35:45
Load Avg: 2.26, 1.88, 1.82 CPU usage: 65.57% user, 4.25% sys, 30.19% idle
SharedLibs: num = 2, resident = 38M code, 0 data, 2632K linkedit.
MemRegions: num = 14931, resident = 805M + 7288K private, 67M shared.
PhysMem: 236M wired, 1300M active, 18M inactive, 1556M used, 492M free.
VM: 5707M + 135M 9235977(0) pageins, 56703513(0) pageouts

PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE
99760 Simon 0.0% 0:00.00 1 8 648 32K 8528K 32K 268M

David Sinclair's picture

Re: "Next check" time - too long; can i increase test ...

Simon uses a queue when starting checking, to avoid starting everything at once. That queue message means the test has been added to the queue, though I agree 43 minutes is way too long. Are there lots of other tests with queued messages, or just a few that are queued and never starting?

With 150 tests running frequently, Simon is kept pretty busy. Reducing the check frequency might help.

I do plan on making some changes in upcoming releases that should help reduce the load. In version 3.1, which I hope to have out in a couple of weeks, the data storage will be more flexible and efficient, which will help, and in version 3.2, in a couple of months, I plan to split the app into more processes: it currently uses some helpers for certainly operations, but in 3.2 it will launch each test in a separate process, which will greatly spread the load, and avoid impacting the UI (and in fact you won't need to keep the main UI app running at all).

Re: "Next check" time - too long; can i increase test ...

On the monitor page all jobs are queued with varying times to next check (the time obviously reduces as i watch it)

so as a snapshot
i have:
3 messages queued for 30min
1 for 29min
1 for 28min
6 for 27min
6 for 26min
2 for 25min
3 for 24min
3 for 23min
and so on down to
1 at 7 seconds
7 "now: 2mins"
etc

but to be clear, ALL my check frequencies are set to either (mainly) 5min or (some) 15min.

incidentally the monitor screen seems to show between 6 - 10 tests occurring concurrently.

Unfortunately I can't reduce the frequency of test as it is a stipulation that i must perform tests against some sites at 5 minute intervals (which at the moment with Simon i'm not able to do).

Is 150 tests at 5min frequency too much for Simon? I don't think the server is working hard, though the app might be.

David Sinclair's picture

Re: "Next check" time - too long; can i increase test ...

That's certainly keeping Simon very busy. It should be able to cope with it, but restarting Simon occasionally would definitely help. As I said, version 3.2 should help it cope better with heavy loads.

In the meantime, another solution could be to split the load up — run multiple copies of Simon, with subsets of the tests (could just duplicate the data and pause some in each). Either on the same machine (by duplicating the app and data folder, and pointing each app at different data), or on different machines.

Re: "Next check" time - too long; can i increase test ...

do i need additional licences?

David Sinclair's picture

Re: "Next check" time - too long; can i increase test ...

If you're running 150 tests, you must have a Platinum license. That can be used on all machines in your organization. So no additional license is needed.

Re: "Next check" time - too long; can i increase test ...

I do have a platinum licence. I'll split them out and see how it goes. thanks

Re: "Next check" time - too long; can i increase test ...

i duplicated the app and data folder 5 times. i renamed each data folder and launched each duplicated copy of Simon and pointed it at the relevant new duplicated data folder.

I then quit simon and did the next. 5 times in total.

when i then launch them, they all point to the same data folder (the last one that was assigned to the 5th app).

Am i missing something?

David Sinclair's picture

Re: "Next check" time - too long; can i increase test ...

You have to use different machines, or different user accounts on the machine. The location of the data folder is stored in the preferences, which is specific to each user account.