A few times recently we have lost out Outbound emails. This has been for a few different reasons and each time has been quickly fixed, but only after we became aware that they were failing. In some cases, this took a few days.
Is there a way that we can set up a regular test email to a Sys Admin account or other way of alerting if emails are failing to send?
You could schedule a daily email via the scheduled reports function.
Or if you use Outlook you could schedule an email to your servicedesk address every morining and check for a reply - that way you check both incoming and outgoing.
If you have a recent version the simplest way may be to put a count query (against failed send in the message queue) on any administrator's dashboards. It's a very simple query to knock up.
Cheers - A.
If you write a query against system/message recipients and set that to run every 30 mins and pop it on a dashboard, that will give you a real time feed on the message queue. If it gets a bit long, time to take a look.
I have a VBS script that I run as a scheduled job every 30 minutes that checks the outbound queue, and if there's anything in there older that an hour, it tries to restart the outbound email service, and then sends me an email telling me whether the restart was successful or not (and how old the oldest unsent email is).
We had several instances like yours where we didn't realise for over a day, which is when I wrote the script. It seems that if TPServices loses connection to the DB, the outbound email service doesn't always recover as cleanly as other services - restarting the service has fixed this for us on a couple of occasions, so by the time I read the email it's already fixed.
Happy to supply the script if it might be useful for you, but I will have to take out the email addresses & DB names etc first.... It's also written for the XP/Windows2003 Windows Scripting Host, so if your on Win 2008 you may have to adapt it to Windows Power Shell (We;re also still on 7.2.5, but the email queue is much the same still I believe)
I also have scripts to monitor the event logs on our servers, and any time there's an error put in the log, it sends me an email. I see alot of errors from users that have too many characters in the TITLE field on the Web Portal. (really looking forward to 7.4 with longer title fields! - next weekend!)
I know this post is 2 years old, but I recently set the background process and outgoing mail process to restart itself (the actual windows service) after crashing after 5 minutes, up to 2 times per day. We haven't had an issue since. Most of time time the crashing was caused by weird things happening in the console.
it never happened on 7.4 for us either, I only noticed it with 7.5 SP1.
I changed the following settings and haven't had issue since. I purposely left the 3rd+ failure at no action, because if it failed more than 2 times in a day I figure there is another issue besides the console causing the service to freeze.