<img height="1" width="1" src="https://www.facebook.com/tr?id=137541796839020&amp;ev=PageView &amp;noscript=1">
Manhattan Cityscape - Blog Page

Thoughts on Transit

System Outages and Disaster Recovery:  Know What to Do

by James Barker / February 18, 2019

As a transportation manager for a transit provider, you must constantly watch many different moving variables throughout the day.  They connect to produce a successful trip, a successful run, and ultimately a successful day.  You understand that any interruption in the communication network can cause a delay and you know how to respond quickly to communication hiccups.  But what happens when there is a noticeable network communication failure, for reasons uncertain, and any resolution may take a long time to find and implement?  When this happens, you are either prepared to overcome it or prepared to fail. Choose to be proactive.

Preparing for success

When we discuss continuity of operations, we are really talking about having an established contingency plan in the event of an unexpected outage.  Any number of things can go wrong in the world of transportation, so it’s best we think about those possibilities and plan accordingly.  You understand that even when an agency only gets a little bit behind schedule, it can be nearly impossible to catch up.

 

What could go wrong?  Our agency is shatterproof.

Everything.  And no, it’s not.

Cloud hosted software has many connections, which means there are many points of possible failure.  Some of the most common reasons include:

  • A network outage with the local communication provider, such as AT&T or Verizon.
  • Mobile data tablets (MDTs) aren’t responding, even if the website is working properly
  • An issue with the server, which causes it to stop transmitting updated information
  • An issue inside the system, which causes a communication breakdown between the server and your operations

 

Could my agency be the cause?

It’s unlikely, but it is possible. An agency may cause a crash if it’s using too much of the system’s resources at once.  For example, an agency that attempts to run a complex, comprehensive report in the middle of the day could potentially cause a crash.  Such reports could be hundreds or even thousands of pages long and many of those agencies have more than 15 vehicles operating, which service hundreds of riders.  It’s best to run large reports during the nighttime or on weekends.

In most cases, the cause will be external; however, it is important to know what your agency can do to minimize the risk of an outage.

 

In the event of an outage

If you experience an outage, the worst thing you can do is panic or to cause panic among staff and drivers.  Instead, here’s what you should do:

  • Collect all the information you can about the environment prior to the outage.
  • Test the software, hardware, and the website. This will give us clues into the cause.
  • Call the HelpDesk and be prepared to answer questions. Working together will help us get to the solution more quickly.
  • Don’t alert the drivers. Unless the drivers reported an issue with the MDTs, they may be entirely unaffected for at least an hour.

What happens during an outage?  At Ecolane, we are alerted immediately and have already begun to fix the issue before we even get your call.  In most cases, the system just needs to be reset, which takes only a few minutes.  In fact, we often fix or completely avoid an issue before clients are aware one exists or could be approaching because we constantly monitor the systems.

Graffiti Outside Transit Bridge

What about a catastrophe?  It’s extremely rare, but we must be prepared for the worst case scenario.  Ecolane regularly performs disaster recovery drills to ensure we can work through major issues like inoperable equipment, communication problems, data loss, and fire to the server.  Knowing what to do before an emergency arises can go a long way, so we recommend that agencies also do disaster recovery drills at least once each year.

 

Putting it all together

In today’s complex, hyper-connected web environment, it’s more important now than ever before to be prepared for system failures.  Communication networks, computer programs, and hardware can all be fickle.  The reasons aren’t always known and the solutions aren’t always easy to implement.  When failures occur, agencies want software partners that are proactive and responsive rather than reactive or lacking a sense of urgency.

 ---

Ecolane’s system is up 99 percent of the time for all our partners.  When an outage occurs, it generally takes no more than two minutes between the time it’s reported and the time it’s remedied.  The Ecolane support staff values preparedness and responsiveness to clients.  Additional disaster recovery information is also made available via literature and webinars.

To learn more about the advantages of using Ecolane’s DRT software, request a hassle-free demo.

Get a Demo


About the Author:

headshots31As Director of Support Services, James oversees the on-site and remote support team to provide technical support for web-based, Android integrated software solutions that service all our clients nationwide.  This includes training, troubleshooting, issues reporting, design specification, and system customization for issues with various platforms. 

James is the lead issues investigator and reporter, provides on-site training and specialized troubleshooting, creates new design specifications for development based on client needs, and has presented at several industry conferences as a representative of the company.  Some of his key achievements include earning a reputation for quickly resolving complex issues, creating custom MySQL queries and automations using Shell, and creating troubleshooting digital maps.

Tags: Transit Operations Customer Support

previous post Where Are We Going?  Looking Ahead into 2019 and Beyond
Next Post Automated Scheduling: The Solution of the Future and the Danger of the Veneer of Automation

Related posts