Ensuring Mission Success: How Black Pearl Embraces DevSecOps and Delivers High Reliability for the Military

In today’s technology-driven military, the deployment of new software applications, updates, and security Black Pearl patches is critical to national security and well-being of our servicemen and women.  A DevSecOps platform that is reliable, secure, and stable can be the deciding factor between mission success and failure, which is why our nation’s DoD Software Factories and platforms must be treated as critical infrastructure.

PEO Digital, an integrated team of Navy and Marine Corps employees responsible for delivering digital and enterprise services to Sailors and Marines, relies on Black Pearl—a DevSecOps platform that facilitates the rapid and secure setup of software factories to achieve their mission of “delivering a world-class digital experience at the speed of mission.”  While the Department of Defense offers various options for DevSecOps platforms, Black Pearl distinguishes itself through its security, and reliability when operating at mission scale. This is why it has received Authority to Operate (ATO) from the U.S. Navy and U.S. Marine Corps and has become the de facto standard for DevSecOps across these services. Black Pearl is accredited at IL2/IL5 security levels and consistently delivers over 99% uptime across these impact levels.

Despite a rapidly changing environment that could hinder performance of other DevSecOps Systems, Black Pearl maintains continuous operations with high levels of reliability and security, excelling a variety of areas:

1. Prevent outages before they occur – Developing systems and architectures that are resilient to failure is the first step in maintaining critical infrastructure uptime.

  • Horizontally scaled services can survive individual node failures through redundancy
  • Infrastructure as Code (IaC) with stringent change control processes can catch failures before production
  • Staging environments test changes
  • Proper resource management prevents system failures due to lack of resources

2. Continuously monitor metrics and other key performance indicators – Continuous feedback and monitoring is critical to understanding the overall reliability of a system, providing the data necessary to make intelligent and targeted decisions about where bottlenecks and reliability soft spots exist.

3. Develop systems that can automatically heal themselves in the event of a failure – Leveraging cloud-based constructs like elastic load balancers, auto scaling groups, and containerized platform managers such as Kubernetes, allows the Black Pearl Party Barge environment to automatically take erroring resources offline and spin up new resources based on known good configurations. Many outages occurring in the Party Barge environment are automatically resolved no human intervention in a matter of minutes and often seconds.

4. Implement processes and procedures for swift acknowledgement and response to issues – Supported by continuous monitoring and auto-heal capabilities, Black Pearl often resolves common issues swiftly and without manual intervention. Automated alerting quickly notifies team members of issues, and each alert is treated with urgency—during work hours, the average time from alert notification to team acknowledgment is less than 2 minute and our engineers work around the clock to address these outages when they appear.

5. Minimize maintenance window impacts – Updates, patches, and new features are critical to maintaining security of the Black Pearl Party Barge environment, but often require maintenance windows or planned downtime.  Black Pearl minimizes these windows and the potential impact by:

  • Implementing smaller and more frequent changes outside of business hours to minimize impact, versus batching until issues hit critical levels and then requiring significant effort to remedy.
  • Leveraging IaC to further reduces potential downtime by allowing teams to quickly roll back to known good states in the event of an issue and go back to testing/staging environments to address the issue before attempting in production again.

6. Prepare for disaster recovery -The Black Pearl team regularly backups every area of the system and tests the backup and restore procedure to ensure not only are backups working correctly, but that the team can quickly implement these backups during critical moments. Hope for the best, but plan for the worst.

Service monitoring and uptime dashboard for Black Pearl Party Barge environment

7. Focused Platform Engineering Teams – Our focus on building a quality modern software development platform starts with staffing the best platform DevSecOps engineers. Our engineering teams are focused on building a highly secure, highly available platform. Black Pearl is the silent partner propelling your projects forward. When your team doesn’t have to worry about the platform, they can focus their effort on mission specific outcomes.

In an ever-evolving environment, supporting the warfighter with critical software is vital for mission success. The Black Pearl team understands this imperative, which is why they lead in performance and reliability metrics. They are also continuously improving these metrics and providing transparent reporting for greater oversight and accountability to PEO Digital. To learn more, visit https://cloud.navy.mil/bp or www.peodigital.navy.mil.