Backblaze Blog | Cloud Storage & Cloud Backup https://www.backblaze.com/blog/ Cloud Storage & Cloud Backup Thu, 07 Mar 2024 02:38:35 +0000 en-US hourly 1 https://wordpress.org/?v=6.4.3 https://www.backblaze.com/blog/wp-content/uploads/2019/04/cropped-cropped-backblaze_icon_transparent-80x80.png Backblaze Blog | Cloud Storage & Cloud Backup https://www.backblaze.com/blog/ 32 32 How Much Storage Do I Need to Back Up My Video Surveillance Footage? https://www.backblaze.com/blog/how-much-storage-do-i-need-to-backup-my-video-surveillance-footage/ https://www.backblaze.com/blog/how-much-storage-do-i-need-to-backup-my-video-surveillance-footage/#comments Tue, 05 Mar 2024 17:05:00 +0000 https://www.backblaze.com/blog/?p=110940 Video surveillance requires a ton of primary and backup storage capacity for many reasons. Here are a few things to consider when you're deciding how to use cloud storage.

The post How Much Storage Do I Need to Back Up My Video Surveillance Footage? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
A decorative image showing a surveillance camera connected to the cloud.

We all have things we want to protect, and if you’re responsible for physically or virtually protecting a business from all types of threats, you probably have some kind of system in place to monitor your physical space. If you’ve ever dealt with video surveillance footage, you know managing it can be a monumental task. Ensuring the safety and security of monitored spaces relies on collecting, storing, and analyzing large amounts of data from cameras, sensors, and other devices. The requirements to back up and retain footage to support investigations are only getting more stringent. Anyone dealing with surveillance data, whether in business or security, needs to ensure that surveillance data is not only backed up, but also protected and accessible. 

In this post, we’ll talk through why you should back up surveillance footage, the factors you should consider to understand how much storage you need, and how you can use cloud storage in your video surveillance backup strategy. 

The Importance of Backing Up Video Surveillance Footage

Backup storage plays a critical part in maintaining the security of video surveillance footage. Here’s why it’s so important:

  1. Risk Reduction: Without backup storage, surveillance system data stored on a single hard drive or storage device is susceptible to crashes, corruption, or theft. Having a redundant copy ensures that critical footage is not lost in case of system failures or data corruption.
  2. Fast Recovery: Video surveillance systems rely on continuous recording to monitor and record all activities. In the event of system failures, backup storage enables swift recovery, minimizing downtime and ensuring uninterrupted surveillance.
  3. Compliance and Legal Requirements: Many industries, including security, have legal obligations to retain surveillance footage for a specified duration. Backup storage ensures compliance with these requirements and provides evidence when needed.
  4. Verification and Recall: Backup recordings allow you to verify actions, recall events, and keep track of activities. Having access to historical footage is valuable for potential investigations and future decision making.

Each piece of information about video surveillance requirements will affect how much space your video files take up and, consequently, your storage requirements. Let’s walk through each of these general requirements so you don’t end up underestimating how much backup storage you’ll need.

Video Surveillance Storage Considerations

When you’re implementing a backup strategy for video surveillance, there are several factors that can impact your choices. The number and resolution of cameras, frame rates, retention periods, and more can influence how you design your backup storage system. Consider the following factors when thinking about how much storage you’ll need for your video surveillance footage:

  • Placement and Coverage: When it comes to video surveillance camera placement, strategic positioning is crucial for optimal security and regulatory compliance. Consider ground floor doors and windows, main stairs or hallways, common areas, and driveways. Install cameras both inside and outside entry points. Generally, the wider the field of view, the fewer cameras you’ll likely need overall. The FBI provides extensive recommendations for setting up your surveillance system properly.
  • Resolution: The resolution determines the clarity of the video footage, which is measured in the number of pixels (px). A higher resolution means more pixels and a sharper image. While there’s no universal minimum for admissible surveillance footage, a resolution of 480 x 640 px is recommended. However, mandated minimums can differ based on local regulations and specific use cases. Note that some regulations may not provide a minimum resolution requirement, and some minimum requirements may not meet the intended purpose of surveillance. Often, it’s better to go with a camera that can record at a higher resolution than the mandated minimum.
  • Frame Rate: All videos are made up of individual frames. A higher frame rate—measured in frames per second (FPS)—means a smoother, less clunky image. This is because there are more frames being packed into each second. Like your cameras’ resolution, there are no universal requirements specified by regulations. However, it’s better to go with a camera that can record at a higher FPS so that you have more images to choose from if there’s ever an open investigation.
  • Recording Length: Surveillance cameras are required to run all day, every day, which requires a lot of storage. To help reduce the instance of storing videos that aren’t of interest, some cameras can come with artificial intelligence (AI) tools that will only record footage when it identifies something of interest, such as movement or a vehicle. But if you’re protecting a business with heavy activity, this use of AI may be moot. 
  • Retention Length: Video surveillance retention requirements can vary significantly based on local, state, and federal regulations. These laws dictate how long companies must store their video surveillance footage. For example, medical marijuana dispensaries in Ohio require 24/7 security video to be retained for a minimum of six months, and the footage must be made available to the licensing board upon request. The required length can be prolonged even further if a piece of footage is required for an ongoing investigation. Additionally, backing up your video footage (i.e., saving a copy of it in another area) is different from archiving it for long-term use. You’ll want to be sure that you select a storage system that helps you meet those requirements—more on that later.

Each point on this list affects how much storage capacity you need. More cameras mean more footage being generated, which means more video files. Additionally, a higher resolution and frame rate mean larger file sizes. Multiply this by the round-the-clock operation of surveillance cameras and the required retention length, and you’ll likely have more video data than you know what to do with.

Scoping Video Surveillance Storage: An Example

To illustrate how much footage you can expect to collect, it can be helpful to see an example of how a given business’s video surveillance may operate. Note that this example may not apply to you specifically. You should review your local area’s regulations and consult with an industry professional to make sure you are compliant. 

Let’s say that you need to install surveillance cameras for a bank. Customers enter through the lobby and wait for the next available bank teller to assist them at a teller station. No items are put on display, only the exchange of paper and cash between the teller and the customer. Only authorized employees are allowed within the teller area. After customers complete their transactions, they walk back through the lobby area and exit via the building’s front entry.

As an estimate, let’s say you need at least 10 cameras around your building: one for the entrance; another for the lobby; eight more to cover the general back area, including the door to the teller terminals, the teller terminals themselves, the door to the safe, and inside of the safe; and, of course, one for the room where the surveillance equipment is housed. You may need more than 10 to cover the exterior of your building plus your ATM and drive through banking, but for the sake of an example, we’ll leave it at 10.

Now, suppose all your cameras record at 1080p resolution (1920 x 1080 px), 15 FPS, and a color depth of 14 bits (basically, how many colors the camera captures). For one 24 hour recording on one camera, you’re looking at 4.703 terabytes (TB). Over 30 days of storage, this can grow to 141.1TB. In other words, if the average person today needs a 2TB hard disk for their PC, it will take more than 70 PCs to hold all the information from just one camera.  

We’re deliberately oversimplifying this example for demonstration purposes. Keep in mind that video compression algorithms may significantly reduce these storage requirements and could play a role in determining storage requirements for video surveillance footage. There are several types of video compression, and different surveillance companies use that technology in different ways. Choosing a compression format can impact both size of the video files and the quality, so keep this in mind when needing to meet mandated minimum resolutions in case of an open investigation.

How Cloud Storage Can Help Back Up Surveillance Footage

Backing up surveillance footage is essential for ensuring data security and accountability. It provides a reliable record of events, aids in investigations, and helps prevent wrongdoing by acting as a deterrent. But the right backup strategy is key to preserving your footage.

The 3-2-1 backup strategy is an accepted foundational structure that recommends keeping three copies of all important data (one primary copy and two backup copies) on two different media types (to diversify risk) and storing at least one copy off-site. With surveillance data utilizing high-capacity data storage systems, adhering to the 3-2-1 rule is important in order to access footage in case of an investigation. The 3-2-1 rule mitigates single points of failure, enhances data availability, and protects against corruption. By adhering to this rule, you increase the resilience of your surveillance footage, making it easier to recover even in unexpected events or disasters.

Having an on-site backup copy is a great start for the 3-2-1 backup strategy, but having an off-site backup is a key component in having a complete backup strategy. Having a backup copy in the cloud provides an easy to maintain, reliable off-site copy, safeguarding against a host of potential data losses including:

  • Natural Disasters: If your business is harmed by a natural disaster, the devices you use for your primary storage or on-site backup may be damaged, resulting in a loss of data.
  • Tampering and Theft: Even if someone doesn’t try to steal or manipulate your surveillance footage, an employee can still move, change, or delete files accidentally. You’ll need to safeguard footage with proper security protocols, such as authorization codes, data immutability, and encryption keys. These protocols may require constant, professional, and IT administration and maintenance that are often automatically built into the cloud.
  • Lack of Backup and Archive Protocols: Unless your primary storage source uses   specialized software to automatically save copies of your footage or move them to long-term storage, any of your data may be lost.

The cloud has transformed backup strategies and made it easy to ensure the integrity of large data sets, like surveillance footage. Here’s how the cloud helps achieve the 3-2-1 back strategy affordably:

  •  Scalability: With the cloud, your backup storage space is no longer limited to what servers you can afford. The cloud provider will continue to build and deploy new servers to keep up with customer demand, meaning you can simply rent out the storage space and pay for more as needed.
  • Reliability: Most cloud providers share information on their durability and reliability and are heavily invested in building systems and processes to mitigate the impact of failures. Their systems are built to be fault-tolerant. 
  • Security: Cloud providers protect data you store with them with enterprise-grade security measures and offer features like access controls and encryption to allow users the ability to better protect their data.
  • Affordability: Cloud storage helps you use your storage budgets effectively by not paying to provision and maintain physical off-site backup locations yourself.
  • Disaster Recovery: If unexpected disasters occur, such as natural disasters, theft, or hardware failure, you’ll know exactly where your data lives in the cloud and how to restore it so you can get back up and running quickly.
  • Compliance: By adhering to a set of standards and regulations cloud solutions meet compliance requirements to ensure data stored and managed in the cloud is protected and used responsibly.

Protect The Footage You Invested In to Protect Yourself

No matter the size, operation, or location of your business, it’s critical to remain compliant with all industry laws and regulations—especially when it comes to surveillance. Protect your business by partnering with a cloud provider that understands your unique business requirements, offering scalable, reliable, and secure services at a fraction of the cost compared with other platforms.

As a leading specialized cloud provider, Backblaze B2 Cloud Storage can secure your surveillance footage for both primary backup and long-term protection. B2 offers a range of security options—from encryption to Object Lock to Cloud Replication and access management controls—to help you protect your data and achieve industry compliance. Learn more about Backblaze B2 for surveillance data or contact our Sales Team today.

The post How Much Storage Do I Need to Back Up My Video Surveillance Footage? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/how-much-storage-do-i-need-to-backup-my-video-surveillance-footage/feed/ 1
Cloud Storage for Higher Education: Benefits & Best Practices https://www.backblaze.com/blog/cloud-storage-for-higher-education-benefits-best-practices/ https://www.backblaze.com/blog/cloud-storage-for-higher-education-benefits-best-practices/#respond Thu, 29 Feb 2024 15:38:08 +0000 https://www.backblaze.com/blog/?p=108589 Higher education institutions create and store large amounts of data with a diverse set of needs. Cloud storage provides flexible solutions. Here are a few things to think about.

The post Cloud Storage for Higher Education: Benefits & Best Practices appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
A decorative image showing files and graduation caps being thrown in the air towards the Backblaze cloud.

Universities and colleges lead the way in educating future professionals and conducting ground-breaking research. Altogether, higher education generates hundreds of terabytes—even petabytes—of data. But, higher education also faces significant data risks. They are one of the most targeted industries for ransomware, with 79% of institutions reporting they were hit with ransomware in the past year.

While higher education institutions often have robust data storage systems that can even include their own off-site disaster recovery (DR) centers, cloud storage can provide several benefits that legacy storage systems cannot match. In particular, cloud storage allows schools to protect from ransomware with immutability, easily grow their datasets without constant hardware outlays, and protect faculty, student, and researchers’ computers with cloud-based endpoint backups.

Cloud storage is also a promising alternative to cloud drives, traditionally a popular option for higher education institutions. While cloud drives provide easy storage across campus, both Google and Microsoft have announced the end of their unlimited storage tiers for education. Faced with changes to the original service, many higher education institutions are looking for alternatives. Plus, cloud drives do not provide true, incremental backup, do not adequately protect from ransomware, and have limited options for recovery.   

Ultimately, cloud storage better protects your school from local disasters and ransomware with a secure, off-site copy of your data. And, with the right cloud service provider, it can be much more affordable than you think. In this article, we’ll look at the benefits of cloud storage for higher education, study some popular use cases, and explore best practices and provisioning considerations.

The Benefits of Cloud Storage in Higher Education

Cloud storage solutions present a host of benefits for organizations in any industry, but many of these benefits are particularly relevant for higher education institutions. Let’s take a look:

1. Enhanced Security

Higher education institutions have emerged as one of ransomware attackers’ favorite targets—63% of higher education CISOs say a cyber attack is likely within the next year. Data backups are a core part of any organization’s security posture, and that includes keeping those backups protected and secure in the cloud. Using cloud storage to store backups strengthens backup programs by keeping copies off-site and geographically distanced, which adheres to the 3-2-1 backup strategy (more on that later). Cloud storage can also be made immutable using tools like Object Lock, meaning data can’t be modified or deleted. This feature is often unavailable in existing data storage hardware.

2. Cost-Effective Storage

Higher education generates huge volumes of data each year. Keeping costs low without sacrificing in other areas is a key priority for these institutions, across both active data and archival data stores. Cloud storage helps higher education institutions use their storage budgets effectively by not paying to provision and maintain on-premises infrastructure they don’t need. It can also help higher education institutions migrate away from linear tape-open (LTO) which can be costly to manage.

3. Improved Scalability

As digital data continues to grow, it’s important for those institutions to be able to easily scale with their storage needs. Cloud storage allows higher education institutions to avoid potentially over-provisioning infrastructure with the ability to affordably tier off data to the cloud. 

4. Data Accessibility

Making data easily accessible is important for many aspects of higher education. From the impact of scientific researchers to the ongoing work of attracting students to the university, the increasing quantities of data that higher education creates needs to be easy to access, use, and manage. Cloud storage makes data accessible from anywhere, and with hot cloud storage, there are no access delays like there can be with cold cloud storage or LTO tape.

5. Supports Cybersecurity Insurance Requirements

It’s increasingly common to utilize cyber insurance to offset potential liabilities incurred by a cyber attack. Many of those applications ask if the covered entity has off-site backups or immutable backups. Sometimes they even specify the backup has to be held somewhere other than the organization’s own locations. (We’ve seen other organizations outside of higher ed adding cloud storage for this reason as well). Cloud storage provides a pathway to meeting cyber insurance requirements universities may face.

How Higher Ed Institutions Can Use Cloud Storage Effectively

There are many ways higher education institutions can make effective use of cloud storage solutions. The most common use case is cloud storage for backup and archive systems. Transitioning from on-premises storage to cloud-based solutions—even if an organization is only transitioning a part of their total data footprint while retaining on-premises systems—is a powerful way for higher education institutions to protect their most important data.  To illustrate, here are some common use cases with real-life examples:

LTO Replacement

It’s no surprise that maintaining tape is a pain. While it’s the only true physical air-gap solution, it’s also a time suck, and those are precious hours that your IT team should be spending on strategic initiatives. This is particularly applicable in projects that generate huge amounts of data, like scientific research. Cloud storage provides the same off-site protection as LTO with far fewer maintenance hours. 

Off-Site Backups

As mentioned, higher ed institutions often keep an off-site copy of their data, but it’s commonly a few miles down the road—perhaps at a different branch’s campus. Transitioning to cloud storage allowed Coast Community College District (CCCD) to quit chauffeuring physical tapes to an off-site backup center about five miles away and instead implement a virtualized, multi-cloud solution with truly geographically distanced backups.

Protection From Ransomware

A ransomware attack is not a matter of if, but when. Cloud storage provides immutable ransomware protection with Object Lock, which creates a “virtual” air gap. Pittsburg State University, for example, leverages cloud storage to protect university data from ransomware threats. They strengthened their protection four-fold by adding immutable off-site data backups, and are now able to manage data recovery and data integrity with a single robust solution (that doesn’t multiply their expenses).

Computer Backup

While S3 compatible object storage provides a secure destination for data from servers, virtual machines (VMs), and network attached storage (NAS), it’s important to remember to back up faculty, staff, student, and researchers’ computers as well. Workstation backup is particularly important for organizations that are leveraging cloud drives, as these platforms are only designed to capture data stored in their respective clouds, leaving local files vulnerable to loss. But, one thing you don’t want is a drain on your IT resources—you want a solution that’s easy to implement, easy to manage ongoing, and simple enough to serve users of varying tech savviness. 

Best Practices for Data Backup and Management in the Cloud

Higher education institutions (and anyone, really!) should follow basic best practices to get the most out of their cloud storage solutions. Here are a few key points to keep in mind when developing a data backup and management strategy for higher education:

The 3-2-1 Backup Strategy

This widely accepted foundational structure recommends keeping three copies of all important data (one primary copy and two backup copies) on two different media types (to diversify risk) and storing at least one copy off-site. While colleges and universities frequently have high-capacity data storage systems, they don’t always adhere to the 3-2-1 rule. For instance, a school may have an off-site disaster recovery site, but their backups are not on two different media types. Or, they may be meeting the two-media-type rule but their media are not wholly off-site. Keeping your backups at a different campus location does not constitute a true off-site backup if you’re in the same region, for instance—the closer your data storage sites are, the more likely they’ll be subject to the same risks, like network outages, natural disasters, and so on.

Regular Data Backups

You’re only as strong as your last backup. Maintaining a frequent and regular backup schedule is a tried and true way to ensure that your institution’s data is as protected as possible. Schools that have historically relied on Google Drive, Dropbox, OneDrive, and other cloud drive systems are particularly vulnerable to this gap in their data protection strategy. Cloud drives provide sync functionality; they are not a true backup. While many now have the ability to restore files, restore periods are limited and not customizable and services often only back up certain file types—so, your documents, but not your email or user data, for instance. Especially when you’re talking about larger organizations with complex file management and high compliance needs, they don’t provide adequate protection from ransomware. Speaking of ransomware…

Ransomware Protection

Educational institutions (including both K-12 and higher ed) are more frequently targeted by ransomware today than ever before. When you’re using cloud storage, you can enable security features like Object Lock to offer “air gapped” protection and data immutability in the cloud. When you add endpoint backup, you’re ensuring that all the data on a workstation is backed up—closing a gap in cloud drives that can leave certain types of data vulnerable to loss.

Disaster Recovery Planning

Incorporating cloud storage into your disaster recovery strategy is the best way to plan for the worst. If unexpected disasters occur, you’ll know exactly where your data lives and how to restore it so you can get back to work quickly. Schools will often use cross-site replication as their disaster recovery solution, but such methods can fail the 3-2-1 test (see above) and it’s not a true backup since replication functions much the same way as sync. If ransomware invades your primary dataset, it can be replicated across all your copies. Cloud storage allows you to fortify your disaster recovery strategy and plug the gaps in your data protection.

Regulatory Compliance

Universities work with and store many diverse kinds of information, including highly regulated data types like medical records and research data. It’s important for higher education to use cloud storage solutions that help them remain in compliance with data privacy laws and federal or international regulations. Providers like Backblaze that frequently work with higher education institutions will usually have a HECVAT questionnaire available so you can better understand a vendor’s compliance and security stance, and they go through regular compliance audits via regulatory agencies like StateRAMP or SOC-2 certifications.

Comprehensive Protection

While it’s obvious that data systems like servers, virtual machines, and network attached storage (NAS) should be backed up, consider the other important sources of data that should be included in your protection strategy. For instance, your Microsoft 365 data should be backed up because you cannot rely on Microsoft to provide adequate backups. Under the shared responsibility model, Microsoft and other SaaS providers state that your data is your responsibility to back up—even if it’s stored on their cloud. And don’t forget about your faculty, student, staff, and researchers’ computers. These devices can hold incredibly valuable work and having a native endpoint backup solution is critical.

The Importance of Cloud Storage for Higher Education Institutions

Institutions of higher education were already on the long road toward digital transformation before the pandemic hit, but 2020 forced any reluctant parties to accept that the future was upon us. The combination of schools’ increasing quantities of sensitive and protected data and the growing threat of ransomware in the higher education space reinforce the need for secure and robust cloud storage solutions. As time has gone on, it’s clear that the diverse needs of higher education institutions need flexible, scalable, affordable solutions, and that current and legacy solutions have room for improvement. 

Universities that leverage best practices like designing 3-2-1 backup strategies, conducting frequent and regular backups, and developing disaster recovery plans before they’re needed will be well on their way toward becoming more modern, digital-first organizations. And with the right cloud storage solutions in place, they’ll be able to move the needle with measurable business benefits like cost effectiveness, data accessibility, increased security, and scalability.

The post Cloud Storage for Higher Education: Benefits & Best Practices appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/cloud-storage-for-higher-education-benefits-best-practices/feed/ 0
Navigating Cloud Storage: What is Latency and Why Does It Matter? https://www.backblaze.com/blog/navigating-cloud-storage-what-is-latency-and-why-does-it-matter/ https://www.backblaze.com/blog/navigating-cloud-storage-what-is-latency-and-why-does-it-matter/#respond Tue, 27 Feb 2024 16:28:41 +0000 https://www.backblaze.com/blog/?p=110914 Latency is an important factor impacting performance and user experience. Let's talk about what it is and some ways to get better performance.

The post Navigating Cloud Storage: What is Latency and Why Does It Matter? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
A decorative image showing a computer and a server arrows moving between them, and a stopwatch indicating time.

In today’s bandwidth-intensive world, latency is an important factor that can impact performance and the end-user experience for modern cloud-based applications. For many CTOs, architects, and decision-makers at growing small and medium sized businesses (SMBs), understanding and reducing latency is not just a technical need but also a strategic play. 

Latency, or the time it takes for data to travel from one point to another, affects everything from how snappy or responsive your application may feel to content delivery speeds to media streaming. As infrastructure increasingly relies on cloud object storage to manage terabytes or even petabytes of data, optimizing latency can be the difference between success and failure. 

Let’s get into the nuances of latency and its impact on cloud storage performance.

Upload vs. Download Latency: What’s the Difference?

In the world of cloud storage, you’ll typically encounter two forms of latency: upload latency and download latency. Each can impact the responsiveness and efficiency of your cloud-based application.

Upload Latency

Upload latency refers to the delay when data is sent from a client or user’s device to the cloud. Live streaming applications, backup solutions, or any application that relies heavily on real-time data uploading will experience hiccups if upload latency is high, leading to buffering delays or momentary stream interruptions.

Download Latency

Download latency, on the other hand, is the delay when retrieving data from the cloud to the client or end user’s device. Download latency is particularly relevant for content delivery applications, such as on demand video streaming platforms, e-commerce, or other web-based applications. Reducing download latency, creating a snappy web experience, and ensuring content is swiftly delivered to the end user will make for a more favorable user experience.

Ideally, you’ll want to optimize for latency in both directions, but, depending on your use case and the type of application you are building, it’s important to understand the nuances of upload and download latency and their impact on your end users.

Decoding Cloud Latency: Key Factors and Their Impact

When it comes to cloud storage, how good or bad the latency is can be influenced by a number of factors, each having an impact on the overall performance of your application. Let’s explore a few of these key factors.

Network Congestion

Like traffic on a freeway, packets of data can experience congestion on the internet. This can lead to slower data transmission speeds, especially during peak hours, leading to a laggy experience. Internet connection quality and the capacity of networks can also contribute to this congestion.

Geographical Distance

Often overlooked, the physical distance from the client or end user’s device to the cloud origin store can have an impact on latency. The farther the distance from the client to the server, the farther the data has to traverse and the longer it takes for transmission to complete, leading to higher latency.

Infrastructure Components

The quality of infrastructure, including routers, switches, and cables, may affect network performance and latency numbers. Modern hardware, such as fiber-optic cables, can reduce latency, unlike outdated systems that don’t meet current demands. Often, you don’t have full control over all of these infrastructure elements, but awareness of potential bottlenecks may be helpful, guiding upgrades wherever possible.

Technical Processes

  • TCP/IP Handshake: Connecting a client and a server involves a handshake process, which may introduce a delay, especially if it’s a new connection.
  • DNS Resolution: Latency can be increased by the time it takes to resolve a domain name to its IP address. There is a small reduction in total latency with faster DNS resolution times.
  • Data routing: Data does not necessarily travel a straight line from its source to its destination. Latency can be influenced by the effectiveness of routing algorithms and the number of hops that data must make.

Reduced latency and improved application performance are important for businesses that rely on frequently accessing data stored in cloud storage. This may include selecting providers with strategically positioned data centers, fine-tuning network configurations, and understanding how internet infrastructure affects the latency of their applications.

Minimizing Latency With Content Delivery Networks (CDNs)

Further reducing latency in your application may be achieved by layering a content delivery network (CDN) in front of your origin storage. CDNs help reduce the time it takes for content to reach the end user by caching data in distributed servers that store content across multiple geographic locations. When your end-user requests or downloads content, the CDN delivers it from the nearest server, minimizing the distance the data has to travel, which significantly reduces latency.

Backblaze B2 Cloud Storage integrates with multiple CDN solutions, including Fastly, Bunny.net, and Cloudflare, providing a performance advantage. And, Backblaze offers the additional benefit of free egress between where the data is stored and the CDN’s edge servers. This not only reduces latency, but also optimizes bandwidth usage, making it cost effective for businesses building bandwidth intensive applications such as on demand media streaming. 

To get slightly into the technical weeds, CDNs essentially cache content at the edge of the network, meaning that once content is stored on a CDN server, subsequent requests do not need to go back to the origin server to request data. 

This reduces the load on the origin server and reduces the time needed to deliver the content to the user. For companies using cloud storage, integrating CDNs into their infrastructure is an effective configuration to improve the global availability of content, making it an important aspect of cloud storage and application performance optimization.

Case Study: Musify Improves Latency and Reduces Cloud Bill by 70%

To illustrate the impact of reduced latency on performance, consider the example of music streaming platform Musify. By moving from Amazon S3 to Backblaze B2 and leveraging the partnership with Cloudflare, Musify significantly improved its service offering. Musify egresses about 1PB of data per month, which, under traditional cloud storage pricing models, can lead to significant costs. Because Backblaze and Cloudflare are both members of the Bandwidth Alliance, Musify now has no data transfer costs, contributing to an estimated 70% reduction in cloud spend. And, thanks to the high cache hit ratio, 90% of the transfer takes place in the CDN layer, which helps maintain high performance, regardless of the location of the file or the user.

Latency Wrap Up

As we wrap up our look at the role latency plays in cloud-based applications, it’s clear that understanding and strategically reducing latency is a necessary approach for CTOs, architects, and decision-makers building many of the modern applications we all use today.  There are several factors that impact upload and download latency, and it’s important to understand the nuances to effectively improve performance.

Additionally, Backblaze B2’s integrations with CDNs like Fastly, bunny.net, and Cloudflare offer a cost-effective way to improve performance and reduce latency. The strategic decisions Musify made demonstrate how reducing latency with a CDN can significantly improve content delivery while saving on egress costs, and reducing overall business OpEx.

For additional information and guidance on reducing latency, improving TTFB numbers and overall performance, the insights shared in “Cloud Performance and When It Matters” offer a deeper, technical look.

If you’re keen to explore further into how an object storage platform may support your needs and help scale your bandwidth-intensive applications, read more about Backblaze B2 Cloud Storage.

The post Navigating Cloud Storage: What is Latency and Why Does It Matter? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/navigating-cloud-storage-what-is-latency-and-why-does-it-matter/feed/ 0
What’s Wrong With Google Drive, Dropbox, and OneDrive? More Than You Think https://www.backblaze.com/blog/whats-wrong-with-google-drive-dropbox-and-onedrive-more-than-you-think/ https://www.backblaze.com/blog/whats-wrong-with-google-drive-dropbox-and-onedrive-more-than-you-think/#respond Fri, 23 Feb 2024 17:15:18 +0000 https://www.backblaze.com/blog/?p=110910 Many organizations may think that their data is secure when they use cloud drives like Google Drive, Dropbox, and OneDrive. Here's what you need to consider to fully protect your data.

The post What’s Wrong With Google Drive, Dropbox, and OneDrive? More Than You Think appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>

Cloud drives like Google Drive, Dropbox, Box, and OneDrive have become the go-to data management solution for countless individuals and organizations. Their appeal lies in the initial free storage offering, user-friendly interface, robust file-sharing, and collaboration tools, making it easier to access files from anywhere with an internet connection. 

However, recent developments in the cloud drives space have posed significant challenges for businesses and organizations. Both Google and Microsoft, leading providers in this space, have announced the discontinuation of some unlimited storage plans, such as those for educational institutions.

Additionally, it’s essential to note that cloud drives, which are primarily sync services, do not offer comprehensive data protection. Today, we’re exploring how organizations can recognize the limitations of cloud drives and strategize accordingly to safeguard their data without breaking the bank. 

Attention Higher Ed

Higher education institutions have embraced platforms like Google Drive, Dropbox, Box, and OneDrive to store vast amounts of data—sometimes reaching into the petabytes. With unlimited plans out the window, they now face the dilemma of either finding alternative storage solutions or deleting data to avoid steep fees. In fact, the education sector reported the highest rates of ransomware attacks with 80% of secondary education providers and 79% of higher education providers hit by ransomware in 2023. If you manage IT for a higher ed institution, read on for more on how you can protect your data.

Sync vs. Backup: Why Cloud Drives Fall Short on Full Data Security

Cloud Sync

Cloud drives offer users an easy way to store and protect files online, and it might seem like these services back up your data. But, they don’t. These services sync (short for “synchronize”) files or folders on your computer to your other devices running the same application, ensuring that the same and most up-to-date information is merged across each device.

The “live update” feature of cloud drives is a double-edged sword. On one hand, it ensures you’re always working on the latest version of a document. On the other, if you need to go back to a specific version of a file from two weeks ago, you might be out of luck depending on your service plan, how far back you need to recover the file from, your organization’s retention settings, and other factors often written in fine print.

Another important item to note is that if cloud drives are shared with others, often they can make changes to the content which can result in the data changing or being deleted and without notifying other users. With the complexity of larger organizations, this presents a potential vulnerability, even with well-meaning users and proactive management of drive permissions. 

Cloud Backup

Unlike cloud sync tools, backup solutions are all about historical data preservation. They utilize block-level backup technology, which offers granular protection of your data. After an initial full backup, these systems only save the incremental changes that occur in the dataset. This means if you need to recover a file (or an entire system) as it existed at a specific point in time, you can do so with precision. This approach is not only more efficient in terms of storage space but also crucial for data recovery scenarios.

For organizations where data grows exponentially but is also critically important and sensitive, the difference between sync and backup is a crucial divide between being vulnerable and being secure. While cloud drives offer ease of access and collaboration, they fall short in providing the comprehensive data protection that comes from true backup solutions, highlighting the need to identify the gap and choose a solution that better fits your data storage and security goals. A full-scale backup solution will typically include backup software like Veeam, Commvault, and Rubrik, and a storage destination for that data. The backup software allows you to configure the frequency and types of backups, and the backup data is then stored on-premises and/or off-premises. Ideally, at least one copy is stored in the cloud, like Backblaze B2, to provide true off-site, geographically distanced protection.

Lack of Protection Against Ransomware

Ransomware payments hit a record high $1 billion in 2023. It shouldn’t be news to anyone in IT that you need to defend against the evolving threat of ransomware with immutable backups now more than ever. However, cloud drives fall short when it comes to protecting against ransomware.

The Absence of Object Lock

Object Lock serves as a digital vault, making data immutable for a specified period. It creates a virtual air gap, protecting data from modification, manipulation, or deletion, effectively shielding it from ransomware attacks that seek to encrypt files for ransom. Unfortunately, most cloud drives do not incorporate this technology. 

Without Object Lock, if a piece of data or a document becomes infected with ransomware before it’s uploaded to the cloud, the version saved on a cloud drive can be compromised as well. This replication of infected files across the cloud environment can escalate a localized ransomware attack into a widespread data disaster. 

Other Security Shortcomings

Beyond the absence of Object Lock, cloud drives may also lag in other critical security measures. While many offer some level of encryption, the robustness of this encryption and its effectiveness in protecting data at reset and in transit can vary significantly. Additionally, the implementation of 2FA and other access control measures is not always standard. These gaps in security protocols can leave the door open for unauthorized access and data breaches.

Navigating the Shared Responsibility Model

The shared responsibility model of cloud computing outlines who is responsible for what when it comes to cloud security. However, this model often leads to a sense of false security. Under this model, cloud drives typically take responsibility for the security “of” the cloud, including the infrastructure that runs all of the services offered in the cloud. On the other hand, the customers are responsible for security “in” the cloud. This means customers must manage the security of their own data. 

What’s the difference? Let’s use an example. If a user inadvertently uploads a ransomware-infected file to a cloud drive, the service might protect the integrity of the cloud infrastructure, ensuring the malware doesn’t spread to other users. However, the responsibility to prevent the upload of the infected file in the first place, and managing its consequences, falls directly on the user. In essence, while cloud drives provide a platform for storing your data, relying solely on them without understanding the nuances of the shared responsibility model could leave gaps in your data protection strategy. 

It’s also important to understand that Google, Microsoft, and Dropbox may not back up your data as often as you’d like, in the format you need, or provide timely, accessible recovery options. 

The Limitations of Cloud Drives in Computer Failures

Cloud drives, such as iCloud, Google Drive, Dropbox, and OneDrive, synchronize your files across multiple devices and the cloud, ensuring that the latest version of a file is accessible from anywhere. However, this synchronization does not equate to a full backup of your computer’s data. In the event of a computer failure, only the files you’ve chosen to sync would be recoverable. Other data stored on the computer (but not in the sync folder) would be lost, and cloud drives typically do not back up things like emails, user data, or any of the deeper data you might need to rebuild your computer or system from scratch. 

While some cloud drives offer versioning, which allows you to recover previous versions of files, this features are often limited in scope and time. It’s not designed to recover all types of files after a hardware failure, which a comprehensive backup solution would allow. 

Additionally, users often have to select which folders of files are synchronized, potentially overlooking important data. This selective sync means that not all critical information is protected automatically, unlike with a backup solution that can be set to automatically back up all data.

The Challenges of Data Sprawl in Cloud Drives

Cloud drives make it easy to provision storage for a wide array of end users. From students and faculty in education institutions to teams in corporations, the ease with which users can start storing data is unparalleled. However, this convenience comes with its own set of challenges—and one of the most notable culprits is data sprawl. 

Data sprawl refers to the rapid expansion and scattering of data without a cohesive management strategy. It is the accumulation of vast amounts of data to the point where organizations no longer know what data they have or what is happening with that data. Organizations often struggle to get a clear picture of who is storing what, how much space it’s taking up, and whether certain data remains accessed or has become redundant. This can lead to inefficient use of storage resources, increased costs, and potential security risks as outdated or unnecessary information piles up. The lack of sophisticated tools within cloud drive platforms for analyzing and understanding storage usage can significantly complicate data governance and compliance efforts. 

The Economic Hurdles of Cloud Drive Pricing

The pricing structure of cloud drive solutions present a significant barrier to achieving both cost efficiency and operational flexibility. The sticker price is only the tip of the iceberg, especially for sprawling organizations like higher education institutions or large enterprises with unique challenges that make the standard pricing models of many cloud drive services less than ideal. Some of the main challenges are: 

  1. User-Based Pricing: Cloud drive platforms base their pricing on the number of users, an approach that quickly becomes problematic for large institutions and businesses. With staff and end user turnover, predicting the number of active users at any given time can be a challenge. This leads to overpaying for unused accounts or constantly adjusting pricing tiers to match the current headcount, both of which are administrative headaches. 
  2. The High Cost of Scaling: The initial promise of free storage tiers or low-cost entry points fades quickly as institutions hit their storage limits. Beyond these thresholds, prices can escalate dramatically, making budget planning a nightmare. This pricing model is particularly problematic for businesses where data is continually growing. As these data sets expand, the cost to store them grows exponentially, straining already tight budgets. 
  3. Limitations of Storage and Users: Most cloud drive platforms come with limits on storage capacity and a cap on the number of users. Upgrading to higher tier plans to accommodate more users or additional storage can be expensive. This often forces organizations into a cycle of constant renegotiation and plan adjustments. 

We’re Partial to an Alternative: Backblaze

While cloud drives excel in collaboration and file sharing, they often fall short in delivering the comprehensive data security and backup that businesses and organizations need. However, you are not without options. Cloud storage platforms like Backblaze B2 Cloud Storage secure business and educational data and budgets with immutable, set-and-forget, off-site backups and archives at a fraction of the cost of legacy providers. And, with Universal Data Migration, you can move large amounts of data from cloud drives or any other source to B2 Cloud Storage at no cost to you. 

For those who appreciate the user-friendly interfaces of services like Dropbox or Google Drive, Backblaze provides integrations that deliver comparable front-end experiences for ease of use without compromising on security. However, if your priority lies in securing data against threats like ransomware, you can integrate Backblaze B2 with popular backup tools including Veeam, Rubrik, and Commvault, for immutable, virtually air-gapped backups to defend against cyber threats. Backblaze also offers  free egress for up to three times your data stored—or unlimited free egress between many of our compute or CDN partners—which means you don’t have to worry about the costs of downloading data from the cloud when necessary. 

Beyond Cloud Drives: A Secure, Cost-Effective Approach to Data Storage

In summary, cloud drives offer robust file sharing and collaboration tools, yet businesses and organizations looking for a more secure, reliable, and cost-effective data storage solution have options. By recognizing the limitations of cloud drives and by leveraging the advanced capabilities of cloud backup services, organizations can not only safeguard their data against emerging threats but also ensure it remains accessible and within budget. 

The post What’s Wrong With Google Drive, Dropbox, and OneDrive? More Than You Think appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/whats-wrong-with-google-drive-dropbox-and-onedrive-more-than-you-think/feed/ 0
Your AI Toolbox: 16 Must Have Products https://www.backblaze.com/blog/your-ai-toolbox-16-must-have-products/ https://www.backblaze.com/blog/your-ai-toolbox-16-must-have-products/#respond Wed, 21 Feb 2024 17:05:00 +0000 https://www.backblaze.com/blog/?p=110901 With all the AI/ML tools on the market, businesses have a lot of things to sort through. Here's a list of some of the biggest tools out there and how they're being used.

The post Your AI Toolbox: 16 Must Have Products appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
A decorative image showing a chip networked to several tech icon images, including a computer and a cloud, with a box that says AI above the image.

Folks, it’s an understatement to say that the explosion of AI has been a wild ride. And, like any new, high-impact technology, the market initially floods with new companies. The normal lifecycle, of course, is that money is invested, companies are built, and then there will be winners and losers as the market narrows. Exciting times. 

That said, we thought it was a good time to take you back to the practical side of things. One of the most pressing questions these days is how businesses may want to use AI in their existing or future processes, what options exist, and which strategies and tools are likely to survive long term. 

We can’t predict who will sink or swim in the AI race—we might be able to help folks predict drive failure, but the Backblaze Crystal Ball (™) is not on our roadmap—so let’s talk about what we know. Things will change over time, and some of the tools we’ve included on this list will likely go away. And, as we fully expect all of you to have strong opinions, let us know what you’re using, which tools we may have missed, and why we’re wrong in the comments section.

Tools Businesses Can Implement Today (and the Problems They Solve)

As AI has become more accessible, we’ve seen it touted as either standalone tools or incorporated into existing software. It’s probably easiest to think about them in terms of the problems they solve, so here is a non-inclusive list.

The Large Language Model (LLM) “Everything Bot”

LLMs are useful in generative AI tasks because they work largely on a model of association. They intake huge amounts of data, use that to learn associations between ideas and words, and then use those learnings to perform tasks like creating copy or natural language search. That makes them great for a generalized use case (an “everything bot”) but it’s important to note that it’s not the only—or best—model for all AI/ML tasks. 

These generative AI models are designed to be talked to in whatever way suits the querier best, and are generally accessed via browser. That’s not to say that the models behind them aren’t being incorporated elsewhere in things like chat bots or search, but that they stand alone and can be identified easily. 

ChatGPT

In many ways, ChatGPT is the tool that broke the dam. It’s a large language model (LLM) whose multi-faceted capabilities were easily apparent and translatable across both business and consumer markets. Never say it came from nowhere, however: OpenAI and Microsoft Azure have been in cahoots for years creating the tool that (ahem) broke the internet. 

Google Gemini, née Google Bard

It’s undeniable that Google has been on the front lines of AI/ML for quite some time. Some experts even say that their networks are the best poised to build a sustainable AI architecture. So why is OpenAI’s ChatGPT the tool on everyone’s mind? Simply put, Google has had difficulty commercializing their AI product—until, that is, they announced Google Gemini, and folks took notice. Google Gemini represents a strong contender for the type of function that we all enjoy from ChatGPT, powered by all the infrastructure and research they’re already known for.

Machine Learning (ML)

ML tasks cover a wide range of possibilities. When you’re looking to build an algorithm yourself, however, you don’t have to start from ground zero. There are robust, open source communities that offer pre-trained models, community support, integration with cloud storage, access to large datasets, and more. 

  • TensorFlow: TensorFlow was originally developed by Google for internal research and production. It supports various programming languages like C++, Python, and Java, and is designed to scale easily from research to development.  
  • PyTorch: PyTorch, on the other hand, is built for rapid prototyping and experimentation, and is primarily built for Python. That makes the learning curve for most devs much shorter, and lots of folks will layer it with Keras for additional API support (without sacrificing the speed and lower-level control of PyTorch). 

Given the amount of flexibility in having an open source library, you see all sorts of things being built. A photo management company might grab a facial recognition algorithm, for instance, or use another to help order the parameters and hyperparameters of the algorithm. Think of it like wanting to build a table, but making the hammer and nails instead of purchasing your own. 

Building Products With AI

You may also want or need to invest more resources—maybe you want to add AI to your existing product. In that scenario, you might hire an AI consultant to help you design, build, and train the algorithm, buy processing power from CoreWeave or Google, and store your data on-premises or in cloud storage.

In reality, most companies will likely do a mix of things depending on how they operate and what they offer. The biggest thing I’m trying to get at by presenting these scenarios, however, is that most people likely won’t set up their own large scale infrastructure, instead relying on inference tools. And, there’s something of a distinction to be made between whether you’re using tools designed to create efficiencies in your business versus whether you’re creating or incorporating AI/ML into your products.

Data Analytics

Without being too contentions, data analytics is one of the most powerful applications of AI/ML. While we measly humans may still need to provide context to make sense of the identified patterns, computers are excellent at identifying them more quickly and accurately than we could ever dream. If you’re looking to crunch serious numbers, these two tools will come in handy.

  • Snowflake: Snowflake is a cloud-based data as a service (DaaS) company that specializes in data warehouses, data lakes, and data analytics. They provide a flexible, integration-friendly platform with options for both developing your own data tools or using built-out options. Loved by devs and business leaders alike, Snowflake is a powerhouse platform that supports big names and diverse customers such as AT&T, Netflix, Capital One, Canva, and Bumble. 
  • Looker: Looker is a business intelligence (BI) platform powered by Google. It’s a good example of a platform that takes the core functionalities of a product we’re already used to and layering on AI to make them more powerful. So, while BI platforms have long had robust data management and visualization capabilities, they can now do things like use natural language search or get automated data insights.

Development and Security

It’s no secret that one of the biggest pain points in the world of tech is having enough developers and having enough high quality ones, at that. It’s pushed the tech industry to work internationally, driven the creation of coding schools that train folks within six months, and compelled people to come up with codeless or low-code platforms that users of different skill levels can use. This also makes it one of the prime opportunities for the assistance of AI. 

  • GitHub Copilot: Even if you’re not in tech or working as a developer, you’ve likely heard of GitHub. Started in 2007 and officially launched in 2008, it’s a bit hard to imagine coding before it existed as the de facto center to find, share, and collaborate on code in a public forum. Now, they’re responsible for GitHub Copilot, which allows devs to generate code with a simple query. As with all generative tools, however, users should double check for accuracy and bias, and make sure to consider privacy, legal, and ethical concerns while using the tool. 

Customer Experience and Marketing

Customer relationship management (CRM) tools assist businesses in effectively communicating with their customers and audiences. You use them to glean insights as broadly as trends in how you’re finding and converting leads to customers, or as granular as a single users’ interactions with marketing emails. A well-honed CRM means being able to serve your target and existing customers effectively. 

  • Hubspot and Salesforce Einstein: Two of the largest CRM platforms on the market, these tools are designed to make everything from email to marketing emails to lead scoring to customer service interactions easy. AI has started popping up in almost every function offered, including social media post generation, support ticket routing, website personalization suggestions, and more.    

Operations, Productivity, and Efficiency

These kinds of tools take onerous everyday tasks and make them easy. Internally, these kinds of tools can represent massive savings to your OpEx budget, letting you use your resources more effectively. And, given that some of them also make processes external to your org easier (like scheduling meetings with new leads), they can also contribute to new and ongoing revenue streams. 

  • Loom: Loom is a specialized tool designed to make screen recording and subsequent video editing easy. Given how much time it takes to make video content, Loom’s targeting of this once-difficult task has certainly saved time and increased collaboration. Loom includes things like filler word and silence removal, auto-generating chapters with timestamps, summarizing the video, and so on. All features are designed for easy sharing and ingesting of data across video and text mediums.  
  • Calendly: Speaking of collaboration, remember how many emails it used to take to schedule a meeting, particularly if the person was external to your company? How about when you were working a conference and wanted to give a new lead an easy way to get on your calendar? And, of course, there’s the joy of managing multiple inboxes. (Thanks, Calendly. You changed my life.) Moving into the AI future, Calendly is doing similar small but mighty things: predicting your availability, detecting time zones, automating meeting schedules based on team member availability or round robin scheduling, cancellation insights, and more.  
  • Slack: Ah, Slack. Business experts have been trying for years to summarize the effect it’s had on workplace communication, and while it’s not the only tool on the market, it’s definitely a leader. Slack has been adding a variety of AI functions to its platform, including the ability to summarize channels, organize unreads, search and summarize messages—and then there’s all the work they’re doing with integrations rumored to be on the horizon, like creating meeting invite suggestions purely based on your mentioning “putting time on the calendar” in a message. 

Creative and Design 

Like coding and developer tools, creative of all kinds—image, video, copy—has long been a resource intensive task. These skills are not traditionally suited to corporate structures, and measuring whether one brand or another is better or worse is a complex process, though absolutely measurable and important. Generative AI, again like above, is giving teams the ability to create first drafts, or even train libraries, and then move the human oversight to a higher, more skilled, tier of work. 

  • Adobe and Figma: Both Adobe and Figma are reputable design collaboration tools. Though a merger was recently called off by both sides, both are incorporating AI to make it much, much easier to create images and video for all sorts of purposes. Generative AI means that large swaths of canvas can be filled by a generative tool that predicts background, for instance, or add stock versions of things like buildings with enough believability to fool a discerning eye. Video tools are still in beta, but early releases are impressive, to say the least. With the preview of OpenAI’s text-to-video model Sora making waves to the tune of a 7% drop in Adobe’s stock, video is the space to watch at the moment.
  • Jasper and Copy.ai: Just like image generation above, these bots are also creating usable copy for tasks of all kinds. And, just like all generative tools, AI copywriters deliver a baseline level of quality best suited to some human oversight. As time goes on, how much oversight remains to be seen.

Tools for Today; Build for Tomorrow

At the end of this roundup, it’s worth noting that there are plenty of tools on the market, and we’ve just presented a few of the bigger names. Honestly, we had trouble narrowing the field of what to include so to speak—this very easily could have been a much longer article, or even a series of articles that delved into things we’re seeing within each use case. As we talked about in AI 101: Do the Dollars Make Sense? (and as you can clearly see here), there’s a great diversity of use cases, technological demands, and unexplored potential in the AI space—which means that companies have a variety of strategic options when deciding how to implement AI or machine learning.

Most businesses will find it easier and more in line with their business goals to adopt software as a service (SaaS) solutions that are either sold as a whole package or integrated into existing tools. These types of tools are great because they’re almost plug and play—you can skip training the model and go straight to using them for whatever task you need. 

But, when you’re a hyperscaler and you’re talking about building infrastructure to support the processing and storage demands of the AI future, it’s a different scenario than when other types of businesses are talking about using or building an AI tool or algorithm specific to your business’ internal strategy or products. We’ve already seen that hyperscalers are going for broke in building data centers and processing hubs, investing in companies that are taking on different parts of the tech stack, and, of course, doing longer-term research and experimentation as well.

So, with a brave new world at our fingertips—being built as we’re interacting with it—the best thing for businesses to remember is that periods of rapid change offer opportunity, as long as you’re thoughtful about implementation. And, there are plenty of companies creating tools that make it easy to do just that. 

The post Your AI Toolbox: 16 Must Have Products appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/your-ai-toolbox-16-must-have-products/feed/ 0
Kubernetes Data Protection: How to Safeguard Your Containerized Applications https://www.backblaze.com/blog/kubernetes-data-protection-how-to-safeguard-your-containerized-applications/ https://www.backblaze.com/blog/kubernetes-data-protection-how-to-safeguard-your-containerized-applications/#respond Thu, 15 Feb 2024 17:05:00 +0000 https://www.backblaze.com/blog/?p=110890 Kubernetes is a useful and widely-used tool to deploy and manage applications at scale. Here are some things to consider backing up your Kubernetes environment.

The post Kubernetes Data Protection: How to Safeguard Your Containerized Applications appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
A decorative image showing the Kubernetes and Backblaze logos.

Kubernetes, originally embraced by DevOps teams for its seamless application deployment, has become the go-to for businesses looking to deploy and manage applications at scale. This powerful container orchestration platform brings many benefits, but it’s not without risks—data loss, misconfigurations, and systems failures can happen when you least expect them. 

That’s why implementing a comprehensive backup strategy is essential for protecting against potential failures that could result in significant downtime, end-user dissatisfaction, and financial losses. However, backing up Kubernetes can be challenging. The environment’s dynamic nature, with containers constantly being created and destroyed, presents a unique set of challenges. 

Traditional backup solutions might falter in the face of Kubernetes’s complexities, highlighting the need for specialized approaches to back up the data and the state of your containerized applications. In this guide, let’s explore how to effectively back up and protect your Kubernetes environments against a wide range of threats, from misconfigurations to ransomware.

Understanding Kubernetes Architecture

Kubernetes has a fairly straightforward architecture that is designed to automate the deployment, scaling, and management of application containers across cluster of hosts. Understanding this architecture is not only essential for deploying and managing applications, but also for implementing effective security measures. Here’s a breakdown of Kubernetes hierarchical components and concepts. 

A chart describing cluster and node organizations within Kubernetes.

Containers: The Foundation of Kubernetes

Containers are lightweight, virtualized environments designed to run application code. They encapsulate an application’s code, libraries, and dependencies into a single object. This makes containerized applications easy to deploy, scale, and manage across different environments.

Pods: The Smallest Deployable Units

Pods are often described as logical hosts that can contain one or multiple containers that share storage, network, and specifications on how to run the containers. They are ephemeral by nature—temporary storage for a container that gets wiped out and lost when the container is stopped or restarted.

Nodes: The Workhorses of Kubernetes

Nodes represent the physical or virtual machines that run the containerized applications. Each node is managed by the master components and contains the services necessary to run pods. 

Cluster: The Heart of Kubernetes

A cluster is a collection of nodes that run containerized applications. Clusters provide the high-level structure within which Kubernetes manages the containerized applications. They enable Kubernetes to orchestrate containers’ deployment, scaling, and management across multiple nodes seamlessly.

Control Plane: The Brain Behind the Operation

The control plane is responsible for managing the worker nodes and the pods in the cluster. It includes several components, such as Kubernetes API server, scheduler, controller manager, and etcd (a key-value store for cluster data). The control plane makes global decisions about the cluster, and therefore its security is paramount as it’s the central point of management for the cluster. 

What Needs to be Protected in Kubernetes?

In Kubernetes, securing your environment is not just about safeguarding the data; it’s about protecting the entire ecosystem that interacts with and manages the data. Here’s an overview of the key components that require protection.

Workloads and Applications

  • Containers and Pods: Protecting containers involves securing the container images from vulnerabilities and ensuring runtime security. For pods, it’s crucial to manage security contexts and network policies effectively to prevent unauthorized access and ensure that sensitive data isn’t exposed to other pods or services unintentionally.
  • Deployments and StatefulSets: These are higher-level constructs that manage the deployment and scaling of pods. Protecting these components involves ensuring that only authorized users can create, update, or delete deployments.

Data and Storage

  • Persistent Volumes (PVs) and Persistent Volume Claims (PVCs): Persistent storage in Kubernetes is managed through PVs and PVCs, and protecting them is essential to ensure data integrity and confidentiality. This includes securing access to the data they contain, encrypting data at rest and transit, and properly managing storage access permissions.
  • ConfigMaps and Secrets: While ConfigMaps might contain general configuration settings, secrets are used to store sensitive data such as passwords, OAuth tokens, and SSH keys. 

Network Configuration

  • Services and Ingress: Services in Kubernetes provide a way to expose an application on a set of pods as a network service. Ingress, on the other hand, manages external access to the services within a cluster, typically HTTP. Protecting these components involves securing the communication channels, implementing network policies to restrict access to and from the services, and ensuring that only authorized services are exposed to the outside world.
  • Network Policies: Network policies define how groups of pods are allowed to communicate with each other and other network endpoints. Securing them is essential for creating a controlled, secure networking environment with your Kubernetes cluster.

Access Controls and User Management

  • Role-Based Access Control (RBAC): RBAC in Kubernetes helps define who can access what within a cluster. It allows administrators to regulate access to Kubernetes resources and namespaces based on the roles assigned to users. Protecting your cluster with RBAC users and applications having only the access they need while minimizing the potential impact of compromised credentials or insider threats.
  • Service Accounts: Service accounts provide an identity for processes that run in a pod, allowing them to interact with the Kubernetes API. Managing and securing these accounts is crucial to prevent unauthorized API access, which could lead to data leakage or unauthorized modifications of the cluster state.

Cluster Infrastructure

  • Nodes and the Control Plane: The nodes run the containerized applications and are controlled by the control plane, which includes the API server, scheduler, controller manager, and etcd database. Securing the nodes involves hardening the underlying operating system (OS), ensuring secure communication between the nodes and the control plane, and protecting control plane components from unauthorized access and tampering.
  • Kubernetes Secrets Management: Managing secrets securely in Kubernetes is critical for protecting sensitive data. This includes implementing best practices for secrets encryption, both at rest and in transit, and limiting secrets exposure to only those pods that require access.

Protecting these components is crucial for maintaining both the security and operational integrity of your Kubernetes environment. A breach in any of these areas can compromise your entire cluster, leading to data loss and causing service disruption and financial damage. Implementing a layered security approach that addresses the vulnerabilities of the Kubernetes architecture is essential for building a resilient, secure deployment.

Challenges in Kubernetes Data Protection

Securing the Kubernetes components we discussed above poses unique challenges due to the platform’s dynamic nature and the diverse types of workloads it supports. Understanding these challenges is the first step toward developing effective strategies for safeguarding your applications and data. Here are some of the key challenges:

Dynamic Nature of Container Environments

Kubernetes’s fluid landscape, with containers constantly being created and destroyed, makes traditional data protection methods less effective. The rapid pace of change demands backup solutions that can adapt just as quickly to avoid data loss. 

Statelessness vs. Statefulness

  • Stateless Applications: These don’t retain data, pushing the need to safeguard the external persistent storage they rely on. 
  • Stateful Applications: Managing data across sessions involves intricate handling of PVs PVCs, which can be challenging in a system where pods and nodes are frequently changing.

Data Consistency

Maintaining data consistency across distributed replicas in Kubernetes is complex, especially for stateful sets with persistent data needs. Strategies for consistent snapshot or application specific replication are vital to ensure integrity.

Scalability Concerns

The scalability of Kubernetes, while a strength, introduces data protection complexities. As clusters grow, ensuring efficient and scalable backup solutions becomes critical to prevent performance degradation and data loss.

Security and Regulatory Compliance

Ensuring compliance with the appropriate standards—GDPR, HIPAA, or SOC 2 standards, for instance—always requires keeping track of storage and management of sensitive data. In a dynamic environment like Kubernetes, which allows for frequent creation and destruction of containers, enforcing persistent security measures can be a challenge. Also, the sensitive data that needs to be encrypted and protected may be hosted in portions across multiple containers. Therefore, it’s important to not only track what is currently existent but also anticipate possible iterations of the environment by ensuring continuous monitoring and the implementation of robust data management practices.

As you can see, Kubernetes data protection requires navigating its dynamic nature and the dichotomy of stateless and stateful applications while addressing the consistency and scalability challenges. A strategic approach to leveraging Kubernetes-native solutions and best practices is essential for effective data protection.

Choosing the Right Kubernetes Backup Solution: Strategies and Considerations

When it comes to protecting your Kubernetes environments, selecting the right backup solution is important. Solutions like Kasten by Veeam, Rubrik, and Commvault are some of the top Kubernetes container backup solutions that offer robust support for Kubernetes backup. 

Here are some essential strategies and considerations for choosing a solution that supports your needs. 

  • Assess Your Workload Types: Different applications demand different backup strategies. Stateful applications, in particular, require backup solutions that can handle persistent storage effectively. 
  • Evaluate Data Consistency Needs: Opt for backup solutions that offer consistent backup capabilities, especially for databases and applications requiring strict data consistency. Look for features that support application-consistent backups, ensuring that data is in a usable state when restored. 
  • Scalability and Performance: The backup solution should seamlessly scale with your Kubernetes deployment without impacting performance. Consider solutions that offer efficient data deduplication, compressions, and incremental backup capabilities to handle growing data volumes.
  • Recovery Objectives: Define clear recovery objectives. Look for solutions that offer granular recovery options, minimizing downtime by allowing for precise restoration of applications or data, aligning with recovery time objectives (RTOs) and recovery point objectives (RPOs). 
  • Integration and Automation: Choose a backup solution that integrates well or natively with Kubernetes, offering automation capabilities for backup schedules, policy management, and recovery processes. This integration simplifies operations and enhances reliability. 
  • Vendor Support and Community: Consider the vendor’s reputation, the level of support provided, and the solution’s community engagement. A strong support system and active community can be invaluable for troubleshooting and best practices.

By considering the above strategies and the unique features offered by backup solutions, you can ensure your Kubernetes environment is not only protected against data loss but also aligned with your operational dynamics and business objectives. 

Leveraging Cloud Storage for Comprehensive Kubernetes Data Protection

After choosing a Kubernetes backup application, integrating cloud storage such as Backblaze B2 with your application offers a flexible, secure, scalable approach to data protection. By leveraging cloud storage solutions, organizations can enhance their Kubernetes data protection strategy, ensuring data durability and availability across a distributed environment. This integration facilitates off-site backups, which are essential for disaster recovery and compliance with data protection policies, providing a robust layer of security against data loss, configuration errors, and breaches. 

Protect Your Kubernetes Data

In summary, understanding the intricacies of Kubernetes components, acknowledging the challenges in Kubernetes backup, selecting the appropriate backup solution, and effectively integrating cloud storage are pivotal steps in crafting a comprehensive Kubernetes backup strategy. These measures ensure data protection, operational continuity, and compliance. The right backup solution, tailored to Kubernetes’s distinctive needs, coupled with the scalability and resiliency of cloud storage, provides a robust framework for safeguarding against data loss or breaches. This multi-faceted approach not only safeguards critical data but also supports the agility and scalability that modern IT environments demand. 

The post Kubernetes Data Protection: How to Safeguard Your Containerized Applications appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/kubernetes-data-protection-how-to-safeguard-your-containerized-applications/feed/ 0
Backblaze Drive Stats for 2023 https://www.backblaze.com/blog/backblaze-drive-stats-for-2023/ https://www.backblaze.com/blog/backblaze-drive-stats-for-2023/#comments Tue, 13 Feb 2024 14:00:00 +0000 https://www.backblaze.com/blog/?p=110853 Read the 2023 Drive Stats Report and the latest insights on drive failure from Andy Klein.

The post Backblaze Drive Stats for 2023 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
A decorative image displaying the words 2023 Year End Drive Stats

As of December 31, 2023, we had 274,622 drives under management. Of that number, there were 4,400 boot drives and 270,222 data drives. This report will focus on our data drives. We will review the hard drive failure rates for 2023, compare those rates to previous years, and present the lifetime failure statistics for all the hard drive models active in our data center as of the end of 2023. Along the way we share our observations and insights on the data presented and, as always, we look forward to you doing the same in the comments section at the end of the post.

2023 Hard Drive Failure Rates

As of the end of 2023, Backblaze was monitoring 270,222 hard drives used to store data. For our evaluation, we removed 466 drives from consideration which we’ll discuss later on. This leaves us with 269,756 hard drives covering 35 drive models to analyze for this report. The table below shows the Annualized Failure Rates (AFRs) for 2023 for this collection of drives.

An chart displaying the failure rates of Backblaze hard drives.

Notes and Observations

One zero for the year: In 2023, only one drive model had zero failures, the 8TB Seagate (model: ST8000NM000A). In fact, that drive model has had zero failures in our environment since we started deploying it in Q3 2022. That “zero” does come with some caveats: We have only 204 drives in service and the drive has a limited number of drive days (52,876), but zero failures over 18 months is a nice start.

Failures for the year: There were 4,189 drives which failed in 2023. Doing a little math, over the last year on average, we replaced a failed drive every two hours and five minutes. If we limit hours worked to 40 per week, then we replaced a failed drive every 30 minutes.

More drive models: In 2023, we added six drive models to the list while retiring zero, giving us a total of 35 different models we are tracking. 

Two of the models have been in our environment for a while but finally reached 60 drives in production by the end of 2023.

  1. Toshiba 8TB, model HDWF180: 60 drives.
  2. Seagate 18TB, model ST18000NM000J: 60 drives.

Four of the models were new to our production environment and have 60 or more drives in production by the end of 2023.

  1. Seagate 12TB, model ST12000NM000J: 195 drives.
  2. Seagate 14TB, model ST14000NM000J: 77 drives.
  3. Seagate 14TB, model ST14000NM0018: 66 drives.
  4. WDC 22TB, model WUH722222ALE6L4: 2,442 drives.

The drives for the three Seagate models are used to replace failed 12TB and 14TB drives. The 22TB WDC drives are a new model added primarily as two new Backblaze Vaults of 1,200 drives each.

Mixing and Matching Drive Models

There was a time when we purchased extra drives of a given model to have on hand so we could replace a failed drive with the same drive model. For example, if we needed 1,200 drives for a Backblaze Vault, we’d buy 1,300 to get 100 spares. Over time, we tested combinations of different drive models to ensure there was no impact on throughput and performance. This allowed us to purchase drives as needed, like the Seagate drives noted previously. This saved us the cost of buying drives just to have them hanging around for months or years waiting for the same drive model to fail.

Drives Not Included in This Review

We noted earlier there were 466 drives we removed from consideration in this review. These drives fall into three categories.

  • Testing: These are drives of a given model that we monitor and collect Drive Stats data on, but are in the process of being qualified as production drives. For example, in Q4 there were four 20TB Toshiba drives being evaluated.
  • Hot Drives: These are drives that were exposed to high temperatures while in operation. We have removed them from this review, but are following them separately to learn more about how well drives take the heat. We covered this topic in depth in our Q3 2023 Drive Stats Report
  • Less than 60 drives: This is a holdover from when we used a single storage server of 60 drives to store a blob of data sent to us. Today we divide that same blob across 20 servers, i.e. a Backblaze Vault, dramatically improving the durability of the data. For 2024 we are going to review the 60 drive criteria and most likely replace this standard with a minimum number of drive days in a given period of time to be part of the review. 

Regardless, in the Q4 2023 Drive Stats data you will find these 466 drives along with the data for the 269,756 drives used in the review.

Comparing Drive Stats for 2021, 2022, and 2023

The table below compares the AFR for each of the last three years. The table includes just those drive models which had over 200,000 drive days during 2023. The data for each year is inclusive of that year only for the operational drive models present at the end of each year. The table is sorted by drive size and then AFR.

A chart showing the failure rates of hard drives from 2021, 2022, and 2023.

Notes and Observations

What’s missing?: As noted, a drive model required 200,000 drive days or more in 2023 to make the list. Drives like the 22TB WDC model with 126,956 drive days and the 8TB Seagate with zero failures, but only 52,876 drive days didn’t qualify. Why 200,000? Each quarter we use 50,000 drive days as the minimum number to qualify as statistically relevant. It’s not a perfect metric, but it minimizes the volatility sometimes associated with drive models with a lower number of drive days.

The 2023 AFR was up: The AFR for all drives models listed was 1.70% in 2023. This compares to 1.37% in 2022 and 1.01% in 2021. Throughout 2023 we have seen the AFR rise as the average age of the drive fleet has increased. There are currently nine drive models with an average age of six years or more. The nine models make up nearly 20% of the drives in production. Since Q2, we have accelerated the migration from older drive models, typically 4TB in size, to new drive models, typically 16TB in size. This program will continue throughout 2024 and beyond.

Annualized Failure Rates vs. Drive Size

Now, let’s dig into the numbers to see what else we can learn. We’ll start by looking at the quarterly AFRs by drive size over the last three years.

A chart showing hard drive failure rates by drive size from 2021 to 2023.

To start, the AFR for 10TB drives (gold line) are obviously increasing, as are the 8TB drives (gray line) and the 12TB drives (purple line). Each of these groups finished at an AFR of 2% or higher in Q4 2023 while starting from an AFR of about 1% in Q2 2021. On the other hand, the AFR for the 4TB drives (blue line) rose initially, peaking in 2022 and has decreased since. The remaining three drive sizes—6TB, 14TB, and 16TB—have oscillated around 1% AFR for the entire period. 

Zooming out, we can look at the change in AFR by drive size on an annual basis. If we compare the annual AFR results for 2022 to 2023, we get the table below. The results for each year are based only on the data from that year.

At first glance it may seem odd that the AFR for 4TB drives is going down. Especially given the average age of each of the 4TB drives models is over six years and getting older. The reason is likely related to our focus in 2023 on migrating from 4TB drives to 16TB drives. In general we migrate the oldest drives first, that is those more likely to fail in the near future. This process of culling out the oldest drives appears to mitigate the expected rise in failure rates as a drive ages. 

But, not all drive models play along. The 6TB Seagate drives are over 8.6 years old on average and, for 2023, have the lowest AFR for any drive size group potentially making a mockery of the age-is-related-to-failure theory, at least over the last year. Let’s see if that holds true for the lifetime failure rate of our drives.

Lifetime Hard Drive Stats

We evaluated 269,756 drives across 35 drive models for our lifetime AFR review. The table below summarizes the lifetime drive stats data from April 2013 through the end of Q4 2023. 

A chart showing lifetime annualized failure rates for 2023.

The current lifetime AFR for all of the drives is 1.46%. This is up from the end of last year (Q4 2022) which was 1.39%. This makes sense given the quarterly rise in AFR over 2023 as documented earlier. This is also the highest the lifetime AFR has been since Q1 2021 (1.49%). 

The table above contains all of the drive models active as of 12/31/2023. To declutter the list, we can remove those models which don’t have enough data to be statistically relevant. This does not mean the AFR shown above is incorrect, it just means we’d like to have more data to be confident about the failure rates we are listing. To that end, the table below only includes those drive models which have two million drive days or more over their lifetime, this gives us a manageable list of 23 drive models to review.

A chart showing the 2023 annualized failure rates for drives with more than 2 million drive days in their lifetimes.

Using the table above we can compare the lifetime drive failure rates of different drive models. In the charts below, we group the drive models by manufacturer, and then plot the drive model AFR versus average age in months of each drive model. The relative size of each circle represents the number of drives in each cohort. The horizontal and vertical scales for each manufacturer chart are the same.

A chart showing annualized failure rates by average age and drive manufacturer.

Notes and Observations

Drive migration: When selecting drive models to migrate we could just replace the oldest drive models first. In this case, the 6TB Seagate drives. Given there are only 882 drives—that’s less than one Backblaze Vault—the impact on failure rates would be minimal. That aside, the chart makes it clear that we should continue to migrate our 4TB drives as we discussed in our recent post on which drives reside in which storage servers. As that post notes, there are other factors, such as server age, server size (45 vs. 60 drives), and server failure rates which help guide our decisions. 

HGST: The chart on the left below shows the AFR trendline (second order polynomial) for all of our HGST models.  It does not appear that drive failure consistently increases with age. The chart on the right shows the same data with the HGST 4TB drive models removed. The results are more in line with what we’d expect, that drive failure increased over time. While the 4TB drives perform great, they don’t appear to be the AFR benchmark for newer/larger drives.

One other potential factor not explored here, is that beginning with the 8TB drive models, helium was used inside the drives and the drives were sealed. Prior to that they were air-cooled and not sealed. So did switching to helium inside a drive affect the failure profile of the HGST drives? Interesting question, but with the data we have on hand, I’m not sure we can answer it—or that it matters much anymore as helium is here to stay.

Seagate: The chart on the left below shows the AFR trendline (second order polynomial) for our Seagate models. As with the HGST models, it does not appear that drive failure continues to increase with age. For the chart on the right, we removed the drive models that were greater than seven years old (average age).

Interestingly, the trendline for the two charts is basically the same up to the six year point. If we attempt to project past that for the 8TB and 12TB drives there is no clear direction. Muddying things up even more is the fact that the three models we removed because they are older than seven years are all consumer drive models, while the remaining drive models are all enterprise drive models. Will that make a difference in the failure rates of the enterprise drive model when they get to seven or eight or even nine years of service? Stay tuned.

Toshiba and WDC: As for the Toshia and WDC drive models, there is a little over three years worth of data and no discernible patterns have emerged. All of the drives from each of these manufacturers are performing well to date.

Drive Failure and Drive Migration

One thing we’ve seen above is that drive failure projections are typically drive model dependent. But we don’t migrate drive models as a group, instead, we migrate all of the drives in a storage server or Backblaze Vault. The drives in a given server or Vault may not be the same model. How we choose which servers and Vaults to migrate will be covered in a future post, but for now we’ll just say that drive failure isn’t everything.

The Hard Drive Stats Data

The complete data set used to create the tables and charts in this report is available on our Hard Drive Test Data page. You can download and use this data for free for your own purpose. All we ask are three things: 1) you cite Backblaze as the source if you use the data, 2) you accept that you are solely responsible for how you use the data, and 3) you do not sell this data itself to anyone; it is free.

Good luck, and let us know if you find anything interesting.

The post Backblaze Drive Stats for 2023 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/backblaze-drive-stats-for-2023/feed/ 7
Backblaze Commits to Routing Security With MANRS Participation https://www.backblaze.com/blog/backblaze-commits-to-routing-security-with-manrs-participation/ https://www.backblaze.com/blog/backblaze-commits-to-routing-security-with-manrs-participation/#respond Thu, 08 Feb 2024 16:29:06 +0000 https://www.backblaze.com/blog/?p=110843 Backblaze is now a recognized Mutually Agreed Norms for Routing Security (MANRS) participant. What does that mean? Read on to find out.

The post Backblaze Commits to Routing Security With MANRS Participation appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
A decorative image displaying the MANRS logo.

They say good manners are better than good looks. When it comes to being a good internet citizen, we have to agree. And when someone else tells you that you have good manners (or MANRS in this case), even better. 

If you hold your cloud partners to a higher standard, and if you think it’s not asking too much that they make the internet a better, safer place for everyone, then you’ll be happy to know that Backblaze is now recognized as a Mutually Agreed Norms for Routing Security (MANRS) participant (aka MANRS Compliant). 

What Is MANRS?

MANRS is a global initiative with over 1,095 participants that are enacting network policies and controls to help reduce the most common routing threats. At a high level, we’re setting up filters to check that network routing information we receive for peers is valid, ensuring that the networks we advertise to the greater internet are marked as owned by Backblaze, and making sure that data that gets out of our network is legitimate and can’t be spoofed.

You can view a full list of MANRS participants here.

What Our (Good) MANRS Mean For You

The biggest benefit for customers is that network traffic to and from Backblaze’s connection points where we exchange traffic with our peering partners is more secure and more trustworthy. All of the changes that we’ve implemented (which we get into below) are on our side—so, no action is necessary from Backblaze partners or users—and will be transparent for our customers. Our Network Engineering team has done the heavy lifting. 

MANRS Actions

Backblaze falls under the MANRS category of CDN and Cloud Providers, and as such, we’ve implemented solutions or processes for each of the five actions stipulated by MANRS:

  1. Prevent propagation of incorrect routing information: Ensure that traffic we receive is coming from known networks.
  2. Prevent traffic of illegitimate source IP addresses: Prevent malicious traffic coming out of our network.
  3. Facilitate global operational communication and coordination: Keep our records with 3rd party sites like Peeringdb.com up to date as other operators use this to validate our connectivity details.
  4. Facilitate validation of routing information on a global scale: Digitally sign our network objects using the Resource Public Key Infrastructure (RPKI) standard.
  5. Encourage MANRS adoption: By telling the world, just like in this post!

Digging Deeper Into Filtering and RPKI

Let’s go over the filtering and RPKI details, since they are very valuable to ensuring the security and validity of our network traffic.

Filtering: Sorting Out the Good Networks From the Bad

One major action for MANRS compliance is to validate that the networks we receive from peers are valid. When we connect to other networks, we each tell each other about our networks in order to build a routing table that lets us know the optimal path to send traffic.

We can blindly trust what the other party is telling us, or we can reach out to an external source to validate. We’ve implemented automated internal processes to help us apply these filters to our edge routers (the devices that connect us externally to other networks).

If you’re a more visual learner, like me, here’s a quick conversational bubble diagram of what we have in place.

Externally verifying routing information we receive.

Every edge device that connects to an external peer now has validation steps to ensure that the networks we receive and use to send out traffic are valid. We have automated processes that periodically check and deploy for updates to any lists.

What Is RPKI?

RPKI is a public key infrastructure framework designed to secure the internet’s routing infrastructure, specifically the Border Gateway Protocol (BGP). RPKI provides a way to connect internet number resource information (such as IP addresses) to a trust anchor. In layman’s terms, RPKI allows us, as a network operator, to securely identify whether other networks that interact with ours are legitimate or malicious.

RPKI: Signing Our Paperwork

Much like going to a notary and validating a form, we can perform the same action digitally with the list of networks that we advertise to the greater internet. The RPKI framework allows us to stamp our networks as owned by us.

It also allows us to digitally sign records of our networks that we own, allowing external parties to confirm that the networks that they see from us are valid. If another party comes along and tries to claim to be us, by using RPKI our peering partner will deny using that network to send data to a false Backblaze network.

You can check the status of our RPKI signed route objects on the MANRS statistics website.

What does the process of peering and advertising networks look like without RPKI validation?

A diagram that imagines IP address requests for ownership without RPKI standards. Bad actors would be able to claim traffic directed towards IP addresses that they don't own.
Bad actor claiming to be a Backblaze network without RPKI validation.

Now, with RPKI, we’ve dotted our I’s and crossed our T’s. A third party certificate holder serves as a validator for the digital certificates that we used to sign our network objects. If anyone else claims to be us, they will be marked as invalid and the peer will not accept the routing information, as you can see in the diagram below.

A diagram that imagines networking requests for ownership with RPKI standards properly applied. Bad actors would attempt to claim traffic towards an owned or valid IP address, but be prevented because they don't have the correct credentials.
With RPKI validation, the bad actor is denied the ability to claim to be a Backblaze network.

Mind Your MANRS

Our first value as a company is to be fair and good. It reads: “Be good. Trust is paramount. Build a good product. Charge fairly. Be open, honest, and accepting with customers and each other.” Almost sounds like Emily Post wrote it—that’s why our MANRS participation fits right in with the way we do business. We believe in an open internet, and participating in MANRS is just one way that we can contribute to a community that is working towards good for all.

The post Backblaze Commits to Routing Security With MANRS Participation appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/backblaze-commits-to-routing-security-with-manrs-participation/feed/ 0
Object Storage Simplified: Introducing Powered by Backblaze https://www.backblaze.com/blog/powered-by-announcement-2024/ https://www.backblaze.com/blog/powered-by-announcement-2024/#respond Tue, 06 Feb 2024 14:00:00 +0000 https://www.backblaze.com/blog/?p=110833 Announcing Powered By Backblaze, offering companies a way to incorporate easy, affordable cloud storage within their branded user experience.

The post Object Storage Simplified: Introducing Powered by Backblaze appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
A decorative image showing the Backblaze logo on a cloud hovering over a power button.

Today, we announced our new Powered by Backblaze program to give platform providers the ability to offer cloud storage without the burden of building scalable storage infrastructure (something we know a little bit about). 

If you’re an independent software vendor (ISV), technology partner, or any company that wants to incorporate easy, affordable data storage within your branded user experience, Powered by Backblaze will give you the tools to do so without complex code, capital outlay, or massive expense.

Read on to learn more about Powered by Backblaze and how it can help you enhance your platforms and services. Or, if you’d like to get started asap, contact our Sales Team for access.  

Benefits of Powered by Backblaze

  • Business Growth: Adding cloud services to your product portfolios can generate new revenue streams and/or grow your existing margin.
  • Improved Customer Experience: Take the complexity out of object storage and deliver the best solutions by incorporating a proven object cloud storage solution.
  • Simplified Billing: Reduce complex billing by providing customers with a single bill from a single provider. 
  • Build Your Brand:  Improve customer expectations by providing cloud storage with your company name for consistency and brand identity.

What Is Powered by Backblaze?

Powered by Backblaze offers companies the ability to incorporate B2 Cloud Storage into their products so they can sell more services or enhance their user experience with no capital investment. Today, this program offers two solutions that support the provisioning of B2 Cloud Storage: Custom Domains and the Backblaze Partner API.

How Can I Leverage Custom Domains?

Custom Domains, launched today, lets you serve content to your end users from the web domain or URL of your choosing, with no need for complex code or proxy servers. Backblaze manages the heavy lifting of cloud storage on the back end.

Custom Domains functionality combines CNAME and Backblaze B2 Object Storage, enabling the use of your preferred domain name in your files’ web domain or URLs instead of using the domain name that Backblaze automatically assigns.

We’ve chosen Backblaze so we can have a reliable partner behind our new Edge Storage solution. With their Custom Domain feature, we can implement the security needed to serve data from Backblaze to end users from Azion’s Edge Platform, improving user experience.

—Rafael Umann, CEO, Azion, a full stack platform for developers

How Can I Leverage the Backblaze Partner API?

The Backblaze Partner API automates the provisioning and management of Backblaze B2 Cloud Storage storage accounts within a platform. It allows for managing accounts, running reports, and creating a bundled solution or managed service for a unified user experience.

We wrote more about the Backblaze Partner API here, but briefly: We created this solution by exposing existing API functionality in a manner that allows partners to automate tasks essential to provisioning users with seamless access to storage.

The Backblaze Partner API calls allow you to:

  • Create accounts (add Group members)
  • Organize accounts in Groups
  • List Groups
  • List Group members
  • Eject Group members

If you’d like to get into the details, you can dig deeper in our technical documentation.

Our customers produce thousands of hours of content daily and, with the shift to leveraging cloud services like ours, they need a place to store both their original and transcoded files. The Backblaze Partner API allows us to expand our cloud services and eliminate complexity for our customers—giving them time to focus on their business needs, while we focus on innovations that drive more value.

—Murad Mordukhay, CEO, Qencode

How to Get Started With Powered by Backblaze

To get started with Powered by Backblaze, contact our Sales Team. They will work with you to understand your use case and how you can best utilize Powered by Backblaze. 

What’s Next?

We’re looking forward to adding more to the Powered by Backblaze program as we continue investing in the tools you need to bring performant cloud storage to your users in an easy, seamless fashion.

The post Object Storage Simplified: Introducing Powered by Backblaze appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/powered-by-announcement-2024/feed/ 0