Data Center Archives - Backblaze Blog | Cloud Storage & Cloud Backup Cloud Storage & Cloud Backup Mon, 06 Nov 2023 15:51:32 +0000 en-US hourly 1 https://wordpress.org/?v=6.4.3 https://www.backblaze.com/blog/wp-content/uploads/2019/04/cropped-cropped-backblaze_icon_transparent-80x80.png Data Center Archives - Backblaze Blog | Cloud Storage & Cloud Backup 32 32 Making the Data Center Choice (the Work Has Just Begun) https://www.backblaze.com/blog/picking-right-eu-data-center-partner/ https://www.backblaze.com/blog/picking-right-eu-data-center-partner/#comments Fri, 30 Aug 2019 14:35:18 +0000 https://www.backblaze.com/blog/?p=92211 We select our European data center.

The post Making the Data Center Choice (the Work Has Just Begun) appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
Globe with Europe

Imagine a globe spinning (or simply look at the top of this blog post). When you start out on a data center search, you could consider almost any corner of the globe. For Backblaze, we knew we wanted to find an anchor location in the European Union. For a variety of reasons, we quickly narrowed in on Amsterdam, Brussels and Dublin as the most likely locations. While we were able to generate a list of 40 qualified locations, narrowed it down to ten for physical visits, and then narrowed it yet again to three finalists, the question remained: How would we choose our ultimate partner? Data center searches have changed a lot since 2012 when we circulated our RFP for a previous expansion.

The good news is we knew our top line requirements would be met. Thinking back to the 2×2 that our Chief Cloud Officer, Tim Nufire, had drawn on the board at the early stages of our search, we felt good that we had weighed the tradeoffs appropriately.

EU data center cost risk quadrant
Cost vs risk

Similarly to hiring an employee, after the screening and the interviews, one runs reference checks. In the case of data centers, that means both validating certain assertions and going into the gory details on certain operational capabilities. For example, in our second post in the EU DC series, we mentioned environmental risks. If one is looking to reduce the probability of catastrophe, making sure that your DC is outside of a flood zone is generally advisable. Of course, the best environmental risk factor reports are much more nuanced and account for changes in the environment.

To help us investigate those sorts of issues, we partnered with PTS Consulting. By engaging with third party experts, we get dispassionate, unbiased, thorough reporting about the locations we are considering. Based on PTS’s reporting, we eliminated one of our finalists. To be clear, there was nothing inherently wrong with the finalist, but it was unlikely that particular location would sustainably meet our long term requirements without significant infrastructure upgrades on their end.

In our prior posts, we mentioned another partner, UpStack. Their platform helped us with the sourcing and narrowing down to a list of finalists. Importantly, their advisory services were crucial in this final stage of diligence. Specifically, UpStack brought in electrical engineering expertise to give us a deep, detailed assessment of the electrical mechanical single line diagrams. For those less versed in the aspects of DC power, that means UpStack was able to go into incredible granularity in looking at the reliability and durability of the power sources of our DCs.

Ultimately, it came down to two finalists:

  • DC 3: Interxion Amsterdam
  • DC 4: The pre-trip favorite

DC four had a lot of things going for it. The pricing was the most affordable and the facility had more modern features and functionality. The biggest downsides were open issues around sourcing and training what would become our remote hands team.

Which gets us back to our matrix of tradeoffs. While more expensive than DC three, Interxion facility graded out equally well during diligence. Ultimately, the people at Interxion and confidence in the ability to build out a sturdy remote hands team made the choice of Interxion clear.

Cost vs risk and result
Cost vs risk and result

Looking back at Tim’s 2×2, DC four presented as financially more affordable, but operationally a little more risky (since we had questions about our ability to effectively operate on a day to day basis).

Interxion, while a little more financially expensive, reduced our operational risks. When thinking of our anchor location in Europe, that felt like the right tradeoff to be making.

Ready, Set, More Work!

The site selection only represented part of the journey. In parallel, our sourcing team has had to learn how to get pods and drives into Europe. Our Tech Ops & Engineering teams have worked through any number of issues around latency, performance, and functionality. Finance & Legal has worked through the implications of having a physical international footprint. And that’s just to name a few things.

Interxion - Backblaze data center floor plan
EU data center floor plan

If you’re in the EU, we’ll be at IBC 2019 in Amsterdam from September 13 to September 17. If you’re interested in making an appointment to chat further, use our form to reserve a time at IBC, or drop by stand 7.D67 at IBC (our friends from Cantemo are hosting us). Or, if you prefer, feel free to leave any questions in the comments below!

The post Making the Data Center Choice (the Work Has Just Begun) appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/picking-right-eu-data-center-partner/feed/ 1
The Logistics of Finding the Right Data Center: The Great European (Non) Vacation https://www.backblaze.com/blog/data-center-due-diligence/ https://www.backblaze.com/blog/data-center-due-diligence/#comments Thu, 29 Aug 2019 15:00:24 +0000 https://www.backblaze.com/blog/?p=92209 Join Backblaze on a European road trip to visit ten data center candidates in three countries in three days.

The post The Logistics of Finding the Right Data Center: The Great European (Non) Vacation appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
EU data center search map

Ten locations, three countries, three days. Even the hardest working person in show business wouldn’t take on challenges like that. But for our COO, John Tran, and UpStack’s CEO, Chris Trapp, that’s exactly what they decided to do.

In yesterday’s post, we discussed the path to getting 40 bids from vendors that could meet our criteria for our new European data center (DC). This was a remarkable accomplishment in itself, but still only part way to our objective of actually opening a DC. We needed to narrow down the list.

With help from UpStack, we began to filter the list based on some qualitative characteristics: vendor reputation, vendor business focus, etc. Chris managed to get us down to a list of 10. The wonders of technology today, like the UpStack platform, help people get more information and cast wider nets then at any other time in human history. The downside of that is you get a lot of information on paper, but that is a poor substitute to what you can gather in person. If you’re looking for a good, long term partner then understanding things like how they operate and their company DNA is imperative to finding the right match. So, to find our newest partner, we needed to go for a trip.

Chris took the lead on booking appointments. The majority of the shortlist clustered in the Netherlands and Ireland. The others were in Belgium and with the magic of Google Maps, one could begin to envision an efficient trip to all three countries. The feeling was it could all be done with just three days on the ground in Europe. Going in, they knew it would be a compressed schedule and that they would be on the move. As experienced travelers, they brought small bags that easily fit in the overhead and the right power adapters.

Hitting the Road

On July 23rd, 2018, John left San Francisco International Airport (SFO) at 7:40 a.m. on a non-stop to Amsterdam. Taking into account the 5,448 miles between the two cities and the time change, John landed at Amsterdam Airport Schiphol (AMS) hours at 7:35 a.m. on July 24th. He would land back home on July 27th at 6:45 p.m.

Tuesday (Day One)

The first day officially started when John’s redeye touched down in Amsterdam at 7:35 a.m. local. Thankfully, Chris’ flight from New York’s La Guardia was also on time. With both flights on time, they were able to meet at the airport: literally, for they had never met before.

Both adjourned to the airport men’s room to change out of their travel clothes and into their suits — choosing a data center is serious business, after all. While airport bathroom changes are best left for spy novels, John and Chris made short work of it and headed to the rental car area.

That day, they’ll ended up touring four DCs. One of the biggest takeaways of the trip was that it turned out visiting data centers is similar to wine tasting. While some of the differences can be divined from the specs on paper, when trying to figure out the difference between A and B, it’s very helpful to compare side by side. Also similar to wine tasting, there’s a fine line between understanding nuances between multiple things and it all starting to blend together. In both cases, after a full day of doing it, you feel like you probably shouldn’t operate heavy machinery.

On day one, our team saw a wide range of options. The physical plant is itself one area of differentiation. While we have requirements for things like power, bandwidth, and security, there’s still a lot of room for tradeoffs among those DCs that exceed the requirement. And that’s just the physical space. The first phase of successful screening (discussed in our prior post) is being effective at examining non-emotional decision variables — specs, price, reputation — but not the people. Every DC is staffed by human beings and cultural fit is important with any partnership. Throughout the day, one of the biggest differences we noticed was the culture of each specific DC.

The third stop of the day was Interxion Amsterdam. While we didn’t know it at the time, they would end up being our partner of choice. On paper, it was clear that Interxion would be a contender. Its impressive facility meets all our requirements and, by happenstance, happens to have a footprint available that is almost exactly to the spec of what we were looking for. During our visit, the facility was impressive, as expected. But the connection we felt with the team there would prove to be the thing that would ultimately be the difference.

After leaving the last DC tour around 7pm, our team drove from Amsterdam to Brussels. Day 2 would be another morning start and, after arriving in Brussels a little after 9pm, they had earned some rest!

Insider Tip: Grand Place, BrusselsEarlier in his career, John had spent a good amount of time in Europe and, specifically, Brussels. One of his favorite spots is the Grand Place (Brussels’ Central Market). If in the neighborhood, he recommends you go and enjoy a Belgium beer sitting at one of the restaurants in the market. The smart move is to take the advice. Chris, newer to Brussels, gave John’s tour a favorable TripAdvisor rating.

Wednesday (Day Two)

After getting a well-deserved couple hours of sleep, the day officially started with an 8:30 a.m. meeting for the first DC of the day. Major DC operators generally have multiple locations and DCs five and six are operated by companies that also operate sites visited on day one. It was remarkable, culturally, to compare the teams and operational variability across multiple locations. Even within the same company, teams at different locations have unique personalities and operating styles, which all serves to reinforce the need to physically visit your proposed partners before making a decision.

After two morning DC visits, John and Chris hustled to the Brussels airport to catch their flight to Dublin. At some point during the drive, it was realized that tickets to Dublin hadn’t actually been purchased. Smartphones and connectivity are transformative on road trips like this.

The flight itself was uneventful. When they landed, they got to the rental car area and their car was waiting for them. Oh, by the way, minor detail but the steering wheel was on the wrong side of the car! Chris buckled in tightly and John had flashbacks of driver’s ed having never driven on the right side of the car. Shortly after leaving the airport, it was realized that one also drives on the left side of the road in Ireland. Smartphones and connectivity were not required for this discovery. Thankfully, the drive was uneventful and the hotel was reached without incident. After work and family check ins, another day was put on the books.

Brazen Head Pub in Dublin from Flickr, https://www.flickr.com/photos/chadlewis/5272488408

Our team checked into their hotel and headed over to the Brazen Head Pub for dinner. Ireland’s oldest pub is worth the visit. It’s here that we come across our it really is a small world nomination for the trip. After starting a conversation with their neighbors at dinner, our team was asked what they were doing in Dublin. John introduced himself as Backblaze’s COO and the conversation seemed to cool a bit. Apparently their neighbor was someone from another large cloud storage provider. Apparently, not all companies like sharing information as much as we do.

Thursday (Day Three)

The day again started with an 8:30 a.m. hotel departure. Bear in mind, during all of this, John and Chris both had their day jobs and families back home to stay in touch with. Today would feature four DC tours. One interesting note about the trip: operating a data center requires a fair amount of infrastructure. In a perfect world, power and bandwidth come in at multiple locations from multiple vendors. This often causes DCs to cluster around infrastructure hubs. Today’s first two DCs were across the street from one another. We’re assuming, but could not verify, a fierce inter-company football rivalry.

While walking across the street was interesting, in the case of the final two DCs, they literally shared the same space; the smaller provider subleasing space from the larger. Here, again, the operating personalities differentiated the companies. It’s not necessarily that one was worse than the other, it is a question of whom you think will be a better partnership match for your own style. In this case, the smaller of the two providers stood out because of the passion and enthusiasm we felt from the team there, and it didn’t hurt that they are long time Hard Drive Stats enthusiasts (flattery will get you everywhere!).

While the trip, and this post, were focused on finding our new DC location, opening up our first physical operations outside of the U.S. had any number of business ramifications. As such, John made sure to swing by the local office of our global accounting firm to take the opportunity to get to know them.

The meeting wrapped up just in time for Chris and John to make it to the Guinness factory by 6:15 p.m. Upon arrival, it was then realized that the last entry into the Guinness factory is 6 p.m. Smartphones and connectivity really can be transformative on road trips like this. All that said, without implicating any of the specific actors, our fearless travelers managed to finagle their way in and could file the report home that they were able to grab a pint or two at St. James’ place.

Guinness sign

Guinness glass

The team would leave for their respective homes early the next morning. John made it back to California in time for a (late) dinner with his family and a well earned weekend.

After a long, productive trip, we had our list of the three finalists. Tomorrow, we’ll discuss how we narrowed it down from three to one. Until then, slainte (cheers)!

The post The Logistics of Finding the Right Data Center: The Great European (Non) Vacation appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/data-center-due-diligence/feed/ 2
Getting Ready to Go to Europe https://www.backblaze.com/blog/getting-ready-to-go/ https://www.backblaze.com/blog/getting-ready-to-go/#respond Wed, 28 Aug 2019 14:41:32 +0000 https://www.backblaze.com/blog/?p=92206 On Tuesday August 20th, we announced the opening of our first European data center. This post is the first in our three-part series on why we wanted an EU data center and how we selected one.

The post Getting Ready to Go to Europe appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
EU data center cost risk quadrant

There’s an old saying, “How do you eat an elephant? One bite at a time.” The best way to tackle big problems is to simplify as much as you can.

In our case, with almost an exabyte of customer data under management and customers in over 160 countries, expanding the geographic footprint of our data centers (DCs) has been a frequently discussed topic. Prior to opening up EU Central, we had three DCs, but all in the western U.S. The topic of opening a DC in Europe is not a new one within Backblaze, but going from idea to storing customer data can be a long journey.

As our team gathered to prioritize the global roadmap, the first question was an obvious one: Why do we want to open a DC in Europe? The answer was simple: Customer demand.

While nearly 15 percent of our existing customer base already resides in Europe, the requests for an EU DC come from citizens around the globe. Why?

  • Customers like keeping data in multiple geographies. Doing so is in line with the best practices of backup (long before there was a cloud, there was still 3-2-1).
  • Geopolitical/regulatory concerns. For any number of reasons, customers may prefer or be required to store data in certain physical locations.
  • Performance concerns. While we enjoy a debate about the effects of latency for most storage use cases, the reality is many customers want a copy of their data as physically close to where it’s being used as possible.

With the need established, the next question was predictably obvious: How are we going to go about this? Our three existing DCs are all in the same timezone as our California headquarters. Logistically, opening and operating a DC that has somewhere around an eight hour time difference from our headquarters felt like a significant undertaking.

Organizing the Search for the Right Data Center

To help get us organized, our co-founder and Chief Cloud Officer, Tim Nufire, drew the following on a whiteboard.

Expense/Risk chart for data center location
Cost vs risk and result

This basic matrix frames the challenge well. If one were willing to accept infinite risk (have customers write to scrolls and “upload” via sealed bottle transported across the ocean), we’d have low financial and effort investments outlays to open the data enter. However, we’re not in the business of accepting infinite risk. So we wanted to achieve a low risk environment for data storage while sustaining our cost advantage for our customers.

But things get much more nuanced once you start digging in.

Risks

There are multiple risk factors to consider when selecting a DC. Some of the leading ones are:

  • Environmental: One could choose a DC in the middle of a floodplain, but, with few exceptions, most DCs don’t work well underwater. We needed to find an area to minimize adverse environmental impact.
  • Political: DCs are physical places. Physical places are governed by some form of nation state. Some customers want (or need) their data to be stored within certain regulatory or diplomatic parameters. In the case of the requests for opening a DC in Europe, many of our customers want their data to be inside of the European Union (EU). That requirement strikes Switzerland off our list. For similar reasons, another requirement we imposed was operating inside of a country that is a NATO member. Regrettably, that eliminated any location inside of Finland. Our customers want EU, not Europe.
  • Financial: By opening a DC in Europe, we will be conducting business with a partner that expects to be paid in euros. As an American company, we primarily operate in dollars. So now the simple timing of when we pay our bills may change the cost (depending on exchange rate fluctuations).

Costs

The other dimension on the board was costs, expressed as Affordable to Expensive. Costs can be thought of both as financial as well as effort:

  • Operating Efficiency: Generally speaking, the climate of the geography will have an effect on the heating/cooling costs. We needed to understand climate nuances across a broad geographic area.
  • Cost of Inputs: Power costs vary widely, often due to fuel sources having different availability at a local level. For example, nuclear power is generally cheaper than fossil fuel, but may not be available in a given region. Complicating things is that power source X may cost one thing in the first country, but something totally different in the next. Our DC negotiations may be for physical space, but we needed to understand our total cost of ownership.
  • Staffing: Some DCs provide remote hands (contract labor) while others expect us to provide our own staffing. We needed to get up to speed on labor laws and talent pools in desired regions.

Trying to Push Forward

We’re fortunate to have a great team of Operations people that have earned expertise in the field. So with the desire to find a DC in the EU, a working group formed to explore our options. A little while later, when the internal memo circulated, the summary in the body of the email jumped out:

“It could take 6-12 months from project kick-off to bring a new EU data center online.”

That’s a significant project for any company. In addition, the time range was sufficiently wide to indicate the number of unknowns in play. We were faced with a difficult decision: How can we move forward on a project with so many unknowns?

While this wouldn’t be our first data center search, prior experience told us we had many more unknowns in front of us. Our most recent facility searches mainly involved coordinating with known vendors to obtain facility reports and pricing for comparison. Even with known vendors, this process involved significant resources from Backblaze to relay requirements to various DC sales reps and to take disparate quotes and create some sort of comparison. All DCs will quote you $/Kilowatt Hour or $/kWh, but there is no standard definition of what is and isn’t included in that. Generally speaking, a DC contract has unit costs that decline as usage goes up. So is the $/kWh in a given quote the blended lifetime cost? Year one? Year five? Adding to this complexity would be all the variables discussed above (and more).

Interested in learning more about the initial assessment of the project? Here is a copy of the internal memo referenced. Because of various privacy agreements, we needed to redact small pieces of the original. Very little has been changed and, if you’re interested in the deep dive, we hope you’ll enjoy!

Serendipity Strikes: UpStack

Despite the obstacles in our path, our team committed to finding a location inside the EU that makes sense for both our customers’ needs and our business model. We have an experienced team that has demonstrated the ability to source and vet DCs already. That said, our experienced team were already quite busy with their day jobs. This project looked to come at a significant opportunity cost as it would fully occupy a number of people for an extended period of time.

At the same time as we were trying to work through the internal resource planning, our CEO happened across an interesting article from our friends at Data Center Knowledge; they were covering a startup called UpStack (“Kayak for data center services”). The premise was intriguing — the UpStack platform is designed to gather and normalize quotes from qualified vendors for relevant opportunities. Minimizing friction for bidding DCs and Backblaze would enable both sides to find the right fit. Intrigued, we reached out to their CEO, Chris Trapp.

UpStack LogoUpStack is a free, vendor-neutral data center sourcing platform that allows businesses to analyze and compare level-set pricing and specifications in markets around the world. Find them at upstack.com.

We were immediately impressed with how easy the user experience was on our side. Knowing how much effort goes into normalizing the data from various DCs, having a DC shopping experience comparable to that of searching for plane tickets was mind blowing. With a plane ticket, you might search for number of stops and layover airports. With UpStack, we were able to search for connectivity to existing bandwidth providers, compliance certifications, and location before asking for pricing.

Once vendors returned pricing, UpStack’s application made it easy to compare specifications and pricing on an apples-to-apples basis. This price normalization was a huge advantage for us as it saved many hours of work usually spent converting quotes into pricing models simply for comparison sake. We have the expertise to do what UpStack does, but we also know how much time that takes us. Being able to leverage a trusted partner was a tremendous value add for Backblaze.

UpStack data center search map
Narrowing down the DC possibilities with UpStack

Narrowing Down The Options

With the benefit of the UpStack platform, we were able to cast a much wider net than would have been viable hopping on phone calls from California.

We specified our load ramp. There’s a finite amount of data that will flow into the new DC on day one, and it only grows from there. So part of the pricing negotiation is agreeing to deploy a minimum amount of racks on day one, a minimum by the end of year one, and so on. In return for the guaranteed revenue, the DCs return pricing based on those deployments. Based on the forecasted storage needs, UpStack’s tool then translates that into estimated power needs so vendors can return bids based on estimated usage. This is an important change from how things are usually done; many quotes otherwise price based on the top estimated usage or a vendor-imposed minimum. By basing quotes off of one common forecast, we could get the pricing that fits our needs.

There are many more efficiencies that UpStack provides us and we’d encourage you to visit their site at https://upstack.com to learn more. The punchline is that we were able to create a shortlist of the DCs that fit our requirements; we received 40 quotes provided by 40 data centers in 10 markets for evaluation. This was a blessing and a curse, as we were able to cast a wider net and learn about more qualified vendors than we thought possible, but a list of 40 needed to be narrowed down.

Based on our cost/risk framework, we narrowed it down to the 10 DCs that we felt gave us our best shot to end up with a low cost, low risk partner. With all the legwork done, it was time to go visit. To learn more about our three country trip to 10 facilities that lasted less than 72 hours, tune in tomorrow. Same bat time, same bat station.

The post Getting Ready to Go to Europe appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/getting-ready-to-go/feed/ 0
Announcing Our First European Data Center https://www.backblaze.com/blog/announcing-our-first-european-data-center/ https://www.backblaze.com/blog/announcing-our-first-european-data-center/#comments Tue, 27 Aug 2019 14:45:38 +0000 https://www.backblaze.com/blog/?p=92192 We have big news. Starting today, our first European data center, in Amsterdam, is open and accepting customer data!

The post Announcing Our First European Data Center appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
city view of Amsterdam, Netherlands

Big news: Our first European data center, in Amsterdam, is open and accepting customer data!

This is our fourth data center (DC) location and the first outside of the western United States. As longtime readers know, we have two DCs in the Sacramento, California area and one in the Phoenix, Arizona area. As part of this launch, we are also introducing the concept of regions.

When creating a Backblaze account, customers can choose whether that account’s data will be stored in the EU Central or US West region. The choice made at account creation time will dictate where all of that account’s data is stored, regardless of product choice (Computer Backup or B2 Cloud Storage). For customers wanting to store data in multiple regions, please read this knowledge base article on how to control multiple Backblaze accounts using our (free) Groups feature.

Whether you choose EU Central or US West, your pricing for our products will be unchanged:

  • For B2 Cloud Storage — it’s $0.005/GB/Month. For comparison, storing your data in Amazon S3’s Ireland region will cost ~4.5x more
  • For Computer Backup — $60/Year/Computer is the yearly cost of our industry leading, unlimited data backup for desktops/laptops

Later this week we will be publishing more details on the process we undertook to get to this launch. Here’s a sneak preview:

  • Wednesday, August 28: Getting Ready to Go (to Europe). How do you even begin to think about opening a DC that isn’t within any definition of driving distance? For the vast majority of companies on the planet, simply figuring out how to get started is a massive undertaking. We’ll be sharing a little more on how we thought about our requirements, gathered information, and the importance of NATO in the whole equation.
  • Thursday, August 29: The Great European (Non) Vacation. With all the requirements done, research gathered, and preliminary negotiations held, there comes a time when you need to jump on a plane and go meet your potential partners. For John & Chris, that meant 10 data center tours in 72 hours across three countries — not exactly a relaxing summer holiday, but vitally important!
  • Friday, August 30: Making a Decision. After an extensive search, we are very pleased to have found our partner in Interxion! We’ll share a little more about the process of narrowing down the final group of candidates and selecting our newest partner.
If you’re interested in learning more about the physical process of opening up a data center, check out our post on the seven days prior to opening our Phoenix DC.

New Data Center FAQs:

Q: Does the new DC mean Backblaze has multi-region storage?
A: Yes, by leveraging our Groups functionality. When creating an account, users choose where their data will be stored. The default option will store data in US West, but to choose EU Central, simply select that option in the pull-down menu.

Region selector
Choose EU Central for data storage

If you create a new account with EU Central selected and have an existing account that’s in US West, you can put both of them in a Group, and manage them from there! Learn more about that in our Knowledge Base article.

Q: I’m an existing customer and want to move my data to Europe. How do I do that?
A: At this time, we do not support moving existing data within Backblaze regions. While it is something on our roadmap to support, we do not have an estimated release date for that functionality. However, any customer can create a new account and upload data to Europe. Customers with multiple accounts can administer those accounts via our Groups feature. For more details on how to do that, please see this Knowledge Base article. Existing customers can create a new account in the EU Central region and then upload data to it; they can then either keep or delete the previous Backblaze account in US West.

Q: Finally! I’ve been waiting for this and am ready to get started. Can I use your rapid ingest device, the B2 Fireball?
A: Yes! However, as of the publication of this post, all Fireballs will ship back to one of our U.S. facilities for secure upload (regardless of account location). By the end of the year, we hope to offer Fireball support natively in Europe (so a Fireball with a European customer’s data will never leave the EU).

Q: Does this mean that my data will never leave the EU?
A: Any data uploaded by the customer does not leave the region it was uploaded to unless at the explicit direction of the customer. For example, restores and snapshots of data stored in Europe can be downloaded directly from Europe. However, customers requesting an encrypted hard drive with their data on it will have that drive prepared from a secure U.S. location. In addition, certain metadata about customer accounts (e.g. email address for your account) reside in the U.S. For more information on our privacy practices, please read our Privacy Policy.

Q: What are my payment options?
A: All payments to Backblaze are made in U.S. dollars. To get started, you can enter your credit card within your account.

Q: What’s next?
A: We’re actively working on region selection for individual B2 Buckets (instead of Backblaze region selection on an account basis), which should open up a lot more interesting workflows! For example, customers who want can create geographic redundancy for data within one B2 account (and for those who don’t want to set that up, they can sleep well knowing that Backblaze durability calculated at 11 nines).

We like to develop the features and functionality that our customers want. The decision to open up a data center in Europe is directly related to customer interest. If you have requests or questions, please feel free to put them in the comment section below.

The post Announcing Our First European Data Center appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/announcing-our-first-european-data-center/feed/ 43
Backblaze Vaults: Zettabyte-Scale Cloud Storage Architecture https://www.backblaze.com/blog/vault-cloud-storage-architecture/ https://www.backblaze.com/blog/vault-cloud-storage-architecture/#comments Tue, 18 Jun 2019 15:00:34 +0000 https://www.backblaze.com/blog/?p=23801 Backblaze introduced our Vault architecture in 2015. It's four years later and Vaults still provide the foundation for Backblaze's highly durable and cost-efficient cloud storage.

The post Backblaze Vaults: Zettabyte-Scale Cloud Storage Architecture appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>

A lot has changed in the four years since Brian Beach wrote a post announcing Backblaze Vaults, our software architecture for cloud data storage. Just looking at how the major statistics have changed, we now have over 100,000 hard drives in our data centers instead of the 41,000 mentioned in the post video. We have three data centers (soon four) instead of one data center. We’re approaching one exabyte of data stored for our customers (almost seven times the 150 petabytes back then), and we’ve recovered over 41 billion files for our customers, up from the 10 billion in the 2015 post.

In the original post, we discussed having durability of seven nines. Shortly thereafter, it was upped to eight nines. In July of 2018, we took a deep dive into the calculation and found our durability closer to eleven nines (and went into detail on the calculations used to arrive at that number). And, as followers of our Hard Drive Stats reports will be interested in knowing, we’ve just started testing our first 16 TB drives, which are twice the size of the biggest drives we used back at the time of this post — then a whopping eight TB.

We’ve updated the details here and there in the text from the original post that was published on our blog on March 11, 2015. We’ve left the original 135 comments intact, although some of them might be non sequiturs after the changes to the post. We trust that you will be able to sort out the old from the new and make sense of what’s changed. If not, please add a comment and we’ll be happy to address your questions.

— Editor

Storage Vaults form the core of Backblaze’s cloud services. Backblaze Vaults are not only incredibly durable, scalable, and performant, but they dramatically improve availability and operability, while still being incredibly cost-efficient at storing data. Back in 2009, we shared the design of the original Storage Pod hardware we developed; here we’ll share the architecture and approach of the cloud storage software that makes up a Backblaze Vault.

Backblaze Vault Architecture for Cloud Storage

The Vault design follows the overriding design principle that Backblaze has always followed: keep it simple. As with the Storage Pods themselves, the new Vault storage software relies on tried and true technologies used in a straightforward way to build a simple, reliable, and inexpensive system.

A Backblaze Vault is the combination of the Backblaze Vault cloud storage software and the Backblaze Storage Pod hardware.

Putting The Intelligence in the Software

Another design principle for Backblaze is to anticipate that all hardware will fail and build intelligence into our cloud storage management software so that customer data is protected from hardware failure. The original Storage Pod systems provided good protection for data and Vaults continue that tradition while adding another layer of protection. In addition to leveraging our low-cost Storage Pods, Vaults take advantage of the cost advantage of consumer-grade hard drives and cleanly handle their common failure modes.

Distributing Data Across 20 Storage Pods

A Backblaze Vault is comprised of 20 Storage Pods, with the data evenly spread across all 20 pods. Each Storage Pod in a given vault has the same number of drives, and the drives are all the same size.

Drives in the same drive position in each of the 20 Storage Pods are grouped together into a storage unit we call a tome. Each file is stored in one tome and is spread out across the tome for reliability and availability.

20 hard drives create 1 tome that share parts of a file.

Every file uploaded to a Vault is divided into pieces before being stored. Each of those pieces is called a shard. Parity shards are computed to add redundancy, so that a file can be fetched from a vault even if some of the pieces are not available.

Each file is stored as 20 shards: 17 data shards and three parity shards. Because those shards are distributed across 20 Storage Pods, the Vault is resilient to the failure of a Storage Pod.

Files can be written to the Vault when one pod is down and still have two parity shards to protect the data. Even in the extreme and unlikely case where three Storage Pods in a Vault lose power, the files in the vault are still available because they can be reconstructed from any of the 17 pods that are available.

Storing Shards

Each of the drives in a Vault has a standard Linux file system, ext4, on it. This is where the shards are stored. There are fancier file systems out there, but we don’t need them for Vaults. All that is needed is a way to write files to disk and read them back. Ext4 is good at handling power failure on a single drive cleanly without losing any files. It’s also good at storing lots of files on a single drive and providing efficient access to them.

Compared to a conventional RAID, we have swapped the layers here by putting the file systems under the replication. Usually, RAID puts the file system on top of the replication, which means that a file system corruption can lose data. With the file system below the replication, a Vault can recover from a file system corruption because a single corrupt file system can lose at most one shard of each file.

Creating Flexible and Optimized Reed-Solomon Erasure Coding

Just like RAID implementations, the Vault software uses Reed-Solomon erasure coding to create the parity shards. But, unlike Linux software RAID, which offers just one or two parity blocks, our Vault software allows for an arbitrary mix of data and parity. We are currently using 17 data shards plus three parity shards, but this could be changed on new vaults in the future with a simple configuration update.

Vault Row of Storage Pods

For Backblaze Vaults, we threw out the Linux RAID software we had been using and wrote a Reed-Solomon implementation from scratch, which we wrote about in “Backblaze Open-sources Reed-Solomon Erasure Coding Source Code.” It was exciting to be able to use our group theory and matrix algebra from college.

The beauty of Reed-Solomon is that we can then re-create the original file from any 17 of the shards. If one of the original data shards is unavailable, it can be re-computed from the other 16 original shards, plus one of the parity shards. Even if three of the original data shards are not available, they can be re-created from the other 17 data and parity shards. Matrix algebra is awesome!

Handling Drive Failures

The reason for distributing the data across multiple Storage Pods and using erasure coding to compute parity is to keep the data safe and available. How are different failures handled?

If a disk drive just up and dies, refusing to read or write any data, the Vault will continue to work. Data can be written to the other 19 drives in the tome, because the policy setting allows files to be written as long as there are two parity shards. All of the files that were on the dead drive are still available and can be read from the other 19 drives in the tome.

Building a Backblaze Vault Storage Pod

When a dead drive is replaced, the Vault software will automatically populate the new drive with the shards that should be there; they can be recomputed from the contents of the other 19 drives.

A Vault can lose up to three drives in the same tome at the same moment without losing any data, and the contents of the drives will be re-created when the drives are replaced.

Handling Data Corruption

Disk drives try hard to correctly return the data stored on them, but once in a while they return the wrong data, or are just unable to read a given sector.

Every shard stored in a Vault has a checksum, so that the software can tell if it has been corrupted. When that happens, the bad shard is recomputed from the other shards and then re-written to disk. Similarly, if a shard just can’t be read from a drive, it is recomputed and re-written.

Conventional RAID can reconstruct a drive that dies, but does not deal well with corrupted data because it doesn’t checksum the data.

Scaling Horizontally

Each vault is assigned a number. We carefully designed the numbering scheme to allow for a lot of vaults to be deployed, and designed the management software to handle scaling up to that level in the Backblaze data centers.

The overall design scales very well because file uploads (and downloads) go straight to a vault, without having to go through a central point that could become a bottleneck.

There is an authority server that assigns incoming files to specific Vaults. Once that assignment has been made, the client then uploads data directly to the Vault. As the data center scales out and adds more Vaults, the capacity to handle incoming traffic keeps going up. This is horizontal scaling at its best.

We could deploy a new data center with 10,000 Vaults holding 16TB drives and it could accept uploads fast enough to reach its full capacity of 160 exabytes in about two months!

Backblaze Vault Benefits

The Backblaze Vault architecture has six benefits:

1. Extremely Durable

The Vault architecture is designed for 99.999999% (eight nines) annual durability (now 11 nines — Editor). At cloud-scale, you have to assume hard drives die on a regular basis, and we replace about 10 drives every day. We have published a variety of articles sharing our hard drive failure rates.

The beauty with Vaults is that not only does the software protect against hard drive failures, it also protects against the loss of entire Storage Pods or even entire racks. A single Vault can have three Storage Pods — a full 180 hard drives — die at the exact same moment without a single byte of data being lost or even becoming unavailable.

2. Infinitely Scalable

A Backblaze Vault is comprised of 20 Storage Pods, each with 60 disk drives, for a total of 1200 drives. Depending on the size of the hard drive, each vault will hold:

12TB hard drives => 12.1 petabytes/vault (Deploying today.)
14TB hard drives => 14.2 petabytes/vault (Deploying today.)
16TB hard drives => 16.2 petabytes/vault (Small-scale testing.)
18TB hard drives => 18.2 petabytes/vault (Announced by WD & Toshiba)
20TB hard drives => 20.2 petabytes/vault (Announced by Seagate)

Backblaze Data Center

At our current growth rate, Backblaze deploys one to three Vaults each month. As the growth rate increases, the deployment rate will also increase. We can incrementally add more storage by adding more and more Vaults. Without changing a line of code, the current implementation supports deploying 10,000 Vaults per location. That’s 160 exabytes of data in each location. The implementation also supports up to 1,000 locations, which enables storing a total of 160 zettabytes (also known as 160,000,000,000,000 GB)!

3. Always Available

Data backups have always been highly available: if a Storage Pod was in maintenance, the Backblaze online backup application would contact another Storage Pod to store data. Previously, however, if a Storage Pod was unavailable, some restores would pause. For large restores this was not an issue since the software would simply skip the Storage Pod that was unavailable, prepare the rest of the restore, and come back later. However, for individual file restores and remote access via the Backblaze iPhone and Android apps, it became increasingly important to have all data be highly available at all times.

The Backblaze Vault architecture enables both data backups and restores to be highly available.

With the Vault arrangement of 17 data shards plus three parity shards for each file, all of the data is available as long as 17 of the 20 Storage Pods in the Vault are available. This keeps the data available while allowing for normal maintenance and rare expected failures.

4. Highly Performant

The original Backblaze Storage Pods could individually accept 950 Mbps (megabits per second) of data for storage.

The new Vault pods have more overhead, because they must break each file into pieces, distribute the pieces across the local network to the other Storage Pods in the vault, and then write them to disk. In spite of this extra overhead, the Vault is able to achieve 1,000 Mbps of data arriving at each of the 20 pods.

Backblaze Vault Networking

This capacity required a new type of Storage Pod that could handle this volume. The net of this: a single Vault can accept a whopping 20 Gbps of data.

Because there is no central bottleneck, adding more Vaults linearly adds more bandwidth.

5. Operationally Easier

When Backblaze launched in 2008 with a single Storage Pod, many of the operational analyses (e.g. how to balance load) could be done on a simple spreadsheet and manual tasks (e.g. swapping a hard drive) could be done by a single person. As Backblaze grew to nearly 1,000 Storage Pods and over 40,000 hard drives, the systems we developed to streamline and operationalize the cloud storage became more and more advanced. However, because our system relied on Linux RAID, there were certain things we simply could not control.

With the new Vault software, we have direct access to all of the drives and can monitor their individual performance and any indications of upcoming failure. And, when those indications say that maintenance is needed, we can shut down one of the pods in the Vault without interrupting any service.

6. Astoundingly Cost Efficient

Even with all of these wonderful benefits that Backblaze Vaults provide, if they raised costs significantly, it would be nearly impossible for us to deploy them since we are committed to keeping our online backup service affordable for completely unlimited data. However, the Vault architecture is nearly cost neutral while providing all these benefits.

Backblaze Vault Cloud Storage

When we were running on Linux RAID, we used RAID6 over 15 drives: 13 data drives plus two parity. That’s 15.4% storage overhead for parity.

With Backblaze Vaults, we wanted to be able to do maintenance on one pod in a vault and still have it be fully available, both for reading and writing. And, for safety, we weren’t willing to have fewer than two parity shards for every file uploaded. Using 17 data plus three parity drives raises the storage overhead just a little bit, to 17.6%, but still gives us two parity drives even in the infrequent times when one of the pods is in maintenance. In the normal case when all 20 pods in the Vault are running, we have three parity drives, which adds even more reliability.

Summary

Backblaze’s cloud storage Vaults calculated at 99.999999% (eight nines) annual durability (now 11 nines — Editor), horizontal scalability, and 20 Gbps of per-Vault performance, while being operationally efficient and extremely cost effective. Driven from the same mindset that we brought to the storage market with Backblaze Storage Pods, Backblaze Vaults continue our singular focus of building the most cost-efficient cloud storage available anywhere.

•  •  •

Note: This post was updated from the original version posted on March 11, 2015.

The post Backblaze Vaults: Zettabyte-Scale Cloud Storage Architecture appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/vault-cloud-storage-architecture/feed/ 150
These Aren’t Your Ordinary Data Centers https://www.backblaze.com/blog/these-arent-your-ordinary-data-centers/ https://www.backblaze.com/blog/these-arent-your-ordinary-data-centers/#comments Thu, 30 May 2019 15:49:25 +0000 https://www.backblaze.com/blog/?p=90149 Most of us likely think of data centers as rather predictable places with big buildings full of racks, cables, cooling, and other equipment, but we learned that there are exceptions to this stereotype that literally are out of this world.

The post These Aren’t Your Ordinary Data Centers appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
Barcelona Supercomputing Center

Many of us would concede that buildings housing data centers are generally pretty ordinary places. They’re often drab and bunker-like with few or no windows, and located in office parks or in rural areas. You usually don’t see signs out front announcing what they are, and, if you’re not in information technology, you might be hard pressed to guess what goes on inside.

If you’re observant, you might notice cooling towers for air conditioning and signs of heavy electrical usage as clues to their purpose. For most people, though, data centers go by unnoticed and out of mind. Data center managers like it that way, because the data stored in and passing through these data centers is the life’s blood of business, research, finance, and our modern, digital-based lives.

That’s why the exceptions to low-key and meh data centers are noteworthy. These unusual centers stand out for their design, their location, what the building was previously used for, or perhaps how they approach energy usage or cooling.

Let’s take a look at a handful of data centers that certainly are outside of the norm.

The Underwater Data Center

Microsoft’s rationale for putting a data center underwater makes sense. Most people live near water, they say, and their submersible data center is quick to deploy, and can take advantage of hydrokinetic energy for power and natural cooling.

Project Natick has produced an experimental, shipping-container-size prototype designed to process data workloads on the seafloor near Scotland’s Orkney Islands. It’s part of a years-long research effort to investigate manufacturing and operating environmentally sustainable, prepackaged datacenter units that can be ordered to size, rapidly deployed, and left to operate independently on the seafloor for years.

Microsoft's Project Natick
Microsoft’s Project Natick at the launch site in the city of Stromness on Orkney Island, Scotland on Sunday May 27, 2018. (Photography by Scott Eklund/Red Box Pictures)
Natick Brest
Microsoft’s Project Natick in Brest, France

The Supercomputing Center in a Former Catholic Church

One might be forgiven for mistaking Torre Girona for any normal church, but this deconsecrated 20th century church currently houses the Barcelona Supercomputing Center, home of the MareNostrum (Latin for Our sea, the Roman name for the Mediterranean Sea) supercomputer. As part of the Polytechnic University of Catalonia, this supercomputer is used for a range of research projects, from climate change to cancer research, biomedicine, weather forecasting, and fusion energy simulations.

Torre Girona. a former Catholic church in Barcelona
Torre Girona, a former Catholic church in Barcelona
The Barcelona Supercomputing Center, home of the MareNostrum supercomputer
The Barcelona Supercomputing Center, home of the MareNostrum supercomputer

The Barcelona Supercomputing Center, home of the MareNostrum supercomputer

The Barcelona Supercomputing Center, home of the MareNostrum supercomputer

The Under-a-Mountain Bond Supervillain Data Center

Most data centers don’t have the extreme protection or history of the The Bahnhof Data Center, which is located inside the ultra-secure former nuclear bunker Pionen, in Stockholm, Sweden. It is buried 100 feet below ground inside the White Mountains and secured behind 15.7 in. thick metal doors. It prides itself on its self-described Bond villain ambiance.

We previously wrote about this extraordinary data center in our post, The Challenges of Opening a Data Center — Part 1.

The Bahnhof Data Center under White Mountain in Stockholm, Sweden
The Bahnhof Data Center under White Mountain in Stockholm, Sweden

The Data Center That Can Survive a Class 5 Hurricane

Sometimes the location of the center comes first and the facility is hardened to withstand anticipated threats, such as Equinix’s NAP of the Americas data center in Miami, one of the largest single-building data centers on the planet (six stories and 750,000 square feet), which is built 32 feet above sea level and designed to withstand category five hurricane winds.

The MI1 facility provides access for the Caribbean, South and Central America and to more than 148 countries worldwide, and is the primary network exchange between Latin America and the U.S., according to Equinix. Any outage in this data center could potentially cripple businesses passing information between these locations.

The center was put to the test in 2017 when Hurricane Irma, a class 5 hurricane in the Caribbean, made landfall in Florida as a class 4 hurricane. The storm caused extensive damage in Miami-Dade County, but the Equinix center survived.

Equinix NAP of the Americas Data Center in Miami
Equinix NAP of the Americas Data Center in Miami

The Data Center Cooled by Glacier Water

Located on Norway’s west coast, the Lefdal Mine Datacenter is built 150 meters into a mountain in what was formerly an underground mine for excavating olivine, also known as the gemstone peridot, a green, high- density mineral used in steel production. The data center is powered exclusively by renewable energy produced locally, while being cooled by water from the second largest fjord in Norway, which is 565 meters deep and fed by the water from four glaciers. As it’s in a mine, the data center is located below sea level, eliminating the need for expensive high-capacity pumps to lift the fjord’s water to the cooling system’s heat exchangers, contributing to the center’s power efficiency.

The Lefdal Mine Data Center in Norway
The Lefdal Mine Datacenter in Norway

The World’s Largest Data Center

The Tahoe Reno 1 data center in The Citadel Campus in Northern Nevada, with 7.2 million square feet of data center space, is the world’s largest data center. It’s not only big, it’s powered by 100% renewable energy with up to 650 megawatts of power.

The Switch Core Campus in Nevada
The Switch Core Campus in Nevada
Tahoe Reno Switch Data Center
Tahoe Reno Switch Data Center

An Out of This World Data Center

If the cloud isn’t far enough above us to satisfy your data needs, Cloud Constellation Corporation plans to put your data into orbit. A constellation of eight low earth orbit satellites (LEO), called SpaceBelt, will offer up to five petabytes of space-based secure data storage and services and will use laser communication links between the satellites to transmit data between different locations on Earth.

CCC isn’t the only player talking about space-based data centers, but it is the only one so far with $100 million in funding to make their plan a reality.

Cloud Constellation's SpaceBelt
Cloud Constellation’s SpaceBelt

A Cloud Storage Company’s Modest Beginnings

OK, so our current data centers are not that unusual (with the possible exception of our now iconic Storage Pod design), but there was a time when Backblaze was just getting started and was figuring out how to make data storage work while keeping costs as low as possible for our customers. It’s a long way from these modest beginnings to almost one exabyte (one billion gigabytes) of customer data stored today.

The photo below is not exactly a data center, but it is the first data storage structure used by Backblaze to develop its storage infrastructure before going live with customer data. It was on the patio behind the Palo Alto apartment that Backblaze used for its first office.

Shed used for very early (pre-customer) data storage testing
Shed used for very early (pre-customer) data storage testing

The photos below (front and back) are of the very first data center cabinet that Backblaze filled with customer data. This was in 2009 in San Francisco, and just before we moved to a data center in Oakland where there was room to grow. Note the storage pod at the top of the cabinet. Yes, it’s made out of wood. (You have to start somewhere.)

Backblaze's first data storage cabinet to hold customer data (2009) (front)
Backblaze’s first data storage cabinet to hold customer data (2009) (front)
Backblaze's first data storage cabinet to hold customer data (2009) (back)
Backblaze’s first data storage cabinet to hold customer data (2009) (back)

Do You Know of Other Unusual Data Centers?

Do you know of another data center that should be on this list? Please tell us in the comments.

The post These Aren’t Your Ordinary Data Centers appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/these-arent-your-ordinary-data-centers/feed/ 2
The Challenges of Opening a Data Center — Part 2 https://www.backblaze.com/blog/factors-for-choosing-data-center/ https://www.backblaze.com/blog/factors-for-choosing-data-center/#comments Fri, 02 Mar 2018 14:00:52 +0000 https://www.backblaze.com/blog/?p=81188 In Part 2 of our series on data centers we continue to look at factors that need to be considered both by those interested in a dedicated data center and those seeking to colocate in an existing center.

The post The Challenges of Opening a Data Center — Part 2 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
Rows of storage pods in a data center

This is part two of a series on the factors that an organization needs to consider when opening a data center and the challenges that must be met in the process.

In Part 1 of this series, we looked at the different types of data centers, the importance of location in planning a data center, data center certification, and the single most expensive factor in running a data center, power.

In Part 2, we continue to look at factors that need to considered both by those interested in a dedicated data center and those seeking to colocate in an existing center.

Power (continued from Part 1)

In part 1, we began our discussion of the power requirements of data centers.

As we discussed, redundancy and failover are primary requirements for data center power. A redundantly designed power supply system is also a necessity for maintenance, as it enables repairs to be performed on one network, for example, without having to turn off servers, databases, or electrical equipment.

Power Path

The common critical components of a data center’s power flow are:

  • Utility Supply
  • Generators
  • Transfer Switches
  • Distribution Panels
  • Uninterruptible Power Supplies (UPS)
  • PDUs

Utility Supply is the power that comes from one or more utility grids. While most of us consider the grid to be our primary power supply (hats off to those of you who manage to live off the grid), politics, economics, and distribution make utility supply power susceptible to outages, which is why data centers must have autonomous power available to maintain availability.

Generators are used to supply power when the utility supply is unavailable. They convert mechanical energy, usually from motors, to electrical energy.

Transfer Switches are used to transfer electric load from one source or electrical device to another, such as from one utility line to another, from a generator to a utility, or between generators. The transfer could be manually activated or automatic to ensure continuous electrical power.

Distribution Panels get the power where it needs to go, taking a power feed and dividing it into separate circuits to supply multiple loads.

A UPS, as we touched on earlier, ensures that continuous power is available even when the main power source isn’t. It often consists of batteries that can come online almost instantaneously when the current power ceases. The power from a UPS does not have to last a long time as it is considered an emergency measure until the main power source can be restored. Another function of the UPS is to filter and stabilize the power from the main power supply.

Data Center UPS

Data center UPSs

PDU stands for the Power Distribution Unit and is the device that distributes power to the individual pieces of equipment.

Network

After power, the networking connections to the data center are of prime importance. Can the data center obtain and maintain high-speed networking connections to the building? With networking, as with all aspects of a data center, availability is a primary consideration. Data center designers think of all possible ways service can be interrupted or lost, even briefly. Details such as the vulnerabilities in the route the network connections make from the core network (the backhaul) to the center, and where network connections enter and exit a building, must be taken into consideration in network and data center design.

Routers and switches are used to transport traffic between the servers in the data center and the core network. Just as with power, network redundancy is a prime factor in maintaining availability of data center services. Two or more upstream service providers are required to ensure that availability.

How fast a customer can transfer data to a data center is affected by: 1) the speed of the connections the data center has with the outside world, 2) the quality of the connections between the customer and the data center, and 3) the distance of the route from customer to the data center. The longer the length of the route and the greater the number of packets that must be transferred, the more significant a factor will be played by latency in the data transfer. Latency is the delay before a transfer of data begins following an instruction for its transfer. Generally latency, not speed, will be the most significant factor in transferring data to and from a data center. Packets transferred using the TCP/IP protocol suite, which is the conceptual model and set of communications protocols used on the internet and similar computer networks, must be acknowledged when received (ACK’d) and requires a communications roundtrip for each packet. If the data is in larger packets, the number of ACKs required is reduced, so latency will be a smaller factor in the overall network communications speed.

Those interested in testing the overall speed and latency of their connection to Backblaze’s data centers can use the Check Your Bandwidth tool on our website.

Latency generally will be less significant for data storage transfers than for cloud computing. Optimizations such as multi-threading, which is used in Backblaze’s Cloud Backup service, will generally improve overall transfer throughput if sufficient bandwidth is available.

Data center telecommunications equipment

Data center telecommunications equipment. Image originally by Gary Stevens of Hosting Canada. Image from Wikimedia.

Data center under floor cable runs

Data center under floor cable runs

Cooling

Computer, networking, and power generation equipment generates heat, and there are a number of solutions employed to rid a data center of that heat. The location and climate of the data center is of great importance to the data center designer because the climatic conditions dictate to a large degree what cooling technologies should be deployed that in turn affect the power used and the cost of using that power. The power required and cost needed to manage a data center in a warm, humid climate will vary greatly from managing one in a cool, dry climate. Innovation is strong in this area and many new approaches to efficient and cost-effective cooling are used in the latest data centers.

Switch's uninterruptible, multi-system, HVAC Data Center Cooling Units

Switch’s uninterruptible, multi-system, HVAC Data Center Cooling Units

There are three primary ways data center cooling can be achieved:

Room Cooling cools the entire operating area of the data center. This method can be suitable for small data centers, but becomes more difficult and inefficient as IT equipment density and center size increase.

Row Cooling concentrates on cooling a data center on a row by row basis. In its simplest form, hot aisle/cold aisle data center design involves lining up server racks in alternating rows with cold air intakes facing one way and hot air exhausts facing the other. The rows composed of rack fronts are called cold aisles. Typically, cold aisles face air conditioner output ducts. The rows the heated exhausts pour into are called hot aisles. Typically, hot aisles face air conditioner return ducts.

Rack Cooling tackles cooling on a rack by rack basis. Air-conditioning units are dedicated to specific racks. This approach allows for maximum densities to be deployed per rack. This works best in data centers with fully loaded racks, otherwise there would be too much cooling capacity, and the air-conditioning losses alone could exceed the total IT load.

Security

Data Centers are high-security facilities as they house business, government, and other data that contains personal, financial, and other secure information about businesses and individuals.

This list contains the physical-security considerations when opening or co-locating in a data center:

Layered Security Zones. Systems and processes are deployed to allow only authorized personnel in certain areas of the data center. Examples include keycard access, alarm systems, mantraps, secure doors, and staffed checkpoints.

Physical Barriers. Physical barriers, fencing and reinforced walls are used to protect facilities. In a colocation facility, one customers’ racks and servers are often inaccessible to other customers colocating in the same data center.

Backblaze racks secured in the data center

Backblaze racks secured in the data center

Monitoring Systems. Advanced surveillance technology monitors and records activity on approaching driveways, building entrances, exits, loading areas, and equipment areas. These systems also can be used to monitor and detect fire and water emergencies, providing early detection and notification before significant damage results.

Top-tier providers evaluate their data center security and facilities on an ongoing basis. Technology becomes outdated quickly, so providers must stay-on-top of new approaches and technologies in order to protect valuable IT assets.

To pass into high security areas of a data center requires passing through a security checkpoint where credentials are verified.

Data Center security

Data centers are careful to control access to all critical and sensitive areas of the site

Facilities and Services

As competition increases among data center colocation providers, expect to see more and more value-added services and facilities.These might include conference rooms, offices, and access to office equipment.

Providers also might offer break rooms, kitchen facilities, storage, and secure loading docks and freight elevators.

Moving into A Data Center

Moving into a data center is a major job for any organization. We wrote a post last year, Desert To Data in 7 Days — Our New Phoenix Data Center, about what it was like to move into our new data center in Phoenix, Arizona.

Desert to Data in Seven Days—Our Phoenix Data Center

Visiting a Data Center

Our Director of Product Marketing Andy Klein wrote a popular post last year on what it’s like to visit a data center called A Day in the Life of a Data Center.

A Day in the Life of a Data Center

Would you Like to Know More about The Challenges of Opening and Running a Data Center?

That’s it for part 2 of this series. If readers are interested, we could write a post about some of the new technologies and trends affecting data center design and use. Please let us know in the comments.

Here's a tip!Here’s a tip on finding all the posts tagged with data center on our blog. Just follow https://www.backblaze.com/blog/tag/data-center/.

The post The Challenges of Opening a Data Center — Part 2 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/factors-for-choosing-data-center/feed/ 2
The Challenges of Opening a Data Center — Part 1 https://www.backblaze.com/blog/choosing-data-center/ https://www.backblaze.com/blog/choosing-data-center/#comments Tue, 27 Feb 2018 16:00:17 +0000 https://www.backblaze.com/blog/?p=81073 In this series we’ll talk in general terms about the factors that an organization needs to consider when opening a data center and the challenges that must be met in the process. In Part 1, we look at the different types of data centers, the importance of location in planning a data center, data center certification, and the single most expensive factor in running a data center, power.

The post The Challenges of Opening a Data Center — Part 1 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
Backblaze storage pod in new data center

This is part one of a series. You can find the second part here.

Though most of us have never set foot inside of a data center, as citizens of a data-driven world we nonetheless depend on the services that data centers provide almost as much as we depend on a reliable water supply, the electrical grid, and the highway system. Every time we send a tweet, post to Facebook, check our bank balance or credit score, watch a YouTube video, or back up a computer to the cloud, we are interacting with a data center.

In this series, The Challenges of Opening a Data Center, we’ll talk in general terms about the factors that an organization needs to consider when opening a data center and the challenges that must be met in the process. Many of the factors to consider will be similar for opening a private data center or seeking space in a public data center, but we’ll assume for the sake of this discussion that our needs are more modest than requiring a data center dedicated solely to our own use (i.e. we’re not Google, Facebook, or China Telecom).

Data center technology and management are changing rapidly, with new approaches to design and operation appearing every year. This means we won’t be able to cover everything happening in the world of data centers in our series, however, we hope our brief overview proves useful.

What is a Data Center?

A data center is the structure that houses a large group of networked computer servers typically used by businesses, governments, and organizations for the remote storage, processing, or distribution of large amounts of data.

While many organizations will have computing services in the same location as their offices that support their day-to-day operations, a data center is a structure dedicated to 24/7 large-scale data processing and handling.

Depending on how you define the term, there are anywhere from a half million data centers in the world to many millions. While it’s possible to say that an organization’s on-site servers and data storage can be called a data center, in this discussion we are using the term data center to refer to facilities that are expressly dedicated to housing computer systems and associated components, such as telecommunications and storage systems. The facility might be a private center, which is owned or leased by one tenant only, or a shared data center that offers what are called “colocation services,” and rents space, services, and equipment to multiple tenants in the center.

A large, modern data center operates around the clock, placing a priority on providing secure and uninterrrupted service, and generally includes redundant or backup power systems or supplies, redundant data communication connections, environmental controls, fire suppression systems, and numerous security devices. Such a center is an industrial-scale operation often using as much electricity as a small town.

Types of Data Centers

There are a number of ways to classify data centers according to how they are set up and used. These factors include:

  1. Whether they are owned or used by one or multiple organizations
  2. Whether and how they fit into a topology of other data centers
  3. Which technologies and management approaches they use for computing, storage, cooling, power, and operations
  4. How green they are, which has become more important and visible in recent years

Data centers can be loosely classified into three types according to who owns them and who uses them.

Exclusive Data Centers are facilities wholly built, maintained, operated and managed by the business for the optimal operation of its IT equipment. Some of these centers are well-known companies such as Facebook, Google, or Microsoft, while others are less public-facing big telecoms, insurance companies, or other service providers.

Managed Hosting Providers are data centers managed by a third party on behalf of a business. The business does not own data center or space within it. Rather, the business rents IT equipment and infrastructure it needs instead of investing in the outright purchase of what it needs.

Colocation Data Centers are usually large facilities built to accommodate multiple businesses within the center. The business rents its own space within the data center and subsequently fills the space with its IT equipment, or possibly uses equipment provided by the data center operator.

Backblaze, for example, doesn’t own its own data centers but colocates in data centers owned by others. As Backblaze’s storage needs grow, Backblaze increases the space it uses within a given data center and/or expands to other data centers in the same or different geographic areas.

Availability is Key

When designing or selecting a data center, an organization needs to decide what level of availability is required for its services. The type of business or service it provides likely will dictate this. Any organization that provides real-time and/or critical data services will need the highest level of availability and redundancy, as well as the ability to rapidly failover (transfer operation to another center) when and if required. Some organizations require multiple data centers not just to handle the computer or storage capacity they use, but to provide alternate locations for operation if something should happen temporarily or permanently to one or more of their centers.

Organizations operating data centers that can’t afford any downtime at all will typically operate data centers that have a mirrored site that can take over if something happens to the first site, or they operate a second site in parallel to the first one. These data center topologies are called Active/Passive, and Active/Active, respectively. Should disaster or an outage occur, disaster mode would dictate immediately moving all of the primary data center’s processing to the second data center.

While some data center topologies are spread throughout a single country or continent, others extend around the world. Practically, data transmission speeds put a cap on centers that can be operated in parallel with the appearance of simultaneous operation. Linking two data centers located apart from each other — say no more than 60 miles to limit data latency issues — together with dark fiber (leased fiber optic cable) could enable both data centers to be operated as if they were in the same location, reducing staffing requirements yet providing immediate failover to the secondary data center if needed.

This redundancy of facilities and ensured availability is of paramount importance to those needing uninterrupted data center services.

Active/Passive Data Centers

Active/Active Data Centers

LEED Certification

Leadership in Energy and Environmental Design (LEED) is a rating system devised by the United States Green Building Council (USGBC) for the design, construction, and operation of green buildings. Facilities can achieve ratings of certified, silver, gold, or platinum based on criteria within six categories: sustainable sites, water efficiency, energy and atmosphere, materials and resources, indoor environmental quality, and innovation and design.

Green certification has become increasingly important in data center design and operation as data centers require great amounts of electricity and often cooling water to operate. Green technologies can reduce costs for data center operation, as well as make the arrival of data centers more amenable to environmentally-conscious communities.

The ACT, Inc. data center in Iowa City, Iowa was the first data center in the U.S. to receive LEED-Platinum certification, the highest level available.

ACT Data Center exterior

ACT Data Center exterior

ACT Data Center interior

ACT Data Center interior

Factors to Consider When Selecting a Data Center

There are numerous factors to consider when deciding to build or to occupy space in a data center. Aspects such as proximity to available power grids, telecommunications infrastructure, networking services, transportation lines, and emergency services can affect costs, risk, security and other factors that need to be taken into consideration.

The size of the data center will be dictated by the business requirements of the owner or tenant. A data center can occupy one room of a building, one or more floors, or an entire building. Most of the equipment is often in the form of servers mounted in 19 inch rack cabinets, which are usually placed in single rows forming corridors (so-called aisles) between them. This allows staff access to the front and rear of each cabinet. Servers differ greatly in size from 1U servers (i.e. one “U” or “RU” rack unit measuring 44.50 millimeters or 1.75 inches), to Backblaze’s Storage Pod design that fits a 4U chassis, to large freestanding storage silos that occupy many square feet of floor space.

Location

Location will be one of the biggest factors to consider when selecting a data center and encompasses many other factors that should be taken into account, such as geological risks, neighboring uses, and even local flight paths. Access to suitable available power at a suitable price point is often the most critical factor and the longest lead time item, followed by broadband service availability.

With more and more data centers available providing varied levels of service and cost, the choices increase each year. Data center brokers can be employed to find a data center, just as one might use a broker for home or other commercial real estate.

Websites listing available colocation space, such as UpStack.com, or entire data centers for sale or lease, are widely used. A common practice is for a customer to publish its data center requirements, and the vendors compete to provide the most attractive bid in a reverse auction.

Business and Customer Proximity

The center’s closeness to a business or organization may or may not be a factor in the site selection. The organization might wish to be close enough to manage the center or supervise the on-site staff from a nearby business location. The location of customers might be a factor, especially if data transmission speeds and latency are important, or the business or customers have regulatory, political, tax, or other considerations that dictate areas suitable or not suitable for the storage and processing of data.

Climate

Local climate is a major factor in data center design because the climatic conditions dictate what cooling technologies should be deployed. In turn this impacts uptime and the costs associated with cooling, which can total as much as 50% or more of a center’s power costs. The topology and the cost of managing a data center in a warm, humid climate will vary greatly from managing one in a cool, dry climate. Nevertheless, data centers are located in both extremely cold regions and extremely hot ones, with innovative approaches used in both extremes to maintain desired temperatures within the center.

Geographic Stability and Extreme Weather Events

A major obvious factor in locating a data center is the stability of the actual site as regards weather, seismic activity, and the likelihood of weather events such as hurricanes, as well as fire or flooding.

Backblaze’s Sacramento data center describes its location as one of the most stable geographic locations in California, outside fault zones and floodplains.

Sacramento Data Center

Sometimes the location of the center comes first and the facility is hardened to withstand anticipated threats, such as Equinix’s NAP of the Americas data center in Miami, one of the largest single-building data centers on the planet (six stories and 750,000 square feet), which is built 32 feet above sea level and designed to withstand category 5 hurricane winds.

Equinix Data Center in Miami

Equinix “NAP of the Americas” Data Center in Miami

Most data centers don’t have the extreme protection or history of the Bahnhof data center, which is located inside the ultra-secure former nuclear bunker Pionen, in Stockholm, Sweden. It is buried 100 feet below ground inside the White Mountains and secured behind 15.7 in. thick metal doors. It prides itself on its self-described “Bond villain” ambiance.

Bahnhof Data Center under White Mountain in Stockholm

Usually, the data center owner or tenant will want to take into account the balance between cost and risk in the selection of a location. The Ideal quadrant below is obviously favored when making this compromise.

Cost vs Risk in selecting a data center

Cost = Construction/lease, power, bandwidth, cooling, labor, taxes
Risk = Environmental (seismic, weather, water, fire), political, economic

Risk mitigation also plays a strong role in pricing. The extent to which providers must implement special building techniques and operating technologies to protect the facility will affect price. When selecting a data center, organizations must make note of the data center’s certification level on the basis of regulatory requirements in the industry. These certifications can ensure that an organization is meeting necessary compliance requirements.

Power

Electrical power usually represents the largest cost in a data center. The cost a service provider pays for power will be affected by the source of the power, the regulatory environment, the facility size and the rate concessions, if any, offered by the utility. At higher level tiers, battery, generator, and redundant power grids are a required part of the picture.

Fault tolerance and power redundancy are absolutely necessary to maintain uninterrupted data center operation. Parallel redundancy is a safeguard to ensure that an uninterruptible power supply (UPS) system is in place to provide electrical power if necessary. The UPS system can be based on batteries, saved kinetic energy, or some type of generator using diesel or another fuel. The center will operate on the UPS system with another UPS system acting as a backup power generator. If a power outage occurs, the additional UPS system power generator is available.

Many data centers require the use of independent power grids, with service provided by different utility companies or services, to prevent against loss of electrical service no matter what the cause. Some data centers have intentionally located themselves near national borders so that they can obtain redundant power from not just separate grids, but from separate geopolitical sources.

Higher redundancy levels required by a company will of invariably lead to higher prices. If one requires high availability backed by a service-level agreement (SLA), one can expect to pay more than another company with less demanding redundancy requirements.

Continue with Part 2 of The Challenges of Opening a Data Center

That’s it for part 1. Read the second part here where we’ll take a look at some other factors to consider when moving into a data center such as network bandwidth, cooling, and security. In future posts, we’ll take a look at what is involved in moving into a new data center. We’ll also investigate what it takes to keep a data center running, and some of the new technologies and trends affecting data center design and use.

•      •      •

Here's a tip!Here’s a tip on finding all the posts tagged with data center on our blog. Just follow https://www.backblaze.com/blog/tag/data-center/.

The post The Challenges of Opening a Data Center — Part 1 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/choosing-data-center/feed/ 5
Desert to Data in Seven Days—Our Phoenix Data Center https://www.backblaze.com/blog/data-center-design/ https://www.backblaze.com/blog/data-center-design/#comments Wed, 28 Jun 2017 15:00:33 +0000 https://www.backblaze.com/blog/?p=75915 We are pleased to announce that Backblaze is now storing some of our customers’ data in our newest data center in Phoenix. Let’s take you through the process of getting the Phoenix data center up and running.

The post Desert to Data in Seven Days—Our Phoenix Data Center appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>

We are pleased to announce that Backblaze is now storing some of our customers’ data in our newest data center in Phoenix. Our Sacramento facility was slated to store about 500PB of data and was starting to fill up, so it was time to expand. After visiting multiple locations in the U.S. and Canada, we selected Phoenix as it had the right combination of power, networking, price, and more that we were seeking. Let’s take you through the process of getting the Phoenix data center up and running.

Day 0: Designing the Data Center

After we selected the Phoenix location as our next data center, we had to negotiate the contract. We’re going to skip that part of the process because unless you’re a lawyer, it’s a long, boring process. Let’s just say we wanted to be ready to move in once the contract was signed. That meant we had to gather up everything we needed and order a bunch of other things like networking equipment, racks, Storage Pods, cables, etc. We decided to use our Sacramento data center as the staging point and started gathering what was going to be needed in Phoenix.

In actuality, for some items we started the process several months ago as lead times for things like network switches, Storage Pods, and even hard drives can be measured in months and delays are normal. For example, depending on our move in date, the network providers we wanted would only be able to provide limited bandwidth, so we had to prepare for that possibility. It helps to have a procurement person who knows what they are doing, can work the schedule, and is creatively flexible—thanks, Amanda.

So by day 0, we had amassed multiple pallets of cabinets, network gear, power distribution units (PDUs), tools, hard drives, carts, Guido, and more. And yes, for all you Guido fans, he is still with us and he now resides in Phoenix. Everything was wrapped and loaded into a 53ft semi-truck that was driven the 755 miles (1,215km) from Sacramento, California to Phoenix, Arizona.

Day 1: Move in Day

We sent a crew of five people to Phoenix with the goal of going from empty space to being ready to accept data in one week. The truck from Sacramento arrived mid-morning and work started unloading and marshaling the pallets and boxes into one area, while the racks were placed near their permanent location on the data center floor.

Day 2: Building the Racks

Day 2 was spent primarily working with the racks. First they were positioned to their precise location on the data center floor. They were then anchored down and tied together. We started with two rows of 22 racks each, with 20 being for Storage Pods and two being for networking equipment. By the end of the week, there will be four rows of racks installed.

Day 3: Networking and Power, Part One

While one team continued to work on the racks, another team began the process a getting the racks connected to the electricity and running the network cables to the network distribution racks. Once that was done, networking gear and rack-based PDUs were installed in the racks.

Day 4: Rack Storage Pods

The truck from Sacramento brought 100 Storage Pods, a combination of 45 drive and 60 drive systems. Why did we use 45 drives units here? It has to do with the size (in racks and power) of the initial installation commitment and the ramp (increase) of installations over time. Contract stuff: boring yes, important yes. Basically, to optimize our spend we wanted to use as much of the initial space we were allotted as possible. Since we had a number of empty 45 drive chassis available in Sacramento we decided to put them to use.

Day 5: Drive Day

Our initial set-up goal was to build out five Backblaze Vaults. Each Vault is comprised of 20 Storage Pods. Four of the Vaults were filled with 45 drive Storage Pods and one was filled with 60 drive Storage Pods. That’s 4,800 hard drives to install—thank goodness we don’t use those rubber bands around the drives anymore.

Day 6: Networking and Power, Part Two

With the Storage Pods in place, day 6 was spent routing network and power cables to the individual Pods. A critical part of the process is to label every wire so you know where it comes from and where it goes, too. Once labeled, wires are bundled together and secured to the racks in a standard pattern. Not only does this make things look neat, it standardizes where you’ll find each cable across the hundreds of racks that are in the data center.

Day 7: Test, Repair, Test, Ready

With all the power and networking finished, it was time to test the installation. Most of the Storage Pods light up with no problem, but there were a few that failed. These failures are quickly dealt with, and one by one each Backblaze Vault is registered into our monitoring and administration systems. By the end of the day, all five Vaults were ready.

Moving Forward

The Phoenix data center was ready for operation except that the network carriers we wanted to use could only provide a limited amount of bandwidth to start. It would take a few more weeks before the final network lines would be provisioned and operational. Even with the limited bandwidth we kicked off the migration of customer data from Sacramento to Phoenix to help balance out the workload. A few weeks later, once the networking was sorted out, we started accepting external customer data.

We’d like to thank our data center build team for documenting their work in pictures and allowing us to share some of them with our readers.

















Questions About Our New Data Center

Now that we have a second data center, you might have a few questions, such as, can you store your data there, and so on. Here’s the status of things today…

A:Not yet. Right now we consider the Phoenix data center and the Sacramento data center to be in the same region.

Q: Does the new data center mean Backblaze has multi-region storage?
Q: Will you ever provide multi-region support?
A: Yes, we expect to provide multi-region support in the future, but we don’t have a date for that capability yet.
Q: Can I pick which data center will store my data?
A: Not yet. This capability is part of our plans when we provide multi-region support.
Q: Which data center is my data being stored in?
A: Chances are that your data is in the Sacramento data center given it currently stores about 90% of our customer’s data.
Q: Will my data be split across the two data centers?
A: It is possible that one portion of your data will be stored in the Sacramento data center and another portion of your data will be stored in the Phoenix data center. This will be completely invisible to you and you should see no difference in storage or data retrieval times.
Q: Can my data be replicated from one data center to the other?
A: Not today. As noted above, your data will be in one data center or the other. That said, files uploaded to the Backblaze Vaults in either data center are stored redundantly across 20 Backblaze Storage Pods within that data center. This has been calculated at 99.999999% durability for the data stored this way. [7/17/2018—Updated annual durability calculated at 99.999999999%. See Backblaze Durability Calculates at 99.999999999%—And Why It Doesn’t Matter.—Editor]
Q: Do you plan on opening more data centers?
A: Yes. We are actively looking for new locations.

If you have any additional questions, please let us know in the comments or on social media. Thanks.

The post Desert to Data in Seven Days—Our Phoenix Data Center appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

]]>
https://www.backblaze.com/blog/data-center-design/feed/ 28