龙空技术网

收藏 | 一文读懂数据中心预测性和预防性维护区别

深知社 303

前言:

现在你们对“networkissues”都比较讲究,同学们都需要了解一些“networkissues”的相关文章。那么小编在网络上汇集了一些对于“networkissues””的相关知识,希望同学们能喜欢,你们一起来了解一下吧!

数据中心的预测性和预防性维护Predictive and Preventive Maintenance in Data Centers

April 30 , 2020 By Kyle Bittner

译 者 说

每一个数据中心在设计阶段就应该考虑运维的便利性和机房的可扩展性。所有主动性的预测性维护都需要安装大量的传感器,在弱电规划设计时,就应该考虑传输线缆的冗余设计,还有放置在线缆桥架中传感器的可维护性。此外,市面上的产品都是基于历史数据通过机器学习预测未来可能发生故障的时间,这需要与依据员工经验的维护计划相结合,才能使整个系统更加可靠。

对数据、数据中心的可访问性和安全性是如今各大小公司最关心的问题的,确保数据中心设施充分优化以避免任何停机时间比以往任何时候都更有必要。而预测性和预防性维护起着关键作用,客户和运营方都承受不起宕机的后果。

In an age where data, data center accessibility, and security are top concerns for companies large and small, it’s more imperative than ever to ensure data center facilities are fully optimized to avoid any downtimes. And predictive and preventive maintenance plays a key role. Companies simply cannot afford downtimes.

一方面,财务上代价高昂。2014年,Gartner估计IT公司宕机每分钟损失5600美元(每小时损失超过30万美元)。

For one, they’re costly financially. In 2014, Gartner estimated that IT companies lose $5,600 per minute (and up to over $300K per hour).

数据流量在最近四年有所增加。因此,可以肯定的是,受影响的成本只会上升。

Data traffic has only increased in the last four years. Because of this, it’s safe to assume those costs have only risen.

如果基础设施数据可面向客户提供,那么潜在的宕机时间可能会让公司未来的业务蒙受损失。

And if a facility’s data is client-facing, potential downtime could cost a company future business.

与此同时,宕机也会损害公司的声誉。

It could also, at the same time, damage an otherwise spotless reputation.

的确,有些宕机不可避免,这也是科技行业中很自然的一部分。但很长一段时间以来,多数公司一直对维护工作自满而沾沾自喜。

Yes, some downtimes are simply unavoidable and a natural part of the tech industry. But for too long companies have been complacent about maintenance.

不幸的是,许多公司经常采取一种保守消极的态度;如果设备或系统出了问题,就直接把它关了再修。

Unfortunately, many companies often take a reactionary approach; something goes wrong, we’ll shut down and fix it.

庆幸的是,时代已经变了。技术的发展和我们自己对数据中心维护的理解大步前进。很多企业如今正在实施更加积极主动的解决方案。

Fortunately, times have changed. Technology and our own understanding of proper data center maintenance have advanced. Companies are implementing much more proactive solutions these days.

先从正确的预测和预防性维护策略说起。

And that starts with the right predictive and preventive maintenance strategies.

预测性维护

Predictive Maintenance

当我们谈到预测性维护时,通常指的是改进和避免数据传输中的延迟。

When we talk about predictive maintenance, usually it’s referring to a means of improving and avoiding delays in data delivery.

特别是关于如何修复或消除从数据传输到单个应用程序的任何中断。

Particularly about how to fix or eliminate any interruptions from the data delivery to an individual application.

这方面的例子可能包括面向客户的交易,比如在销售或银行取款/存款时的订单流程。对于许多公司来说,闪存的使用已经取代了传统的硬盘存储。

Examples might include customer-facing transactions like an order-process in sales or withdrawals/deposits at a bank. For many companies, the use of flash storage has replaced traditional hard drive disk storage.

闪存的速度和效率的提高有助于整体性能和减少数据传输的延迟。

The speed and efficiency improvements with flash storage help overall performance and cut down on the delay in data delivery.

幸运的是,许多新的闪存单元都配备了先进的人工智能。它们已经内置了预测分析功能。

And fortunately, many new flash storage units come equipped with advanced artificial intelligence. They have predictive analytics capabilities already built-in.

然而,如果硬件无法工作,那么有大量软件可以随时与现有的系统集成。

However, if your hardware does not, there is plenty of software readily available to integrate with existing systems.

预测性存储结合了数百万个传感器和人工智能遥测技术,不断记录、预测并防止任何可能发生的重大问题或中断。

Predictive storage uses a combination of millions of sensors and AI telemetry to constantly record, predict, and prevent any major problems or interruptions that might occur.

预测性维护的益处

Benefits of Predictive Maintenance

省钱。降低维护费用、停机维修操作的人员成本,并减少数据中心故障和停机成本。Save money. Decrease your maintenance fees, personnel costs for reactionary fixes, and reduce data center failures and downtime costs.提高效率和生产力。由于任何问题都可以实时预测、预防或解决,维护工作将在任何问题升级到严重程度之前采取措施。Increase efficiency and productivity. Since any problems will be predicted, prevented or solved in real-time, maintenance will occur before any issues can escalate too far.维护公司的声誉。使公司不会与任何应用程序中断,短时间的数据中心停机或中断联系在一起。再加上不用担心因为灾难性的故障而引发的公关噩梦。Preserve company reputation. You won’t be associated with any app-data gaps, short data center downtimes or interruptions. Plus no fear of a public relations nightmare due to full catastrophic failure.集成简便。现在获得合适的软件比以往任何时候都容易。Easy integration. Getting the right software is easier now than ever.

也许要升级硬件。但是,现在投资以节省以后的时间和金钱,比不使用预测分析、忽视日常的风险更有意义。

Maybe it’s a matter of upgrading your hardware. But investing now to save time and money later makes more sense than the day-to-day risks of not actively using predictive analytics.

在这种情况下,如果决定升级数据中心的网络设备,确保知道如何出售使用过的硬盘驱动器。你肯定想要在电脑零件上赚到最多的钱。

In the event, you decide to upgrade the network equipment in your data center, make sure you know how to sell your used hard drives. You want to be certain to get the most money for any computer parts.

事后看来,预测性维护似乎将成为所有以处理数据存储和交付业务公司的标准流程。然而,这是一个相对较新的想法和实践。

In hindsight, predictive maintenance seems like it would be a standard part of all companies dealing in major data storage and delivery. However, it’s a relatively new idea and practice.

技术和数据存储趋势只会继续向前发展。需要确保你的公司与时俱进。

Technology and data storage trends will only keep driving forward. Make sure your company stays modern.

不要掉队。充分优化您的数据中心,以发挥其最高能效。数据中心维护是一个不断变化的多方面问题。整合预测性分析只是解决问题的一部分。

Don’t get left behind. Fully optimize your data centers to perform at their highest capabilities. Data center maintenance is a multi-faceted issue that’s ever-changing. Integrating predictive analytics is only part of the battle.

预防性维护

Preventive Maintenance

如果预测性维护用于预测和预防数据传输中的问题,那么预防性维护可能会是什么样子呢?

If predictive maintenance is used to predict and prevent issues in data delivery, then what could preventive maintenance possibly look like?

在这种情况下,预防性维护就是对数据中心中所有重要、昂贵的网络设备进行物理上的维护。

Well, in this case, preventive maintenance is simply physically taking care of all of the important, expensive network equipment in your data center.

通常情况下,由于没有适当地建立简单的物理维护和管护,直接导致了重大的中断、数据中断和宕机。有时这仅仅是人为错误或忽视操作程序。有时,公司根本就没有一个实际的预防性维护计划。

More often than not, major outages, data interruptions, and downtimes are directly caused because simple, physical maintenance and care haven’t been properly established. Sometimes it’s simply human error and neglect of procedures. Other times it’s not having an actual preventive maintenance plan in place.

标准的预防性维护计划将有助于延长数据中心设备的寿命并降低维修成本。

A standard preventive maintenance plan will help extend the life and reduce any repair costs of the equipment in your data center.

同样,这会让企业更积极主动的降低和消除风险,而不是被动的去修复;从而提高数据中心的整体效率。

Again, it better suits organizations to be proactive than reactive; improve the overall efficiency of your data center.

需要合理维护的设备和系统包括:

Equipment and Systems that Require Proper Maintenance and Attention Include:

不间断电源 Uninterruptible Power Supply (UPS)电池 Batteries发电机 Generators暖通空调 HVAC电缆/线缆 Cables

您可以安排内部员工每年执行许多主要的维护检查和标准审查。

You can use your in-house staff to annually perform many of the major maintenance checks and standard inspections.

你上次换空气滤网是什么时候?电缆是否存在缠绕或损坏?公司的发电机用了多久了?如果放任不管,这些小问题最终会爆发,并可能导致更重大的问题。

When’s the last time that air filter was changed? Are cables tangled or damaged? How old are your company’s generators? If left alone, these small maintenance checks can eventually catch up and may cause a major problem.

也许你对公司内部处理任何检查或维护能力没有信心。没关系! 有很多第三方公司有足够的能力处理标准的预防性维护程序。

Maybe you’re not confident in your company’s ability to internally handle any inspections or maintenance. That’s ok! There are plenty of third-party companies that are more than capable of handling standard preventive maintenance procedures.

外部公司可以帮助您创建最佳的预防性维护计划。

An external company can help you create an optimal preventive maintenance plan.

他们可以每年规划好一切。包括对数据中心站点的状况进行一般性检查,以及对所有设备和硬件进行更深入的检查。

They can schedule everything annually. This includes general reviews of your data center site’s condition to more in-depth examinations of all of your equipment and hardware.

正确的预防性维护计划也将解放你的员工,让他们专注于自己的工作。

The right preventive maintenance plan will also free your employees up to focus solely on their own work.

无论您选择何种方式,请确保对任何过去或当前的预防性维护程序和历史记录都正确归档留存。

Whatever route you choose, be sure to keep correct and proper documentation of any past or current preventive maintenance procedures and history.

主动而非被动

Be Proactive Not Reactive

这个故事的寓意是,你最好是主动的,而不是被动的。

The moral of the story here is to do your best to be proactive and not reactive.

与数据传输有关的网络问题可能会发生,就像物理系统可能出现故障一样。故障发生时,重要的是要有合适的员工来迅速让你的网络恢复正常,恢复所有服务。

Network issues with data delivery may happen just like a physical system can fail. And if and when that time comes, it’s important to have the right staff in place to quickly get your network back on its feet and restore all your services.

你的确可以降低小中断、长时间宕机和完全崩溃的可能性和概率。

But you can reduce the likelihood and probability of small interruptions, long downtimes, and full breakdowns.

为此,您所需要做的就是落实简单、有效的预测性和预防性维护技术。

To that end, all you need to do is implement simple, effective predictive and preventive maintenance techniques.

在目前看来,这两种投资都是花销不菲。但其价值以后会通过减少维修时间和成本来翻倍呈现。

Both may seem like expensive investments in the present. But they’ll double their value later on with all the time and money saved in reduced repair costs.

在讨论宕机的可能性时,必须利用最好的工具进行更新,以保护数据免受网络攻击。

When discussing the probability of downtime, you must update yourself with all the best tools to protect your data from network attacks.

对于任何即将退役的数据中心,其中旧的网络设备升级或迁移时,请确保邀请经过认证的IT资产处置公司参与。

And for any network equipment upgrades or change of space where you’ll require a full-on data center decommission, be sure to enlist the help of a certified IT asset disposition company.

深 知 社

翻译:

Plato Deng

深知社数据中心高级研究员 /DKV计划创始成员

校对:

Eric

DKV(DeepKnowledge Volunteer)计划创始成员

公众号声明:

本文并非原文官方认可的中文版本,仅供国内读者学习参考,不得用于任何商业用途,文章内容请以英文原版为准。中文版未经公众号DeepKnowledge书面授权,请勿转载。

标签: #networkissues