What you'll learn:
- 人工智能和推论的影响在推进the IoT.
- Why real-time data at the edge is becoming a top priority in IoT design.
- The rise of breakthrough synthetic aperture radar (SAR) technology.
The Internet of Things has advanced significantly in recent years, with billions of connected devices worldwide. But for IoT to fully deliver on its promise, more work is required in several areas. One is in improving the pace of data processing to produce real-time intelligence.
Let me explain. IoT is generating massive amounts of data, and every day more and more is added into the mix. The World Economic Forum estimated 44 zettabytes of data in the world at the start of 2020. The same group estimates the amount of data generated each day to reach 463 exabytes globally in 2025. In our move to space, sensor processing gets even more difficult. Lack of high-speed communications channels and limited storage means that satellites may flush data daily, used or not.
In many cases, data centers in the cloud are being used to store IoT information and then process it. Often in its raw form, this data is transmitted and stored in data lakes. The problem is that for real-time applications, data processing in the cloud doesn’t work nearly fast enough.
What kind of real-time applications am I talking about? Well, for instance, having an automated customs line given a database of all passports. Or in a medical facility where facial recognition is used to provide access to medical professionals, while restricting patients for their own safety.
Historically, CPUs have been the mainstay of data processing. More recently, GPUs entered the picture and have sped up training to derive greater patterns from data. However, even with GPUs, one old problem persists: namely, fast inference at the edge.
Intelligent Inference
Training in AI entails teaching the system to perform a prescribed job. Inference is the AI’s ability to then apply what it’s learned to that specific task. GPUs are great for training the data. It’s a lot of parallel multiply accumulates on large swaths of data. But inference— using the trained model to make critical decisions and doing it in real-time—is a single query.
Yes, you can batch up the processing, but only as much as your system has extra time between getting multiple inputs and needing to perform an action on the first one. The training/inference difference can be modeled as life: Going to school for several years to become an expert, but then using the learned capability to make intelligent decisions case by case in real-time situations.
If my inference workload is based on just a little bit of data, then the GPU is perfectly suited to the job. There are lots of minor applications where you can shrink down the training data because you’re not overly worried about accuracy and you’re not looking for a lot of detail. Take a traffic camera that’s trained to only identify motion of cars. During a red light, are cars stopped? If not, take a photo.
Other jobs require full data sets to have a finer granularity of distinction and a high level of accuracy. For instance, looking for a particular car associated with an Amber Alert.
The Data-Heavy SAR
One critical job that requires the processing of massive amounts of data is synthetic aperture radar (SAR). According to NASA, “SAR is a type of active data collection where a sensor produces its own energy and then records the amount of that energy reflected back after interacting with the Earth.” This is done with finely tuned radar and can be used to create high-resolution, two-dimensional images or three-dimensional reconstructions of objects, such as landscapes. The field of view and resolution can be modified by the frequency of the samples and the number of samples in a time window.
SAR is breakthrough technology because, unlike an aircraft-mounted or satellite-mounted optical camera, it can capture images at night and see through clouds and smoke(see figure). This makes SAR more effective for a number of use cases, including volcano, earthquake, and fire monitoring; oceanography; treaty adherence; and military surveillance.
The problem, though, is that SAR collects massive amounts of data as it scans the Earth. This can be from an airplane-mounted system or from a satellite. What generally happens on a SAR mission is that a plane or satellite captures samples from the desired location and that data is processed over a period of days or weeks.
GSIestimates that to process a 5- × 5-Km sample at 0.5-m resolution in one second, a CPU data center would require 23 cabinets of Xeon-Gold-based servers. GPUs can speed up the processing. However, the estimate is that it would take five cabinets of NVIDIA V-100 GPUs to process the SAR data in one second.
Paradigm Shift: Real-Time Data at the Edge
Why is real-time intelligence so important? Say you’re flying over a forest and using SAR to take images of the ground below. Lightning has struck the ground a few hours before and you’re looking for signs of smoke or smoldering branches. If the data needs to be returned to base and it takes days to process and run anomaly detection, that could be too late.
However, if you’re able to immediately determine there’s something unusual, then do a second higher-resolution pass and immediately conclude that a tiny area in your field of view is really smoldering, you can take corrective action right away and possibly prevent a wildfire, such as the Windy Fire that started in California from lightning.
The real paradigm shift is moving to real-time data processing and analysis at the edge. That’s where GSI’s new Gemini APU comes in. This custom processor can accelerate similarity search for a broad range of applications, and some pre-processing and data processing workloads such as SAR.
The APU is based on a new architecture that can perform as a memory and compute-in-memory on a single chip, which is optimized for real-time query-by-query inference on massive amounts of trained data. The GSI APU can complete the above-mentioned SAR processing in one second in a 1/3-sized cabinet.
Why is this a paradigm shift? Because of its portable size—drawing 1.2 kW/hr, it can be powered and run real-time workloads in an airplane, or a vehicle, or in an edge server at a medical center. Going back to our earlier comparison of CPUs and GPUs, even in a data center over a five-year period, the utilization versus the APU would be (accounting equal multiplier rate also for facility overhead) like that shown in thetable.
The great promise of IoT is to change behavior for the better. That only starts to happen when we can use IoT data to make intelligent decisions in a matter of seconds, not days or weeks. And once we get true intelligence from data, we can start to use it in ways to make life better, safer, and more convenient for everyone. That’s when IoT becomes a real game-changer.