Machine Vision
Assembly Line
GE Aerospace, Waygate Technologies to Deliver new AI-assisted Commercial Jet Engine Borescope Inspection Solution to Enhance Defect Recognition
GE Aerospace and Waygate Technologies, a Baker Hughes business, announced they have jointly developed a new, AI-assisted commercial engine borescope solution that will be available to Waygate Technologies customers and introduced to GE Aerospace’s MRO network later this year. The development represents the successful completion of their first development program under a Joint Technology Development Agreement between the two companies announced in May of 2023.
Through this joint development effort, GE Aerospace provided Waygate Technologies with a comprehensive dataset of engine inspection videos, which resulted in thousands of new representative images used for training Waygate Technologies’ Gas Power-assist ADR model. GE’s Services Technology Acceleration Center (STAC) and GE Aerospace Research brought subject matter expertise to ensure accurate and complete data labeling was performed. Waygate Technologies then leveraged this data and applied cutting-edge AI techniques, including a compute-optimized, state-of-the-art object detection algorithm and a novel temporal smoothing algorithm.
Key technical advancements, as compared to the program starting point (Gas Power-assist ADR model v4.1), include:
- Increased True Positive Rate: Model recall rates realized a 33.6% increase, indicating a dramatic improvement in identifying HPC defects.
- Decreased False Positive Rate: Model precision rates realized a 13.5% increase, indicating a reduction in previous falsely identified defects. This improvement was achieved both by an increased training dataset and the temporal smoothing algorithm used for detection confirmation.
The new AI-assisted features will be integrated and available for deployment through a software update to customers for Waygate Technologies’ Mentor Visual iQ+ borescope later this year. In addition, GE Aerospace will be introducing the model to its MRO network for use in High Pressure Compressor inspections for its GEnx and CFM LEAP engines.
The Role of AI-Powered Machine Vision Systems in Textile Quality Control
Integrating AI with machine vision enables systems to learn from past data and improve defect detection over time. For example, Robro Systems’ Kiara Web Inspection System (KWIS) uses AI-driven algorithms to enhance detection capabilities, adapting to new defect patterns that may emerge during production.
With KWIS, saw a 25% improvement in defect detection accuracy compared to manual inspection methods. For instance, in a batch of conveyor belt fabric, the system detected micro-tears that manual inspection would have missed, allowing to correct the issue early and avoid downstream quality failures. This reduced our material waste and ensured that only high-quality products reached our customers.
Implementing machine vision technology has also translated into significant cost savings for manufacturers. According to a study by the International Journal of Advanced Manufacturing Technology, machine vision can reduce defect-related production costs by up to 30%. For manufacturers, this has meant reducing the costs associated with rework and waste and minimizing customer returns and complaints.
Machine Vision for medical device manufacturing - Pentagon Automation Assembly & Cognex
Combining next-token prediction and video diffusion in computer vision and robotics
Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have proposed a simple change to the diffusion training scheme that makes this sequence denoising considerably more flexible.
When applied to fields like computer vision and robotics, the next-token and full-sequence diffusion models have capability trade-offs. Next-token models can spit out sequences that vary in length. However, they make these generations while being unaware of desirable states in the far future — such as steering its sequence generation toward a certain goal 10 tokens away — and thus require additional mechanisms for long-horizon (long-term) planning. Diffusion models can perform such future-conditioned sampling, but lack the ability of next-token models to generate variable-length sequences.
Researchers from CSAIL want to combine the strengths of both models, so they created a sequence model training technique called “Diffusion Forcing.” The name comes from “Teacher Forcing,” the conventional training scheme that breaks down full sequence generation into the smaller, easier steps of next-token generation (much like a good teacher simplifying a complex concept).
Diffusion Forcing found common ground between diffusion models and teacher forcing: They both use training schemes that involve predicting masked (noisy) tokens from unmasked ones. In the case of diffusion models, they gradually add noise to data, which can be viewed as fractional masking. The MIT researchers’ Diffusion Forcing method trains neural networks to cleanse a collection of tokens, removing different amounts of noise within each one while simultaneously predicting the next few tokens. The result: a flexible, reliable sequence model that resulted in higher-quality artificial videos and more precise decision-making for robots and AI agents.
Feline eye–inspired artificial vision for enhanced camouflage breaking under diverse light conditions
Biologically inspired artificial vision research has led to innovative robotic vision systems with low optical aberration, wide field of view, and compact form factor. However, challenges persist in object detection and recognition against complex backgrounds and varied lighting. Inspired by the feline eye, which features a vertically elongated pupil and tapetum lucidum, this study introduces an artificial vision system designed for superior object detection and recognition in a monocular framework. Using a slit-like elliptical aperture and a patterned metal reflector beneath a hemispherical silicon photodiode array, the system reduces excessive light and enhances photosensitivity. This design achieves clear focus under bright light and enhanced sensitivity in dim conditions. Theoretical and experimental analyses demonstrate the system’s ability to filter redundant information and detect camouflaged objects in diverse lighting, representing a substantial advancement in monocular camera technology and the potential of biomimicry in optical innovations.
Optimizing waste handling with interactive AI: Prompt-guided segmentation of construction and demolition waste using computer vision
Optimized and automated methods for handling construction and demolition waste (CDW) are crucial for improving the resource recovery process in waste management. Automated waste recognition is a critical step in this process, and it relies on robust image segmentation techniques. Prompt-guided segmentation methods provide promising results for specific user needs in image recognition. However, the current state-of-the-art segmentation methods trained for generic images perform unsatisfactorily on CDW recognition tasks, indicating a domain gap. To address this gap, a user-guided segmentation pipeline is developed in this study that leverages prompts such as bounding boxes, points, and text to segment CDW in cluttered environments. The adopted approach achieves a class-wise performance of around 70 % in several waste categories, surpassing the state-of-the-art algorithms by 9 % on average. This method allows users to create accurate segmentations by drawing a bounding box, clicking, or providing a text prompt, minimizing the time spent on detailed annotations. Integrating this human–machine system as a user-friendly interface into material recovery facilities enhances the monitoring and processing of waste, leading to better resource recovery outcomes in waste management.
How to Train an Object Detection Model for Visual Inspection with Synthetic Data
Edge Impulse is an integrated development platform that empowers developers to create and deploy AI models for edge devices. It supports data collection, preprocessing, model training, and deployment, helping users integrate AI capabilities into their applications effectively.
With NVIDIA Omniverse Replicator, a core extension of NVIDIA Omniverse, users can produce physically accurate and photorealistic, synthetically generated annotated images in Universal Scene Description, known as OpenUSD. These images can then be used for training an object detection model on the Edge Impulse platform.
Taking a data-centric approach, where you create more data around the failure points of the model, is crucial to solving ML problems. Additional training and fine-tuning of parameters can enable a model to generalize well across different orientations, materials, and other relevant conditions.
How Schaeffler Amplifies Electric Vehicle Production with Cognex Machine Vision
Strapping and stretch hooding of concrete blocks
AF-FTTSnet: An end-to-end two-stream convolutional neural network for online quality monitoring of robotic welding
Online welding quality monitoring (WQM) is crucial for intelligent welding, and deep learning approaches considering spatiotemporal features for WQM tasks show great potential. However, one of the important challenges for existing approaches is to balance the spatiotemporal representation learning capability and computational efficiency, which makes it challenging to adapt welding processes with complex and drastic molten pool dynamic behavior. This paper proposes a novel approach for WQM using molten pool visual sensing and deep learning considering spatiotemporal features, the proposed deep learning network called attention fusion based frame-temporality two-stream network (AF-FTTSnet). Firstly, a passive vision sensor is used to acquire continuous dynamic molten pool images. Meanwhile, temporal difference images are computed to provide novel features and temporal representations. Then, a two-stream feature extraction module is designed to concurrently extract rich spatiotemporal features from molten pool images and temporal difference images. Finally, an attention fusion module with the ability to automatically identify and weight the most relevant features is designed to achieve optimal fusion of the two-stream features. The shop welding experimental results indicate that the proposed AF-FTTSnet model can effectively and robustly recognize five typical welding states during helium arc welding, with an accuracy of 99.26%. This model has been demonstrated to exhibit significant performance improvements compared to mainstream temporal sequence models.
Vuzix CEO talks about the AR Space, AI, Warehouse Picking with Smart Glasses
Unveiling Databricks power in analyzing electrical grid assets using computer vision
Data is ingested from an EPRI dataset consisting of images of distribution assets along with labels for each object. These are ingested into Delta tables and transformed through the medallion architecture in order to produce a dataset that is ready for model training.
After data loading has been completed, the training can begin. In the age of GenAI, there is a scarcity of large GPU’s leaving only the smaller ones that can significantly impact training and experimentation times. In order to combat this, Databricks allows you to run distributed GPU training using features like PytorchDistributor. This accelerator takes advantage of this to utilize a cluster of commodity GPU’s to train our model which brings the training time down almost linearly.
AI-powered 3D inspection system for factory automation - In-Sight L38 Series from Cognex
Cognex Launches the World's First 3D Vision System with AI
Cognex Corporation released the In-Sight® L38 3D Vision System, which combines AI, 2D, and 3D vision technologies to solve a range of inspection and measurement applications. The system creates unique projection images that combine 3D information into an easy-to-label 2D image for simplified training and reveals features not visible with traditional 2D imaging. AI tools detect variable or undefined features, while rule-based algorithms provide 3D measurements to deliver reliable inspection results.
The In-Sight L38 greatly simplifies the process of configuring 3D systems thanks to embedded AI technology that uses pre-trained models with domain-specific data. Example-based training replaces complex programming steps, which previously required combining many traditional rule-based tools, to streamline application development. The unique AI-powered 3D tools can be set up in minutes, requiring as few as 5 to 10 labeled images to automate a task. With one tool, users can detect challenging defects, gauge variances in three dimensions, and get results in real-world units.
3D Vision-Guided Racking of Reflective Sheet Metal Parts
Easily integrate Machine Vision into production with apps from the Industrial Edge Ecosystem
Quality control is critical in modern industry. Machine vision makes it less error-prone, time-consuming, and costly. By adding offerings from industry leaders Basler and MVTec to the Siemens Industrial Edge ecosystem, new scalable machine vision solutions can be efficiently and seamlessly integrated into production automation.
One-Stop Shop for Machine Vision: Teledyne Technologies and Edmund Optics Partnership
Defining the Future of Supply Chain & Manufacturing with Fictiv CEO Dave Evans
Mech-Mind AI + 3D Vision-Guided Applications in Automotive Industry
High-Accuracy Inline Measurement of Automotive Part with Mech-Mind 3D Vision System
Smart factory deployment with Schneider Electric & Cognex machine vision
Label inspection for pharmaceutical manufacturing - Why OCTUM & HERMA trust Cognex Machine Vision
Autaza Vision AI
Mech-Mind's Industrial 3D Camera Mech-Eye: Empowering Robotic Integrators
AI Driven Vision Inspection Automation for Engine Tappets
How AI helps this contract manufacturer to stand out on product quality
Increase manufacturing processes by 25% with AI, Opcenter and Retrocausual a Siemens Partner
Cone Ice cream Inspection using Machine Vision
Basler AG: Innovation Leaders
How OSARO used Cognex to solve a tricky barcode reading challenge for Zenni Optical
📷 Making automated visual-inspection systems practical
Using supervised learning to train anomaly localization models has major drawbacks compared to images of defect-free products, images of defective products are scarce; and labeling defective-product images is expensive. Consequently, our benchmarking framework doesn’t require any anomalous images in the training phase. Instead, from the defect-free examples, the model learns a distribution of typical image features.
We have released our benchmark in the hope that other researchers will expand on it, to help bridge the gap between the impressive progress on anomaly localization in research and the challenges of real-world implementation.
🧠🦾 Google’s Robotic Transformer 2: More Than Meets the Eye
Google DeepMind’s Robotic Transformer 2 (RT2) is an evolution of vision language model (VLM) software. Trained on images from the web, RT2 software employs robotics datasets to manage low-level robotics control. Traditionally, VLMs have been used to combine inputs from both visual and natural language text datasets to accomplish more complex tasks. Of course, ChatGTP is at the front of this trend.
Google researchers identified a gap in how current VLMs were being applied in the robotic space. They note that current methods and approaches tend to focus on high-level robotic theory such as strategic state machine models. This leaves a void in the lower-level execution of robotic action, where the majority of control engineers execute work. Thus, Google is attempting to bring the power and benefits of VLMs down into the control engineers’ domain of programming robotics.
🛣️ America’s Bridges, Factories and Highways Are in Dire Need of Repairs. Bring in the Robots.
These days, Shell is able to keep the plant running, and keep repair personnel on the ground and at a safe distance as they operate wall-climbing robots that inspect things like steel holding tanks at millimeter resolution, says Steven Treviño, a robotics engineer at Shell. Using a variety of sensors, the robots can look for both corrosion and cracking. This helps the team shorten the list of things they have to take care of when a full shutdown occurs. The magnetic wall climbers Shell is using are made by a Pittsburgh-based startup called, appropriately, Gecko Robotics. After testing the Gecko robots at Geismar, Shell plans to expand their use to offshore facilities.
“There are hundreds of types of corrosion,” says Jake Loosararian, CEO of Gecko Robotics, “and we’ve been developing technology and software to analyze what kind of damage is happening.” Gecko began as a robotics company, but has since expanded into creating software to process the data its robots gather. The startup makes systems that are now used to track more than 60,000 assets across the globe, including power plants, pipelines, oil refineries, dams, U.S. Navy vessels and other military equipment.
When it comes to inspections, “often the data you need is literally in plain sight, it’s just hard to collect it,” says Bry, of Skydio.
AI Transformer Models Enable Machine Vision Object Detection
Machine vision is another key technology, and today AI and machine vision interact in a few ways. “First, machine vision output is fed to an AI engine to perform functions such as people counting, object recognition, etc., to make decisions,” said Arm’s Zyazin. “Second, AI is used to provide better quality images with AI-based de-noising, which then assists with decision-making. An example could be an automotive application where a combination of AI and machine vision can recognize a speed limit sign earlier and adjust the speed accordingly.”
“There are a few main directions for machine vision, including cloud computing to scale deep-learning solutions, automated ML architectures to improve the ML pipeline, transformer architectures that optimize computer vision (a superset of machine vision), and mobile devices incorporating computer vision technology on the edge,” Synopsys’ Andersen said.
🧠🦾 RT-2: New model translates vision and language into action
Robotic Transformer 2 (RT-2) is a novel vision-language-action (VLA) model that learns from both web and robotics data, and translates this knowledge into generalised instructions for robotic control.
High-capacity vision-language models (VLMs) are trained on web-scale datasets, making these systems remarkably good at recognising visual or language patterns and operating across different languages. But for robots to achieve a similar level of competency, they would need to collect robot data, first-hand, across every object, environment, task, and situation.
In our paper, we introduce Robotic Transformer 2 (RT-2), a novel vision-language-action (VLA) model that learns from both web and robotics data, and translates this knowledge into generalised instructions for robotic control, while retaining web-scale capabilities.
📐 UCLA Researchers Propose PhyCV: A Physics-Inspired Computer Vision Python Library
In the latest innovation, Jalali-Lab @ UCLA has developed a new Python library called PhyCV, which is the first Physics-based Computer vision Python library. This unique library uses algorithms based on the laws and equations of physics to analyze pictorial data. These algorithms imitate how light passes through several physical materials and are based on mathematical equations rather than a series of hand-crafted rules. The algorithms in PhyCV are built on the principles of a rapid data acquisition method called the photonic time stretch.
The three algorithms included in PhyCV are – Phase-Stretch Transform (PST) algorithm, Phase-Stretch Adaptive Gradient-Field Extractor (PAGE) algorithm, and Vision Enhancement via Virtual diffraction and coherent Detection (VEViD) algorithm.
Behind the A.I. tech making BMW vehicle assembly more efficient
Vision retrofits are the quick automation wins you should do now
A vision retrofit is to install the latest AI-powered vision technologies to optimize the performance of a robotic cell. These technologies can significantly improve productivity by allowing robots to operate faster in a wider range of operating conditions and account for randomness intelligently. By upgrading existing 3D camera or fixtured setups with this AI-powered solution, companies can improve their operations and gain a competitive edge.
With AI-powered vision you only need to add an extrusion above the cell for 2D cameras (or just remove the existing 3D camera). With a focused investment under $50K, a few hours of downtime and a project time under six weeks, you could have a cell performing faster and more reliably.
Meta-Transformer: A Unified Framework for Multimodal Learning
Multimodal learning aims to build models that can process and relate information from multiple modalities. Despite years of development in this field, it still remains challenging to design a unified network for processing various modalities (e.g. natural language, 2D images, 3D point clouds, audio, video, time series, tabular data) due to the inherent gaps among them. In this work, we propose a framework, named Meta-Transformer, that leverages a frozen encoder to perform multimodal perception without any paired multimodal training data. In Meta-Transformer, the raw input data from various modalities are mapped into a shared token space, allowing a subsequent encoder with frozen parameters to extract high-level semantic features of the input data. Composed of three main components: a unified data tokenizer, a modality-shared encoder, and task-specific heads for downstream tasks, Meta-Transformer is the first framework to perform unified learning across 12 modalities with unpaired data. Experiments on different benchmarks reveal that Meta-Transformer can handle a wide range of tasks including fundamental perception (text, image, point cloud, audio, video), practical application (X-Ray, infrared, hyperspectral, and IMU), and data mining (graph, tabular, and time-series). Meta-Transformer indicates a promising future for developing unified multimodal intelligence with transformers.
🧠📹 What Sets Toshiba’s Ceramic Balls Apart? The AI Quality Inspection System
Bearings cannot be easily replaced once a vehicle is assembled. In the U.S., bearings used in EVs are expected to be of high enough quality to withstand long distances. One issue that can occur with EVs, however, is the “electric corrosion” of the bearings that mount the various vital parts of the vehicle onto the motor—a serious issue, as it can lead to the breakdown of the vehicle. High-performance bearings would drive the widespread use of EVs, and contribute to the push towards carbon neutrality. The electrical corrosion phenomenon had hampered these efforts, but not anymore—therein lies the beauty of Toshiba’s ceramic balls.
“Our ceramic balls go through slight changes about every year and a half due to changes in material and other factors. To keep up the accuracy of the quality inspections, we have to continually update the AI system itself. The MLOps system automates that process,” says Kobatake.
“We’ve been able to dramatically reduce the time spent on these inspections. Ceramic balls are expensive compared to their metal counterparts. They have so many different strengths, and yet they haven’t been able to replace the metal ones precisely because of this particular issue. If we’re able to reduce the cost through AI quality inspection, we’ll be able to lower the price of the products themselves,” says Yamada.
Apera AI & Mitsubishi Electric Automation Making Robotic Vision Simple
ImageBind: One Embedding Space To Bind Them All
We present ImageBind, an approach to learn a joint embedding across six different modalities - images, text, audio, depth, thermal, and IMU data. We show that all combinations of paired data are not necessary to train such a joint embedding, and only image-paired data is sufficient to bind the modalities together. ImageBind can leverage recent large scale vision-language models, and extends their zero-shot capabilities to new modalities just by using their natural pairing with images. It enables novel emergent applications ‘out-of-the-box’ including cross-modal retrieval, composing modalities with arithmetic, cross-modal detection and generation. The emergent capabilities improve with the strength of the image encoder and we set a new state-of-the-art on emergent zero-shot recognition tasks across modalities, outperforming specialist supervised models. Finally, we show strong few-shot recognition results outperforming prior work, and that ImageBind serves as a new way to evaluate vision models for visual and non-visual tasks.
Improving Image Resolution At The Edge
AI vision for print quality inspection on bottles
Basler Lens selector: Match the right lens for your camera
How is 3D machine vision transforming manufacturing processes?
3D machine vision employs 3D cameras that provide robots with data and information pertaining to particular parts. These three-dimensional cameras can be installed at various locations to create 360-degree, multi-angle images for surface and volume inspection.
The topographical map results from reflected laser displacement. Taking images from two distinct angles facilitates you in getting the 3D data of the image. Then, the separation between each perspective in 3D space is computed. There’s some installed software that can do some substantial image processing and analysis. To evaluate an object with machine vision software, a PC-based machine vision system is hardwired to vision cameras and image capture boards.
AI Driven Vision Inspection Automation for Bevel Gears
Edge Learning: AI for Industrial Machine Vision Made Easy
Plastic Bottles Defect Inspection Using Omron FH Vision System with AI
Artificial intelligence for stable processes in industry
Cerrion’s solution can be used in any manufacturing process where the process is visible. What makes it special is the combination of video technology, which can be integrated directly into existing processes, and state-of-the-art AI. This way it is possible to monitor new processes immediately without having to first import a lot of data into the system, as is often the case with conventional solutions. The AI-based computer vision technology learns what a process looks like in normal operation and can detect deviations in real time.
The AI-based software can be connected to a commercially available camera and instantly provides reliable process insights. The system helps to detect problems in the process at an early stage, i.e. before they arise and lead to failures or losses. The technology can be used for a wide range of applications. Cerrion has already gained experience in glass bottle production, plant and machine manufacturing, tool manufacturing, pharmaceutical packaging, and wood construction. The AI can detect and track problems in automated processes such as jams on the production line in real time. It can also analyse whether the assigned time for the manual assembly of a component is in line with planning or whether the defined process needs to be optimised. Hazards for employees and equipment can also be detected and eliminated at an early stage. This leads to more safety in operations.
How machine vision works in RIBE Anlagentechnik’s camera-monitored assembly facility
The German company RIBE Anlagentechnik develops innovative assembly systems, including inspection systems, for bumpers. SICK’s machine vision helps to identify the individual components, and it also monitors each work operation. This particular system concept could prove revolutionary for other manufacturers and suppliers as well.
As the level of individualization in production areas increases, so does the importance of special-purpose systems with innovative potential. RIBE Anlagentechnik specializes in delivering added value to its end customers. The company has demonstrated its specific strengths in technologies associated with assembly and inspection systems for vehicle interiors/exteriors and related components. Managing Director Dietmar Heckel regards the cobot and robot technologies with innovative Industry 4.0 solutions and digitalization concepts not only as a supporting pillar of RIBE Anlagentechnik, but also as a cross-sectoral growth field.
Where Four-Legged Robot Dogs Are Finding Work
High-Performance Machine Vision: Versatile lighting for subtle surface defects
Fabs Drive Deeper Into Machine Learning
For the past couple decades, semiconductor manufacturers have relied on computer vision, which is one of the earliest applications of machine learning in semiconductor manufacturing. Referred to as Automated Optical Inspection (AOI), these systems use signal processing algorithms to identify macro and micro physical deformations.
Defect detection provides a feedback loop for fab processing steps. Wafer test results produce bin maps (good or bad die), which also can be analyzed as images. Their data granularity is significantly larger than the pixelated data from an optical inspection tool. Yet test results from wafer maps can match the splatters generated during lithography and scratches produced from handling that AOI systems can miss. Thus, wafer test maps give useful feedback to the fab.
3D Vision Technology Advances to Keep Pace With Bin Picking Challenges
When a bin has one type of object with a fixed shape, bin picking is straightforward, as CAD models can easily recognize and localize individual items. But randomly positioned objects can overlap or become entangled, presenting one of the greatest challenges in bin picking. Identifying objects with varying shapes, sizes, colors, and materials poses an even larger challenge, but by deploying deep learning algorithms, it is possible to find and match objects that do not conform to one single geometrical description but belong to a general class defined by examples, according to Andrea Pufflerova, Public Relations Specialist at Photoneo.
“A well-trained convolutional neural network (CNN) can recognize and classify mixed and new types of objects that it has never come across before,”
Vision Cameras Inspect Disk Drive Assemblies
Once manufactured, an HDD is carefully fitted and sealed in a metal or plastic case. The case ensures that all drive components are perfectly secured in place and their mechanics work well over the lifetime of the product. It also protects the sensitive disks from dust, humidity, shock and vibration.
An HDD case must be defect-free and have perfectly machined thread holes to perform these functions, according to Somporn Kornwong, a manager at Flexon. In 2019 his company developed Visual Machine Inspection (VMI) for a manufacturer so it can quickly and thoroughly inspect each case it produces.
Simplify Deep Learning Systems with Optimized Machine Vision Lighting
Deep learning cannot compensate for or replace quality lighting. This experiment’s results would hold true over a wide variety of machine vision applications. Poor lighting configurations will result in poor feature extraction and increased defect detection confusion (false positives).
Several rigorous studies show that classification accuracy reduces with image quality distortions such as blur and noise. In general, while deep neural networks perform better than or on par with humans on quality images, a network’s performance is much lower than a human’s when using distorted images. Lighting improves input data, which greatly increases the ability of deep neural network systems to compare and classify images for machine vision applications. Smart lighting — geometry, pattern, wavelength, filters, and more — will continue to drive and produce the best results for machine vision applications with traditional or deep learning systems.
Perceiver: General Perception with Iterative Attention
Biological systems perceive the world by simultaneously processing high-dimensional inputs from modalities as diverse as vision, audition, touch, proprioception, etc. The perception models used in deep learning on the other hand are designed for individual modalities, often relying on domain-specific assumptions such as the local grid structures exploited by virtually all existing vision models. These priors introduce helpful inductive biases, but also lock models to individual modalities. In this paper we introduce the Perceiver - a model that builds upon Transformers and hence makes few architectural assumptions about the relationship between its inputs, but that also scales to hundreds of thousands of inputs, like ConvNets. The model leverages an asymmetric attention mechanism to iteratively distill inputs into a tight latent bottleneck, allowing it to scale to handle very large inputs. We show that this architecture is competitive with or outperforms strong, specialized models on classification tasks across various modalities: images, point clouds, audio, video, and video+audio. The Perceiver obtains performance comparable to ResNet-50 and ViT on ImageNet without 2D convolutions by directly attending to 50,000 pixels. It is also competitive in all modalities in AudioSet.
Tilling AI: Startup Digs into Autonomous Electric Tractors for Organics
Ztractor offers tractors that can be configured to work on 135 different types of crops. They rely on the NVIDIA Jetson edge AI platform for computer vision tasks to help farms improve plant conditions, increase crop yields and achieve higher efficiency.
AI Vision for Monitoring Applications in Manufacturing and Industrial Environments
In traditional industrial and manufacturing environments, monitoring worker safety, enhancing operator efficiency, and improving quality assurance were physical tasks. Today, AI-enabled machine vision technologies replace many of these inefficient, labor-intensive operations for greater reliability, safety, and efficiency. This article explores how, by deploying AI smart cameras, further performance improvements are possible since the data used to empower AI machine vision comes from the camera itself.
Tools Move up the Value Chain to Take the Mystery Out of Vision AI
Intel DevCloud for the Edge and Edge Impulse offer cloud-based platforms that take most of the pain points away with easy access to the latest tools and software. While Xilinx and others have started offering complete systems-on-module with production-ready applications that can be deployed with tools at a higher level of abstraction, removing the need for some of the more specialist skills.
How the USPS Is Finding Lost Packages More Quickly Using AI Technology from Nvidia
In one of its latest technology innovations, the USPS got AI help from Nvidia to fix a problem that has long confounded existing processes – how to better track packages that get lost within the USPS system so they can be found in hours instead of in several days. In the past, it took eight to 10 people several days to locate and recover lost packages within USPS facilities. Now it is done by one or two people in a couple hours using AI.
Hyperspectral imaging aids precision farming
Remote sensing techniques have exponentially evolved thanks to technological progress with the spread of multispectral cameras. Hyperspectral imaging is the capture and processing of an image at a very high number of wavelengths. While multispectral imaging can evaluate the process with three or four colors (red, green, blue and near infrared), hyperspectral imaging splits the image into tens or hundreds of colors. By using the technique of spectroscopy, which is used to identify materials based on how light behaves when it hits a subject, hyperspectral imaging obtains more spectra of data for each pixel in the image of a scene.
Unlike radiography, hyperspectral imaging is a non-destructive, non-contact technology that can be used without damaging the object being analyzed. For example, a drone with a hyperspectral camera can detect plant diseases, weeds, soil erosion problems, and can also estimate crop yields.
John Deere and Audi Apply Intel’s AI Technology
Identifying defects in welds is a common quality control process in manufacturing. To make these inspections more accurate, John Deere is applying computer vision, coupled with Intel’s AI technology, to automatically spot common defects in the automated welding process used in its manufacturing facilities.
At Audi, automated welding applications range from spot welding to riveting. The widespread automation in Audi factories is part of the company’s goal of creating Industrie 4.0-level smart factories. A key aspect of this goal involves Audi’s recognition that creating customized hardware and software to handle individual use cases is not preferrable. Instead, the company focuses on developing scalable and flexible platforms that allow them to more broadly apply advanced digital capabilities such as data analytics, machine learning, and edge computing.
F-16s Are Now Getting Washed By Robots
The Wilder Systems solution actually leverages technology previously developed for robotic drilling in commercial aircraft manufacturing and converts these components and subsystems into an automated washing system. The main changes have involved the development and addition of robot end-effectors to provide the water and soap spray, waterproofing of the robots themselves, and a robot motion path, which is dependent on the type of aircraft to be cleaned.
Machine learning optimizes real-time inspection of instant noodle packaging
During the production process there are various factors that can potentially lead to the seasoning sachets slipping between two noodle blocks and being cut open by the cutting machine or being packed separately in two packets side by side. Such defective products would result in consumer complaints and damage to the company’s reputation, for which reason delivery of such products to dealers should be reduced as far as possible. Since the machine type upgraded by Tianjin FengYu already produced with a very low error rate before, another aspect of quality control is critical: It must be ensured that only the defective and not the defect-free products are reliably sorted out.
Tractor Maker John Deere Using AI on Assembly Lines to Discover and Fix Hidden Defective Welds
John Deere performs gas metal arc welding at 52 factories where its machines are built around the world, and it has proven difficult to find defects in automated welds using manual inspections, according to the company.
That’s where the successful pilot program between Intel and John Deere has been making a difference, using AI and computer vision from Intel to “see” welding issues and get things back on track to keep John Deere’s pilot assembly line humming along.
Harvesting AI: Startup’s Weed Recognition for Herbicides Grows Yield for Farmers
In 2016, the former dorm-mates at École Nationale Supérieure d’Arts et Métiers, in Paris, founded Bilberry. The company today develops weed recognition powered by the NVIDIA Jetson edge AI platform for precision application of herbicides at corn and wheat farms, offering as much as a 92 percent reduction in herbicide usage.
Driven by advances in AI and pressures on farmers to reduce their use of herbicides, weed recognition is starting to see its day in the sun.
Analysing fruit data in the supply chain has never been more important for business efficiency
Fruit and production data can be used in ways that it has never been done before to improve a company’s efficiency and boost profits, according to global packhouse equipment and automation supplier Tomra Food.
He added that there are several different useful data types at play in a packhouse; production and traceability level data, performance level data, quality data and auditing data. This data can be used to optimise the supply chain and can be used to make decisions and directions in terms of the next big thing that needs to be done. But consumer trends will constantly change the requirements of automation.