Machine Vision
Assembly Line
Zebra Technologies to Acquire Photoneo, Expanding Its Portfolio of 3D Machine Vision Solutions
Zebra Technologies (NASDAQ: ZBRA), a leading digital solution provider enabling businesses to intelligently connect data, assets, and people, announced it intends to acquire Photoneo, a leading developer and manufacturer of 3D machine vision solutions. The 3D segment of the Machine Vision market is the fastest growing, and this acquisition will further accelerate Zebra’s presence in the category.
Photoneo’s intelligent sensors are particularly effective within the vision-guided robotic (VGR) segment. They are certified to interface with many of the largest robotic manufacturers for a variety of use cases including robot-arm applications for bin picking. Photoneo differentiates itself through parallel structured light technology in complex 3D applications which provides a faster, more accurate, higher resolution and more robust solution comprised of both hardware and software.
Zebra has made strategic investments in the Machine Vision market, most recently in the acquisition of Matrox Imaging in June 2022 to augment its portfolio of fixed industrial scanners and machine vision sensors. By acquiring Matrox Imaging, Zebra accelerated its position as a leading provider of machine vision hardware and a broad range of software development libraries and apps now unified within the Zebra Aurora software suite.
Vision-guided cobot automates paint process for DENSO
DENSO, a leading global automotive parts manufacturer, partnered with CapSen Robotics and systems integrator Invent Automation to automate a physically challenging and repetitive tote-handling task. The task involved loading and unloading large stacks of heavy totes to and from a paint booth. The new system uses a six-axis collaborative robot with an Intel RealSense 3D RGB depth camera to visually identify and measure the height of the totes for picking. The CapSen PiC 2.0 software allows the robot to plan its motion, locate, pick, and manipulate the tote, and move it toward another conveyor headed to the paint booth station. The system has resulted in zero drops and zero missed picks, freeing up employees to perform more impactful work.
Automated 3D Inspection System Detects Weld Defects
Since the assembly process involves more than 20 steps, quality control must be comprehensive to ensure that defects are identified at an early stage and reworked. If problems are identified too late in the process, the part must be scrapped.
To guarantee structural integrity, each weld must be inspected. Afterwards, the height of each rivet must be checked with an accuracy of ±0.1 millimeter, since the height of the raised round rivet heads determines how much pressure must be applied when the sheets are joined. Finally, various holes, slots and mounts must be inspected to comply with specifications.
To accomplish these tasks, the carmaker turned to Bluewrist Inc. in Ann Arbor, MI. Bluewrist developed an inline 3D vision inspection system that can continually monitor production quality. Working with the OEM, Bluewrist designed a system that uses two six-axis robots from FANUC and two 3D laser profile cameras from LMI Technologies. As the robots carry out high-speed inspection of all the critical features, the results are recorded and a 3D point cloud is processed and verified against specifications to identify any defects.
The profilers capture detailed surface characteristics of each weld. The weld profile can be inspected in its entirety or broken down into individual sections for analysis and comparison against blueprints to guarantee conformity to length, width, throat thickness, and leg length. The system can inspect fillet, lap, butt, corner, T-joint and plug welds.
Given the complexity of the battery tray, Bluewrist tested the system extensively with feasibility studies before deploying it on the production line. In production, the system performs all the inspections in less than 200 seconds.
Imagia announces ‘breakthrough’ in optical feature detection
Imagia, a developer of optical metasurface technology, has announced a “breakthrough” that allows for optically-accelerated feature detection, enabling image processing operations without power consumption or code.
The new technology, called Processing Optics™, allows for the identification of complex features like a human hand or face by means of an array of microscopic optical filters rather than standard, algorithmic on chip image processing. For power-limited and latency-critical applications like edge computing on wearable devices, the ability to extract features of interest quickly and efficiently presents an immense opportunity to enable AI applications at the edge and push the boundaries of device design.
“Processing Optics is a step change in the way we think about extracting information from the world around us,” commented Greg Kress, CEO of Imagia. “Searching for objects or patterns within images has traditionally been a slow and computationally intensive task. Now, we get the signal we want at the speed of light, for very little power.”
The technology works by applying a set of mathematical convolutions in an array of optical filters. The light passing through a metalens is steered and transformed by billions of nanoscale components on each Imagia metalens that imparts a hard-coded pattern recognition algorithm to the signal. Imagia has demonstrated a hand and gesture detector that works with only eight pixels of information and with a response time of only 80 microseconds. By contrast, traditional optics and processing typically take 30-40 milliseconds to process the millions of pixels for digital algorithmic approaches.
By processing the image directly in the optics, Imagia is able to realize a 500x reduction in detection latency for a fraction of the power compared to the traditional method of capturing an image and then processing that data in downstream software. Running at a comparable frame rate to a standard image processing system, the Imagia solution consumes less than 1% of the power.
Applications like artificial intelligence and active feature detection in laptops and AR/VR headsets are set to receive outsized benefit from the innovation, which could extend battery life of these devices by 20% or more.
ZEISS and Aechelon Join Forces to Unveil VELVET 4K SIM 240Hz SIM with Unprecedented 5M:1 Contrast
ZEISS and Aechelon Technology Inc. proudly announce the debut of a revolutionary simulation system integrating ZEISS’ cutting-edge dual-DMD projector, delivering 4K resolution at 240Hz with a stunning 5,000,000:1 native contrast ratio.
“Our partnership with ZEISS exemplifies Aechelon’s commitment to fostering innovation through collaboration with Germany’s industrial fabric,” said Javier Castellar, Co-Founder & Chief Strategy Officer at Aechelon Technology, Inc. “This integration sets a new standard for simulation technology, offering unparalleled fidelity and realism.”.
The demonstration system integrates Aechelon’s Image Generator with a high-resolution worldwide 3D visual and sensor database, which includes extensive coverage of domestic and foreign areas of interest. This global database is already operational in Department of Defense (DoD) and Allied Nations programs, ensuring seamless compatibility and commonality.
DeepX DX-M1 AI Chip Hits 90% Yield with Samsung’s 5nm Technology
DeepX, a South Korean AI semiconductor company, has announced a major milestone for its DX-M1 chip, achieving a 90% yield rate through Samsung Foundry’s 5nm manufacturing process.
The DX-M1 is engineered for real-time AI processing in energy-constrained environments, making it ideal for physical security, robotics, and smart manufacturing. Its ability to process over 16 video channels simultaneously at 30 frames per second highlights its suitability for intelligent video analytics, machine vision, and industrial automation.
By supporting advanced AI models such as YOLOv9 and vision transformers, the chip caters to a wide range of inference tasks, from object detection to image recognition.
DeepX has emphasized its unique contributions to edge AI, leveraging proprietary technologies like IQ8 compression, which enhances model efficiency without sacrificing accuracy.
Additionally, Smart Memory Access technology significantly reduces memory usage, enabling high performance with Low-Power Double Data Rate (LPDDR) memory while avoiding the costs associated with high-bandwidth memory (HBM).
BYD Clutch Cover Vision Inspection Equipment Combining Contact and Non-Contact Technologies
Your Next Computer Vision Model Might be an LLM
Sony Semiconductor Solutions to Release an Industrial CMOS Image Sensor with Global Shutter for High-Speed Processing and High Pixel Count
Sony Semiconductor Solutions Corporation (SSS) announced the upcoming release of the IMX925 stacked CMOS image sensor with back-illuminated pixel structure and global shutter. This new product offers 394 fps high-speed processing and a high, 24.55-effective-megapixel*1 count and is optimized for industrial equipment imaging.
With factory automation progressing, demand continues to grow for machine vision cameras capable of fast, high-quality imaging for a variety of objects in the industrial equipment domain. By employing a global shutter capable of capturing moving subjects free of distortion together with a proprietary back-illuminated pixel structure, SSS’s global-shutter CMOS image sensors deliver superb pixel characteristics, including high sensitivity and saturation capacity. They are mainly being used to recognize and inspect precision components such as electronic devices.
PLC based laser scanning system for conveyor belt surface monitoring
This paper presents the design, implementation, and testing of an advanced conveyor belt surface monitoring system, specifically engineered for harsh and complex industrial environments. The system integrates multiple cutting-edge technologies, including programmable logic controllers (PLC), laser scanning, industrial-grade cameras, and deep learning algorithms, particularly YOLOv7, to achieve real-time, high-precision monitoring of conveyor belt conditions. Key innovations include optimized detection location based on failure modes, advanced PLC integration for seamless automation, and intelligent dust-proof features to maintain accuracy in challenging conditions. Through strategic placement of detection devices and multi-mode control strategies (local, remote, and automatic), the system offers unparalleled adaptability and responsiveness. The system leverages robust data management for trend analysis and predictive maintenance, enhancing operational efficiency. The hardware architecture comprises PLC-based control systems, high-resolution industrial cameras, and laser emitters, while the software features a two-tier structure combining human-machine interaction (HMI) with real-time data processing capabilities. Experimental results show that the system is highly effective in detecting common belt defects such as foreign objects, tears, and shallow scratches, ensuring optimal operational efficiency and minimizing downtime. The system’s scalability, robust data management, and adaptability to low-light and dusty conditions make it ideal for deployment in large-scale industrial operations, where continuous monitoring and early fault detection are critical to maintaining productivity and safety.
GE Aerospace, Waygate Technologies to Deliver new AI-assisted Commercial Jet Engine Borescope Inspection Solution to Enhance Defect Recognition
GE Aerospace and Waygate Technologies, a Baker Hughes business, announced they have jointly developed a new, AI-assisted commercial engine borescope solution that will be available to Waygate Technologies customers and introduced to GE Aerospace’s MRO network later this year. The development represents the successful completion of their first development program under a Joint Technology Development Agreement between the two companies announced in May of 2023.
Through this joint development effort, GE Aerospace provided Waygate Technologies with a comprehensive dataset of engine inspection videos, which resulted in thousands of new representative images used for training Waygate Technologies’ Gas Power-assist ADR model. GE’s Services Technology Acceleration Center (STAC) and GE Aerospace Research brought subject matter expertise to ensure accurate and complete data labeling was performed. Waygate Technologies then leveraged this data and applied cutting-edge AI techniques, including a compute-optimized, state-of-the-art object detection algorithm and a novel temporal smoothing algorithm.
Key technical advancements, as compared to the program starting point (Gas Power-assist ADR model v4.1), include:
- Increased True Positive Rate: Model recall rates realized a 33.6% increase, indicating a dramatic improvement in identifying HPC defects.
- Decreased False Positive Rate: Model precision rates realized a 13.5% increase, indicating a reduction in previous falsely identified defects. This improvement was achieved both by an increased training dataset and the temporal smoothing algorithm used for detection confirmation.
The new AI-assisted features will be integrated and available for deployment through a software update to customers for Waygate Technologies’ Mentor Visual iQ+ borescope later this year. In addition, GE Aerospace will be introducing the model to its MRO network for use in High Pressure Compressor inspections for its GEnx and CFM LEAP engines.
Exotec | Client Sites | Renault Group
The Role of AI-Powered Machine Vision Systems in Textile Quality Control
Integrating AI with machine vision enables systems to learn from past data and improve defect detection over time. For example, Robro Systems’ Kiara Web Inspection System (KWIS) uses AI-driven algorithms to enhance detection capabilities, adapting to new defect patterns that may emerge during production.
With KWIS, saw a 25% improvement in defect detection accuracy compared to manual inspection methods. For instance, in a batch of conveyor belt fabric, the system detected micro-tears that manual inspection would have missed, allowing to correct the issue early and avoid downstream quality failures. This reduced our material waste and ensured that only high-quality products reached our customers.
Implementing machine vision technology has also translated into significant cost savings for manufacturers. According to a study by the International Journal of Advanced Manufacturing Technology, machine vision can reduce defect-related production costs by up to 30%. For manufacturers, this has meant reducing the costs associated with rework and waste and minimizing customer returns and complaints.
Machine Vision for medical device manufacturing - Pentagon Automation Assembly & Cognex
Combining next-token prediction and video diffusion in computer vision and robotics
Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have proposed a simple change to the diffusion training scheme that makes this sequence denoising considerably more flexible.
When applied to fields like computer vision and robotics, the next-token and full-sequence diffusion models have capability trade-offs. Next-token models can spit out sequences that vary in length. However, they make these generations while being unaware of desirable states in the far future — such as steering its sequence generation toward a certain goal 10 tokens away — and thus require additional mechanisms for long-horizon (long-term) planning. Diffusion models can perform such future-conditioned sampling, but lack the ability of next-token models to generate variable-length sequences.
Researchers from CSAIL want to combine the strengths of both models, so they created a sequence model training technique called “Diffusion Forcing.” The name comes from “Teacher Forcing,” the conventional training scheme that breaks down full sequence generation into the smaller, easier steps of next-token generation (much like a good teacher simplifying a complex concept).
Diffusion Forcing found common ground between diffusion models and teacher forcing: They both use training schemes that involve predicting masked (noisy) tokens from unmasked ones. In the case of diffusion models, they gradually add noise to data, which can be viewed as fractional masking. The MIT researchers’ Diffusion Forcing method trains neural networks to cleanse a collection of tokens, removing different amounts of noise within each one while simultaneously predicting the next few tokens. The result: a flexible, reliable sequence model that resulted in higher-quality artificial videos and more precise decision-making for robots and AI agents.
Feline eye–inspired artificial vision for enhanced camouflage breaking under diverse light conditions
Biologically inspired artificial vision research has led to innovative robotic vision systems with low optical aberration, wide field of view, and compact form factor. However, challenges persist in object detection and recognition against complex backgrounds and varied lighting. Inspired by the feline eye, which features a vertically elongated pupil and tapetum lucidum, this study introduces an artificial vision system designed for superior object detection and recognition in a monocular framework. Using a slit-like elliptical aperture and a patterned metal reflector beneath a hemispherical silicon photodiode array, the system reduces excessive light and enhances photosensitivity. This design achieves clear focus under bright light and enhanced sensitivity in dim conditions. Theoretical and experimental analyses demonstrate the system’s ability to filter redundant information and detect camouflaged objects in diverse lighting, representing a substantial advancement in monocular camera technology and the potential of biomimicry in optical innovations.
Potato Sorting Grading & Sizing Using AI
Optimizing waste handling with interactive AI: Prompt-guided segmentation of construction and demolition waste using computer vision
Optimized and automated methods for handling construction and demolition waste (CDW) are crucial for improving the resource recovery process in waste management. Automated waste recognition is a critical step in this process, and it relies on robust image segmentation techniques. Prompt-guided segmentation methods provide promising results for specific user needs in image recognition. However, the current state-of-the-art segmentation methods trained for generic images perform unsatisfactorily on CDW recognition tasks, indicating a domain gap. To address this gap, a user-guided segmentation pipeline is developed in this study that leverages prompts such as bounding boxes, points, and text to segment CDW in cluttered environments. The adopted approach achieves a class-wise performance of around 70 % in several waste categories, surpassing the state-of-the-art algorithms by 9 % on average. This method allows users to create accurate segmentations by drawing a bounding box, clicking, or providing a text prompt, minimizing the time spent on detailed annotations. Integrating this human–machine system as a user-friendly interface into material recovery facilities enhances the monitoring and processing of waste, leading to better resource recovery outcomes in waste management.
How to Train an Object Detection Model for Visual Inspection with Synthetic Data
Edge Impulse is an integrated development platform that empowers developers to create and deploy AI models for edge devices. It supports data collection, preprocessing, model training, and deployment, helping users integrate AI capabilities into their applications effectively.
With NVIDIA Omniverse Replicator, a core extension of NVIDIA Omniverse, users can produce physically accurate and photorealistic, synthetically generated annotated images in Universal Scene Description, known as OpenUSD. These images can then be used for training an object detection model on the Edge Impulse platform.
Taking a data-centric approach, where you create more data around the failure points of the model, is crucial to solving ML problems. Additional training and fine-tuning of parameters can enable a model to generalize well across different orientations, materials, and other relevant conditions.
How Schaeffler Amplifies Electric Vehicle Production with Cognex Machine Vision
Strapping and stretch hooding of concrete blocks
AF-FTTSnet: An end-to-end two-stream convolutional neural network for online quality monitoring of robotic welding
Online welding quality monitoring (WQM) is crucial for intelligent welding, and deep learning approaches considering spatiotemporal features for WQM tasks show great potential. However, one of the important challenges for existing approaches is to balance the spatiotemporal representation learning capability and computational efficiency, which makes it challenging to adapt welding processes with complex and drastic molten pool dynamic behavior. This paper proposes a novel approach for WQM using molten pool visual sensing and deep learning considering spatiotemporal features, the proposed deep learning network called attention fusion based frame-temporality two-stream network (AF-FTTSnet). Firstly, a passive vision sensor is used to acquire continuous dynamic molten pool images. Meanwhile, temporal difference images are computed to provide novel features and temporal representations. Then, a two-stream feature extraction module is designed to concurrently extract rich spatiotemporal features from molten pool images and temporal difference images. Finally, an attention fusion module with the ability to automatically identify and weight the most relevant features is designed to achieve optimal fusion of the two-stream features. The shop welding experimental results indicate that the proposed AF-FTTSnet model can effectively and robustly recognize five typical welding states during helium arc welding, with an accuracy of 99.26%. This model has been demonstrated to exhibit significant performance improvements compared to mainstream temporal sequence models.
Vuzix CEO talks about the AR Space, AI, Warehouse Picking with Smart Glasses
Unveiling Databricks power in analyzing electrical grid assets using computer vision
Data is ingested from an EPRI dataset consisting of images of distribution assets along with labels for each object. These are ingested into Delta tables and transformed through the medallion architecture in order to produce a dataset that is ready for model training.
After data loading has been completed, the training can begin. In the age of GenAI, there is a scarcity of large GPU’s leaving only the smaller ones that can significantly impact training and experimentation times. In order to combat this, Databricks allows you to run distributed GPU training using features like PytorchDistributor. This accelerator takes advantage of this to utilize a cluster of commodity GPU’s to train our model which brings the training time down almost linearly.
AI-powered 3D inspection system for factory automation - In-Sight L38 Series from Cognex
Cognex Launches the World's First 3D Vision System with AI
Cognex Corporation released the In-Sight® L38 3D Vision System, which combines AI, 2D, and 3D vision technologies to solve a range of inspection and measurement applications. The system creates unique projection images that combine 3D information into an easy-to-label 2D image for simplified training and reveals features not visible with traditional 2D imaging. AI tools detect variable or undefined features, while rule-based algorithms provide 3D measurements to deliver reliable inspection results.
The In-Sight L38 greatly simplifies the process of configuring 3D systems thanks to embedded AI technology that uses pre-trained models with domain-specific data. Example-based training replaces complex programming steps, which previously required combining many traditional rule-based tools, to streamline application development. The unique AI-powered 3D tools can be set up in minutes, requiring as few as 5 to 10 labeled images to automate a task. With one tool, users can detect challenging defects, gauge variances in three dimensions, and get results in real-world units.
3D Vision-Guided Racking of Reflective Sheet Metal Parts
Easily integrate Machine Vision into production with apps from the Industrial Edge Ecosystem
Quality control is critical in modern industry. Machine vision makes it less error-prone, time-consuming, and costly. By adding offerings from industry leaders Basler and MVTec to the Siemens Industrial Edge ecosystem, new scalable machine vision solutions can be efficiently and seamlessly integrated into production automation.
One-Stop Shop for Machine Vision: Teledyne Technologies and Edmund Optics Partnership
Defining the Future of Supply Chain & Manufacturing with Fictiv CEO Dave Evans
Mech-Mind AI + 3D Vision-Guided Applications in Automotive Industry
High-Accuracy Inline Measurement of Automotive Part with Mech-Mind 3D Vision System
Smart factory deployment with Schneider Electric & Cognex machine vision
Emirates Global Aluminium: An interview with Carlo Nizam, Chief Digital Officer
EGA’s digital transformation is driven by a dual-track strategy, designed to deliver both short-term impact and long-term scalability. Carlo emphasizes that the digital factory is the heart of this transformation, executing practical use cases that deliver measurable results. “The digital factory delivers quarterly waves of use cases, which are implemented in agile sprints. Each wave involves multiple business areas, from supply chain and marketing to operations, HR, and finance,” Carlo says. This approach allows the company to demonstrate tangible results quickly while working toward more complex, large-scale initiatives. Since its inception, EGA’s digital factory has delivered over 100 million USD in impact, with more than 80 use cases, ranging from AI applications for real-time quality checks to predictive tools for market movements.
For example at EGA, AI-powered cameras are being used to enhance operational efficiency and ensure precision in critical processes. One such application is in the Smart Cranes system, which oversees the replacement of anodes; large carbon blocks essential for electricity conduction during smelting. EGA’s real-time vision AI model helps supervisors monitor compliance during the 9 million crane tasks performed annually by over 5,000 operators. Integrated with the industrial operations mobile platform, the AI system generates alarms in case of non-compliance, enabling immediate corrective action.
In the carbon anode facility, AI cameras are also revolutionizing quality control. Previously, only 2% of the anodes produced were manually inspected for defects. With the installation of cameras at the inspection station, EGA now uses a neural network machine learning model to analyze nearly 4 million images each year. This system automatically inspects 100% of anode surfaces, detecting defects in real time with greater accuracy than the human eye, significantly improving the quality of the aluminium produced.
Label inspection for pharmaceutical manufacturing - Why OCTUM & HERMA trust Cognex Machine Vision
Autaza Vision AI
Mech-Mind's Industrial 3D Camera Mech-Eye: Empowering Robotic Integrators
AI Driven Vision Inspection Automation for Engine Tappets
How AI helps this contract manufacturer to stand out on product quality
Increase manufacturing processes by 25% with AI, Opcenter and Retrocausual a Siemens Partner
Cone Ice cream Inspection using Machine Vision
Basler AG: Innovation Leaders
How OSARO used Cognex to solve a tricky barcode reading challenge for Zenni Optical
📷 Making automated visual-inspection systems practical
Using supervised learning to train anomaly localization models has major drawbacks compared to images of defect-free products, images of defective products are scarce; and labeling defective-product images is expensive. Consequently, our benchmarking framework doesn’t require any anomalous images in the training phase. Instead, from the defect-free examples, the model learns a distribution of typical image features.
We have released our benchmark in the hope that other researchers will expand on it, to help bridge the gap between the impressive progress on anomaly localization in research and the challenges of real-world implementation.
🧠🦾 Google’s Robotic Transformer 2: More Than Meets the Eye
Google DeepMind’s Robotic Transformer 2 (RT2) is an evolution of vision language model (VLM) software. Trained on images from the web, RT2 software employs robotics datasets to manage low-level robotics control. Traditionally, VLMs have been used to combine inputs from both visual and natural language text datasets to accomplish more complex tasks. Of course, ChatGTP is at the front of this trend.
Google researchers identified a gap in how current VLMs were being applied in the robotic space. They note that current methods and approaches tend to focus on high-level robotic theory such as strategic state machine models. This leaves a void in the lower-level execution of robotic action, where the majority of control engineers execute work. Thus, Google is attempting to bring the power and benefits of VLMs down into the control engineers’ domain of programming robotics.
🛣️ America’s Bridges, Factories and Highways Are in Dire Need of Repairs. Bring in the Robots.
These days, Shell is able to keep the plant running, and keep repair personnel on the ground and at a safe distance as they operate wall-climbing robots that inspect things like steel holding tanks at millimeter resolution, says Steven Treviño, a robotics engineer at Shell. Using a variety of sensors, the robots can look for both corrosion and cracking. This helps the team shorten the list of things they have to take care of when a full shutdown occurs. The magnetic wall climbers Shell is using are made by a Pittsburgh-based startup called, appropriately, Gecko Robotics. After testing the Gecko robots at Geismar, Shell plans to expand their use to offshore facilities.
“There are hundreds of types of corrosion,” says Jake Loosararian, CEO of Gecko Robotics, “and we’ve been developing technology and software to analyze what kind of damage is happening.” Gecko began as a robotics company, but has since expanded into creating software to process the data its robots gather. The startup makes systems that are now used to track more than 60,000 assets across the globe, including power plants, pipelines, oil refineries, dams, U.S. Navy vessels and other military equipment.
When it comes to inspections, “often the data you need is literally in plain sight, it’s just hard to collect it,” says Bry, of Skydio.
AI Transformer Models Enable Machine Vision Object Detection
Machine vision is another key technology, and today AI and machine vision interact in a few ways. “First, machine vision output is fed to an AI engine to perform functions such as people counting, object recognition, etc., to make decisions,” said Arm’s Zyazin. “Second, AI is used to provide better quality images with AI-based de-noising, which then assists with decision-making. An example could be an automotive application where a combination of AI and machine vision can recognize a speed limit sign earlier and adjust the speed accordingly.”
“There are a few main directions for machine vision, including cloud computing to scale deep-learning solutions, automated ML architectures to improve the ML pipeline, transformer architectures that optimize computer vision (a superset of machine vision), and mobile devices incorporating computer vision technology on the edge,” Synopsys’ Andersen said.
🧠🦾 RT-2: New model translates vision and language into action
Robotic Transformer 2 (RT-2) is a novel vision-language-action (VLA) model that learns from both web and robotics data, and translates this knowledge into generalised instructions for robotic control.
High-capacity vision-language models (VLMs) are trained on web-scale datasets, making these systems remarkably good at recognising visual or language patterns and operating across different languages. But for robots to achieve a similar level of competency, they would need to collect robot data, first-hand, across every object, environment, task, and situation.
In our paper, we introduce Robotic Transformer 2 (RT-2), a novel vision-language-action (VLA) model that learns from both web and robotics data, and translates this knowledge into generalised instructions for robotic control, while retaining web-scale capabilities.
📐 UCLA Researchers Propose PhyCV: A Physics-Inspired Computer Vision Python Library
In the latest innovation, Jalali-Lab @ UCLA has developed a new Python library called PhyCV, which is the first Physics-based Computer vision Python library. This unique library uses algorithms based on the laws and equations of physics to analyze pictorial data. These algorithms imitate how light passes through several physical materials and are based on mathematical equations rather than a series of hand-crafted rules. The algorithms in PhyCV are built on the principles of a rapid data acquisition method called the photonic time stretch.
The three algorithms included in PhyCV are – Phase-Stretch Transform (PST) algorithm, Phase-Stretch Adaptive Gradient-Field Extractor (PAGE) algorithm, and Vision Enhancement via Virtual diffraction and coherent Detection (VEViD) algorithm.
Behind the A.I. tech making BMW vehicle assembly more efficient
Vision retrofits are the quick automation wins you should do now
A vision retrofit is to install the latest AI-powered vision technologies to optimize the performance of a robotic cell. These technologies can significantly improve productivity by allowing robots to operate faster in a wider range of operating conditions and account for randomness intelligently. By upgrading existing 3D camera or fixtured setups with this AI-powered solution, companies can improve their operations and gain a competitive edge.
With AI-powered vision you only need to add an extrusion above the cell for 2D cameras (or just remove the existing 3D camera). With a focused investment under $50K, a few hours of downtime and a project time under six weeks, you could have a cell performing faster and more reliably.
Meta-Transformer: A Unified Framework for Multimodal Learning
Multimodal learning aims to build models that can process and relate information from multiple modalities. Despite years of development in this field, it still remains challenging to design a unified network for processing various modalities (e.g. natural language, 2D images, 3D point clouds, audio, video, time series, tabular data) due to the inherent gaps among them. In this work, we propose a framework, named Meta-Transformer, that leverages a frozen encoder to perform multimodal perception without any paired multimodal training data. In Meta-Transformer, the raw input data from various modalities are mapped into a shared token space, allowing a subsequent encoder with frozen parameters to extract high-level semantic features of the input data. Composed of three main components: a unified data tokenizer, a modality-shared encoder, and task-specific heads for downstream tasks, Meta-Transformer is the first framework to perform unified learning across 12 modalities with unpaired data. Experiments on different benchmarks reveal that Meta-Transformer can handle a wide range of tasks including fundamental perception (text, image, point cloud, audio, video), practical application (X-Ray, infrared, hyperspectral, and IMU), and data mining (graph, tabular, and time-series). Meta-Transformer indicates a promising future for developing unified multimodal intelligence with transformers.
🧠📹 What Sets Toshiba’s Ceramic Balls Apart? The AI Quality Inspection System
Bearings cannot be easily replaced once a vehicle is assembled. In the U.S., bearings used in EVs are expected to be of high enough quality to withstand long distances. One issue that can occur with EVs, however, is the “electric corrosion” of the bearings that mount the various vital parts of the vehicle onto the motor—a serious issue, as it can lead to the breakdown of the vehicle. High-performance bearings would drive the widespread use of EVs, and contribute to the push towards carbon neutrality. The electrical corrosion phenomenon had hampered these efforts, but not anymore—therein lies the beauty of Toshiba’s ceramic balls.
“Our ceramic balls go through slight changes about every year and a half due to changes in material and other factors. To keep up the accuracy of the quality inspections, we have to continually update the AI system itself. The MLOps system automates that process,” says Kobatake.
“We’ve been able to dramatically reduce the time spent on these inspections. Ceramic balls are expensive compared to their metal counterparts. They have so many different strengths, and yet they haven’t been able to replace the metal ones precisely because of this particular issue. If we’re able to reduce the cost through AI quality inspection, we’ll be able to lower the price of the products themselves,” says Yamada.
Apera AI & Mitsubishi Electric Automation Making Robotic Vision Simple
ImageBind: One Embedding Space To Bind Them All
We present ImageBind, an approach to learn a joint embedding across six different modalities - images, text, audio, depth, thermal, and IMU data. We show that all combinations of paired data are not necessary to train such a joint embedding, and only image-paired data is sufficient to bind the modalities together. ImageBind can leverage recent large scale vision-language models, and extends their zero-shot capabilities to new modalities just by using their natural pairing with images. It enables novel emergent applications ‘out-of-the-box’ including cross-modal retrieval, composing modalities with arithmetic, cross-modal detection and generation. The emergent capabilities improve with the strength of the image encoder and we set a new state-of-the-art on emergent zero-shot recognition tasks across modalities, outperforming specialist supervised models. Finally, we show strong few-shot recognition results outperforming prior work, and that ImageBind serves as a new way to evaluate vision models for visual and non-visual tasks.
Improving Image Resolution At The Edge
AI vision for print quality inspection on bottles
Basler Lens selector: Match the right lens for your camera
How is 3D machine vision transforming manufacturing processes?
3D machine vision employs 3D cameras that provide robots with data and information pertaining to particular parts. These three-dimensional cameras can be installed at various locations to create 360-degree, multi-angle images for surface and volume inspection.
The topographical map results from reflected laser displacement. Taking images from two distinct angles facilitates you in getting the 3D data of the image. Then, the separation between each perspective in 3D space is computed. There’s some installed software that can do some substantial image processing and analysis. To evaluate an object with machine vision software, a PC-based machine vision system is hardwired to vision cameras and image capture boards.
AI Driven Vision Inspection Automation for Bevel Gears
Edge Learning: AI for Industrial Machine Vision Made Easy
Plastic Bottles Defect Inspection Using Omron FH Vision System with AI
Artificial intelligence for stable processes in industry
Cerrion’s solution can be used in any manufacturing process where the process is visible. What makes it special is the combination of video technology, which can be integrated directly into existing processes, and state-of-the-art AI. This way it is possible to monitor new processes immediately without having to first import a lot of data into the system, as is often the case with conventional solutions. The AI-based computer vision technology learns what a process looks like in normal operation and can detect deviations in real time.
The AI-based software can be connected to a commercially available camera and instantly provides reliable process insights. The system helps to detect problems in the process at an early stage, i.e. before they arise and lead to failures or losses. The technology can be used for a wide range of applications. Cerrion has already gained experience in glass bottle production, plant and machine manufacturing, tool manufacturing, pharmaceutical packaging, and wood construction. The AI can detect and track problems in automated processes such as jams on the production line in real time. It can also analyse whether the assigned time for the manual assembly of a component is in line with planning or whether the defined process needs to be optimised. Hazards for employees and equipment can also be detected and eliminated at an early stage. This leads to more safety in operations.
Zebra Technologies to Acquire Matrox Imaging, Broadening Its Portfolio of Machine Vision Solutions
Zebra Technologies (NASDAQ: ZBRA), an innovator at the front line of business with solutions and partners that deliver a performance edge, announced it intends to acquire Matrox Imaging (Matrox Electronic Systems Ltd.), a proven developer of advanced machine vision components and systems. This acquisition further expands Zebra’s offerings in the fast-growing automation and vision technology solution space. Last year, Zebra introduced its fixed industrial scanning and machine vision portfolio and acquired Adaptive Vision and Fetch Robotics.
Matrox Imaging offers platform-independent software, software development kits (SDKs), smart cameras, 3D sensors, vision controllers, input/output (I/O) cards, and frame grabbers which are used to capture, inspect, assess, and record data from industrial vision systems in factory automation, electronics and pharmaceutical packaging, semiconductor inspection, and more. These capabilities enable industrial customers to lower their cost to manufacture products, improve product quality, and increase compliance and yield.
The acquisition of Matrox Imaging expands the portfolio of machine vision products, software and services Zebra can offer customers to help them thrive in the on-demand economy that is constrained by both labor shortages and limited supply of upstream goods and materials. Matrox Imaging’s solutions complement Zebra’s recently launched fixed industrial scanning and machine vision portfolio as well as significantly augment Zebra’s growing expertise in software, machine learning and deep learning.
How machine vision works in RIBE Anlagentechnik’s camera-monitored assembly facility
The German company RIBE Anlagentechnik develops innovative assembly systems, including inspection systems, for bumpers. SICK’s machine vision helps to identify the individual components, and it also monitors each work operation. This particular system concept could prove revolutionary for other manufacturers and suppliers as well.
As the level of individualization in production areas increases, so does the importance of special-purpose systems with innovative potential. RIBE Anlagentechnik specializes in delivering added value to its end customers. The company has demonstrated its specific strengths in technologies associated with assembly and inspection systems for vehicle interiors/exteriors and related components. Managing Director Dietmar Heckel regards the cobot and robot technologies with innovative Industry 4.0 solutions and digitalization concepts not only as a supporting pillar of RIBE Anlagentechnik, but also as a cross-sectoral growth field.
Where Four-Legged Robot Dogs Are Finding Work
High-Performance Machine Vision: Versatile lighting for subtle surface defects
Fabs Drive Deeper Into Machine Learning
For the past couple decades, semiconductor manufacturers have relied on computer vision, which is one of the earliest applications of machine learning in semiconductor manufacturing. Referred to as Automated Optical Inspection (AOI), these systems use signal processing algorithms to identify macro and micro physical deformations.
Defect detection provides a feedback loop for fab processing steps. Wafer test results produce bin maps (good or bad die), which also can be analyzed as images. Their data granularity is significantly larger than the pixelated data from an optical inspection tool. Yet test results from wafer maps can match the splatters generated during lithography and scratches produced from handling that AOI systems can miss. Thus, wafer test maps give useful feedback to the fab.
3D Vision Technology Advances to Keep Pace With Bin Picking Challenges
When a bin has one type of object with a fixed shape, bin picking is straightforward, as CAD models can easily recognize and localize individual items. But randomly positioned objects can overlap or become entangled, presenting one of the greatest challenges in bin picking. Identifying objects with varying shapes, sizes, colors, and materials poses an even larger challenge, but by deploying deep learning algorithms, it is possible to find and match objects that do not conform to one single geometrical description but belong to a general class defined by examples, according to Andrea Pufflerova, Public Relations Specialist at Photoneo.
“A well-trained convolutional neural network (CNN) can recognize and classify mixed and new types of objects that it has never come across before,”
Vision Cameras Inspect Disk Drive Assemblies
Once manufactured, an HDD is carefully fitted and sealed in a metal or plastic case. The case ensures that all drive components are perfectly secured in place and their mechanics work well over the lifetime of the product. It also protects the sensitive disks from dust, humidity, shock and vibration.
An HDD case must be defect-free and have perfectly machined thread holes to perform these functions, according to Somporn Kornwong, a manager at Flexon. In 2019 his company developed Visual Machine Inspection (VMI) for a manufacturer so it can quickly and thoroughly inspect each case it produces.
Simplify Deep Learning Systems with Optimized Machine Vision Lighting
Deep learning cannot compensate for or replace quality lighting. This experiment’s results would hold true over a wide variety of machine vision applications. Poor lighting configurations will result in poor feature extraction and increased defect detection confusion (false positives).
Several rigorous studies show that classification accuracy reduces with image quality distortions such as blur and noise. In general, while deep neural networks perform better than or on par with humans on quality images, a network’s performance is much lower than a human’s when using distorted images. Lighting improves input data, which greatly increases the ability of deep neural network systems to compare and classify images for machine vision applications. Smart lighting — geometry, pattern, wavelength, filters, and more — will continue to drive and produce the best results for machine vision applications with traditional or deep learning systems.
Machine Vision System for Detection of Edge Cracks in Packaging Industry
Perceiver: General Perception with Iterative Attention
Biological systems perceive the world by simultaneously processing high-dimensional inputs from modalities as diverse as vision, audition, touch, proprioception, etc. The perception models used in deep learning on the other hand are designed for individual modalities, often relying on domain-specific assumptions such as the local grid structures exploited by virtually all existing vision models. These priors introduce helpful inductive biases, but also lock models to individual modalities. In this paper we introduce the Perceiver - a model that builds upon Transformers and hence makes few architectural assumptions about the relationship between its inputs, but that also scales to hundreds of thousands of inputs, like ConvNets. The model leverages an asymmetric attention mechanism to iteratively distill inputs into a tight latent bottleneck, allowing it to scale to handle very large inputs. We show that this architecture is competitive with or outperforms strong, specialized models on classification tasks across various modalities: images, point clouds, audio, video, and video+audio. The Perceiver obtains performance comparable to ResNet-50 and ViT on ImageNet without 2D convolutions by directly attending to 50,000 pixels. It is also competitive in all modalities in AudioSet.
Tilling AI: Startup Digs into Autonomous Electric Tractors for Organics
Ztractor offers tractors that can be configured to work on 135 different types of crops. They rely on the NVIDIA Jetson edge AI platform for computer vision tasks to help farms improve plant conditions, increase crop yields and achieve higher efficiency.
AI Vision for Monitoring Applications in Manufacturing and Industrial Environments
In traditional industrial and manufacturing environments, monitoring worker safety, enhancing operator efficiency, and improving quality assurance were physical tasks. Today, AI-enabled machine vision technologies replace many of these inefficient, labor-intensive operations for greater reliability, safety, and efficiency. This article explores how, by deploying AI smart cameras, further performance improvements are possible since the data used to empower AI machine vision comes from the camera itself.
Zebra Technologies Introduces Intuitive, Flexible Industrial Machine Vision and Fixed Scanning Solutions
Zebra Technologies Corporation (NASDAQ: ZBRA), an innovator at the front line of business with solutions and partners that deliver a performance edge, announced it has entered the fixed industrial scanning (FIS) and machine vision (MV) markets with a new portfolio of solutions that enable track and trace capabilities and quality inspection of manufacturing work in process. Zebra’s suite of machine vision smart cameras and fixed industrial scanners is unlocked by Zebra Aurora™, a unified software platform that can easily set up, deploy and run both cameras and scanners, meeting businesses’ need for simplicity, speed, productivity and efficiency.
As part of its move into the FIS and MV markets, Zebra has acquired Adaptive Vision, a leading provider of graphical MV software for manufacturing and other industries. Adaptive Vision’s comprehensive set of tools and algorithms help power-users easily create complex MV applications while assisting customers who are relatively new to MV produce full-featured applications without coding. The addition of Adaptive Vision’s MV software, deep learning expertise and team of machine vision engineers will provide manufacturers greater visibility into the status and condition of their goods and assets through visual-based sensing and analytics capabilities.
The fixed industrial scanners improve track and trace capabilities throughout the supply chain with flawless decoding of every part and package moving through production, storage and fulfillment. Capable of reading 1D/2D barcodes, direct part marks (DPM) and optical character recognition (OCR) text, Zebra’s fixed industrial scanners help improve productivity and automate the movement of goods, enhancing the efficiency of warehouse, shipping and returns processes. Zebra’s MV smart cameras are ideal for automating quality inspections in a variety of discrete manufacturing processes, reducing defects as well as validating assembly and tracking information to improve productivity and quality.
Tools Move up the Value Chain to Take the Mystery Out of Vision AI
Intel DevCloud for the Edge and Edge Impulse offer cloud-based platforms that take most of the pain points away with easy access to the latest tools and software. While Xilinx and others have started offering complete systems-on-module with production-ready applications that can be deployed with tools at a higher level of abstraction, removing the need for some of the more specialist skills.
How the USPS Is Finding Lost Packages More Quickly Using AI Technology from Nvidia
In one of its latest technology innovations, the USPS got AI help from Nvidia to fix a problem that has long confounded existing processes – how to better track packages that get lost within the USPS system so they can be found in hours instead of in several days. In the past, it took eight to 10 people several days to locate and recover lost packages within USPS facilities. Now it is done by one or two people in a couple hours using AI.
Hyperspectral imaging aids precision farming
Remote sensing techniques have exponentially evolved thanks to technological progress with the spread of multispectral cameras. Hyperspectral imaging is the capture and processing of an image at a very high number of wavelengths. While multispectral imaging can evaluate the process with three or four colors (red, green, blue and near infrared), hyperspectral imaging splits the image into tens or hundreds of colors. By using the technique of spectroscopy, which is used to identify materials based on how light behaves when it hits a subject, hyperspectral imaging obtains more spectra of data for each pixel in the image of a scene.
Unlike radiography, hyperspectral imaging is a non-destructive, non-contact technology that can be used without damaging the object being analyzed. For example, a drone with a hyperspectral camera can detect plant diseases, weeds, soil erosion problems, and can also estimate crop yields.
John Deere and Audi Apply Intel’s AI Technology
Identifying defects in welds is a common quality control process in manufacturing. To make these inspections more accurate, John Deere is applying computer vision, coupled with Intel’s AI technology, to automatically spot common defects in the automated welding process used in its manufacturing facilities.
At Audi, automated welding applications range from spot welding to riveting. The widespread automation in Audi factories is part of the company’s goal of creating Industrie 4.0-level smart factories. A key aspect of this goal involves Audi’s recognition that creating customized hardware and software to handle individual use cases is not preferrable. Instead, the company focuses on developing scalable and flexible platforms that allow them to more broadly apply advanced digital capabilities such as data analytics, machine learning, and edge computing.
F-16s Are Now Getting Washed By Robots
The Wilder Systems solution actually leverages technology previously developed for robotic drilling in commercial aircraft manufacturing and converts these components and subsystems into an automated washing system. The main changes have involved the development and addition of robot end-effectors to provide the water and soap spray, waterproofing of the robots themselves, and a robot motion path, which is dependent on the type of aircraft to be cleaned.
Machine learning optimizes real-time inspection of instant noodle packaging
During the production process there are various factors that can potentially lead to the seasoning sachets slipping between two noodle blocks and being cut open by the cutting machine or being packed separately in two packets side by side. Such defective products would result in consumer complaints and damage to the company’s reputation, for which reason delivery of such products to dealers should be reduced as far as possible. Since the machine type upgraded by Tianjin FengYu already produced with a very low error rate before, another aspect of quality control is critical: It must be ensured that only the defective and not the defect-free products are reliably sorted out.
Tractor Maker John Deere Using AI on Assembly Lines to Discover and Fix Hidden Defective Welds
John Deere performs gas metal arc welding at 52 factories where its machines are built around the world, and it has proven difficult to find defects in automated welds using manual inspections, according to the company.
That’s where the successful pilot program between Intel and John Deere has been making a difference, using AI and computer vision from Intel to “see” welding issues and get things back on track to keep John Deere’s pilot assembly line humming along.
Harvesting AI: Startup’s Weed Recognition for Herbicides Grows Yield for Farmers
In 2016, the former dorm-mates at École Nationale Supérieure d’Arts et Métiers, in Paris, founded Bilberry. The company today develops weed recognition powered by the NVIDIA Jetson edge AI platform for precision application of herbicides at corn and wheat farms, offering as much as a 92 percent reduction in herbicide usage.
Driven by advances in AI and pressures on farmers to reduce their use of herbicides, weed recognition is starting to see its day in the sun.
Analysing fruit data in the supply chain has never been more important for business efficiency
Fruit and production data can be used in ways that it has never been done before to improve a company’s efficiency and boost profits, according to global packhouse equipment and automation supplier Tomra Food.
He added that there are several different useful data types at play in a packhouse; production and traceability level data, performance level data, quality data and auditing data. This data can be used to optimise the supply chain and can be used to make decisions and directions in terms of the next big thing that needs to be done. But consumer trends will constantly change the requirements of automation.
One-Shot Recognition of Manufacturing Defects in Steel Surfaces
Quality control is an essential process in manufacturing to make the product defect-free as well as to meet customer needs. The automation of this process is important to maintain high quality along with the high manufacturing throughput. With recent developments in deep learning and computer vision technologies, it has become possible to detect various features from the images with near-human accuracy. However, many of these approaches are data intensive. Training and deployment of such a system on manufacturing floors may become expensive and time-consuming. The need for large amounts of training data is one of the limitations of the applicability of these approaches in real-world manufacturing systems. In this work, we propose the application of a Siamese convolutional neural network to do one-shot recognition for such a task. Our results demonstrate how one-shot learning can be used in quality control of steel by identification of defects on the steel surface. This method can significantly reduce the requirements of training data and can also be run in real-time.