Reinforcement Learning (RL) | Exponential Industry

Fourier Trains Humanoid Robots for Real-World Roles Using NVIDIA Isaac Gym

📅 Date: November 6, 2024

🔖 Topics: Partnership, Humanoid, Reinforcement learning

🏢 Organizations: Fourier, NVIDIA

Fourier, a Shanghai-based robotics company, is doing the heavy lifting by developing advanced humanoid robots that can be integrated into real-world applications where precision and agility are critical. The company announced the expansion of its GRx humanoid robot series with the launch of GR-2 in late September. Building on the previous-generation GR-1, the world’s first mass-produced humanoid robot, GR-2 features an upgraded hardware design, greater adaptability, advanced dexterity and a humanlike range of motion.

To develop and test GR-2, the Fourier team turned to NVIDIA Isaac Gym (now deprecated) for reinforcement learning. They are currently porting their workflows to the recently launched NVIDIA Isaac Lab, an open-source modular framework for robot learning designed to simplify how robots adapt to new skills.

While training GR-2 for the floor-to-stand maneuver, Fourier simulated the physical demands required for completing tasks at different levels of elevation. By replicating the GR-2 model, they tested how it performs under various settings and completed 3,000 iterations in around 15 hours, a notable reduction compared to traditional training methods. When transferred directly to GR-2’s physical controls, the model’s action tensors achieved an 89% success rate.

Crossing the Sim2Real Gap With NVIDIA Isaac Lab

📅 Date: October 31, 2024

🔖 Topics: Partnership, Sim2Real, Reinforcement Learning, Humanoid

🏢 Organizations: Agility Robotics, NVIDIA

We’ve been able to demonstrate this in areas like step-recovery, where physics are particularly hard to model. In situations where Digit loses its footing, it’s often a result of an environment where we don’t have a good model of what’s going on – there might be something pushing on or caught on Digit, or its feet might be slipping on the ground in an unexpected way. Digit might not even be able to tell which issue it’s having. But we can train a controller to be robust to many of these disturbances with reinforcement-learning, training it on many possible ways that the robot might fall until it comes up with a controller that works well in many situations. In the following chart, you can see how big of a difference that training can make.

Early last year, we started using NVIDIA Isaac Lab to train these types of models. Working with NVIDIA, we were able to make some basic policies that allowed Digit to walk around.

The net result has been a huge step forward in our RL software stack. Instead of a pile of stacked reward functions over everything from “stop wiggling your foot” to “stand up straighter”, we have a handful of rewards around things like energy consumption and symmetry that are not only simpler, but follow our basic intuitions about how Digit should move. Investing the time to understand why the simulation differed has taught us a lot more about why we want Digit to move a certain way in the first place. And most importantly, coupled with fast NVIDIA Isaac Sim, a reference application built on NVIDIA Omniverse for simulating an testing AI-driven robots, it’s enabled us to explore the impact of different physical characteristics that we might want in future generations of Digit.

Multi-agent reinforcement learning for integrated manufacturing system-process control

📅 Date: September 1, 2024

✍️ Authors: Chen Li, Qing Chang, Hua-Tzu Fan

🔖 Topics: Reinforcement Learning, Process Control

🏢 Organizations: University of Virginia, General Motors

The increasing complexity, adaptability, and interconnections inherent in modern manufacturing systems have spurred a demand for integrated methodologies to boost productivity, improve quality, and streamline operations across the entire system. This paper introduces a holistic system-process modeling and control approach, utilizing a Multi-Agent Reinforcement Learning (MARL) based integrated control scheme to optimize system yields. The key innovation of this work lies in integrating the theoretical development of manufacturing system-process property understanding with enhanced MARL-based control strategies, thereby improving system dynamics comprehension. This, in turn, enhances informed decision-making and contributes to overall efficiency improvements. In addition, we present two innovative MARL algorithms: the credit-assigned multi-agent actor-attention-critic (C-MAAC) and the physics-guided multi-agent actor-attention-critic (P-MAAC), each designed to capture the individual contributions of agents within the system. C-MAAC extracts global information via parallel-trained attention blocks, whereas P-MAAC embeds system dynamics through permanent production loss (PPL) attribution. Numerical experiments underscore the efficacy of our MARL-based control scheme, particularly highlighting the superior training and execution performance of C-MAAC and P-MAAC. Notably, P-MAAC achieves rapid convergence and exhibits remarkable robustness against environmental variations, validating the proposed approach’s practical relevance and effectiveness.

Read more at Journal of Manufacturing Systems

Manufacturing process optimization for real-time quality control in multi-regime conditions: Tire tread production use case

📅 Date: August 14, 2024

✍️ Authors: Katarina Stanković, Dea Jelić, Nikola Tomašević, Aleksandra Krstić

🔖 Topics: Process Control, Suffix Tree Search, Reinforcement Learning

🏭 Vertical: Plastics and Rubber

🏢 Organizations: University of Belgrade

The high-stake nature of most manufacturing processes empowers the importance of real-time quality control and assurance. In the event of a failure in production, a decision-making process can be time-consuming for the human and prevent timely actions. The agility can be boosted with a decision-support system based on artificial intelligence. Particularly, multi-objective process optimization can be employed to select the optimal control settings in real-time, and thus enhance relevant key performance indicators, concurrently.

Based on process and quality parameters being streamed from the production plant in real-time, the optimizer can act in timely critical and quality-threatening situations and generate immediate corrective actions. The multi-regime operation of the plant and design space dimensionality can impact the convergence rate and add to execution time. Therefore, production regimes recognition and greedy search of suffix tree-based models of the process have been engaged, aiding in a better-focused and faster space search at an early phase of the algorithm run.

Beyond simply reviewing the outputs, the user can leave feedback, which is utilized by the optimizer’s reinforcement learning mechanisms. Validated in this real-world scenario, the solution produced a rise from 81.83% to 90.91% in the tread quality.

Read more at Journal of Manufacturing Systems

Precision Home Robotics w/Real-to-Sim-to-Real

Automating Circuit Board Design Using Reinforcement Learning

Explainable generative design in manufacturing for reinforcement learning based factory layout planning

📅 Date: November 28, 2023

✍️ Authors: Matthias Klar, Patrick Ruediger, Maik Schuermann, Goren Tobias Gören, Moritz Glatt, Bahram Ravani, Jan C. Aurich

🔖 Topics: Generative AI, Generative Design, Facility Design, Reinforcement Learning

🏢 Organizations: RPTU Kaiserslautern

Generative design can be an effective approach to generate optimized factory layouts. One evolving topic in this field is the use of reinforcement learning (RL)-based approaches. Existing research has focused on the utilization of the approach without providing additional insights into the learned metrics and the derived policy. This information, however, is valuable from a layout planning perspective since the planner needs to ensure the trustworthiness and comprehensibility of the results. Furthermore, a deeper understanding of the learned policy and its influencing factors can help improve the manual planning process that follows as well as the acceptance of the results. These gaps in the existing approaches can be addressed by methods categorized as explainable artificial intelligence methods which have to be aligned with the properties of the problem and its audience. Consequently, this paper presents a method that will increase the trust in layouts generated by the RL approach. The method uses policy summarization and perturbation together with the state value evaluation. The method also uses explainable generative design for analyzing interrelationships between state values and actions at a feature level. The result is that the method identifies whether the RL approach learns the problem characteristics or if the solution is a result of random behavior. Furthermore, the method can be used to ensure that the reward function is aligned with the overall optimization goal and supports the planner in further detailed planning tasks by providing insights about the problem-defining interdependencies. The applicability of the proposed method is validated based on an industrial application scenario considering a layout planning case of 43 functional units. The results show that the method allows evaluation of the trustworthiness of the generated results by preventing randomly generated solutions from being considered in a detailed manual planning step. The paper concludes with a discussion of the results and a presentation of future research directions which also includes the transfer potentials of the proposed method to other application fields in RL-based generative design.

Read more at Journal of Manufacturing Systems

Eureka! NVIDIA Research Breakthrough Puts New Spin on Robot Learning

📅 Date: October 20, 2023

✍️ Author: Angie Lee

🔖 Topics: Generative AI, Large Language Model, Industrial Robot, Reinforcement Learning

🏢 Organizations: NVIDIA

A new AI agent developed by NVIDIA Research that can teach robots complex skills has trained a robotic hand to perform rapid pen-spinning tricks — for the first time as well as a human can. The Eureka research, published today, includes a paper and the project’s AI algorithms, which developers can experiment with using NVIDIA Isaac Gym, a physics simulation reference application for reinforcement learning research. Isaac Gym is built on NVIDIA Omniverse, a development platform for building 3D tools and applications based on the OpenUSD framework. Eureka itself is powered by the GPT-4 large language model.

🧠🎛️ Multi-objective reinforcement learning in process control: A goal-oriented approach with adaptive thresholds

📅 Date: August 30, 2023

✍️ Authors: Dazi Li, Wentao Gu, Tianheng Song

🔖 Topics: Reinforcement Learning, Fermentation

🏢 Organizations: Beijing University of Chemical Technology, Lenovo

In practical control problems with multiple conflicting objectives, multi-objective optimization (MOO) problems must be simultaneously addressed. To tackle these challenges, scholars have extensively studied multi-objective reinforcement learning (MORL) in recent years. However, due to the complexity of the system and the difficulty in determining preferences between objectives, complex continuous control processes involving MOO problems still require further research. In this study, an innovative goal-oriented MORL algorithm is proposed. The agent is better guided for optimization through adaptive thresholds and goal selection strategy. Additionally, the reward function is refined based on the chosen objective. To validate the approach, a comprehensive environment for the fermentation process is designed. Experimental results show that our proposed algorithm surpasses other benchmark algorithms in most performance metrics. Moreover, the Pareto solution set found by our algorithm is closer to the true Pareto frontier of fermentation problems.

Read more at Journal of Process Control

🦾 Transferring Industrial Robot Assembly Tasks from Simulation to Reality

📅 Date: May 26, 2023

✍️ Authors: Bingjie Tang, Yashraj Narang

🔖 Topics: Industrial Robot, Simulation, Reinforcement Learning

🏢 Organizations: NVIDIA, Franka Emika

By lessening the complexity of the hardware architecture, we can significantly increase the capabilities and ways of using the equipment that makes it financially efficient even for low-volume tasks. Moreover, the further development of the solution can be mostly in the software part, which is easier, faster and cheaper than hardware R&D. Having chipset architecture allows us to start using AI algorithms - a huge prospective. To use RL for challenging assembly tasks and address the reality gap, we developed IndustReal. IndustReal is a set of algorithms, systems, and tools for robots to solve assembly tasks in simulation and transfer these capabilities to the real world.

We introduce the simulation-aware policy update (SAPU) that provides the simulated robot with knowledge of when simulation predictions are reliable or unreliable. Specifically, in SAPU, we implement a GPU-based module in NVIDIA Warp that checks for interpenetrations as the robot is learning how to assemble parts using RL.

We introduce a signed distance field (SDF) reward to measure how closely simulated parts are aligned during the assembly process. An SDF is a mathematical function that can take points on one object and compute the shortest distances to the surface of another object. It provides a natural and general way to describe alignment between parts, even when they are highly symmetric or asymmetric.

We also propose a policy-level action integrator (PLAI), a simple algorithm that reduces steady-state (that is, long-term) errors when deploying a learned skill on a real-world robot. We apply the incremental adjustments to the previous instantaneous target pose to produce the new instantaneous target pose. Mathematically (akin to the integral term of a classical PID controller), this strategy generates an instantaneous target pose that is the sum of the initial pose and the actions generated by the robot over time. This technique can minimize errors between the robot’s final pose and its final target pose, even in the presence of physical complexities.

🧠 Data-Driven Wind Farm Control via Multiplayer Deep Reinforcement Learning

📅 Date: May 1, 2023

✍️ Authors: Hongyang Dong, Xiaowei Zhao

🔖 Topics: Machine Learning, Reinforcement Learning

🏢 Organizations: University of Warwick

This brief proposes a novel data-driven control scheme to maximize the total power output of wind farms subject to strong aerodynamic interactions among wind turbines. The proposed method is model-free and has strong robustness, adaptability, and applicability. Particularly, distinct from the state-of-the-art data-driven wind farm control methods that commonly use the steady-state or time-averaged data (such as turbines’ power outputs under steady wind conditions or from steady-state models) to carry out learning, the proposed method directly mines in-depth the time-series data measured at turbine rotors under time-varying wind conditions to achieve farm-level power maximization. The control scheme is built on a novel multiplayer deep reinforcement learning method (MPDRL), in which a special critic–actor–distractor structure, along with deep neural networks (DNNs), is designed to handle the stochastic feature of wind speeds and learn optimal control policies subject to a user-defined performance metric. The effectiveness, robustness, and scalability of the proposed MPDRL-based wind farm control method are tested by prototypical case studies with a dynamic wind farm simulator (WFSim). Compared with the commonly used greedy strategy, the proposed method leads to clear increases in farm-level power generation in case studies.

🦾♻️ Robotic deep RL at scale: Sorting waste and recyclables with a fleet of robots

📅 Date: April 13, 2023

✍️ Authors: Sergey Levine, Alexander Herzog

🔖 Topics: Recycling, Robot Picking, Reinforcement Learning

🏢 Organizations: Google

In “Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of Mobile Manipulators”, we discuss how we studied this problem through a recent large-scale experiment, where we deployed a fleet of 23 RL-enabled robots over two years in Google office buildings to sort waste and recycling. Our robotic system combines scalable deep RL from real-world data with bootstrapping from training in simulation and auxiliary object perception inputs to boost generalization, while retaining the benefits of end-to-end training, which we validate with 4,800 evaluation trials across 240 waste station configurations.

A new intelligent fault diagnosis framework for rotating machinery based on deep transfer reinforcement learning

📅 Date: March 10, 2023

✍️ Authors: Daoguang Yang, Hamid Reza Karimi, Marek Pawelczyk

🔖 Topics: Bearing, Reinforcement Learning, Machine Health, Convolutional Neural Network

🏢 Organizations: Politecnico di Milano, Silesian University of Technology

The advancement of artificial intelligence algorithms has gained growing interest in identifying the fault types in rotary machines, which is a high-efficiency but not a human-like module. Hence, in order to build a human-like fault identification module that could learn knowledge from the environment, in this paper, a deep reinforcement learning framework is proposed to provide an end-to-end training mode and a human-like learning process based on an improved Double Deep Q Network. In addition, to improve the convergence properties of the Deep Reinforcement Learning algorithm, the parameters of the former layers of the convolutional neural networks are transferred from a convolutional auto-encoder under an unsupervised learning process. The experiment results show that the proposed framework could efficiently extract the fault features from raw time-domain data and have higher accuracy than other deep learning models with balanced samples and better performance with imbalanced samples.

Read more at Control Engineering Practice

AI and the chocolate factory

📅 Date: February 15, 2023

✍️ Author: Aenne Barnard

🔖 Topics: Autonomous Production, Reinforcement Learning

🏭 Vertical: Food

🏢 Organizations: Siemens

“After about 72 hours of training with the digital twin (on a standard computer; about 24 hours on computer clusters in the cloud), the AI is ready to control the real machine. That’s definitely much faster than humans developing these control algorithms,” Bischoff says. Using reinforcement learning, the AI has developed a solution strategy in which all the chocolate bars on the front conveyor belts are transported on as quickly as possible and the exact speed is only controlled on the last conveyor belt - is interestingly quite different from that of a conventional control system.

The researchers led by Martin Bischoff were able to make their approach even more practical by compressing and compiling the trained control models in such a way that they run cycle-synchronously on the Siemens Simatic controllers in real time. Thomas Menzel, who is responsible for the department Digital Machines and Innovation within the business segment Production Machines, sees great potential in the methodology of letting AI learn complex control tasks independently on the digital twin: “Under the name AI Motion Trainer, this method is now helping several co-creation partners to develop application-specific optimized controls in a much shorter time. Production machines are now no longer limited to tasks for which a PLC control program has already been developed but can realize all tasks that can be learned by AI. The integration with our SIMATIC portfolio makes the use of this technology particularly industry-grade.”

Table Tennis: A Research Platform for Agile Robotics

📅 Date: October 18, 2022

✍️ Authors: Avi Singh, Laura Graesser

🔖 Topics: Reinforcement Learning

🏢 Organizations: Google

Robot learning has been applied to a wide range of challenging real world tasks, including dexterous manipulation, legged locomotion, and grasping. It is less common to see robot learning applied to dynamic, high-acceleration tasks requiring tight-loop human-robot interactions, such as table tennis. There are two complementary properties of the table tennis task that make it interesting for robotic learning research. First, the task requires both speed and precision, which puts significant demands on a learning algorithm. At the same time, the problem is highly-structured (with a fixed, predictable environment) and naturally multi-agent (the robot can play with humans or another robot), making it a desirable testbed to investigate questions about human-robot interaction and reinforcement learning. These properties have led to several research groups developing table tennis research platforms.

Could Reinforcement Learning play a part in the future of wafer fab scheduling?

📅 Date: September 22, 2022

✍️ Author: Marcus Vitelli

🔖 Topics: Reinforcement Learning

🏭 Vertical: Semiconductor

🏢 Organizations: Flexciton

However, as the use of RL for JSS problems is still a novelty, it is not yet at the level of sophistication that the semiconductor industry would require. So far, the approaches can handle standard small problem scenarios but cannot handle flexible problems or batching decisions. Many constraints need to be obeyed in wafer fabs (e.g., timelinks and reticle availability) and it is not easily guaranteed that the agent will adhere to them. The objective set for the agent must be defined ahead of training, which means that any change made afterwards will require a repeat of training before new decisions can be obtained. This is less problematic for solving the instance proposed by Tassel et al., although their approach relies on a specifically modelled reward function which would not easily adapt to changing objectives.

Yokogawa and DOCOMO Successfully Conduct Test of Remote Control Technology Using 5G, Cloud, and AI

📅 Date: May 30, 2022

🔖 Topics: Autonomous Production, 5G, Reinforcement Learning, AI

🏢 Organizations: Yokogawa, DOCOMO, Nara Institute of Science and Technology

Yokogawa Electric Corporation and NTT DOCOMO, INC. announced today that they have conducted a proof-of-concept test (PoC) of a remote control technology for industrial processing. The PoC test involved the use in a cloud environment of an autonomous control AI, the Factorial Kernel Dynamic Policy Programming (FKDPP) algorithm developed by Yokogawa and the Nara Institute of Science and Technology, and a fifth-generation (5G) mobile communications network provided by DOCOMO. The test, which successfully controlled a simulated plant processing operation, demonstrated that 5G is suitable for the remote control of actual plant processes.

Read more at Yokogawa Press Releases

In a World First, Yokogawa and JSR Use AI to Autonomously Control a Chemical Plant for 35 Consecutive Days

📅 Date: March 22, 2022

🔖 Topics: Autonomous Factory, Reinforcement Learning, Artificial Intelligence

🏭 Vertical: Chemical

🏢 Organizations: Yokogawa, JSR, Nara Institute of Science and Technology

Yokogawa Electric Corporation (TOKYO: 6841) and JSR Corporation (JSR, TOKYO: 4185) announce the successful conclusion of a field test in which AI was used to autonomously run a chemical plant for 35 days, a world first. This test confirmed that reinforcement learning AI can be safely applied in an actual plant, and demonstrated that this technology can control operations that have been beyond the capabilities of existing control methods (PID control/APC) and have up to now necessitated the manual operation of control valves based on the judgements of plant personnel. The initiative described here was selected for the 2020 Projects for the Promotion of Advanced Industrial Safety subsidy program of the Japanese Ministry of Economy, Trade and Industry.

The AI used in this control experiment, the Factorial Kernel Dynamic Policy Programming (FKDPP) protocol, was jointly developed by Yokogawa and the Nara Institute of Science and Technology (NAIST) in 2018, and was recognized at an IEEE International Conference on Automation Science and Engineering as being the first reinforcement learning-based AI in the world that can be utilized in plant management.

Given the numerous complex physical and chemical phenomena that impact operations in actual plants, there are still many situations where veteran operators must step in and exercise control. Even when operations are automated using PID control and APC, highly-experienced operators have to halt automated control and change configuration and output values when, for example, a sudden change occurs in atmospheric temperature due to rainfall or some other weather event. This is a common issue at many companies’ plants. Regarding the transition to industrial autonomy, a very significant challenge has been instituting autonomous control in situations where until now manual intervention has been essential, and doing so with as little effort as possible while also ensuring a high level of safety. The results of this test suggest that this collaboration between Yokogawa and JSR has opened a path forward in resolving this longstanding issue.

Action-limited, multimodal deep Q learning for AGV fleet route planning

📅 Date: March 18, 2022

✍️ Author: Hang Liu

🔖 Topics: Automated Guided Vehicle, Reinforcement Learning

🏢 Organizations: Hitachi

In traditional operating models, a navigation system completes all calculations i.e., the shortest path planning in a static environment, before the AGVs start moving. However, due to constant incoming offers, changes in vehicle availability, etc., this creates a huge and intractable optimization problem. Meanwhile, an optimal navigation strategy for an AGV fleet cannot be achieved if it fails to consider the fleet and delivery situation in real-time. Such dynamic route planning is more realistic and must have the ability to autonomously learn the complex environments. Deep Q network (DQN), that inherits the capabilities of deep learning and reinforcement learning, provides a framework that is well prepared to make decisions for discrete motion sequence problems.

Improving PPA In Complex Designs With AI

📅 Date: February 14, 2022

✍️ Author: John Koon

🔖 Topics: Reinforcement Learning, Generative Design

🏭 Vertical: Semiconductor

🏢 Organizations: Google, Cadence, Synopsys

The goal of chip design always has been to optimize power, performance, and area (PPA), but results can vary greatly even with the best tools and highly experienced engineering teams. AI works best in design when the problem is clearly defined in a way that AI can understand. So an IC designer must first see if there is a problem that can be tied to a system’s ability to adapt to, learn, and generalize knowledge/rules, and then apply these knowledge/rules to an unfamiliar scenario.

Read more at Semiconductor Engineering

Bridge the gap between Process Control and Reinforcement Learning with QuarticGym

📅 Date: February 2, 2022

🔖 Topics: Autonomous Production, Reinforcement Learning

🏢 Organizations: Quartic AI, OpenAI

Modern process control algorithms are the key to the success of industrial automation. The increased efficiency and quality create value that benefits everyone from the producers to the consumers. The question then is, could we further improve it?

From AlphaGo to robot-arm control, deep reinforcement learning (DRL) tackled a variety of tasks that traditional control algorithms cannot solve. However, it requires a large and compactly sampled dataset or a lot of interactions with the environment to succeed. In many cases, we need to verify and test the reinforcement learning in a simulator before putting it into production. However, there are few simulations for industrial-level production processes that are publicly available. In order to pay back the research community and encourage future works on applying DRL to process control problems, we built and published a simulation playground with data for every interested researcher to play around with and benchmark their own controllers. The simulators are all written in the easy-to-use OpenAI Gym format. Each of the simulations also has a corresponding data sampler, a pre-sampled d4rl-style dataset to train offline controllers, and a set of preconfigured online and offline Deep Learning algorithms.

Chip floorplanning with deep reinforcement learning

Artificial intelligence optimally controls your plant

📅 Date: October 11, 2021

🔖 Topics: energy consumption, reinforcement learning, machine learning, industrial control system

🏢 Organizations: Siemens

Until now, heating systems have mainly been controlled individually or via a building management system. Building management systems follow a preset temperature profile, meaning they always try to adhere to predefined target temperatures. The temperature in a conference room changes in response to environmental influences like sunlight or the number of people present. Simple (PI or PID) controllers are used to make constant adjustments so that the measured room temperature is as close to the target temperature values as possible.

We believe that the best alternative is learning a control strategy by means of reinforcement learning (RL). Reinforcement learning is a machine learning method that has no explicit (learning) objective. Instead, an “agent” with as complete a knowledge of the system state as possible learns the manipulated variable changes that maximize a “reward” function defined by humans. Using algorithms from reinforcement learning, the agent, meaning the control strategy, can be trained from both current and recorded system data. This requires measurements for the manipulated variable changes that have been carried out, for the (resulting) changes to the system state over time, and for the variables necessary for calculating the reward.

Getting Industrial About The Hybrid Computing And AI Revolution

📅 Date: July 21, 2021

✍️ Author: Jeffrey Burt

🔖 Topics: IIoT, machine learning, reinforcement learning

🏭 Vertical: Petroleum and Coal

🏢 Organizations: Beyond Limits

Beyond Limits is applying such techniques as deep reinforcement learning (DRL), using a framework to train a reinforcement learning agent to make optimal sequential recommendations for placing wells. It also uses reservoir simulations and novel deep convolutional neural networks to work. The agent takes in the data and learns from the various iterations of the simulator, allowing it to reduce the number of possible combinations of moves after each decision is made. By remembering what it learned from the previous iterations, the system can more quickly whittle the choices down to the one best answer.

Toward Generalized Sim-to-Real Transfer for Robot Learning

📅 Date: June 3, 2021

✍️ Authors: Daniel Ho, Kanishka Rao

🔖 Topics: reinforcement learning, AI, robotics, imitation learning, generative adversarial networks

🏢 Organizations: Google

A limitation for their use in sim-to-real transfer, however, is that because GANs translate images at the pixel-level, multi-pixel features or structures that are necessary for robot task learning may be arbitrarily modified or even removed.

To address the above limitation, and in collaboration with the Everyday Robot Project at X, we introduce two works, RL-CycleGAN and RetinaGAN, that train GANs with robot-specific consistencies — so that they do not arbitrarily modify visual features that are specifically necessary for robot task learning — and thus bridge the visual discrepancy between sim and real.

Multi-Task Robotic Reinforcement Learning at Scale

📅 Date: April 19, 2021

✍️ Authors: Karol Hausman, Yevgen Chebotar

🔖 Topics: reinforcement learning, robotics, AI, machine learning

🏢 Organizations: Google

For general-purpose robots to be most useful, they would need to be able to perform a range of tasks, such as cleaning, maintenance and delivery. But training even a single task (e.g., grasping) using offline reinforcement learning (RL), a trial and error learning method where the agent uses training previously collected data, can take thousands of robot-hours, in addition to the significant engineering needed to enable autonomous operation of a large-scale robotic system. Thus, the computational costs of building general-purpose everyday robots using current robot learning methods becomes prohibitive as the number of tasks grows.

Using tactile-based reinforcement learning for insertion tasks

📅 Date: March 24, 2021

✍️ Authors: Alan Sullivan, Diego Romeres, Radu Corcodel

🔖 Topics: AI, cobot, reinforcement learning, robotics

🏢 Organizations: MIT, Mitsubishi Electric

A paper entitled “Tactile-RL for Insertion: Generalization to Objects of Unknown Geometry” was submitted by MERL and MIT researchers to the IEEE International Conference on Robotics and Automation (ICRA) in which reinforcement learning was used to enable a robot arm equipped with a parallel jaw gripper having tactile sensing arrays on both fingers to insert differently shaped novel objects into a corresponding hole with an overall average success rate of 85% with 3-4 tries.

Way beyond AlphaZero: Berkeley and Google work shows robotics may be the deepest machine learning of all

📅 Date: March 1, 2021

✍️ Author: @TiernanRayTech

🔖 Topics: AI, machine learning, robotics, reinforcement learning

🏢 Organizations: Google

With no well-specified rewards and state transitions that take place in a myriad of ways, training a robot via reinforcement learning represents perhaps the most complex arena for machine learning.

Multi-agent deep reinforcement learning for multi-echelon supply chain optimization

📅 Date: June 10, 2020

✍️ Author: Ilya Katsov

🔖 Topics: Supply Chain Optimization, Reinforcement Learning

🏢 Organizations: Grid Dynamics

In this article, we explore how the problem can be approached from the reinforcement learning (RL) perspective that generally allows for replacing a handcrafted optimization model with a generic learning algorithm paired with a stochastic supply network simulator. We start by building a simple simulation environment that includes suppliers, factories, warehouses, and retailers, as depicted in the animation below; we then develop a deep RL model that learns how to optimize inventory and pricing decisions.

Our first step is to develop an environment that can be used to train supply chain management policies using deep RL. We choose to create a relatively small-scale model with just a few products and facilities but implement a relatively rich set of features including transportation, pricing, and competition. This environment can be viewed as a foundational framework that can be extended and/or adapted in many ways to study various problem formulations. Henceforth, we refer to this environment as the World of Supply (WoS).

Scalable reinforcement learning for plant-wide control of vinyl acetate monomer process

📅 Date: February 10, 2020

✍️ Authors: Lingwei Zhu, Yunduan Cui, Go Takami, Hiroaki Kanokogi, Takamitsu Matsubara

🔖 Topics: Reinforcement Learning, Autonomous Production, Factorial Kernel Dynamic Policy Programming

🏭 Vertical: Chemical

🏢 Organizations: Nara Institute of Science and Technology, Yokogawa

This paper explores a reinforcement learning (RL) approach that designs automatic control strategies in a large-scale chemical process control scenario as the first step for leveraging an RL method to intelligently control real-world chemical plants. The huge number of units for chemical reactions as well as feeding and recycling the materials of a typical chemical process induces a vast amount of samples and subsequent prohibitive computation complexity in RL for deriving a suitable control policy due to high-dimensional state and action spaces. To tackle this problem, a novel RL algorithm: Factorial Fast-food Dynamic Policy Programming (FFDPP) is proposed. By introducing a factorial framework that efficiently factorizes the action space, Fast-food kernel approximation that alleviates the curse of dimensionality caused by the high dimensionality of state space, into Dynamic Policy Programming (DPP) that achieves stable learning even with insufficient samples. FFDPP is evaluated in a commercial chemical plant simulator for a Vinyl Acetate Monomer (VAM) process. Experimental results demonstrate that without any knowledge of the model, the proposed method successfully learned a stable policy with reasonable computation resources to produce a larger amount of VAM product with comparative performance to a state-of-the-art model-based control.

Read more at Control Engineering Practice

Equipment Health Indicator Learning using Deep Reinforcement Learning

📅 Date: July 29, 2019

✍️ Author: Chi Zhang

🔖 Topics: machine health, reinforcement learning, predictive maintenance

🏢 Organizations: Hitachi

We propose a machine learning based method to solve health indicator learning problem. Our key insight is that HIL can be modeled as a credit assignment problem which can then be solved using Deep Reinforcement Learning (DRL). The life of equipment can be thought as a series of state transitions from a state that is healthy at the beginning to a state that is completely unhealthy when it fails. Reinforcement learning learns from failures by naturally backpropagating the credit of failures into intermediate states.

Read more at Hitachi Industrial AI Blog

Assembly Line

Fourier Trains Humanoid Robots for Real-World Roles Using NVIDIA Isaac Gym

Crossing the Sim2Real Gap With NVIDIA Isaac Lab

Multi-agent reinforcement learning for integrated manufacturing system-process control

Manufacturing process optimization for real-time quality control in multi-regime conditions: Tire tread production use case

Precision Home Robotics w/Real-to-Sim-to-Real

Automating Circuit Board Design Using Reinforcement Learning

Explainable generative design in manufacturing for reinforcement learning based factory layout planning

Eureka! NVIDIA Research Breakthrough Puts New Spin on Robot Learning

🧠🎛️ Multi-objective reinforcement learning in process control: A goal-oriented approach with adaptive thresholds

🦾 Transferring Industrial Robot Assembly Tasks from Simulation to Reality

🧠 Data-Driven Wind Farm Control via Multiplayer Deep Reinforcement Learning

🦾♻️ Robotic deep RL at scale: Sorting waste and recyclables with a fleet of robots

A new intelligent fault diagnosis framework for rotating machinery based on deep transfer reinforcement learning

AI and the chocolate factory

Table Tennis: A Research Platform for Agile Robotics

Could Reinforcement Learning play a part in the future of wafer fab scheduling?

Yokogawa and DOCOMO Successfully Conduct Test of Remote Control Technology Using 5G, Cloud, and AI

In a World First, Yokogawa and JSR Use AI to Autonomously Control a Chemical Plant for 35 Consecutive Days

Action-limited, multimodal deep Q learning for AGV fleet route planning

Improving PPA In Complex Designs With AI

Bridge the gap between Process Control and Reinforcement Learning with QuarticGym

Chip floorplanning with deep reinforcement learning

Artificial intelligence optimally controls your plant

Getting Industrial About The Hybrid Computing And AI Revolution

Toward Generalized Sim-to-Real Transfer for Robot Learning

Multi-Task Robotic Reinforcement Learning at Scale

Using tactile-based reinforcement learning for insertion tasks

Way beyond AlphaZero: Berkeley and Google work shows robotics may be the deepest machine learning of all

Multi-agent deep reinforcement learning for multi-echelon supply chain optimization

Scalable reinforcement learning for plant-wide control of vinyl acetate monomer process

Equipment Health Indicator Learning using Deep Reinforcement Learning