Large Language Model
Assembly Line
China’s Baowu Launches Self-Developed AI Tool for Steel Industry
Chinese steel giant China Baowu Group yesterday unveiled its first large language model for the steel sector, which increases efficiency and refines operations across key links of the steel industrial chain, raising the bar for vertical artificial intelligence models in the country.
xIn³Plat is made up of a three-tier architecture comprising a foundational model, an industry-specific vertical model and an application scenario domain model, the Shanghai-based firm said on its WeChat account yesterday.
It covers key areas in the R&D, production, operations, and services of the steel industry, it said. This includes lean manufacturing, refined management of operations, precise services in production and sales, intelligent maintenance services as well as green, low-carbon and energy-saving scenarios.
Baowu’s AI tool has achieved a 30 percent increase in R&D efficiency. And in terms of lean manufacturing, the increase in annual efficiency of a production line where the LLM was adopted has topped CNY10 million (USD1.4 million), and the results were significantly better than when using manual processes.
MIT researchers use large language models to flag problems in complex systems
In a new study, MIT researchers found that large language models (LLMs) hold the potential to be more efficient anomaly detectors for time-series data. Importantly, these pretrained models can be deployed right out of the box.
The researchers developed a framework, called SigLLM, which includes a component that converts time-series data into text-based inputs an LLM can process. A user can feed these prepared data to the model and ask it to start identifying anomalies. The LLM can also be used to forecast future time-series data points as part of an anomaly detection pipeline.
While LLMs could not beat state-of-the-art deep learning models at anomaly detection, they did perform as well as some other AI approaches. If researchers can improve the performance of LLMs, this framework could help technicians flag potential problems in equipment like heavy machinery or satellites before they occur, without the need to train an expensive deep-learning model.
In the future, an LLM may also be able to provide plain language explanations with its predictions, so an operator could be better able to understand why an LLM identified a certain data point as anomalous.
Could reading instruction manuals become a thing of the past?
Simon Bennett, Aveva’s head of AI innovation, says the AI can locate where there has been, say, a power failure. It then delves into “a monster PDF manual”. From this, the AI - via a computer screen - generates different ideas of what the problem might be. It can also produce a 3D image of the affected machinery, such as a turbine, with Mr Bennett noting that engineers appreciate such visual responses to their questions.
Dozuki’s AI-powered system CreatorPro can automatically create a user guide based on an engineer making a video of him or her talking through and carrying out a process. “The user uploads the video, and a step-by-step instruction guide is automatically created,” says Allen Yeung, Dozuki’s vice president of product. “The AI chooses the text that accompanies each step, and it can automatically translate that into other languages.”
How Chevron is using gen AI to strike oil
Oil and gas operations generate an enormous amount of data — a seismic survey in New Mexico, for instance, can provide a file that is a petabyte all by itself. “To turn that into an image that you can make a decision with is a 100 exaflop operation,” Bill Braun, Chevron CIO, told the audience at this year’s VB Transform. “It’s an incredible amount of compute.”
This can be helpful, for instance, with well lengths, which are several miles long. Other companies might be working in areas around those wells, and gen AI could alert to interference so that human users can proactively reach out to prevent disruption to either party, Braun explained.
Chevron also uses large language models (LLMs) to craft engineering standards, specifications and safety bulletins and other alerts, he said, and AI scientists are constantly fine-tuning models.
Enhancing Audit Efficiency at Hapag-Lloyd with Generative AI
Building a generative AI reservoir simulation assistant with Stone Ridge Technology
In the field of reservoir simulation, accurate modeling is paramount for understanding and predicting the behavior of subsurface flow through geological formations. However, the complexities involved in creating, implementing, and optimizing these models often pose significant challenges, even for experienced professionals. Fortunately, the integration of artificial intelligence (AI) and large language models (LLMs) offers a transformative solution to streamline and enhance the reservoir simulation workflow. This post describes our efforts in developing an intelligent simulation assistant powered by Amazon Bedrock, Anthropic’s Claude, and Amazon Titan LLMs, aiming to revolutionize the way reservoir engineers approach simulation tasks.
Although not covered in this architecture, two key elements enhance this workflow significantly and are the topic of future exploration: 1) simulation execution using natural language by orchestration through a generative AI agent, and 2) multimodal generative AI (vision and text) analysis and interpretation of reservoir simulation results such as well production logs and 3D depth slices for pressure and saturation evolution. As future work, automating aspects of our current architecture is being explored using an agentic workflow framework as described in this AWS HPC post.
Mech-GPT: Enabling Robots to Understand Linguistic Instructions and Handle Complex Tasks
AMGPT: a Large Language Model for Contextual Querying in Additive Manufacturing
Generalized large language models (LLMs) such as GPT-4 may not provide specific answers to queries formulated by materials science researchers. These models may produce a high-level outline but lack the capacity to return detailed instructions on manufacturing and material properties of novel alloys. Enhancing a smaller model with specialized domain knowledge may provide an advantage over large language models which cannot be retrained quickly enough to keep up with the rapid pace of research in metal additive manufacturing (AM). We introduce “AMGPT,” a specialized LLM text generator designed for metal AM queries. The goal of AMGPT is to assist researchers and users in navigating the extensive corpus of literature in AM. Instead of training from scratch, we employ a pre-trained Llama2-7B model from Hugging Face in a Retrieval-Augmented Generation (RAG) setup, utilizing it to dynamically incorporate information from ∼50 AM papers and textbooks in PDF format. Mathpix is used to convert these PDF documents into TeX format, facilitating their integration into the RAG pipeline managed by LlamaIndex. Expert evaluations of this project highlight that specific embeddings from the RAG setup accelerate response times and maintain coherence in the generated text.
LLMatDesign: Autonomous Materials Discovery with Large Language Models
Discovering new materials can have significant scientific and technological implications but remains a challenging problem today due to the enormity of the chemical space. Recent advances in machine learning have enabled data-driven methods to rapidly screen or generate promising materials, but these methods still depend heavily on very large quantities of training data and often lack the flexibility and chemical understanding often desired in materials discovery. We introduce LLMatDesign, a novel language-based framework for interpretable materials design powered by large language models (LLMs). LLMatDesign utilizes LLM agents to translate human instructions, apply modifications to materials, and evaluate outcomes using provided tools. By incorporating self-reflection on its previous decisions, LLMatDesign adapts rapidly to new tasks and conditions in a zero-shot manner. A systematic evaluation of LLMatDesign on several materials design tasks, in silico, validates LLMatDesign’s effectiveness in developing new materials with user-defined target properties in the small data regime. Our framework demonstrates the remarkable potential of autonomous LLM-guided materials discovery in the computational setting and towards self-driving laboratories in the future.
OpenVLA: An Open-Source Vision-Language-Action Model
Large policies pretrained on a combination of Internet-scale vision-language data and diverse robot demonstrations have the potential to change how we teach robots new skills: rather than training new behaviors from scratch, we can fine-tune such vision-language-action (VLA) models to obtain robust, generalizable policies for visuomotor control. Yet, widespread adoption of VLAs for robotics has been challenging as 1) existing VLAs are largely closed and inaccessible to the public, and 2) prior work fails to explore methods for efficiently fine-tuning VLAs for new tasks, a key component for adoption. Addressing these challenges, we introduce OpenVLA, a 7B-parameter open-source VLA trained on a diverse collection of 970k real-world robot demonstrations. OpenVLA builds on a Llama 2 language model combined with a visual encoder that fuses pretrained features from DINOv2 and SigLIP. As a product of the added data diversity and new model components, OpenVLA demonstrates strong results for generalist manipulation, outperforming closed models such as RT-2-X (55B) by 16.5% in absolute task success rate across 29 tasks and multiple robot embodiments, with 7x fewer parameters. We further show that we can effectively fine-tune OpenVLA for new settings, with especially strong generalization results in multi-task environments involving multiple objects and strong language grounding abilities, and outperform expressive from-scratch imitation learning methods such as Diffusion Policy by 20.4%. We also explore compute efficiency; as a separate contribution, we show that OpenVLA can be fine-tuned on consumer GPUs via modern low-rank adaptation methods and served efficiently via quantization without a hit to downstream success rate. Finally, we release model checkpoints, fine-tuning notebooks, and our PyTorch codebase with built-in support for training VLAs at scale on Open X-Embodiment datasets.
New C.H. Robinson Technology Breaks a Decades-Old Barrier to Automation in the Logistics Industry
In another industry-leading innovation, C.H. Robinson has automated transactions that many shippers still conduct by email. It breaks a long-standing barrier to automation and gives shippers who use email the same speed-to market and cost savings as shippers who are more digitally connected.
Using artificial intelligence, C.H. Robinson’s new technology classifies incoming email, reads it and replicates the steps a person would take to fulfill a customer’s request. For example, shippers often still choose to send an email asking for a price quote rather than log into a digital platform. On an average business day, the global logistics company receives over 11,000 emails from customers and carriers requesting pricing on truckload freight. While the technology is replying to 2,000 customer quote requests a day, it opens the door to automating other transactions shippers and carriers choose to do by email. The large language model (LLM) the technology uses can be trained to identify an email about a load tender, a pickup appointment or a shipment tracking update.
“Our customers can get instant price quotes through our Navisphere platform or any of the 35 largest TMS or ERP systems we’re integrated with. But for someone like a busy warehouse manager with unexpected spot freight or freight in a new lane, an email can just feel easier. Email works the same for everybody. It doesn’t ask for your password. There are no fields to fill in,” said Mark Albrecht, Vice President for Artificial Intelligence. “Before generative AI, replying to that email request defied automation. Customers had to wait for a human just to pass along a quote from our Dynamic Pricing Engine. Now, our new technology reads the email and supplies the quote in an average 2 minutes 13 seconds. C.H. Robinson is doing this at scale, leaving our people more time to help those same customers with more complex requests.”
C.H. Robinson Introduces AI Technology to Automate Email Interactions
Logistics management company C.H. Robinson has automated email transactions with shippers using generative e artificial intelligence large language models (LLM), to offer shippers who use email the same speed-to market and cost savings as shippers who are more digitally connected to the company.
The technology classifies incoming email, reads it and replicates the steps a person would take to fulfill a customer’s request. For example, shippers often still choose to send an email asking for a price quote rather than log into a digital platform. On an average business day, the global logistics company receives over 11,000 emails from customers and carriers requesting pricing on truckload freight.
Large language model based agent for process planning of fiber composite structures
Process planning is a crucial activity, connecting product development and manufacturing of fiber composite structures. Recently published Large Language Models (LLM) promise more flexible and autonomous workflows compared to state of the art automation methods. An autonomous agent for process planning of fiber composite structures is implemented with the LangChain framework, based on OpenAI’s GPT-4 language model. The agent is equipped with deterministic tools which encode a-priori process planning knowledge. It can handle different process planning problems, such as cycle time estimation and resource allocation. Combinations thereof are solved through executing a multi-step solution path.
The agent is supposed to solve these problems autonomously:
- Time Estimation - Estimate the cycle time, i.e., duration from start to end, for a manufacturing task.
- Process Chains - Determine which tasks are required in which order to manufacture a specific component.
- Resource Allocation - Identify the resources, e.g. machines, required to manufacture a specific component.
- Integrated Planning - Estimate the total cycle time for a chain of tasks required to manufacture a component.
Customize large language models with oil and gas terminology using Amazon Bedrock
The Norwegian multinational energy company Equinor has made Volve dataset, a set of drilling reports available for research, study, and development purposes. (When using external data, be sure to abide by the license the data is offered under.) The dataset contains 1,759 daily drilling reports—each containing both hourly comments and a daily summary—from the Volve field in the North Sea. Drilling rig supervisors tend to use domain-specific terminology and grammar when describing operations in both the hourly comments and the daily summary. This terminology is standard in the industry, which is why fine-tuning a foundation model using these reports is likely to improve summarization accuracy by enhancing the LLM’s ability to understand jargon and speak like a drilling engineer.
Generative AI has the potential to improve efficiency by automating time-consuming tasks even in domains that require deep knowledge of industry-specific nomenclature and acronyms. Having a custom model that provides drilling engineers with a draft of daily activities has the potential to save hours of work every week. Model customization can also help energy and utilities customers in other applications that involve the generation of highly technical content, as is the case of geological analyses, maintenance reports, and shift handover reports.
Integrating LLMs for Explainable Fault Diagnosis in Complex Systems
This paper introduces an integrated system designed to enhance the explainability of fault diagnostics in complex systems, such as nuclear power plants, where operator understanding is critical for informed decision-making. By combining a physics-based diagnostic tool with a Large Language Model, we offer a novel solution that not only identifies faults but also provides clear, understandable explanations of their causes and implications. The system’s efficacy is demonstrated through application to a molten salt facility, showcasing its ability to elucidate the connections between diagnosed faults and sensor data, answer operator queries, and evaluate historical sensor anomalies. Our approach underscores the importance of merging model-based diagnostics with advanced AI to improve the reliability and transparency of autonomous systems.
Introducing Materials.AI: Your AI Assistant for Material Selection
With advances in material science and manufacturing technologies like 3D printing, it can be overwhelming (not to mention time-consuming) to find the right material for your project needs. That’s why we created Materials.AI: a first-of-its-kind artificial intelligence assistant, powered by ChatGPT and Fictiv’s expansive manufacturing database, to help you navigate the complex landscape of plastic and metal materials.
Quality Execution System® – two use cases in the European metals industries
Data from various automation levels is consolidated to represent each coil, bridging the gap between the physical and digital realms. This requires data to be transmitted flawlessly, enabling the virtual coil to mirror the physical coil. As the coil progresses through the production route, a digital counterpart is created at each stage of the process. The Quality Execution System (QES®) is designed to gather, combine, and examine all the data pertaining to a coil, thereby establishing the foundation for its digital twin.
Speira, a leading European aluminum rolling and recycling company, is expanding the QES® application ‘Automatic Coil Grading & Release and Genealogy’ to two of its strip coating line routes as part of a long-term digitalization initiative it launched earlier. Speira’s aluminum rolling mill in Grevenbroich, Germany stands for high-quality automotive, beverage can, foil and lithographic products.
TATA Steel, the second largest European steel manufacturer, is also expanding cooperation with SMS as part of a long-term and early-started digitalization initiative. TATA started with their automated coil release at the cold mill in 2012. One of the main goals was to improve the utilization of surface inspection data and conduct post-processing. TATA invested in the automatic coil release for the DSP (direct sheet plant) and Hot Mill shortly after. The Cold Mill now started implementing the PDW part of the DataFactory. Wouter Overgaauw, Manager Quality Assurance Cold Rolling Mill Ijmuiden, states: “The amount of measurement data is steadily increasing, the possibilities for data-driven applications are improving, the PDW gives us the possibility to make better use of both data and applications.”
A Unified Industrial Large Knowledge Model Framework in Smart Manufacturing
The recent emergence of large language models (LLMs) shows the potential for artificial general intelligence, revealing new opportunities in industry 4.0 and smart manufacturing. However, a notable gap exists in applying these LLMs in industry, primarily due to their training on general knowledge rather than domain-specific knowledge. Such specialized domain knowledge is vital for effectively addressing the complex needs of industrial applications. To bridge this gap, this paper proposes an Industrial Large Knowledge Model (ILKM) framework emphasizing their potential to revolutionize the industry in smart manufacturing. In addition, ILKMs and LLMs are compared from eight perspectives. Finally, “6S Principle” is proposed as the guideline for the development of ILKMs in smart manufacturing.
Can ChatGPT Create Usable G-Code Programs?
Mike Wearne is an educational content creator at CAMInstructor, has a take on the GPT-3 G-code. “If we use a basic program that’s a drill four holes sort of thing, and compare this to someone who’s just learning G code, I would say it’s not bad,” he says. “I would give it a low B or a high C.” The overall structure was there — it put the right codes in the right places, such as G20 and G21 to switch between metric and imperial units, and G90 for absolute positioning at the top of the program. “If you’re new to G-code programming, those are usually the tough things to remember and to get in the right spot,” he notes. However, it was missing some elements, such as tool changes and spindle speeds.
Wearne also noticed a marked improvement in the G code GPT-4 produces. “It’s like GPT-4 can think more about its answers and GPT-3.5 just spits out whatever it comes up with as quick as it can,” he explains. With its most recent update, Wearne says it can program simple parts almost perfectly. Whereas GPT-3 was getting a high C or low B as a grade for its code, “For the simple parts, if we’re in G-code 101, GPT-4 is getting an A,” he says.
Creative Robot Tool Use with Large Language Models
We introduce RoboTool, enabling robots to use tools creatively with large language models, which solves long-horizon hybrid discrete-continuous planning problems with the environment- and embodiment-related constraints.
In this work, we are interested in solving language-instructed long-horizon robotics tasks with implicitly activated physical constraints. By providing LLMs with adequate numerical semantic information in natural language, we observe that LLMs can identify the activated constraints induced by the spatial layout of objects in the scene and the robot’s embodiment limits, suggesting that LLMs may maintain knowledge and reasoning capability about the 3D physical world. Furthermore, our comprehensive tests reveal that LLMs are not only adept at employing tools to transform otherwise unfeasible tasks into feasible ones but also display creativity in using tools beyond their conventional functions, based on their material, shape, and geometric features.
LLM-based Control Code Generation using Image Recognition
LLM-based code generation could save significant manual efforts in industrial automation, where control engineers manually produce control logic for sophisticated production processes. Previous attempts in control logic code generation lacked methods to interpret schematic drawings from process engineers. Recent LLMs now combine image recognition, trained domain knowledge, and coding skills. We propose a novel LLM-based code generation method that generates IEC 61131-3 Structure Text control logic source code from Piping-and-Instrumentation Diagrams (P&IDs) using image recognition. We have evaluated the method in three case study with industrial P&IDs and provide first evidence on the feasibility of such a code generation besides experiences on image recognition glitches.
AI for industry: Schaeffler and Siemens bring Industrial Copilot to shopfloor
To support engineers with various automation tasks, the AI-powered assistant is connected to Siemens’ engineering framework Totally Integrated Automation (TIA) Portal via the open API TIA Portal Openness. The Industrial Copilot helps Schaeffler’s automation engineers to generate code faster for programmable logic controllers (PLC), the devices that control most machines throughout the world’s factories. Engineering teams can significantly reduce time, effort, and the probability of errors by generating PLC code through natural language inputs.
Siemens Industrial Copilot has access to all relevant documentation, guidelines and manuals to assist shopfloor workers with identifying possible errors. These capabilities enable maintenance teams to identify errors and generate step-by-step solutions more quickly. This will help to significantly reduce machine downtime, make industrial companies more efficient and thus support sustainability efforts.
TwinCAT Chat integrates LLMs into the automation environment
Generative AI for Process Systems Engineering
Unleashing the Potential of Large Language Models in Robotics: RoboDK’s Virtual Assistant
The RoboDK Virtual Assistant is the first step towards a comprehensive generalized assistant for RoboDK. At its core is OpenAI’s GPT3.5-turbo-0613 model. The model is provided with additional context about RoboDK in the form of an indexed database containing the RoboDK website, documentation, forum threads, blog posts, and more. The indexing process is done with LlamaIndex, a specialized data framework designed for this purpose. Thanks to this integration, the Virtual Assistant can swiftly provide valuable technical support to over 75% of user queries on the RoboDK forum, reducing the time spent searching through the website and documentation via manual methods. Users can expect to have an answer to their question in 5 seconds or less.
Fast and efficient PLC code generation and more with artificial intelligence
TwinCAT Chat was developed to offer users a clear advantage over the conventional use of, for example, ChatGPT in the web browser. The key added value lies in its deep integration, especially with regard to the specialized requirements of the automation industry. The core features include the direct integration of the chat function into the development environment (IDE). This greatly simplifies the development process, as communication and code exchange are seamlessly integrated. Furthermore, the basic initialization of our model has been tailored specifically to TwinCAT requests. This way you can ask your specific questions directly and don’t have to tell the model that you are using TwinCAT and expect the code examples in Structured Text. Another highlight is the ability to easily adopt generated code. This not only saves developers time, but also reduces human errors that can occur during manual transfers. Interaction with TwinCAT Chat has been designed in such a way that the need to type commands is reduced to a minimum. Instead, the user can simply click on pre-tested requests that are specifically designed to improve their workflow. These requests include actions such as:
- Optimize: The system can make suggestions to increase the performance or improve the efficiency of the code.
- Document: TwinCAT Chat helps to create comments and documentation so that the code is easier for other team members to understand.
- Complete: If code fragments are missing or incomplete, our system can generate suggestions to complete them to ensure functionality.
- Refactoring: TwinCAT Chat can refactor code according to certain guidelines and policies so that it is more in line with company guidelines.
Overall, this system provides an efficient and intuitive user interface that greatly facilitates the development process.
Silicon Volley: Designers Tap Generative AI for a Chip Assist
The work demonstrates how companies in highly specialized fields can train large language models (LLMs) on their internal data to build assistants that increase productivity.
The paper details how NVIDIA engineers created for their internal use a custom LLM, called ChipNeMo, trained on the company’s internal data to generate and optimize software and assist human designers. Long term, engineers hope to apply generative AI to each stage of chip design, potentially reaping significant gains in overall productivity, said Ren, whose career spans more than 20 years in EDA. After surveying NVIDIA engineers for possible use cases, the research team chose three to start: a chatbot, a code generator and an analysis tool.
On chip-design tasks, custom ChipNeMo models with as few as 13 billion parameters match or exceed performance of even much larger general-purpose LLMs like LLaMA2 with 70 billion parameters. In some use cases, ChipNeMo models were dramatically better.
Eureka! NVIDIA Research Breakthrough Puts New Spin on Robot Learning
A new AI agent developed by NVIDIA Research that can teach robots complex skills has trained a robotic hand to perform rapid pen-spinning tricks — for the first time as well as a human can. The Eureka research, published today, includes a paper and the project’s AI algorithms, which developers can experiment with using NVIDIA Isaac Gym, a physics simulation reference application for reinforcement learning research. Isaac Gym is built on NVIDIA Omniverse, a development platform for building 3D tools and applications based on the OpenUSD framework. Eureka itself is powered by the GPT-4 large language model.
New Foundations: Controlling robots with natural language
The integration of Large Language Models (LLMs) in robotics is a rapidly evolving field, with numerous projects pushing the boundaries of what’s possible. These projects are not just isolated experiments, but pieces of a larger puzzle that collectively paint a picture of a future where robots are more intelligent, adaptable and interactive.
SayCan and Code as Policies are two early papers that indicate how an LLM can understand a task in natural language and create actions from it. “Code as Policies” leverages the ability of LLMs to output code and demonstrate how the language model can produce the actual code to perform a robotic action.
Instruct2Act connects the sense-making ability with vision capabilities. This way the robotic application (in this case a simulation) can identify, localize and segment (define object outlines for the best grabbing position) known or unknown objects according to the task. Similarly, NL-MAP connects the “SayCan” project with a mapping step, where the robot scans a room for objects before it can output tasks. The TidyBot research project focuses on a real world application for LLMs and robotics. A team at Princeton university developed a robot that can tidy up a room. It adapts to personal preferences (”socks in 3rd drawer on the right”) and benefits from general language understanding. For example, it knows that trash should go into the trash bin because it was trained on internet-scale language data.
Interactive Language achieves robotic actions from spoken commands by training a neural network on demonstrated moves connected with language and vision data.
While much of the work related to this technology is still in its early stages and limited to lab research, some applications such as PickGPT from logistics company Sereact’s, are starting to show the vast commercial potential.
Making Conversation: Using AI to Extract Intel from Industrial Machinery and Equipment
What if your machine could talk? This is the question Ron Di Carlantonio has grappled with since he founded iNAGO 1998. iNAGO was onboard when the Government of Canada supported a lighthouse project led by the Automotive Parts Manufacturers’ Association (APMA) to design, engineer and build a connected and autonomous zero-emissions vehicle (ZEV) concept car and its digital twin that would validate and integrate autonomous technologies. The electric SUV is equipped with a dual-motor powertrain with total output of 550 hp and 472 lb-ft of torque.
The general use of AI-based solutions in the automotive industry stretches across the lifecycle of a vehicle, from design and manufacturing to sales and aftermarket care. AI-powered chatbots, in particular, deliver instant, personalized virtual driver assistance, are on call 27/7 and can evolve with the preferences of tech-savvy drivers. Di Carlantonio now sees an opportunity to extend the use of the intelligent assistant platform to the smart factory by making industrial equipment—CNC machines, presses, conveyors, industrial robots—talk.
Solution Accelerator: LLMs for Manufacturing
In this solution accelerator, we focus on item (3) above, which is the use case on augmenting field service engineers with a knowledge base in the form of an interactive context-aware Q/A session. The challenge that manufacturers face is how to build and incorporate data from proprietary documents into LLMs. Training LLMs from scratch is a very costly exercise, costing hundreds of thousands if not millions of dollars.
Instead, enterprises can tap into pre-trained foundational LLM models (like MPT-7B and MPT-30B from MosaicML) and augment and fine-tune these models with their proprietary data. This brings down the costs to tens, if not hundreds of dollars, effectively a 10000x cost saving.
The treacherous path to trustworthy Generative AI for Industry
Despite the awesome first impact ChatGPT showed and the already significant efficiency gain programming copilots are delivering to developers as users2, making LLMs serve non-developers – the vast majority of the workforce, that is – by having LLMs translate from natural language prompts to API or database queries, expecting readily usable analytics outputs, is not quite so straightforward. Three primary challenges are:
- Inconsistency of prompts to completions (no deterministic reproducibility between LLM inputs and outputs)
- Nearly impossible to audit or explain LLM answers (once trained, LLMs are black boxes)
- Coverage gap on niche domain areas that typically matter most to enterprise users (LLMs are trained on large corpora of internet data, heavily biased towards more generalist topics)
Lumafield Introduces Atlas, an AI Co-Pilot for Engineers
Lumafield today unveiled Atlas, a groundbreaking AI co-pilot that helps engineers work faster by answering questions and solving complex engineering and manufacturing challenges using plain language. Atlas is a new tool in Voyager, Lumafield’s cloud-based software for analyzing 3D scan and industrial CT scan data. Along with Atlas, Lumafield announced a major expansion of Voyager’s capabilities, including the ability to upload, analyze, and share data from any 3D scanner.
Cadence Design Is Working With Renesas To Build The World’s First LLM Tool For Up-Front Chip Design
Cadence has been aggressively rolling out reinforcement learning-based tools to help chip design teams accelerate the processes of digital design, debugging, verification, PCB layout, and multi-physics optimization. Customers have been eating it up, especially the physical design optimizer “Cerebrus” and the underlying cross-platform consolidated database, “JedAI.”
Now, the company has focused on the most challenging part of designing a chip: defining the specs and creating the first clean version of the design that drives the rest of the entire workflow. Renesas and Cadence have collaborated to develop a novel approach to address the up-front design work by leveraging LLMs, significantly reducing the time and effort from specification to final design. The chip design verification, debugging, and implementation phases remain the same today. They call this accelerating “Correct by Construction” design methodology.
Using an LLM, the team can demonstrate interrogating the plan for compliance with specifications and other design and project documents, in areas such as IP connections for data, control, and test, and other requirements specified in the IP and chip level specifications. These steps of cleaning the design code can take individual engineers and the team weeks of design time and hundreds of meetings to reduce the number of bugs they encounter during the simulation and implementation stages of the project. By using an LLM, Cadence hopes to significantly streamline this process.
🦾 Doosan Robotics to develop GPT-based collaborative robots
Doosan Robotics, a subsidiary of South Korea’s Doosan Group specializing in robot solutions, is venturing into the development of collaborative robot solutions using AI-based GPT (generative pre-trained transformer) technology to enhance its software capabilities.
Doosan Robotics announced it has entered into a business agreement with Microsoft and Doosan Digital Innovation to develop a GPT-based robot control system” utilizing Microsoft’s Azure OpenAI service. Azure OpenAI provides cloud services that include cutting-edge open AI systems, including GPT.
Doosan Robotics plans to apply GPT to its collaborative robots, enabling them to autonomously correct errors and perform tasks. Once the solution is developed, programming time will be reduced, leading to improved operational efficiency and utility.
🖨️ AI and 3D printing: Ai Build’s Daghan Cam and Luke Rogers on simplifying large-format 3D printing with AI
Ai Build has already partnered with a number of leading 3D printer hardware manufacturers, including Hans Weber Maschinenfabrik, Meltio, KUKA, Evo3D, CEAD, and Massive Dimension. Through these partnerships, the company incorporates a wide range of large-format 3D printers into their Ai Lab workshop. Here, the hardware is used to test, develop, verify, and integrate Ai Build’s software for a growing range of applications. Whilst Cam could not disclose too many names, global engineering solutions firm Weir Group and aerospace manufacturer Boeing were pinpointed as key customers employing AiSync software.
Ai Build’s key product is its AiSync software, an AI-driven toolpath optimization and quality control platform. Regarding toolpath optimization, it was announced earlier this year that Ai Build had developed a process which allows users to create advanced 3D printing toolpaths using natural language prompts. This feature, called Talk to AiSync, allows users to input simple text, such as “slice the part with 2mm layer height.” This text is then translated into machine instructions to produce the desired 3D printed part.
Key to this feature is large language AI models. AiSync uses OpenAI on the back end, with GPT-4 running the software’s natural language processing. “With the addition of large language models, we are able to translate simple English words, plain sentences, into a stack of workflow that we create on our software,” explained Cam. “The goal is to make it super accessible to inexperienced users by making the user experience really smooth.”
Retentive Network: A Successor to Transformer for Large Language Models
In this work, we propose Retentive Network (RetNet) as a foundation architecture for large language models, simultaneously achieving training parallelism, low-cost inference, and good performance. We theoretically derive the connection between recurrence and attention. Then we propose the retention mechanism for sequence modeling, which supports three computation paradigms, i.e., parallel, recurrent, and chunkwise recurrent. Specifically, the parallel representation allows for training parallelism. The recurrent representation enables low-cost O(1) inference, which improves decoding throughput, latency, and GPU memory without sacrificing performance. The chunkwise recurrent representation facilitates efficient long-sequence modeling with linear complexity, where each chunk is encoded parallelly while recurrently summarizing the chunks. Experimental results on language modeling show that RetNet achieves favorable scaling results, parallel training, low-cost deployment, and efficient inference. The intriguing properties make RetNet a strong successor to Transformer for large language models.
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Scaling sequence length has become a critical demand in the era of large language models. However, existing methods struggle with either computational complexity or model expressivity, rendering the maximum sequence length restricted. To address this issue, we introduce LongNet, a Transformer variant that can scale sequence length to more than 1 billion tokens, without sacrificing the performance on shorter sequences. Specifically, we propose dilated attention, which expands the attentive field exponentially as the distance grows. LongNet has significant advantages: 1) it has a linear computation complexity and a logarithm dependency between any two tokens in a sequence; 2) it can be served as a distributed trainer for extremely long sequences; 3) its dilated attention is a drop-in replacement for standard attention, which can be seamlessly integrated with the existing Transformer-based optimization. Experiments results demonstrate that LongNet yields strong performance on both long-sequence modeling and general language tasks. Our work opens up new possibilities for modeling very long sequences, e.g., treating a whole corpus or even the entire Internet as a sequence.
Training ChatGPT on Omniverse Visual Scripting Using Prompt Engineering
Palantir AIP | Defense and Military
What does it take to talk to your Industrial Data in the same way we talk to ChatGPT?
The vast data set used to train LLMs is curated in various ways to provide clean, contextualized data. Contextualized data includes explicit semantic relationships within the data that can greatly affect the quality of the model’s output. Contextualizing the data we provide as input to an LLM ensures that the text consumed is relevant to the task at hand. For example, when prompting an LLM to provide information about operating industrial assets, the data provided to the LLM should include not only the data and documents related to those assets but also the explicit and implicit semantic relationships across different data types and sources.
An LLM is trained by parceling text data into smaller collections, or chunks, that can be converted into embeddings. An embedding is simply a sophisticated numerical representation of the ‘chunk’ of text that takes into consideration the context of surrounding or related information. This makes it possible to perform mathematical calculations to compare similarities, differences, and patterns between different ‘chunks’ to infer relationships and meaning. These mechanisms enable an LLM to learn a language and understand new data that it has not seen previously.
How ChatGPT Programmed an Industrial Robot
Our initial challenge for ChatGPT involved programming the Yaskawa robot to perform a wire cut. This is a very simple task. However, ChatGPT isn’t intrinsically familiar with the INFORM programming language, which is integral to Yaskawa robots. As such, our first step was to delineate the fundamental commands of this language.
Furthermore, ChatGPT had no understanding of the physical robot, its movements, or the typical process of wire-cutting. To address this, we established several coordinates using the robot’s teach pendant and outlined the basic principles of operation.
With these prerequisites met, we put forward our request for ChatGPT to create the required program. The AI successfully rose to the challenge, generating a program that we then transferred to the robot for a test run. The outcome was encouraging, with the robot effectively performing the wire-cutting task as directed.
How Large-Language Models Can Revolutionize Military Planning
What happens when you give military planners access to large-language models and other artificial intelligence and machine-learning applications? Will the planner embrace the ability to rapidly synthesize diffuse data streams or ignore the tools in favor of romanticized views of military judgment as a coup d’œil? Can a profession still grappling to escape its industrial-age iron cage and bureaucratic processes integrate emerging technologies and habits of mind that are more inductive than deductive?
A team that includes a professor from Marine Corps University and a portfolio manager from Scale AI share our efforts to bridge new forms of data synthesis with foundational models of military decision-making. Based on this pilot effort, we see clear and tangible ways to integrate large-language models into the planning process. This effort will require more than just buying software. It will require revisiting how we approach epistemology in the military professional. The results suggest a need to expand the use of large-language models alongside new methods of instruction that help military professionals understand how to ask questions and interrogate the results. Skepticism is a virtue in the 21st century.
Will Generative AI finally turn data swamps into contextualized operations insight machines?
Generative AI, such as ChatGPT/GPT-4, has the potential to put industrial digital transformation into hyperdrive. Whereas a process engineer might spend several hours performing “human contextualization” (at an hourly rate of $140 or more) manually – again and again – contextualized industrial knowledge graphs provide the trusted data relationships that enable Generative AI to accurately navigate and interpret data for Operators without requiring data engineering or coding competencies.
Can Large Language Models Enhance Efficiency In Industrial Robotics?
One of the factors that slow down the penetration of industrial robots into manufacturing is the complexity of human-to-machine interfaces. This is where large language models, such as ChatGPT developed by OpenAI, come in. Large language models are a cutting-edge artificial intelligence technology that can understand and respond to human language at times almost indistinguishable from human conversation. Its versatility has been proven in applications ranging from chatbots to language translation and even creative writing.
It turns out that large language models are quite effective at generating teach pendant programs for a variety of industrial robots, such as KUKA, FANUC, Yaskawa, ABB and others.
ChatGPT for Robotics: Design Principles and Model Abilities
ChatGPT unlocks a new robotics paradigm, and allows a (potentially non-technical) user to sit on the loop, providing high-level feedback to the large language model (LLM) while monitoring the robot’s performance. By following our set of design principles, ChatGPT can generate code for robotics scenarios. Without any fine-tuning we leverage the LLM’s knowledge to control different robots form factors for a variety of tasks. In our work we show multiple examples of ChatGPT solving robotics puzzles, along with complex robot deployments in the manipulation, aerial, and navigation domains.
Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance
Last year Google Research announced our vision for Pathways, a single model that could generalize across domains and tasks while being highly efficient. An important milestone toward realizing this vision was to develop the new Pathways system to orchestrate distributed computation for accelerators. In “PaLM: Scaling Language Modeling with Pathways”, we introduce the Pathways Language Model (PaLM), a 540-billion parameter, dense decoder-only Transformer model trained with the Pathways system, which enabled us to efficiently train a single model across multiple TPU v4 Pods. We evaluated PaLM on hundreds of language understanding and generation tasks, and found that it achieves state-of-the-art few-shot performance across most tasks, by significant margins in many cases.