Inside PyTouch, Facebook’s ML library for touch processing
Facebook AI recently launched an open source machine learning library, PyTouch, to process touch sensing signals. It provides cutting-edge touch processing capabilities as a service to unify the touch sensing community and help create scalable, proven, and performance-validated modules. The library is currently available on GitHub.
Your required expertise: Complete our survey on the state of AI in Indian businesses
With the increased availability of touch sensors, the sense of touch is becoming a new paradigm in robotics and machine learning. However, out-of-the-box touch processing software is limited, which is a high barrier to entry for budding developers. Processing raw sensor measurements into high-level functionality is a challenge.
On the other hand, computer vision has algorithmic and programmatic methods to understand images and videos. Popular open source libraries such as Google’s TensorFlow, PyTorch, CAFFE, OpenCV have further accelerated research by providing unified interfaces, algorithms and platforms.
Even though tools such as PyTorch and CAFFEE can be used for tactile processing, the development of precursors is needed to support the algorithms for experimentation and research purposes. PyTouch provides an entry point here. The library has been designed to support beginners as well as experts.
With PyTouch, Facebook aims to help researchers develop machine learning models that transparently process tactile sensing signals. “Sensing the world through touch opens up exciting new challenges and opportunities for measuring, understanding and interacting with the world around us,” Facebook said.
“We believe that, as with computer vision, the availability of open source and maintained software libraries for the processing of tactile reading would reduce the barrier of entry to tactile tasks, experimentation and research in the field of touch detection, ”Facebook said.
In an article titled “PyTouch: A Machine Learning Library for Touch Processing”, co-authored by Mike Lambeta, Huazhe Xu, Jingwei Xu, Po-Wei Chou, Shaoxiong Wang, Trevor Darrell and Roberto Calandra, the researchers described the architecture choice of library and has demonstrated its capabilities and advantages through several experiments.
The image represents the PyTouch architecture, where tactile tactile processing is delivered to the end application “as a service” via published pre-trained models. (Source: arXiv.org)
As shown in the image above, the software library modularizes a set of commonly used touch processing functions that are valuable for various downstream tasks such as touch manipulation, touch-based object recognition, slip detection , etc. With this architecture, PyTouch is stepping up efforts. standardize research in robotics and machine learning for better benchmarks and more reproducible results.
More importantly, the library aims to standardize the design of touch experiences and seeks to reduce the amount of individual software developed, keeping the PyTouch library as the basis for the expansion of future research applications.
- PyTouch is built on top of the PyTorch machine learning framework.
- Built on a library of pre-trained models, PyTouch provides real-time tactile processing functionality.
- Provides functions such as contact classification, slip detection, contact area estimation and interfaces for training and learning transfers
- The library can train models using data from other vision-based or non-vision-based touch sensors.
- PyTouch enables benchmarking of the performance of actual experiences of creating a tactile task baseline.
“Finally, alongside the framework, we’ve released a set of pre-trained models that PyTouch uses in the background for touch tasks,” Facebook said.
Facebook evaluated the performance of machine learning models trained on different models of vision-based touch sensors, including DIGIT, OmniTact, and GelSight.
The table above shows the accuracy of the classification [%] tactile detection (average and standard) by cross validation (k = 5). Common models are trained with data from all three sensors, including DIGIT, OmniTact, and GelSight. The precision of the cross-validation with the varying size of the train dataset for the simple and joint models is shown below.
Experiments have shown that the same amount of data as training a joint model using data on multiple sensors (DIGIT, OmniTact, and GelSight) gives better model performance than training from a single sensor.
Presentation of examples of data used in the formation of tactile prediction models. The dataset includes data on multiple DIGIT, OmniTact and GelSight sensors showing different lighting conditions and objects of different spatial resolutions. (Source: arXiv.org)
The road to follow
Facebook is looking to create an expandable library for tactile processing similar to what PyTorch and OpenCV are for computer vision.
PyTouch is still in its infancy. With several pre-trained models in place, it will allow researchers to focus on rapid prototyping. “We believe this would have a beneficial impact on the robotics and machine learning community by enabling new capabilities and speeding up research,” Facebook concluded.
Join our Telegram group. Be part of an engaging online community. Join here.
Subscribe to our newsletter
Receive the latest updates and relevant offers by sharing your email.