Machine Perception
Articles tagged with Machine Perception
KTU researchers develop a model that improves machine understanding of the real world
A new model combines multiple ways of analysing 3D data, integrating local and global perspectives to interpret complex environments more reliably. The system improves detection of small or partially visible objects in real-world situations, enhancing safety in autonomous systems.
AI-based system for real-time detection of whip sounds in horse racing
Researchers developed an AI-based system that accurately detects whip sounds in horse racing, achieving detection rates of up to 70% in audio data. The system's ability to process audio in real-time and its reliance on high-frequency components make it a promising tool for improving animal welfare and fair competition.
New AI tool can take a cattle’s temperature with only a photo
A new AI tool, CattleFever, uses artificial intelligence and thermal cameras to estimate cattle body temperature from a photo. The system can automatically determine an animal's body temperature within 1 degree of the reading from a thermometer.
Can AI read humans’ minds? A new model shows it’s shockingly good at it
A breakthrough AI system called OmniPredict can predict human pedestrian behaviors with unprecedented accuracy, revolutionizing self-driving cars and urban mobility. The model combines visual cues with contextual information to anticipate pedestrians' next moves, reducing the risk of accidents and improving traffic safety.
Purdue innovation to be evaluated in international study for earlier identification of preeclampsia risk
Researchers at Purdue University are testing a computer-vision method to analyze smartphone photos of pregnant women's eyes to predict preeclampsia risk. The two-year study aims to reduce maternal mortality in Africa and could potentially save thousands of lives.
AI at the Eyelid: Glasses that track health through your blinks
Researchers developed AI-powered BlinkWise glasses that track blinking patterns to assess fatigue, mental workload, and eye-related health issues. The device uses radio signals to detect minute eyelid movements with unprecedented detail, preserving privacy and using minimal power.
Researchers create multimodal sentiment analysis method that improves detection of human emotions while reducing computational cost
Researchers developed a novel approach called R3DG that analyzes representations at varying granularities to capture nuanced emotional fluctuations and reduce computational complexity. This framework demonstrates superior performance in multiple multimodal tasks, including sentiment analysis, emotion recognition, and humor detection.
AI vision, reinvented: The power of synthetic data
Researchers developed CoSyn, a new approach to train open-source models using AI-generated scientific figures and charts. The resulting dataset, CoSyn-400K, includes over 400,000 synthetic images and 2.7 million sets of corresponding instructions. CoSyn-trained models match or outperform proprietary peers in various benchmark tests.
New tool gives anyone the ability to train a robot
MIT engineers developed a versatile demonstration interface that allows users to teach robots new skills in three intuitive ways: remote control, physical manipulation, or demonstration. This innovation expands the type of users and 'teachers' who interact with robots, enabling robots to learn a wider set of skills.
New attack can make AI ‘see’ whatever you want
Researchers have demonstrated a new technique, RisingAttacK, to manipulate all widely used AI computer vision systems, allowing them to control what the AI 'sees'. The attack is effective at influencing the AI's ability to detect top targets, such as cars, pedestrians, or stop signs.
Self-trained vision transformers mimic human gaze with surprising precision
Researchers from The University of Osaka have demonstrated that vision transformers can spontaneously develop human-like visual attention patterns without specific training. This breakthrough showcases the potential of self-supervised learning for advancing AI applications and modeling biological vision.
Study explores how to use AI to listen to the ‘heartbeat’ of a city
University of Missouri researchers create digital sentiment map using AI to analyze public Instagram posts, linking emotional tone to real-life features. The tool aims to improve city services, identify areas of concern, and inform emergency response decisions.
Could crowdsourcing hold the key to early wildfire detection?
A new crowdsourcing system, FireLoc, uses a network of low-cost mobile phones to detect wildfires minutes—even seconds—after they ignite. The system prioritizes privacy and accurately maps wilderness fires to within 180 feet of their origin.
Real-time descriptions of surroundings for people who are blind
WorldScribe, a new software, uses generative AI to provide real-time text and audio descriptions of surroundings for people who are blind or have low vision. The tool can adjust the level of detail based on user commands or camera frame time.
Study offers improvements to food quality computer predictions
A study from the University of Arkansas System Division of Agriculture has improved food quality computer predictions by using human perception data. The researchers trained a computer model to mimic human adaptation to environmental conditions, resulting in more consistent predictions under different lighting conditions.
Recent development of multimodal sentiment recognition and understanding
Researchers have made significant strides in multimodal sentiment recognition, leveraging self-supervised learning and large models to capture correlations between modalities and emotional information. The study emphasizes the importance of addressing data scarcity and exploring transfer learning methods to develop robust models.
Penn Engineers recreate Star Trek’s Holodeck using ChatGPT and video game assets
Researchers created a system called Holodeck to generate interactive 3D environments, leveraging language models like ChatGPT to control it. The system outperformed earlier tools in evaluating realism and accuracy, with human evaluators preferring its outputs across various indoor environments.
Innovations in depth from focus/defocus pave the way to more capable computer vision systems
A new depth from focus/defocus approach, DDFS, combines model-based and learning-based strategies to achieve notable improvements in performance and applicability. The proposed method outperformed state-of-the-art methods in various metrics for several image datasets.
MethaneMapper is poised to solve the problem of underreported methane emissions
MethaneMapper is an artificial intelligence-powered hyperspectral imaging tool that can detect real-time methane emissions and trace them to their sources. With a performance accuracy of 91%, it has the potential to revolutionize the way we monitor oil and gas operations and curb climate change.
New method improves efficiency of ‘vision transformer’ AI systems
Researchers at North Carolina State University have developed a new methodology called Patch-to-Cluster attention (PaCa) that addresses the challenges of vision transformers. PaCa improves ViT's ability to identify, classify, and segment objects in images while reducing computational demands and enhancing model interpretability.
When should someone trust an AI assistant’s predictions?
MIT researchers develop teaching phase that guides humans in understanding AI strengths and weaknesses, enabling more accurate decisions and faster conclusions. The technique helps humans build a mental model of the AI agent, reducing reliance on biased assumptions.
A step towards natural interaction between robots and animals
Researchers at Beijing Institute of Technology created a robot that can track fast-moving rats for extended periods using real-time localization and movement analysis. The robotic rat's built-in stereo vision system enables it to characterize typical behaviors of actual rats, promoting autonomy and reproducibility in behavior research.
A robot that finds lost items
Researchers at MIT develop RFusion, a robotic system that uses data from a camera and radio frequency antenna to locate and retrieve lost items. The system relies on RFID tags and machine learning algorithms to optimize the robot's trajectory and grasp the object.
Paint the town
A team of scientists from Osaka University developed a machine learning method for classifying the type of building and its primary façade color using deep learning models applied to street-level images. This work may assist in fostering neighborhood cohesion and support urban renewal by providing tailored street-view datasets.