List of Accepted Papers
Following is the list of accepted ICIP 2025 papers, sorted by paper title. You can use the search feature of your web browser to find your paper number. Notifications to all authors have also been sent by email. If you have not received your notification of the results by email, please contact us at papers@2025.ieeeicip.org.
Paper Number | Paper Title |
---|---|
2583 | 3D Magnetic Inverse Routine for Single-Segment Magnetic Field Images |
2489 | A 3D MESH CONVOLUTION-BASED AUTOENCODER FOR GEOMETRY COMPRESSION |
1208 | A BENCHMARK AND EVALUATION FOR REAL-WORLD OUT-OF-DISTRIBUTION DETECTION USING VISION-LANGUAGE MODELS |
2563 | A BENCHMARK DATASET FOR AUTOMATED DIAGNOSIS AND TREATMENT PLANNING OF CLASS III MALOCCLUSION USING X-RAYS AND PROFILE PHOTOS |
2026 | A CONFIDENCE-BASED SAMPLING STRATEGY FOR DENSE TEMPORAL TOKEN LEARNING IN THERMAL INFRARED OBJECT TRACKING |
2209 | A DEBIASING FRAMEWORK FOR ATTRIBUTE BINDING IN DIFFUSION-BASED TEXT-TO-IMAGE GENERATION |
2702 | A Distortion-Guided Fine-Tuning Network for Blind Image Quality Assessment |
1969 | A DUAL BRANCH GRAPHIC TEXT DETECTION NETWORK BASED ON PROGRESSIVE DOMAIN ADAPTATION |
2488 | A Generative Diffusion Model to solve Inverse Problems for Robust in-NICU Neonatal MRI |
1579 | A GENERATIVE FACE VIDEO CODING FRAMEWORK WITH DISENTANGLED AND CONSISTENT BACKGROUND |
2252 | A GREEN LEARNING APPROACH TO LDCT IMAGE RESTORATION |
2279 | A MULTI-LAYER END-TO-END 360 IMAGE COMPRESSION |
2329 | A MULTISCALE ATTENTION-BASED DEEP LEARNING METHOD FOR DCE-MRI BREAST TUMOR SEGMENTATION |
1286 | A NEW MULTI-SOURCE DISTRIBUTED TRANSFER LEARNING FRAMEWORK |
2172 | A NOISE-TO-NOISE TRAINING APPROACH FOR ROBUST MOTION-COMPENSATED PROCESSING IN CARDIAC-GATED IMAGES |
2420 | A NOVEL AI FRAMEWORK FOR BREAST CANCER MOLECULAR BIOMARKER RESPONSE SCORE DETECTION ON CELLS LEVEL USING MARKER-BASED WATERSHED SEGMENTATION AND MACHINE LEARNING CLASSIFIERS |
2195 | A Novel Automated System for Pathological Lung Segmentation using Modified Local Binary Patterns and Hierarchical Transformers |
1478 | A Novel Downsampling Strategy Based on Information Complementarity for Medical Image Segmentation |
2657 | A NOVEL EXPLAINABLE AI-BASED SYSTEM FOR IMPROVED PREDICTION OF BREAST CANCER RESPONSE TO NEOADJUVANT CHEMOTHERAPY |
2156 | A NOVEL GAME GRAPHICS QUALITY EVALUATION MODEL USING SALIENCY AND RESOLUTION INFORMATION |
1207 | A NOVEL METHOD AND DATASET FOR DEPTH-GUIDED IMAGE DEBLURRING FROM SMARTPHONE LIDAR |
2691 | A PHYSICS-GUIDED SMOOTHING METHOD FOR MATERIAL MODELING WITH DIGITAL IMAGE CORRELATION (DIC) MEASUREMENTS |
1637 | A PRINCIPLED DIFFUSION POSTERIOR SAMPLING FOR INVERSE PROBLEM WITH MIXED POISSON-GAUSSIAN NOISE |
1960 | A RETINEX-BASED VARIATIONAL MODEL WITH A NONLOCAL GRADIENT-TYPE CONSTRAINT FOR LOW-LIGHT IMAGE ENHANCEMENT |
2648 | A SCALABLE IMAGE COMPRESSION USING CONDITIONAL DIFFUSION MODEL IN HUMAN-MACHINE HYBRID VISION |
2237 | A TRANSPARENT AND LIGHTWEIGHT TUMOR-AWARE MRI SUPER-RESOLUTION FRAMEWORK TO ENHANCE PROSTATE CANCER DETECTION |
2476 | A TREE OF SHAPES COMPUTATION ALGORITHM FOR MASSIVELY PARALLEL ARCHITECTURES |
1288 | A Unified Transformer-Based Framework with Pretraining For Whole Body Grasping Motion Generation |
2509 | A Warmer Start to Active Learning with Adaptive Gaussian Mixture Models for Skin Lesion Segmentation |
2112 | ABAS-RAL: ADAPTIVE BATCH SIZE USING REINFORCED ACTIVE LEARNING |
2284 | ACC: ALTERNATING COMPLEMENTARY COLORS FOR DISPLAY ENERGY REDUCTION |
1152 | Adapting Foundation Features via Cross-View Contrastive Learning for Unseen Object Pose Estimation |
2456 | ADAPTIVE HIERARCHICAL FEATURE DIFFERENCE AUTO-ENCODER FOR ROBUST RGB-T OBJECT TRACKING |
1195 | ADAPTIVE HIGH-FREQUENCY PREPROCESSING FOR VIDEO CODING |
1198 | ADAPTIVE MULTIMODAL FUSION VIA ATTENTION-GUIDED FEATURE SELECTION FOR HISTOPATHOLOGY IMAGE CLASSIFICATION |
2028 | ADAPTIVE SMOOTHING OF NON-RECTANGULAR PREDICTION BLOCK EDGES IN THE WEDGE MODE OF AVM |
2587 | ADAPTIVE VOXELIZATION FOR TRANSFORM CODING OF 3D GAUSSIAN SPLATTING DATA |
2516 | ADVANCEMENTS IN MEDICAL IMAGE CLASSIFICATION THROUGH FINE-TUNING NATURAL DOMAIN FOUNDATION MODELS |
2496 | Advancing Limited-Angle CT Reconstruction Through Diffusion-Based Sinogram Completion |
1476 | ADVERSARIAL IMAGE PURIFICATION BY EXPLAINING ADVERSARIAL DETECTORS |
2532 | AFMUNet: Adaptive Filter-Based Frequency Modulation UNet for OCTA Segmentation |
2243 | Aggressive Rejection with Adaptive Gradient for Contaminated Data |
1692 | AMFNVD: ADDRESSING TWO KINDS OF THE DISCREPANCY PROBLEMS OF FAKE NEWS VIDEO DETECTION |
2795 | ANATOMICAL ATTENTION ALIGNMENT REPRESENTATION FOR RADIOLOGY REPORT GENERATION |
2485 | ANCHOR-BASED GRAVITY ALIGNMENT FOR PANORAMAS |
2718 | ANCHOR-VIT: SPATIALLY-FOCUSED VISION TRANSFORMER FOR DISTRACTED DRIVING DETECTION |
2666 | ANTI-FT: TOWARDS PRACTICAL DEEP LEAKAGE FROM GRADIENTS |
1426 | ARaBIQA: A Novel Blind Image Quality Assessment model for Augmented Reality |
2841 | Assessing Urban Environments with Vision-Language Models: A Comparative Analysis of AI-Generated Ratings and Human Volunteer Evaluations |
2749 | ASTROPHOTOGRAPHY TURBULENCE MITIGATION VIA GENERATIVE MODELS |
2676 | Attribute-Specified Generation and Style-Transfer Diffusion for Face Recognition Enhancement |
2633 | AUDIO VISUAL SEGMENTATION THROUGH TEXT EMBEDDINGS |
2076 | AUTOREGRESSION-FREE VIDEO PREDICTION USING DIFFUSION MODEL FOR MITIGATING ERROR PROPAGATION |
2047 | Avoiding Bias While Pruning Neural Networks: The Case of Image Classification |
1552 | BAIT: A NEW DNN BACKDOOR ATTACK USING INPAINTED TRIGGERS |
1567 | Batch-Aware Active Learning for Object Detection |
1755 | BAYESIAN SURPRISE FOR SMALL AND SUB-PIXEL MOVING TARGET DETECTION |
2789 | BD OPEN LULC MAP: HIGH-RESOLUTION LAND USE LAND COVER DATASET & BENCHMARK RESULTS FOR DEVELOPING CITY: DHAKA, BD |
2793 | Beta Wavelet Induced Multi-Scale Kernel Clustering: A Frequency-Aware Framework For Complex Data Analysis |
2779 | BEVANET: BILATERAL EFFICIENT VISUAL ATTENTION NETWORK FOR REAL-TIME SEMANTIC SEGMENTATION |
1832 | BicycleDualNet: BicycleGAN-powered Dual Encoder Network for Single Image 3D Reconstruction |
1952 | Bidirectional Flow Fields for Sparse Input Novel View Synthesis of Dynamic Scenes |
1675 | BIOVL-QR: EGOCENTRIC BIOCHEMICAL VISION-AND-LANGUAGE DATASET USING MICRO QR CODES |
1824 | Bits-to-Photon: End-to-End Learned Scalable Point Cloud Compression for Direct Rendering |
2370 | BLAZE: A DATASET FOR WILDFIRE AND BURNT AREA UAV IMAGE CLASSIFICATION AND SEGMENTATION |
2078 | BLIND DENOISING USING DENSE IN DENSE NETWORK WITH ATTENTION MODULE |
1914 | BLIND MULTI-MODE PTYCHOGRAPHY USING A DISTRIBUTED PROBE ESTIMATE |
2649 | BOOSTED AFFINE MOTION COMPENSATION FOR GEOMETRIC PARTITIONING MODE |
1036 | BOOSTING TEXT-TO-IMAGE PERSON RE-IDENTIFICATION WITH GENERATIVE HARD NEGATIVE |
2675 | BreathAI: Transfer Learning-Based Thermal Imaging for Automated Breathing Pattern Recognition |
2479 | BRIDGING DOMAIN SHIFTS THROUGH SELF-CONTRASTIVE LEARNING AND DISTRIBUTION ALIGNMENT |
1899 | BSRPCA: A SIMPLIFIED BLIND SUPER-RESOLVED RPCA-BASED APPROACH FOR ENHANCING BLOOD FLOW ESTIMATION |
2686 | CADOT: CITYSCAPE AERIAL IMAGE DATASET FOR OBJECT DETECTION |
1804 | CAG: Context-conditional 2D Affordance Generation |
2830 | CAN LARGE LANGUAGE MODELS CHALLENGE CNNS IN MEDICAL IMAGE ANALYSIS? |
2508 | CATEGORY-DEPENDENT LEARNED IMAGE COMPRESSION FOR SMARTPHONE PHOTOGRAPHY WITH STANDARD-COMPLIANT DECODERS |
2556 | CERTAINTY AND UNCERTAINTY GUIDED ACTIVE DOMAIN ADAPTATION |
2784 | CGD-MAE: CLIP DISTILLATION-DRIVEN PRE-TRAINING FRAMEWORK FOR VEHICLE RE-IDENTIFICATION |
1582 | CHTMAE: Cross-modal Hierarchical Temporal-spatial Masked Autoencoder Model for Micro-expression Recognition |
2632 | CHUG: CROWDSOURCED USER-GENERATED HDR VIDEO QUALITY DATASET |
2410 | CLIP-AE: CLIP-assisted Cross-view Audio-Visual Enhancement for Unsupervised Temporal Action Localization |
1368 | CLIP-FSQAE: CLIP-guided Finite Scalar Quantized AutoEncoder for Few-Shot Anomaly Detection |
2646 | Cloud Optical Thickness Retrievals Using Angle Invariant Attention Based Deep Learning Models |
1181 | Cluster Contrast for Unsupervised Visual Representation Learning |
1701 | CMP: COMPOSABLE META PROMPT FOR SAM-BASED CROSS-DOMAIN FEW-SHOT SEGMENTATIO |
2069 | CMTM: CROSS-MODAL TOKEN MODULATION FOR UNSUPERVISED VIDEO OBJECT SEGMENTATION |
1322 | CODING-INFORMATION BASED IMPROVEMENT FOR IN-LOOP FILTERS BEYOND VVC |
2132 | ColorGPT: Automatic Colorization with Generative Prompts and Transformer |
1863 | Combination Test of NNVC Tools and NN-Inter in VVC |
1624 | COMMON AND UNIQUE REPRESENTATION DEEP EMBEDDED CLUSTERING |
1394 | COMPACT LATENT REPRESENTATION FOR IMAGE COMPRESSION (CLRIC) |
1196 | Compressing Human Body Video with Interactive Semantics: A Generative Approach |
1449 | Compressing multi-scale features with a channel-shrinked single-branch architecture |
2104 | Conditional Diffusion Transformer for Unified Distortion Correction and Rectification |
1717 | CONFIDENCE-AWARE AGGLOMERATION CLASSIFICATION AND SEGMENTATION OF 2D MICROSCOPIC FOOD CRYSTAL IMAGES |
1881 | CONSISTENT VIEW SYNTHESIS WITH BIDIRECTIONAL EPIPOLAR ATTENTION AND RECONSTRUCTION |
2366 | Constrained GAN-Generated X-ray CT Data for Self-Supervised and Foundation-Model Segmentation of Concrete Microstructures |
2321 | CONTEXTLOSS: CONTEXT INFORMATION FOR TOPOLOGY-PRESERVING SEGMENTATION |
1818 | CONTINUOUS ACTION UNIT INTENSITY MODELING FOR MICRO-EXPRESSION RECOGNITION |
2197 | ConvFuse: A Progressive ConvFormer Network for Context-Aware Multisensor Image Fusion |
2317 | CorDis: A novel correlation-based disentanglement measure |
2468 | COT-AD: COTTON ANALYSIS DATASET |
2534 | CrossDR: Bridging 2D and 3D Features for Diabetic Retinopathy Classification Using Context-Aware Cross-Attention |
2465 | CROSS-MODAL ATTENTION WITH ADAPTIVE AND HIERARCHICAL FUSION FOR ROBUST RGB-T IMAGE SEGMENTATION FOR SAFE DRIVING |
1615 | CROSS-MODALITY ABDOMINAL MULTI-ORGAN SEGMENTATION VIA SOURCE-FREE UNSUPERVISED DOMAIN ADAPTATION |
1642 | CTU-LEVEL RATE CONTROL WITH λ OPTIMIZATION BASED ON VISUAL GAZE MECHANISM FOR 360-DEGREE VERSATILE VIDEO CODING |
1477 | CURVE: CLIP-UTILIZED REINFORCEMENT LEARNING FOR VISUAL IMAGE ENHANCEMENT VIA SIMPLE IMAGE PROCESSING |
1382 | Cwc-DNeRF: Compact Dynamic Neural Radiance Field via Discrete Wavelet Transform and Learnable Codebooks |
1565 | CYTOFUSION: A LATENT DIFFUSION-BASED FRAMEWORK FOR CYTOLOGY CLASSIFICATION |
1541 | D2TR: Sea Clutter Suppression via Dynamic Dual-tree Complex Wavelet Selection and Target-Guided Regularization |
2109 | DAAFNET: DOMAIN ADAPTIVE AUGMENTED FEATURE NETWORK FOR BIOSIGNAL BASED EMOTION RECOGNITION |
1559 | DARK COUNT REMOVAL IN PHOTON-COUNTING SPAD ARRAYS |
2005 | DARTs: Deformable Animation Ready Templates for Clothing Humans |
1784 | DATA-DRIVEN RECURSIVE INTRA PREDICTION |
1253 | DBF-Net: A Dual-Branch Network with Feature Fusion for Ultrasound Image Segmentation |
2822 | DCM-VIDEONET: A DENSELY-CONNECTED MODULATED DECODER FRAMEWORK FOR IMPLICIT NEURAL VIDEO COMPRESSION |
1916 | DEEP OBJECT RECOGNITION-BASED ANALYSIS OF DIVERSE CULINARY LANDSCAPES |
2689 | DEEP UNFOLDING-BASED IMAGE RECONSTRUCTION FOR QUANTA IMAGE SENSORS |
1781 | DEEP UNSUPERVISED DESPECKLING WITH UNBIASED RISK ESTIMATION |
2703 | Deformable Shape Registration from Inexact Correspondences |
1309 | DEFORMABLE SPHERICAL GEOMETRY TRANSFORMER FOR PANORAMIC SEMANTIC SEGMENTATION |
2427 | DEPTH-AWARE SCORING AND HIERARCHICAL ALIGNMENT FOR MULTIPLE OBJECT TRACKING |
1387 | DETECTING AND MITIGATING INCOHERENT INPUT OF LATENT DIFFUSION MODELS |
2300 | DETECTION OF PAVEMENT DEFECTS ON ROADS USING A MULTIMODAL YOLOV8 WITH IMAGE AND IMU DATA |
1650 | DETECTION OF SCREEN USAGE DURING EATING EVENTS AMONG PRESCHOOL-AGED CHILDREN |
2611 | DFT GAZE: DISTILLED AND FINE-TUNED GAZE ESTIMATION FOR PERSONALIZATION ON TINY DEVICES |
2325 | DGRGaze: A difference-guided gaze estimation framework based on 6D rotation matrix representation |
2036 | DIFFDEMORPH: EXTENDING REFERENCE-FREE DEMORPHING TO UNSEEN FACES |
2090 | Diffuse and Refine Latent Prior with Transformers for Neural ISP |
2507 | DIFFUSE2ADAPT: CONTROLLED DIFFUSION FOR SYNTHETIC-TO-REAL DOMAIN ADAPTATION |
2289 | DIFFUSION BASED SHAPE-AWARE LEARNING WITH MULTI-SCALE CONTEXT FOR SEGMENTATION OF TIBIOFEMORAL KNEE JOINT TISSUES: AN END-TO-END APPROACH |
2021 | DIFFUSION PRETRAINING FOR GAIT RECOGNITION IN THE WILD |
2569 | Diffusion to Confusion: Naturalistic Adversarial Patch Generation Based on Diffusion Model for Object Detector |
2163 | DIFFUSION-BASED CT IMAGE SEGMENTATION FOR INTRACEREBRAL HEMORRHAGE |
1963 | Direction-Emphasizing Transformer for Road Extraction from Optical Remote Sensing Imagery |
1048 | DISCO: A Diffusion Model for Spatial Transcriptomics Data Completion |
2011 | DISCRETE DIFFUSION PROPAGATED TRANSFORMER FOR FLEXIBLE URETEROSCOPIC SEMANTIC SEGMENTATION |
1335 | DISENEMO: LEARNING DISENTANGLED EMOTIONAL REPRESENTATION FROM FACIAL MOTION FOR 3D TALKING HEAD GENERATION |
1416 | DIVA-VQA: Detecting Inter-frame Variations in UGC Video Quality |
2196 | DIVERSIFYING HUMAN POSE IN SYNTHETIC DATA FOR AERIAL-VIEW HUMAN DETECTION |
1455 | DIVERSITY-DRIVEN GENERATIVE DATASET DISTILLATION BASED ON DIFFUSION MODEL WITH SELF-ADAPTIVE MEMORY |
2357 | DIVIDE-AND-SUMMARIZE: ENHANCING DEEP NEURAL VIDEO SUMMARIZATION |
2275 | DLP-YOLOV9: MODEL WITH FEWER PARAMETERS AND HIGHER PRECISION BASED ON IMPROVED YOLOV9 IN DRONE-CAPTURED-SCENARIOS |
1345 | DMSO: A DYNAMIC MOMENTUM-SMOOTHING OPTIMIZER FOR LEARNED IMAGE COMPRESSION |
2387 | DOMAIN TRANSFER GENERATIVE MODEL FOR NEW FACE GENERATION |
2272 | DPM-CLIP: ZERO-SHOT MULTIMODAL EGOCENTRIC ACTIVITY RECOGNITION BASED ON DUAL-PREDICTION MECHANISM |
1653 | DP-Net: A 3D Dilated Projection Framework for Precise Fetal Brain Tissue Segmentation |
2438 | DSFACE : CONDITIONAL DIFFUSION INPAINTING FOR SKETCH-TO-FACE SYNTHESIS |
2397 | DTLS-Inpaint: Yet Another Efficient Image Inpainting with Domain Transfer |
1975 | DUAL STREAM NETWORKS FOR 3D HUMAN POSE AND SHAPE ESTIMATION |
2683 | DUAL-STREAM SPATIO-TEMPORAL ACCIDENT ANTICIPATION AND DETECTION |
1891 | DYNAMIC 3D GAUSSIAN RECONSTRUCTION WITH SPECULAR REFLECTION |
2497 | DYNAMIC MESH CODING USING EDGE LENGTH-BASED ADAPTIVE SUBDIVISION |
1609 | DYNAMIC MESH CODING WITH TEMPORALLY CONSISTENT UV ATLAS GENERATION |
2060 | Edge-Guided Monocular Absolute Depth Estimation with Diffusion-Based Refinement |
1917 | EF2LANE: ENHANCED FEATURE FUSION 2D LANE DETECTION NETWORK IN 3D POINT CLOUD |
1737 | Efficient Asymmetric Shared Low-Rank Adaptation based on Selective Scanning Vision Mamba for Medical Imaging Analysis |
2645 | EFFICIENT ATLAS GENERATION FOR MEDICAL IMAGING VIA GROUPWISE LATENT DIFFUSION MODELS |
2295 | EFFICIENT CONSTRAINING OF TRANSCODING IN DNA-BASED IMAGE STORAGE |
2833 | EFFICIENT FEATURE-GUIDED APPROACH FOR IMAGE RESTORATION |
1511 | EFFICIENT IMPLICIT NEURAL REPRESENTATIONS FOR VIDEOS WITH FEATURE MODULATION |
2447 | EFFICIENT LEAF DISEASE CLASSIFICATION AND SEGMENTATION USING MIDPOINT NORMALIZATION TECHNIQUE AND ATTENTION MECHANISM |
2627 | EFFICIENT RANDOM ACCESS METHOD USING SEED AND INTER-KEY FRAMES FOR NEXT GENERATION VIDEO CODEC |
1403 | Efficient Text-to-Image Generation: An Adaptive Step Schedule Controller for Diffusion Models |
2086 | EmoMamba: Advancing Dynamic Facial Expression Recognition with Visual and Textual Fusion |
1635 | Enabling Controllable, Identity Preserving, Non-Rigid Edits in Human-Centric Images |
1338 | ENACT: ENTROPY-BASED CLUSTERING OF ATTENTION INPUT FOR REDUCING THE COMPUTATIONAL NEEDS OF OBJECT DETECTION TRANSFORMERS |
2333 | Energy Efficiency of Video Quality Assessment Metrics |
2529 | Energy-Based Generative Models with Morphological Attention Networks for Hyperspectral Image Classification: A Unified Framework |
2158 | ENHANCED FRAME CONTEXT INITIALIZATION FOR VIDEO CODING BEYOND AV1 |
2622 | ENHANCED MULTI-SCALE NETWORK FOR SINGLE IMAGE SUPER RESOLUTION |
2694 | ENHANCING 3D SCENE REPRESENTATION WITH STRUCTURAL DISSIMILARITY-AWARE LEARNING |
2250 | ENHANCING ADVERSARIAL ROBUSTNESS OF FOUNDATION MODELS WITHOUT DATA CENTRALIZATION |
1370 | Enhancing Autonomous Driving Perception under Complex Weather Conditions through CycleGAN-based Driving Scene Generation |
1538 | Enhancing Image Deraining through VLM-Based Data Refinement and Classification |
2443 | Enhancing Medical Vision-Language Models with Rich Textual Descriptions and Multiple Alignments for Chest X-Ray Diagnosis |
1663 | ENHANCING MULTISCALE FEATURE REPRESENTATION FOR OBJECT-LEVEL RECOGNITION IN MASKED IMAGE MODELING |
2740 | ENHANCING MULTI-TASK LEARNING WITH ATTENTION MECHANISMS |
2797 | Enhancing Unsupervised Domain Adaptation in Semantic Segmentation through Selective Consensus and Gaussian Mixture Model-Based Pseudo-Labeling |
1812 | Enhancing Visual Question Answering via Clustered In-Context Sequence Configuration |
1104 | Enhancing Visual Re-ranking through Denoising Nearest Neighbor Graph via Continuous CRF |
1709 | EQUR: Equivariant Uncertainty Quantification and Refinement for Point Cloud Registration |
1954 | ERP-AWARE TEXT-TO-360 PANORAMA DIFFUSION MODEL |
2816 | ERPGS: EQUIRECTANGULAR IMAGE RENDERING ENHANCED WITH 3D GAUSSIAN REGULARIZATION |
2344 | Error correction for DNA-based image storage |
1222 | ESTIMATING VIRTUAL CAMERA FOV TO REDUCE PERSPECTIVE SHAPE DISTORTION IN 2D-TO-3D FACE RECONSTRUCTION |
1761 | ESwinDNet: Image Demoiréing Using Multiscale Swin Transformer Layers |
2579 | EVENT DENOISING BASED ON ITERATIVE TREE-STRUCTURED INFORMATION AGGREGATION |
2550 | EVENT-BASED EGOCENTRIC HUMAN POSE ESTIMATION IN DYNAMIC ENVIRONMENT |
1965 | EventEgoHands: Event-based Egocentric 3D Hand Mesh Reconstruction |
2320 | Event-Guided Motion Deblurring with Wavelet-Based Cross-Modal Feature Fusion |
2570 | ExDF: Explainable Deepfake Detection with Vision-Language Model |
1733 | Exploring Effective Unfolding Covering Prompt Tuning for Vision Mamba |
2266 | EXPLORING THE POTENTIAL OF VISION-LANGUAGE MODELS FOR PURE-IMAGE AND TEXT-GUIDED-IMAGE SALIENCY PREDICTION |
2023 | EXTENSION OF SEMI-DECOUPLED PARTITIONING IN INTER FRAMES |
2230 | EXTENSION OF SOUND FIELD IMAGE DENOISING TO HIGH-FREQUENCY SOUND FIELDS BY CONSIDERING WAVENUMBER SPECTRAL LOSS |
1704 | EXTENSIONS OF MORPHOLOGICAL GRADIENT FOR HYPERSPECTRAL IMAGES |
2765 | EYES AND EARS: AUTOMATED ANNOTATION OF AUDIO DATA USING COMPUTER VISION |
1754 | F2T2-HiT: A U-Shaped FFT Transformer and Hierarchical Transformer for Reflection Removal |
2234 | FACELIVT: FACE RECOGNITION USING LINEAR VISION TRANSFORMER WITH STRUCTURAL REPARAMETERIZATION FOR MOBILE DEVICE |
2274 | FACIAL IDENTITY EDITING: TOWARDS EFFECTIVE DE-IDENTIFICATION |
2500 | Facilitate and scale up the creation of 3D meshes, 6D category-based datasets and grasping with generative models: GenVegeFruits3D |
2547 | FAST AND ACCURATE OUTLIER-AWARE LIDAR SUPER-RESOLUTION FOR SLAM APPLICATIONS |
1218 | Fast Bounding Box Hierarchy |
2073 | FAST IMAGE VECTOR QUANTIZATION USING SPARSE OBLIQUE REGRESSION TREES |
2705 | Fast Iterative Enhancement for Image Signal Processing |
2734 | FC-RENDER: ADAPTIVE FONT- AND COLOR-AWARE TEXT DIFFUSION MODEL |
2135 | FEW-SHOT CLASS-INCREMENTAL LEARNING FOR EFFICIENT SAR AUTOMATIC TARGET RECOGNITION |
2449 | FGA-NN: Film Grain Analysis Neural Network |
1772 | FINE-GRAINED SPATIAL-TEMPORAL PERCEPTION FOR GAS LEAK SEGMENTATION |
2102 | F-LGAM: Enhancing Single Domain Generalized Object Detection Through Fourier-based Local and Global Amplitude MixUp |
1189 | FMG-DET: FOUNDATION MODEL GUIDED ROBUST OBJECT DETECTION |
2155 | FOUNDATION MODEL-BASED DEFORMABLE REGISTRATION OF MULTI-MODAL REMOTE SENSING IMAGES |
1970 | FPW: Frequency-domain Pixel-by-Pixel Watermarking against unauthorized images used on training generative model |
1868 | Frequency Aware Learned Image Compression Using Swin Transformer and Discrete Wavelet Transform |
1983 | FREQUENCY-GUIDED CONTEXTUAL IMAGE CAPTIONING |
1049 | Frozen Network Few-Shot Object Detection |
2106 | FSAC-IA: A HIERARCHICAL CONSTRUCTED SAC-IA ALGORITHM FOR POINT CLOUD ALIGNMENT ACCELERATION |
1775 | GARMENT DE-WARPING FOR VIRTUAL TRY-ON IN THE WILD |
2055 | GEE-UOD: AN UNDERWATER OBJECT DETECTION NETWORK BASED ON GLOBAL AND EDGE INFORMATION ENHANCEMENT |
2543 | Generative Face Video Compression Using Depth Estimation and Compressed Sensing |
2794 | Generative Personalized Blind Face Restoration Enhanced by Physical Identity |
2327 | GEOMETRIC MEAN IMPROVES LOSS FOR FEW-SHOT LEARNING |
1612 | GEOMETRY PARAMETRIZATION STABILIZATION FOR DYNAMIC MESH CODING |
1630 | GEOMETRY REGULARIZED POINT CLOUD AUTOENCODER |
1855 | GEOSCALER: GEOMETRY AND RENDERING-AWARE DOWNSAMPLING OF 3D MESH TEXTURES |
2773 | GHS-VDG:Graph and Hybrid Spatio-Temporal Attention for Video Diffusion Generation |
1629 | GIP: Gated Interaction Prompt for Parameter Efficient Vision-Language Fine-Tuning |
1557 | GlioSurvNet: Multimodal Survival Prediction for Glioblastoma Using Deep Learning and Clinical Variables from Brain MRI |
1605 | GMAR: GRADIENT-DRIVEN MULTI-HEAD ATTENTION ROLLOUT FOR VISION TRANSFORMER INTERPRETABILITY |
1866 | GMD: A MULTIMODAL FRAMEWORK FOR AI-GENERATED MISINFORMATION DETECTION |
2004 | GMOT-Mamba: Mamba-Based Model Prediction for Generic Multiple Object Tracking |
1179 | GOP-LEVEL ADAPTIVE RESAMPLING WITH CNN-BASED SUPER RESOLUTION |
1780 | GRAFT-XPCI: Dataset of synchrotron X-ray images for detection of acute cellular rejection after heart transplantation |
2531 | GRAPH CONVOLUTIONAL NETWORK AGGREGATION FOR BROAD-SPECTRAL OBJECT DETECTION |
2000 | GRID-LOGAT: GRID BASED LOCAL AND GLOBAL AREA TRANSCRIPTION FOR VIDEO QUESTION ANSWERING |
2524 | GROUP JOINT INDEPENDENT COMPONENT ANALYSIS (GROUP jICA): A NOVEL METHOD TO JOINTLY DECOMPOSE AND LINK SIMULTANEOUS EEG AND FMRI |
1163 | GTA-Crime: A Synthetic Dataset and Generation Framework for Fatal Violence Detection with Adversarial Snippet-Level Domain Adaptation |
1295 | GUIDED DETAIL FILTER FOR AVM |
1724 | GUIDED DIFFUSION FOR CLASS-CONDITIONED SYNTHESIS & CLASSIFICATION OF MICROSCOPIC BLOOD CELL IMAGES |
2007 | Handling Multiple Hypotheses in Coarse-to-Fine Dense Image Matching |
2204 | HARDWARE FRIENDLY MULTI-HYPOTHESIS CROSS COMPONMENT PREDICTION |
2074 | HARNESSING FEATURE DISTRIBUTION CONSISTENCY FOR FEDERATED LEARNING WITH NOISY LABELS |
1838 | HARNESSING THE POWER OF LLMS FOR IMAGE AESTHETICS ASSESSMENT THROUGH SEMANTIC AND CONTEXTUAL UNDERSTANDING |
2623 | HIGH-FREQUENCY SEMANTIC ENHANCEMENT IN COMPRESSED SCENARIOS FOR ROBUST VISUAL AND MACHINE VISION APPLICATIONS |
2687 | Holism to Atomism: Enhancing the Vision-Language Alignment for Cross-Modal Few-Shot Learning |
2595 | Holistic Coreset Selection for Data Efficient Image Quality Assessment |
1710 | HSBS: COMPREHENSIVE BOOSTING OF FACIAL EXPRESSION RECOGNITION VIA HIERARCHICAL SEMANTIC AND BATCH-WISE SIMILARITY |
1859 | ICP-3DGS: SfM-Free 3D Gaussian Splatting for large-scale unbounded scenes |
2165 | ID-TTA: Classifier-Free Test Time Adaptation for Metric Learning |
2368 | iHDR: Iterative HDR Imaging with Arbitrary Number of Exposures |
1993 | Illumination Spectrum Estimation for Multispectral Images using Illuminant Prior |
1377 | IMAGE MOTION BLUR REMOVAL IN THE TEMPORAL DIMENSION WITH VIDEO DIFFUSION MODELS |
1504 | IMPLICIT OBJECT RECOGNITION VIA REINFORCEMENT LEARNING IN OUT-OF-DOMAIN SCENARIOS |
2293 | IMPROVE REAL-TIME FLOOD SEGMENTATION BY ENCODING AND DISTILLING FOREGROUND INFORMATION |
1495 | IMPROVED CERVICAL CELL DETECTION MODEL BASED ON HYBRID-DOMAIN FEATURE PYRAMID NETWORK |
1786 | IMPROVED UNET++ BASED ON KOLMOGOROV-ARNOLD CONVOLUTIONS |
1756 | IMPROVING NOVEL VIEW SYNTHESIS OF 360◦ SCENES IN EXTREMELY SPARSE VIEWS BY JOINTLY TRAINING HEMISPHERE SAMPLED SYNTHETIC IMAGES |
1164 | IMPROVING OPEN-WORLD CLASS-AGNOSTIC OBJECT DETECTORS VIA FEATURE DISTILLATION WITH STUDENT-AWARE ADAPTATION |
2787 | IMPROVING PSEUDO-LABELS SELECTION USING DOMAIN PRIORS FOR SEMI-SUPERVISED DETECTION IN CAPSULE ENDOSCOPY |
1932 | IMPROVING THE PERFORMANCE OF COMPRESSIVE SPECTRAL IMAGING WITH BAYER COLOR FILTER ARRAY |
1751 | IMPROVING YOLOV8 FOR FAST FEW-SHOT OBJECT DETECTION BY DINOV2 DISTILLATION |
2116 | IN2OUT: FINE-TUNING VIDEO INPAINTING MODEL FOR VIDEO OUTPAINTING USING HIERARCHICAL DISCRIMINATOR |
2461 | Investigating Data Replication in Medical Synthetic Image Generation with Diffusion Models |
2451 | INVESTIGATING ROBUSTNESS OF UNSUPERVISED STYLEGAN IMAGE RESTORATION |
2441 | IS PERTURBATION-BASED IMAGE PROTECTION DISRUPTIVE TO IMAGE EDITING? |
1344 | ITERATIVE SELF-IMPROVEMENT OF VISION LANGUAGE MODELS FOR IMAGE SCORING AND SELF-EXPLANATION |
2108 | IterDiff: Training-Free Iterative Face Editing via Efficient CLIP-guided Memory Bank |
1759 | J-CAPA : JOINT CHANNEL AND PYRAMID ATTENTION IMPROVES MEDICAL IMAGE SEGMENTATION |
1536 | JOINT GEOMETRY-ATTRIBUTE POINT CLOUD COMPRESSION WITH SPATIAL CONTEXT MINING AND DUAL-CLASS ATTRIBUTE LOSS |
2661 | Joint optimization of primary and secondary transforms using rate-distortion optimized transform design |
1034 | JUDGING FROM SUPPORT-SET: A NEW WAY TO UTILIZE FEW-SHOT SEGMENTATION FOR SEGMENTATION REFINEMENT PROCESS |
2091 | KNOWLEDGE IS WHAT YOU NEED FOR ACTIVE OBJECT TRACKING |
2660 | KNOWLEDGE REFINEMENT FOR UNSUPERVISED LIFELONG PERSON RE-IDENTIFICATION |
2035 | LANGVISION-LORA-NAS: NEURAL ARCHITECTURE SEARCH FOR VARIABLE LORA RANK IN VISION LANGUAGE MODELS |
2799 | LAPLACE-MAMBA: MAMBA-BASED LAPLACE PYRAMID ENHANCEMENT NETWORK FOR UNPAIRED ULTRA-HIGH-DEFINITION IMAGE |
2655 | LARGE VISION-LANGUAGE MODELS ARE GENERALIST SOLVERS FOR PATHOLOGY TASKS |
2770 | LARGESCENEGAUSSIAN: HIGH-EFFICIENCY 3D GAUSSIAN SPLATTING FOR LARGE-SCALE SCENE RECONSTRUCTION |
2398 | LEARNED HYBRID VIDEO CODING FOR HUMAN PERCEPTION AND MULTIPLE MACHINE VISION TASKS |
2094 | LEARNED VIDEO COMPRESSION WITH SPATIAL CORRELATION PRIORS AND HIERARCHICAL TEMPORAL ATTENTION |
2184 | LEARNING FROM PU DATA USING DISENTANGLED REPRESENTATIONS |
2348 | Learning Geometry-Aware Representation for Gaze Estimation |
2472 | LEMORE: LEARN MORE DETAILS FOR LIGHTWEIGHT SEMANTIC SEGMENTATION |
2065 | LEVERAGING COMPLEMENTARY ATTENTION MAPS IN VISION TRANSFORMERS FOR OCT IMAGE ANALYSIS |
2771 | LEVERAGING DEPTH FOUNDATION MODELS IN SELF SUPERVISED MONOCULAR DEPTH ESTIMATION |
2395 | LIFT-PCAC: LIFTING BASED POINT CLOUD ATTRIBUTE COMPRESSION |
2599 | LIGHTWEIGHT IMAGE SUPER-RESOLUTION PREPROCESSOR FOR JPEG COMPRESSION |
2831 | LIGHTWEIGHT TEMPORAL CONTEXTUAL FINE-TUNING METHOD OF LARGE MULTIMODAL MODEL FOR VIDEO MOMENT RETRIEVAL |
2453 | LINEA: FAST AND ACCURATE LINE DETECTION USING SCALABLE TRANSFORMERS |
2198 | LONG-SHORT EXPOSURE FUSION WITH EVENT DATA FOR LOW-LIGHT VIDEO ENHANCEMENT |
2650 | LORENTZ TRANSFORMATION NEURAL NETWORK |
1936 | LOW-RANK ADAPTATION OF PRE-TRAINED VISION BACKBONES FOR ENERGY-EFFICIENT IMAGE CODING FOR MACHINES |
2696 | Machine Learning-Based Decoding Energy Modeling for VVC Streaming |
2711 | MAMBA-BASED GLOBAL CORRELATION LEARNING FOR LIGHT FIELD SPATIAL SUPER-RESOLUTION |
1293 | Mamba-SF: Monocular Scene Flow Learning with State Space Models |
2087 | MCM: A MULTI-AGENT COLLABORATIVE MULTIMODAL FRAMEWORK FOR TRADITIONAL CHINESE MEDICINE DIAGNOSIS |
2471 | MEASURING DISTORTION STRENGTH WITH DEWARPING DIFFUSION MODELS IN ANOMALY DETECTION |
1141 | MEDKI: KNOWLEDGE DUAL INJECTIONS FOR MEDICAL VISUAL QUESTION ANSWERING |
2342 | METAH2: A SNAPSHOT METASURFACE HDR HYPERSPECTRAL CAMERA |
1193 | METALWORK: A SYNTHETIC DATASET AND BASELINE FOR STEREO MATCHING OF METAL WORKPIECES |
1895 | METAREG: ROBUST CAMERA PARAMETER ESTIMATION BY LEVERAGING NOISY CAMERA EXTRINSICS |
1769 | MFA-NET: MOTION FIELD ADAPTIVE NETWORK FOR SKELETON-BASED ACTION RECOGNITION |
1138 | MFB-SAC: A MULTI-SCALE FREQUENCY AND BOUNDARY-ENHANCED SAM FOR CELL SEGMENTATION |
2151 | MGP-KAD: Multimodal Geometric Priors and Kolmogorov-Arnold Decoder for Single-View 3D Reconstruction in Complex Scenes |
2324 | Mirror Feature-Aware Generative Adversarial Network for RGB-T Salient Object Detection |
2157 | MM-IML: MULTI-MODAL IMAGE FORGERY DETECTION AND LOCALIZATION |
1202 | MMP-2k: A Benchmark Multi-labeled Macro Photography Image Quality Assessment Database |
1423 | MODALITY-AWARE DIFFUSION DISTILLATION NETWORK FOR SENTIMENT ANALYSIS IN MISSING MODALITIES |
2067 | MONSTR: Model-Oriented Neutron Strain Tomographic Reconstruction |
2020 | Mosaic-SR: An Adaptive Multi-step Super-Resolution Method for Low-Resolution 2D Barcodes |
2408 | MOTION-AWARE RECONSTRUCTION FOR VIDEO SNAPSHOT COMPRESSIVE IMAGING |
2139 | MOVING FORWARD WITH BWC: THE FALEB DATASET FOR MULTIMODAL IMAGE ANALYSIS |
1852 | MPEG EDGEBREAKER: AN EFFICIENT STATIC AND DYNAMIC MESH CODEC IN MPEG V-DMC |
2154 | MS-RAFT-3D: A Multi-Scale Architecture for Recurrent Image-based Scene Flow |
1316 | Multi-Branch Clothes-Agnostic Feature Learning for Cloth-Changing Person Re-identification |
2631 | MULTI-CLASS PART PARSING BASED ON MULTI-CLASS BOUNDARIES |
1687 | Multi-class Smoothed Hinge Loss Function in Pre-training for Transfer Learning |
1301 | MULTI-LEVEL AND MULTI-MODAL ACTION ANTICIPATION |
2419 | MULTI-LEVEL STATISTICAL MODEL GUIDANCE IMPROVES GENERALIZATION FOR BIOMETRIC SYNTHETIC FACE DETECTION |
1750 | MULTIMAE MEETS EARTH OBSERVATION: PRE-TRAINING MULTI-MODAL MULTI-TASK MASKED AUTOENCODERS FOR EARTH OBSERVATION TASKS |
1844 | Multimodal Cell Context Instruction Tuning for Conditional DNA Regulatory Sequence Generation with Large Language Models |
1170 | MULTIMODAL RE-RANKING FOR HETEROGENEOUS FACE RE-IDENTIFICATION |
2680 | Multimodal-LLM Agent for Text-driven Multi-Attribute Face Editing |
1674 | MULTI-RES-3DGS: MULTI-RESOLUTION 3D GAUSSIAN SPLATTING BOUND WITH A SUBDIVIDED MESH SEQUENCE |
2144 | Multi-scale Spatial-Frequency Features Representation and Learnable Cross Modal Feature Fusion in Deepfake Detection |
1695 | Multi-Teacher Knowledge Distillation for Efficient Object Segmentation |
2658 | MULTI-VIEW AMODAL INSTANCE SEGMENTATION BASED ON 3D REPRESENTATION |
1839 | NAR-DIFF: A NOISE-ADAPTIVE REFLECTANCE DIFFUSION MODEL FOR LOW-LIGHT IMAGE ENHANCEMENT |
1658 | NASSBLIF: NO-REFERENCE LIGHT FIELD IMAGE QUALITY ASSESSMENT VIA NEIGHBORHOOD ATTENTION AND SCALE SWIN |
1698 | NEIGHBOR-AWARE FEATURE-DRIVEN MOTION COMPENSATION FOR LEARNED VIDEO COMPRESSION |
2114 | NOISY LABEL REFINEMENT WITH SEMANTICALLY RELIABLE SYNTHETIC IMAGES |
2138 | NONLINEAR MODIFICATIONS OF TRANSFORM COEFFICIENTS IN VVC INTRA CODING |
2477 | NON-LOCAL N2V: IMPROVING N2V NETWORKS FOR SPATIALLY CORRELATED NOISE |
1875 | NON-RIGID MOTION CORRECTION FOR MRI RECONSTRUCTION VIA COARSE-TO-FINE DIFFUSION MODELS |
2593 | NON-UNIFORM ILLUMINATION IMAGE RESTORATION FOR DEEP-SEA EXPLORATION WITH A NEW SCATTERING MODEL |
1251 | NO-REFERENCE TEXTURED MESH QUALITY ASSESSMENT USING GRAPH-BASED FEATURES |
2653 | OBJECT DETECTION AND FRUIT TREE GROWTH STAGE IDENTIFICATION VIA YOLO WITH INVERTED AND SWIN TRANSFORMER BLOCKS |
1165 | Objective, Absolute and Hue-aware Metrics for Intrinsic Image Decomposition on Real-World Scenes: A Proof of Concept |
2227 | OBLIQUE DECISION TREES AS AN IMAGE MODEL FOR CUBIST IMAGE RESTYLING |
1968 | On The Discovery of Novel Targets for Compounds Using Cell Painting Imagery and Zero-Shot Learning |
2495 | ONE-STAGE FRAMEWORK FOR THYROID NODULE DETECTION WITH MIXUP AND NEGATIVE SAMPLE UTILIZATION |
1853 | Online Continual Learning of Diffusion Models: Multi-Mode Adaptive Generative Distillation |
1760 | OPENRR-1K: A SCALABLE DATASET FOR REAL-WORLD REFLECTION REMOVAL |
1871 | OPTIMAL TRANSPORT-BASED DOMAIN ALIGNMENT AS A PREPROCESSING STEP FOR FEDERATED LEARNING |
1519 | OPTIMIZED LEARNED IMAGE COMPRESSION FOR FACIAL EXPRESSION RECOGNITION |
2736 | ORGANOID-ICLIP: CLASS IMBALANCE-AWARE VISION-LANGUAGE LEARNING FOR ORGANOID MITOSIS CLASSIFICATION |
1320 | ORIENTED OBJECT DETECTION BASED ON COMPOSITE TRIGONOMETRIC FUNCTION CODER |
2278 | OUT-OF-DISTRIBUTION SAMPLE SELECTION GENERATED BY DIFFUSION MODEL TOWARD MODEL GENERALIZATION |
2571 | Overlooked Factors in Continual Zero-Shot Learning: Inflexible Semantic Prototypes, Simplistic Loss Functions, and SGD Noise |
2222 | PARALLEL-BASED FAST CODING MODE DECISION FOR INTRA CODING IN VVC SCC |
1173 | Patch-Wise Framework for Event-Based Monocular Depth Estimation |
2835 | PAWPRINT: WHOSE FOOTPRINTS ARE THESE? IDENTIFYING ANIMAL INDIVIDUALS BY THEIR FOOTPRINTS |
2008 | PDD-AGENT: MULTIMODAL LARGE LANGUAGE MODEL-DRIVEN AI AGENT FOR ENHANCED PLANT DISEASE DIAGNOSIS |
1874 | PERFACE: METRIC LEARNING IN PERCEPTUAL FACIAL SIMILARITY FOR ENHANCED FACE ANONYMIZATION |
2517 | PETSRS - A DATASET AND BENCHMARK FOR PET RECOGNITION ON A CLIMATE DISASTER SCENARIO |
2446 | PIT-QMM: A Large Multimodal Model for No-Reference Point Cloud Quality Assessment |
2052 | PIXELSHUFFLER: A SIMPLE IMAGE TRANSLATION THROUGH PIXEL REARRANGEMENT |
1102 | PLUG-AND-PLAY PRIORS AS A SCORE-BASED METHOD |
2737 | POLARIZATION DENOISING AND DEMOSAICKING: DATASET AND BASELINE METHOD |
2111 | POLICY GRADIENT-BASED OPTIMAL SUBSET SELECTION FOR FEW-SHOT VISION-LANGUAGE LEARNING |
1768 | POSE ESTIMATION OF ARTWORK CHARACTERS WITH SERIES AND PARALLEL DILATED CONVOLUTION AND STYLE CHANNEL ATTENTION |
2560 | POSE-FREE 3D GAUSSIAN SPLATTING VIA SHAPE-RAY ESTIMATION |
1607 | POSE-INVARIANT FACE RECOGNITION VIA FEATURE-SPACE POSE FRONTALIZATION |
1928 | POWER COST COMPARISON OF NEURAL-NETWORK COMPRESSION METHODS FOR SATELLITE IMAGERY |
1803 | PRIVACY-PRESERVING FACE RECOGNITION SCHEME BASED ON SECURE DATA STORAGE AND SECRET SPLITTING |
2415 | PROBABILISTIC SAMPLING WITH FROBENIUS NORM FOR ACTION RECOGNITION |
2181 | PSF-SRDN: Point Spread Function-Aware Speckle Reducing Diffusion Network |
1265 | QUANTA DIFFUSION |
2203 | QUANTA-SLOMO: SINGLE PHOTON CAMERA GUIDED 100X VIDEO FRAME INTERPOLATION |
2038 | QUANTUM-ENHANCED CANCER DETECTION FOR HISTOPATHOLOGIC IMAGES |
1797 | RADIUS-ALIGNED TRAINING AND ROTATED IOU METRICS FOR PEDESTRIAN DETECTION IN TOP-VIEW FISHEYE IMAGES |
2580 | RAPID OBJECT MODELING INITIALIZATION FOR VECTOR QUANTIZED-VARIATIONAL AUTOENCODER |
2568 | RATE-DISTORTION OPTIMIZATION WITH NON-REFERENCE METRICS FOR UGC COMPRESSION |
1447 | RATE-DISTORTION OPTIMIZED CHROMA QUANTIZATION FOR POINT CLOUD COMPRESSION |
1883 | RAVEN: RETHINKING ADVERSARIAL VIDEO GENERATION WITH EFFICIENT TRI-PLANE NETWORKS |
2523 | RAW: Region Attention-Weighted Guided Network with Inter-Region Exchange for AMD Grading |
1826 | ReACT: Reference-based Anime Colorization Transformer |
2548 | READING BETWEEN THE LINES: HOW EYE-TRACKING DATA CAN INFORM READING STRATEGIES FOR LARGE LANGUAGE MODELS |
2614 | REALISTIC SKIN TROUBLE SIMULATION VIA IMAGE GENERATION MODELS |
1410 | Real-Time Semantic Video Communication with Temporally Consistent and Controllable Diffusion Models |
2545 | REAL-TIME TRAFFIC ACCIDENT ANTICIPATION WITH FEATURE REUSE |
2186 | RECOVERING AND CLASSIFYING UPPER LIMB IMPAIRMENT TRAJECTORIES AFTER STROKE |
2089 | REMOTE SENSING TARGET DETECTOR WITH MULTI SCALE ATTENTION MECHANISM |
2205 | RE-PURPOSING SEGMENT ANYTHING FOR SKELETON ACTION LOCALIZATION |
1990 | RETHINKING IMAGE HISTOGRAM MATCHING FOR IMAGE CLASSIFICATION |
2731 | Reverse Distillation based Detection of Anomalies on a newly developed fabric dataset |
1254 | Reward Backdoor Attack on Text-to-Image Model Alignment |
1666 | REWARD-ADAPTATION: A NOVEL TEST-TIME ADAPTATION METHOD WITH REWARD MODEL |
2823 | RGC-BENT: A NOVEL DATASET FOR BENT RADIO GALAXY CLASSIFICATION |
2625 | Ridgeformer: Mutli-Stage Contrastive Training For Fine-grained Cross-Domain Fingerprint Recognition |
2031 | RN-SAM:ROAD NETWORK-AIDED SAM OPTIMIZATION FOR ROAD SEGMENTATION IN SATELLITE IMAGERY |
1662 | ROBUST CHARACTER STROKE SEGMENTATION FOR DIVERSE FONTS VIA CONTOUR MATCHING AND CHAIN PROPAGATION |
2635 | ROBUST ESTIMATION OF BUMP HEIGHT FOR WAFER-LEVEL PACKAGING USING OPTICAL TRIANGULATION |
2638 | Robust Multi-Label Learning with Human-Guided and Foundation Model-Aided Crowd Framework |
1285 | Robust Multimodal Representation Learning with Information Bottleneck and Balanced Fusion for Alzheimer’s Disease Classification |
1065 | ROLLOUT-GUIDED TOKEN PRUNING FOR EFFICIENT VIDEO UNDERSTANDING |
2817 | RPPG-NDCL: UNSUPERVISED REMOTE PHYSIOLOGICAL MEASUREMENT VIA NOISE-DISENTANGLED CONTRASTIVE LEARNING |
2107 | RT-X Net: RGB-Thermal cross attention network for Low-Light Image Enhancement |
2304 | S3VD SELF-SUPERVISED SPATIAL VIDEO DOWNSAMPLING LOSS: A METHOD FOR TRAINING VIDEO FPN DENOISING NETWORKS |
2046 | SAM 2-DRIVEN SELF-TRAINING FOR MAMMOGRAM SEGMENTATION: ZERO-SHOT MASK GENERATION VIA PSEUDO-VIDEO |
1861 | SAW-MONODETR: SHAPE-AWARE ADAPTIVE WEIGHTED TRANSFORMER FOR MONOCULAR 3D OBJECT DETECTION |
2015 | Scalable Multi-view Clustering via Bipartite Graph Consensus Filtering |
1858 | SCIGS: 3D GAUSSIANS SPLATTING FROM A SNAPSHOT COMPRESSIVE IMAGE |
2428 | SCL-GAN: Spatially-Correlative Lightweight GAN for Efficient and High-Fidelity Thermal-Visible Face Synthesis |
1915 | SCRIBBLE-GUIDED DIFFUSION FOR TRAINING-FREE TEXT-TO-IMAGE GENERATION |
1258 | SDFCNET: A SPATIAL-DOMAIN AND FREQUENCY-DOMAIN COLLABORATIVE NETWORK FOR BUILDING EXTRACTION IN HIGH-RESOLUTION REMOTE SENSING IMAGES |
2511 | Segmentation for Early Tumor Detection in Mammograms via Temporal Discrepancy Analysis and Dynamic Loss Weighting |
1951 | Segment-Attention Augmented Dual-Contrastive Aggregation Learning for Unsupervised Visible-Infrared Person Re-identification |
1510 | SEMANTIC CONTEXT RE-MINING FOR MULTIMODAL GUIDED HUMAN-OBJECT INTERACTION DETECTION |
2276 | SEMANTIC PROTOTYPE-GUIDED SAMPLING FOR LONG-TAILED GENERALIZED CATEGORY DISCOVERY |
2032 | SEMANTICS-GUIDED GENERATIVE IMAGE COMPRESSION |
1171 | SEMI-SUPERVISED INFRARED MEIBOMIAN GLAND SEGMENTATION WITH INTRA-PATIENT REGISTRATION AND FEATURE SUPERVISION |
2382 | SENSOR DISTANCE LEARNING FOR CROSS-CAMERA COLOR CONSTANCY |
1807 | Session Class Prototype Incremental Learning (SCPIL): Mitigating Catastrophic Forgetting with Distance-Based Prototype Learning |
1543 | SFS-NERF: ENHANCING GEOMETRY CONSISTENCY IN FEW-SHOT NOVEL VIEW SYNTHESIS THROUGH SURFACE-AWARE NEURAL RENDERING |
1217 | SF-VQA: Saliency Fragments No-Reference Video Quality Assessment |
2255 | Shape Reconstruction of Foreground and Background in Scenes with Translucent Objects Based on Coding Curves |
2226 | Shuffle PatchMix Augmentation with Confidence-Margin Weighted Pseudo-Labels for Enhanced Source-Free Domain Adaptation |
1879 | SIAVATAR: ANIMATABLE 3D GAUSSIAN AVATAR FROM A SINGLE IMAGE |
2350 | SIGNWRITING FOR HANDSHAPE RECOGNITION IN SIGN LANGUAGE |
2566 | SIMILARITY NORMALIZATION AND STRONG GEOMETRIC AUGMENTATION FOR LOCAL FEATURE MATCHING UNDER LARGE SCALE AND ROTATION CHANGES |
1900 | Simple Zero-Shot Image Dehazing |
2411 | Single Snapshot Distillation for Phase Coded Mask Design in Phase Retrieval |
2692 | SINOGRAM INPAINTING WITH PHYSICS-GUIDED LATENT DIFFUSION MODEL FOR SYNCHROTRON LIGHT SOURCES |
2717 | SKETCH TO STYLIZED-IMAGE: TRAINING-FREE TWO-STAGE APPROACH FOR ARTISTIC IMAGE GENERATION FROM SKETCH |
2618 | SKIN CANCER CLASSIFICATION USING EXTENDED 5 CHANNEL (I-RGB-U) IMAGES GENERATED FROM RGB IMAGES |
2306 | SLICE: SYNTHETIC CAPTION-TRAINED LIGHTWEIGHT IMAGE CAPTIONER FOR EDGE DEVICES |
1546 | Sparse R-CNN OBB: Ship Target Detection in SAR Images Based on Oriented Sparse Learnable Proposals |
2806 | SPARSE2DGS: SPARSE-VIEW SURFACE RECONSTRUCTION USING 2D GAUSSIAN SPLATTING WITH DENSE POINT CLOUD |
1767 | SPARSITY-DRIVEN PARALLEL IMAGING CONSISTENCY FOR IMPROVED SELF-SUPERVISED MRI RECONSTRUCTION |
1125 | SPATIAL-SPECTRAL CONSISTENCY: A SEMI-SUPERVISED APPROACH FOR MULTISPECTRAL SCENE CLASSIFICATION |
2604 | SPC TO 3D: NOVEL VIEW SYNTHESIS FROM BINARY SPC VIA I2I TRANSLATION |
2642 | SPECTRAL MIXING AUGMENTATION FOR PREVENTING FALSE POSITIVES FROM HYPERSPECTRAL ANOMALY DETECTION |
1073 | SPECTRAL-AWARE GLOBAL FUSION FOR RGB-THERMAL SEMANTIC SEGMENTATION |
2506 | Splitter: Faster Inference Through Channel Partitioning and Feature Fusion |
2432 | STABLE-INVERTIBLE GRAPH CONVOLUTIONAL NETWORKS FOR LABEL-EFFICIENT SKELETON-BASED RECOGNITION |
1691 | STENCIL: SUBJECT-DRIVEN GENERATION WITH CONTEXT GUIDANCE |
1906 | ST-GRIT: SPATIO-TEMPORAL GRAPH TRANSFORMER FOR INTERNAL ICE LAYER THICKNESS PREDICTION |
2664 | STRUCTURED INSTRUCTION PARSING AND SCENE ALIGNMENT FOR UAV VISION-LANGUAGE NAVIGATION |
2364 | SWINSCALE-LFVS: PARALLEL FEATURE INTEGRATION FOR LIGHT FIELD VIEW SYNTHESIS |
2409 | TAGSIM: TOPIC-INFORMED ATTENTION GUIDED SIMILARITY METRIC FOR IMAGE CAPTION COMPARISON |
1919 | TARGET DRIVEN ADAPTIVE LOSS FOR INFRARED SMALL TARGET DETECTION |
1681 | TASK-SPECIFIC SPATIOTEMPORAL CONTEXT-AWARE DECOUPLING FOR OCCLUDED VIDEO OBJECT DETECTION |
1484 | TEACH ME SIGN: STEPWISE PROMPTING LLM FOR SIGN LANGUAGE PRODUCTION |
1589 | TERRAFLY-FORENSICS: A DATASET FOR FORENSIC DETECTION OF GENERATED MAP IMAGES WITH QUALITY ASSESSMENT OF GENERATIVE MODELS |
2403 | TerraScope: A Natural Land Cover Segmentation Dataset for High-Resolution Satellite Images with a Multi-Layer Cross Attention Hybrid Transformer |
1576 | TEST-TIME VOCABULARY ADAPTATION FOR LANGUAGE-DRIVEN OBJECT DETECTION |
2463 | Texture- and Shape-based Adversarial Attacks for Overhead Image Vehicle Detection |
1428 | TEXTURING ENDOSCOPIC 3D STOMACH VIA NEURAL RADIANCE FIELD UNDER UNEVEN LIGHTING |
2257 | TIME-EFFICIENT UNCERTAINTY ESTIMATION BASED ON TARGET NETWORKS IN DEEP REINFORCEMENT LEARNING |
1819 | TOROIDAL ADAPTIVE INTENSITY AND SPECTRUM UPDATING IMAGE RECONSTRUCTION FOR FOURIER PTYCHOGRAPHIC MICROSCOPY |
2435 | TOWARDS ALL-TIME, ALL-WEATHER FOD DETECTION THROUGH GENERATIVE AI |
1758 | TOWARDS CERTIFIED OBJECT DETECTORS: CERTIFIED RUNWAY DETECTION USING YOLO |
1045 | TOWARDS CONTROLLABLE REAL IMAGE DENOISING WITH CAMERA PARAMETERS |
2469 | Towards Dark-Field X-ray Microscopy Through Coherent Encoding |
1154 | TOWARDS EFFECTIVE AND ROBUST UNLEARNABLE EXAMPLES AGAINST OBJECT DETECTION |
2282 | Towards Image Copy Detection at E-commerce Scale |
2626 | TOWARDS ROBUST TEXT-GUIDED IMAGE COMPRESSION UNDER MODALITY MISSING |
2448 | Towards Test Time Adaptation in Low Dose Computed Tomography Denoising via Bias Modulation |
2248 | TRAINING A PHASE DETECTION AUTOFOCUS MODEL USING HYBRID LABELS |
1604 | TRANSDUCTIVE ONE-SHOT LEARNING MEET SUBSPACE DECOMPOSITION |
2590 | TRANSFORM SET MERGING FOR NEURAL NETWORK-BASED INTRA PREDICTION IN BEYOND VVC |
2786 | Transformer Augmented Multi-Resolution Hash Encoding in Diffusion Model for 3D Point Cloud Denoising |
2246 | TRIQA: IMAGE QUALITY ASSESSMENT BY CONTRASTIVE PRETRAINING ON ORDERED DISTORTION TRIPLETS |
2764 | TURBIT: GENERATING TURBID UNDERWATER IMAGES WITH DIFFUSION AND DIFFERENTIAL TRANSFORMERS |
1659 | TWO-STAGE FRAMEWORK FOR ENHANCED HYPERSPECTRAL ANOMALY DETECTION |
2776 | ULTRAFAST HIGH-FLUX SINGLE-PHOTON LIDAR SIMULATOR VIA NEURAL MAPPING |
1726 | UNLOCKING A NEW PARADIGM IN ROBUSTNESS FOR MULTI-STEP FACIAL FORGERY DETECTION |
2385 | Unraveling Vanishing Point and Calibrating Tiny Objects for Semantic Scene Completion |
2239 | UNROLLING NONCONVEX GRAPH TOTAL VARIATION FOR IMAGE DENOISING |
1668 | UNSUPERVISED DEEP SEMANTIC-PRESERVING TRIPLET HASHING VIA EFFICIENT DISTILLATION |
1930 | USER-IN-THE-LOOP VIEW SAMPLING WITH ERROR PEAKING VISUALIZATION |
1390 | VARIABLE RATE LEARNED WAVELET VIDEO CODING USING TEMPORAL LAYER ADAPTIVITY |
1828 | Veta-GS: View-dependent deformable 3D Gaussian Splatting for thermal infrared Novel-view Synthesis |
2079 | VIDA: Unsupervised Visible-to-Infrared Domain Adaptation for Object Detection Using Large Vision Language Model |
1115 | VIDEO INDIVIDUAL COUNTING WITH IMPLICIT ONE-TO-MANY MATCHING |
2219 | VIEWPOINT-DEPENDENT 3D VISUAL GROUNDING FOR MOBILE ROBOTS |
2834 | Visible-Infrared Person Re-Identification via Multi-Level Triple-Branch Learning |
1443 | Vision Language Model Interpretability with Concept Guided Decoding |
2561 | VISIONSCORES - A SYSTEM-SEGMENTED IMAGE SCORE DATASET FOR DEEP LEARNING TASKS |
2088 | VISUAL ENCODERS FOR GENERALIZED CHROMOSOME RECOGNITION |
2070 | VISUAL PROMPT AIDED SINGLE SHOT OBJECT PART SEGMENTATION |
1689 | VISUAL PROMPTING THROUGH IMAGE MINES |
1054 | VITA-PAR: VISUAL AND TEXTUAL ATTRIBUTE ALIGNMENT WITH ATTRIBUTE PROMPTING FOR PEDESTRIAN ATTRIBUTE RECOGNITION |
1716 | Watermarking Diffusion Models by Constructing Generative Classifiers |
1595 | WaveE2VID: FREQUENCY-AWARE EVENT-BASED VIDEO RECONSTRUCTION |
2137 | WAVELET PACKING FOR SELF-SUPERVISED MONOCULAR DEPTH ESTIMATION |
1776 | WEAKLY SUPERVISED DEFECT LOCALIZATION WITH RESIDUAL FEATURES |
1592 | WEAKLY-SUPERVISED NUCLEI SEGMENTATION INTEGRATING HYBRID DECODER AND GRAPH-BASED SPATIAL MODELING |
1434 | WEIGHTED AVERAGE PREDICTION FOR REGION ADAPTIVE HIERARCHICAL TRANSFORM IN SOLID GEOMETRY POINT CLOUD COMPRESSION |
1981 | When 512x512 Is Not Enough: Local Degradation-Aware Multi-Diffusion for Extreme Image Super-Resolution |
2549 | X265-PVMAF: A REAL-TIME PERCEPTUAL VIDEO QUALITY METRIC FOR HEVC VIDEO ENCODING |
1518 | YOLO-VG: ENHANCING MULTI-STAGE FEATURE INTERACTION FOR VISUAL GROUNDING |
2788 | ZERO-SHOT PSEUDO LABELS GENERATION USING SAM AND CLIP FOR SEMI-SUPERVISED SEMANTIC SEGMENTATION |
2194 | ρ-NERF: LEVERAGING ATTENUATION PRIORS IN NEURAL RADIANCE FIELD FOR 3D COMPUTED TOMOGRAPHY RECONSTRUCTION |