Cargando…

Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference

Efficient machine learning implementations optimized for inference in hardware have wide-ranging benefits, depending on the application, from lower inference latency to higher data throughput and reduced energy consumption. Two popular techniques for reducing computation in neural networks are pruni...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hawks, Benjamin, Duarte, Javier, Fraser, Nicholas J., Pappalardo, Alessandro, Tran, Nhan, Umuroglu, Yaman
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2021
Materias:	Artificial Intelligence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8299073/ https://www.ncbi.nlm.nih.gov/pubmed/34308339 http://dx.doi.org/10.3389/frai.2021.676564

Ejemplares similares

QONNX: Representing Arbitrary-Precision Quantized Neural Networks
por: Pappalardo, Alessandro, et al.
Publicado: (2022)

A Synaptic Pruning-Based Spiking Neural Network for Hand-Written Digits Classification
por: Faghihi, Faramarz, et al.
Publicado: (2022)

Random pruning: channel sparsity by expectation scaling factor
por: Sun, Chuanmeng, et al.
Publicado: (2023)

A lightweight intrusion detection method for IoT based on deep learning and dynamic quantization
por: Wang, Zhendong, et al.
Publicado: (2023)

Supply forecasting and profiling of urban supermarket chains based on tensor quantization exponential regression for social governance
por: Li, Dazhou, et al.
Publicado: (2022)

Robust ellipsoidal set-membership fault estimation for time-varying systems with uniform quantization effects over sensor networks
por: Zhao, Peiying, et al.
Publicado: (2022)

InferBERT: A Transformer-Based Causal Inference Framework for Enhancing Pharmacovigilance
por: Wang, Xingqiao, et al.
Publicado: (2021)

Planning as Inference in Epidemiological Dynamics Models
por: Wood, Frank, et al.
Publicado: (2022)

Deep Active Inference and Scene Construction
por: Heins, R. Conor, et al.
Publicado: (2020)

Neuronal Sequence Models for Bayesian Online Inference
por: Frölich, Sascha, et al.
Publicado: (2021)

Analysis of cause-effect inference by comparing regression errors
por: Blöbaum, Patrick, et al.
Publicado: (2019)

Using knowledge graphs to infer gene expression in plants
por: Thessen, Anne E., et al.
Publicado: (2023)

Membership inference attack on differentially private block coordinate descent
por: Riaz, Shazia, et al.
Publicado: (2023)

Real-Time Inference With 2D Convolutional Neural Networks on Field Programmable Gate Arrays for High-Rate Particle Imaging Detectors
por: Jwa, Yeon-jae, et al.
Publicado: (2022)

Kernelized Heterogeneity-Aware Cross-View Face Recognition
por: Dhamecha, Tejas I., et al.
Publicado: (2021)

Federated Learning for Privacy-Aware Human Mobility Modeling
por: Ezequiel, Castro Elizondo Jose, et al.
Publicado: (2022)

The weaponization of artificial intelligence: What the public needs to be aware of
por: Dresp-Langley, Birgitta
Publicado: (2023)

Inference-Optimized AI and High Performance Computing for Gravitational Wave Detection at Scale
por: Chaturvedi, Pranshu, et al.
Publicado: (2022)

A Bayesian Account of Generalist and Specialist Formation Under the Active Inference Framework
por: Chen, Anthony G., et al.
Publicado: (2020)

Retrospective Inference as a Form of Bounded Rationality, and Its Beneficial Influence on Learning
por: FitzGerald, Thomas H. B., et al.
Publicado: (2020)

Listener Modeling and Context-Aware Music Recommendation Based on Country Archetypes
por: Schedl, Markus, et al.
Publicado: (2021)

Self-aware cycle curriculum learning for multiple-choice reading comprehension
por: Chen, Haihong, et al.
Publicado: (2022)

Co-Inference of Data Mislabelings Reveals Improved Models in Genomics and Breast Cancer Diagnostics
por: Gerber, Susanne, et al.
Publicado: (2022)

AmericasNLI: Machine translation and natural language inference systems for Indigenous languages of the Americas
por: Kann, Katharina, et al.
Publicado: (2022)

Heterogeneous feature-aware Transformer-CNN coupling network for person re-identification
por: Li, Yanchao, et al.
Publicado: (2022)

Characterization inference based on joint-optimization of multi-layer semantics and deep fusion matching network
por: Zheng, Wenfeng, et al.
Publicado: (2022)

Ordinal SuStaIn: Subtype and Stage Inference for Clinical Scores, Visual Ratings, and Other Ordinal Data
por: Young, Alexandra L., et al.
Publicado: (2021)

Evaluating Personalization: The AB Testing Pitfalls Companies Might Not Be Aware of—A Spotlight on the Automotive Sector Websites
por: Esteller-Cucala, Maria, et al.
Publicado: (2020)

DeepCausality: A general AI-powered causal inference framework for free text: A case study of LiverTox
por: Wang, Xingqiao, et al.
Publicado: (2022)

Challenging social media threats using collective well-being-aware recommendation algorithms and an educational virtual companion
por: Ognibene, Dimitri, et al.
Publicado: (2023)

QoS-aware service composition based on context-free grammar and skyline in service function chaining using genetic algorithm
por: Khosravian, Pouya, et al.
Publicado: (2021)

An Integrated World Modeling Theory (IWMT) of Consciousness: Combining Integrated Information and Global Neuronal Workspace Theories With the Free Energy Principle and Active Inference Framework; Toward Solving the Hard Problem and Characterizing Agentic Causation
por: Safron, Adam
Publicado: (2020)

Computational Modeling of Stereotype Content in Text
por: Fraser, Kathleen C., et al.
Publicado: (2022)

DeepHeartCT: A fully automatic artificial intelligence hybrid framework based on convolutional neural network and multi-atlas segmentation for multi-structure cardiac computed tomography angiography image segmentation
por: Bui, Vy, et al.
Publicado: (2022)

Volume Prediction With Neural Networks
por: Libman, Daniel, et al.
Publicado: (2019)

Users’ Responsiveness to Persuasive Techniques in Recommender Systems
por: Alslaity, Alaa, et al.
Publicado: (2021)

Cracking the genetic code with neural networks
por: Joiret, Marc, et al.
Publicado: (2023)

Interpretable neural networks: principles and applications
por: Liu, Zhuoyang, et al.
Publicado: (2023)

Editorial: Efficient AI in particle physics and astrophysics
por: Duarte, Javier, et al.
Publicado: (2022)

Effect of neural network structure in accelerating performance and accuracy of a convolutional neural network with GPU/TPU for image analytics
por: Ravikumar, Aswathy, et al.
Publicado: (2022)

Cannot write session to /tmp/vufind_sessions/sess_ehk8henavhjdgvandvl7tbs204